Techno Ladder: 2012-05-20

Saturday, May 26, 2012

Code Smells & Coding Standards

What is smell? Its a symptom of a possible deeper problem. Lets say you are cooking something and there is an unusual smell. You will immediately come to know there could be something wrong that went while cooking. You can't say the smell is indicating something bad when you are novice in cooking. Identifying smell requires little experience. The same applies for computer programming also.
Code Smells is a hint that something has gone wrong somewhere in the code. Code smells are possible bad practices used while coding. Following stuffs are all referred as code smells.

1. Large files
2. Large functions
3. Large function parameter list
4. Commented Code
5. Complex Conditional Logic
6. Switch case or if else on someone Else's data
7. Duplication of information
8. Multiple meanings to a variable - Same variable is used in different ways inside a single function or single class
9. Inconsistent parameter ordering across methods
10.Feature Envy - Method needing too much information from another object.
11. Data clumps - Data use always together.
12. Parallel inheritance hierarchies
13. Lazy class - class with no intent
14. Middle man - Too much delegating methods
15. Temp Field
16. Message Chains
17. Data classes - Class with only getter and setter methods.

Coding Standards

1. Avoid TODO's or FIX-ME's - These pass the test cases and hide the possible errors and the wrong impression at the test report that everything is implemented.

2. Document the side effects - Document why than how and update them actively

3. Use tabs or spaces but not both

4. Put curly braces at the same level

5. Do not have long line lengths

6. Do not use type information in the variables

7. Avoid being abstract, do not use variables like 'it', 'everything', 'data','handle','stuff','perform' etc

8. Whenever copy paste occurs, refactor into a method

10. Do not have comment history in the file header if you are using version control

11. Do not mention bug ids in the source, instead mention that as code commit comment.

In other words do the negation of what is mentioned at http://thc.org/root/phun/unmaintain.html

Handling Software Errors

All programmers deal with software errors. In this post I will talk about types of errors and how to deal with them not specific to any programming language and also touch upon code smells. Broadly the errors can be classified as following.
1. User Input Error: When user keys in an invalid data and program is not handling it
2. Programmer Error: This comprises of coding errors, broken code
3. Exception Circumstances: Network Connectivity Loss, No memory issue

What to do with errors?

1. Raise an error when something goes wrong
2. Detect all possible error reports
3. Handle them appropriately
4. Propagate errors when we can't handle them

Error handling Guidelines

1. Never ignore an error condition. If you do not know how to handle the problem, signal a failure back to the calling code.
2. Do not sweep an error under the rug and hope for the best.
3. Do not return null in case of error condition
4. Too many getter and setter methods are indicators of 'ASK & Decide' approach, eliminate or minimize Getter & Setter methods. Moving from ASK & Decide approach to TELL approach gives better encapsulation and better design. Remember that get/set are violation of encapsulation. Reducing get/set methods improve code quality and readability.
5. Develop the code that is 'hard to misuse' and easy to use. This reduces the chance for an error.

Code Review Principles

If you think a bit, code review is an overhead. Does it add any value to the customers? The answer is no. If you tell your customers that you require 50hrs for the code reviews, customers will ask you back why do you do code reviews? Why don't you write the code right the first time? Lets answer another question. Does code review add value to the company? Hm... may be yes but very less. The bottom line is that the code reviews are not for our customers but for us. Certain companies impose code review as a mandatory process in software development processes. These companies find little value in that. According to me the code reviews are essential due to several factors such as different experience levels of developers, newly joined employee is modifying some code but he does not understand the entire system yet etc. There are different types of code reviews like self review (that every developer does), peer review (code reviewed by your co-workers) and code review by an expert who has an understanding about the entire system. In this post, I am going to talk about the best practices, guidelines for the code reviews and touch upon something called assumptions.

What are the types of code reviews?

There are two types. (1) Static and (2) Logic. Static code reviews ensures that the naming conventions are followed or not, syntax checks, common mistakes etc. And today, the static code reviews are done by the tools since its more of standard rules. Some examples of static code review tools are FindBug, PMD, CheckStyle etc. For the second type of code review, i.e., Logic, the program logic is reviewed such as logical errors, code smells, requirements to design and design to coding defect review, dead codes etc

What is dead code and how dead code is identified?

Dead code is not reachable code. The code is simply present in the source files but does not have any execution path, in other words there is no test case that can be defined to test the code. Hence the dead code is identified and eliminated during the code reviews. Identifying the dead code is tricky unless there are 100% test cases written. Code coverage reports are one way to find dead code however as I said there must be hundred percent test coverage. Just by looking at the references for a given method in the entire source project does not suffice. Some methods are not at all referenced but still get called via reflection mechanisms. And there will be callback methods, callbacks get called based on certain events.

Some examples of dead code are: int n = 3+1; if (n==5) { //some code }, unused variables. Best way to identify these kind of code is to look for warnings given by the IDEs. We may have to increase the warning level in the IDEs (Every IDE has the option of increasing the warning level) to find out all possible code warnings. By default some warnings are ignored by the IDEs.

What else to look for in the code review?

Global states: Typically global variables are, singletons, static variables are the global states in the code. The problems with the global states are (1) Non-locality -Global states can be read and modified by any other part of the program making it difficult to remember or reason about every possible use and side effects. So change at one place can affect a seemly unrelated code (2) Implicit Coupling - Between global state and other modules. (3) Concurrency Issues - Any change in the global states affects all the concurrency issues. So minimize the use of global states

1. No global variables
2. Minimize class static variables
3. Minimize singletons

Magic Numbers: Consider the declarations such as int TWENTY_FIVE = 26, int ONE = 1 etc

Code Smells: Intuitive feeling that something is wrong with the code is what is called code smells. Highly experienced and knowledgeable developers have a feeling for good design. They routinely practice good design without thinking about it too much.

The following diagram summarizes the effective code review (Click on the image to zoom)

ASSUMPTIONS

There are two types assumptions while developer is coding. (1) Implicit Assumption and another (2) Explicit Assumption. Assuming something during while code development will definitely have side effects. Lets take a simple example to better understand the types of assumptions. Say a developer codes the instruction as char filename[256]. Here it is assumed that file length is always up to 256 characters and this is an example of explicit assumption. And this also has a side effect, this code works only with English Operating Systems, when the code has to run in Japanese OS, this will crash since it does not implement Unicode and this is an example for implicit assumption.

Typical implicit assumptions

1. User is expected to call function X() only once.
2. User is expected to call function X() before calling function Y().
3. There are no side effects of overwriting the value of Z
4. Assumptions about underlying OS (Supported OS, file name sizes, path separators etc)

Friday, May 25, 2012

Principles of Object Oriented Design

The first and the foremost key factor in software development is the requirements change.And this brings a greatest challenge for software developers to have a design that is open for adopting requirement changes or new requirements getting added altogether. You might have heard about design debt quadrant, right?
The second key factor is that over a period of time, design degrades. In other words, design debt increases.

Characteristics of a good design are robust, extensible, maintainable etc and how about the characteristics of a bad design? They are rigidity(Making a change in the software is difficult), fragility (tendency of software to break in several places by a single change), immobility(unable to rescue), viscosity (Easy to do the wrong thing but hard to do the right thing resulting in tendencies of the developers to avoid 'clean fix')

What causes design degradation? The changes that introduce new and unplanned dependencies, we have dependency firewalls.

Well, we saw what are design challenges today in any software development activities. These challenges can be beaten by applying or following certain design principles(object oriented) which I am going to talk about in this post. First I will explain about class design principles and move on to package design principles which are fair enough to understand.

Principles of Class Design

1. Single Responsibility Principle: This principle says that a class has to change for one reason and should never be more than one reason. For example a Rectangle class which draws the rectangle as well as calculates its area is not following SRP. To resolve this, we need to have two rectangle classes one for drawing the rectangle and another for calculating the area. Context of responsibility is defined as "A reason for change".

2. Open Close Principle: This principle says that a method/class should be "Open for extension but closed for modification". New functionality will be implemented by a new class or a new method.Make all the member variables private. Minimize the use of getter and setter methods. The key for OPC is abstraction. Add new code for new functionality, don't modify the existing code. Concrete implementation of the interfaces somewhere(open for extension), but existing code must be closed for new functionality.
For a detailed example for OPC, have a look at this link

3. Liskov Substitution Principle: This principle says that in class hierarchies, subclasses should be substitutable for base classes.The answer for the question, "Is square a rectangle?" is that it depends on the interface. Interface define the behavior, they also define assumptions constraints. Square can be a rectangle or not rectangle based on how we design the interface. See an example for this

4. Dependency Inversion Principle: This principle says that the key is to depend upon the abstractions, abstraction interfaces do not change and concrete classes can be replaced. Avoid deriving from the concrete classes, associating to or aggregating concrete classes. See an example for this.

5. Interface Segregation Principle: This principle says that clients should not be forced to depend upon interfaces. Do not depend on concretions. When a class collects interfaces for various purposes, its called a fat interface. Click here to see an example for this.

Principles of Package Design

A package is a group of classes and following are certain principles to consider while designing them

1. Acyclic Dependency Principle: It says that dependencies between packages must NOT form cycles. The key here is to break the cycle of dependencies, break cycles with abstract interfaces, splitting packages, reorganizing packages and then reducing coupling. The basic rule that I follow is to visualize high level dependencies and then rationalize them separating the interfaces and implementations to break dependencies that I don't want.

2. Stable Dependency Principle: This principle says that the dependencies between packages in a design should be in the direction of the stability of the packages. To be more precise, a package should only depend upon packages that are more stable than itself. So the question is how to measure the stability of the packages? There is an algorithm that says to calculate something called Afferent coupling (Ca). For detailed information on package stability measurement, look at this under measuring stability section. There are tools to do this, for example JDepend for Java.

Well, the rationale behind these principles is that the design cannot remain static.we design some packages with some expectations that these packages will undergo changes.

Design By Contract Principle

( Fundamental design principles continued from previous post...)

DESIGN BY CONTRACT (DBC): This principle talks about how elements of software collaborate with each other. In a typical complex software, first the contracts are defined and then different components of the software that need to talk to each other are defined. Typically the contracts are defined by system architect who understands the entire software system and knows how should each software components interact. In Java programming language, the contracts are nothing but the Interfaces.

Lets see what will be the consequence of not designing by contract. If contracts are not well or completely defined or not defined at all, developer integration is going to be a nightmare and will lead to schedule risk. Because during DIT (Developer Integration Testing), the developers need to talk to all other developers who are developing different components of the same system and correct the contracts. Individually developed components will not collaborate well. This needs to be avoided.

Eiffel Programming Language is one language that follows design by contract principle and a good example for DBC principle, we may want to have a look at this language.Some real time contracts are as follows

1. Device Drivers
2. Standards and Protocols
3. COM (Component Object Model)
4. SOA (Service Oriented Architecture)

A real time example for design by contract is employee and employer systems. Employe signs an agreement before taking up an appointment latter of the employer and both the parties must adhere to what was signed in the agreement in other words a contract.

MINIMIZING THE IMPACT OF CHANGE

This principle says "Keep the Software Soft", Have you seen the software system where if you modify a line of code and you assume your modification works well, but in-fact, it has broken several software functionality? If the answer is yes, then probably the software system is not following this principle.This typically happens if software grows over the time, because the software design degrades over the period of time. The principle of Minimizing The Impact Of Change talks about how we can avoid these situations and we are encouraged to apply the following guidelines for minimizing the impact of change

1. Isolate Code: Isolate the code that has the one responsibility, meaning that the code responsible for a given task has to be in one place and should not scattered in several files. Its so obvious that this practice reduces impact of change since any change in that particular task can be done in one place. As par as the impacts are concerned, you need to test only isolated code.

2. Minimize Dependencies: Dependencies are another hurdle in software development. Changing something in software component will require change in its dependent components also. And hence dependencies need to be reduced and a complete functionality needs to be implemented by one component that does not depend on any other for completing the given functionality. In software systems, the dependencies can't be avoided 100%, however the dependencies must be in such a way that modifying a component should not require change in its dependency or recompiling the dependency. An example is that develop reusable core software components and use in required components. the core components are built, tested once and they do not require any change further Maintain unidirectional dependencies and avoid circular ones..

3. Standardize/Define Contracts: Defining contracts is a kind of standardizing how different software components must behave. However tricky part is modifying a contract needs to be done carefully. Do not just add a public method in the interface without giving a thorough thought since you are modifying the initial agreement.

4.Reuse - Don't Reinvent: Reuse as much as possible and avoid duplicates in the software code. Duplicate code always has the change impact since the person who is changing the code tends to forget to change in several other duplicated code.

5. Continuous Verification (Fail Fast): Use assertions, automated tests for continuous verification, Run automated test cases after each change. Another best practice is to use Test Driven Development (TDD) for continuous verification purposes

Fundamental Design Principles

Design principles are the guidelines for better software development. In this post I'm going to write about 4 fundamental design principles which I thought are very essential.

FAIL FAST PRINCIPLE

Fail fast is very effective principle that every software systems must adopt. This principle says that do not ever hide the error; report as soon as it occurs and corrective action(s). This is very essential because the error is constrained. Let’s take an example of a web page of a web application where user needs to key in the lots of data. The page has got many text fields and user starts filling the fields and at the last text box the user enters an invalid data and there is message thrown by the web application saying “Error Occurred, please start over” then the user is forced enter again all the fields from the beginning. This is the real pain for the user. Does anyone use this website? Of course the answer is NO. This web application is said not to follow the Fail – Fast Principle. We need to remember "Success is 99% failure" and Fail Fast is all about this.

Let’s look at different types of errors/failures in software and how can we apply Fail – Fast for each of them.

What are the types of failures?

1.User Input Error: Tell the user about the error right away! Do not give user errors at step10 and ask him/her to start all over again. Ask the users to correct them immediately. One way to solve this type of error is not allowing user to navigate to the next field until the correct data is entered in the previous field. As a best practice, user input errors can be reduced. Can we see how Google Search does a spell check when user starts typing the search keywords? Google autocorrects the words types if spell mistake is obvious. Drop downs can be supplied if we know what the values that a text field can have are. Provide calendar widget than having users type in the date etc

2.Programming Error: These are all coding errors. These can be avoided by following certain best coding practices. The key to avoid programming errors are to use asserts. An assert is the tiny piece of code that checks a condition and throws error if the condition is not met. Whenever writing a comment, better write the assert. For example: //Variable “a” should not be negative -> assert (a > 0) etc. Typical places to use assert are at the beginning of a function, at the end of complicated procedure to check that the result is plausible, to check the post conditions etc. Note that the too many asserts can lead to fragile software which seems to fail now and then and hence need to be used carefully.

3. Error Conditions: A condition in the software that can happen. It is not an illegal condition.

4. Violation of Assumptions: violation of assumption is the condition in software which should never arise. A typical example for violation of assumption is that assuming that the program will always run on Windows platform. Remember that the software requirements change over the time and your program can be asked to run on Linux or MacOS.

5. Compilation Errors: IDE’s (Integrated Development Environments) help to fail fast the compilation errors. Compilation errors can be reduced by having less or no dependencies, increasing the IDE’s error levels etc.

Advantages of Fail Fast Principle

1. Easy debugging because if an error is reported late, it is too difficult to figure out the root cause of the issue. You all know debugging is unpredictable.
2. Fail Fast ensures that the distance between point of failure and point of reporting the failure is low
3/ Fail Fast is the key to the robust software since it reduces the debugging cost, pain and the schedule risk.

TELL, DON’T ASK PRINCIPLE

The principle says that use TELL interface. This means that just tell the interface what you want and let the interface do the rest. In a typical ASK interface, we query lot of data before getting our job done. Better example is that if you want to go from one floor to another using elevator, you just press the up or down key of the elevator irrespective of which floor the elevator is in. This is good TELL interface. If the elevator is implementing ASK principle and let us you want to move from 2nd floor to the ground floor. You need to know first where the elevator is and if it is in the ground floor you need to press up button and if the elevator happens to be in the 5th floor you need to press down button.

“Interfaces based on TELL are better than the interfaces based on ASK”

To put it in object oriented paradigm, tell what objects need to do and don’t ask questions about the object state to make a decision and then tell them to do what you want.

The logic you are implementing is probably the object’s responsibility and not yours. For you to make decisions outside the object violates the encapsulation.

Asking the objects for their values is a bad idea instead just tell the objects what you want from them.

Let’s take a java code example.
-----------------------------------------------------------------
| public String findGrade(int testScore){
|        TestGrade grade = new TestGrade();
|        String strScore;
|        if(testscore >= 90) {
|            strScore = grade.getAGrade();
|        } else if (testscore >= 80) {
|            strScore = grade.getBGrade();
|        } else if (testscore >= 70) {
|            strScore = grade.getCGrade();
|        } else if (testscore >= 60) {
|            strScore = grade.getDGrade();
|        } else {
|            strScore = grade.getFGrade();
|        }
|                   return strScore;
|   }
---------------------------------------------------------------------------------

The above code asks the object for grade. This is following ASK principle. Above code can be re-factored to follow TELL
--------------------------------------------------
| public String findGrade(int testScore){
|        TestGrade grade = new TestGrade();
|                    return grade.findAndGetGrade(testScore);
|     }
----------------------------------------------------------------------------------

Well, the difference between above two code snippet is obvious. The code written based on TELL is easy to read and robust.TELL principle says that you need to minimize having getters and setters.

Advantages of TELL, DON'T ASK Principle

1. Enables programmers to think decleratively and not procedurally
2. We give commands than queries and code is more readable
3. Freezing responsibility of a class will ease in modularity, abstraction and encapsulation
4. Minimizes the use of getter and setter methods.

........ Article Continues to this post