Monday, February 27, 2012

Software protection through code obfuscation

Obfuscation is the process of making software program code difficult for understanding by applying various obfuscation methods without affecting the program logic.  The obfuscated code is difficult to understand by human beings and also makes it difficult for de-compilers to reverse engineer the code. This is basically done to prevent the competitors or hackers from cracking your code and get into underlying software design.  Byte code obfuscation refers to obfuscation of java byte code so that it becomes difficult to understand the code or difficult to reverse engineer the byte code by de-compilers. There are many byte code obfuscation tools that various software organizations are using today. As everyone knows java byte code is easily reverse engineered to produce the exact source code. This is possibly because of the java programming language promise that "Write once and Run Anywhere".

I dealt with Byte Code Obfuscation for many years for Java Software Products, using the obfuscation tool called Zelix KlassMaster. Thought of sharing some experiences with obfuscation. Let’s understand in depth what exactly the obfuscation tools do to make the code difficult to understand, reverse engineer, to what extent the obfuscation protects your software and what are the difficulties that can arise out of obfuscating your java byte code and some best practices while doing so in this short article with regards to Java programs.

What does exactly obfuscation do?

The basic operations that every obfuscation tools do is they change all the understandable string literals to non-understandable string literals. In other terms, the obfuscation tools encrypt the class/method/variable names of you Java byte code as an example to some other names that are meaningless. The obfuscation can also change the control flow of the program making it difficult to understand and trace out the program execution. The obfuscation tools provide users the option of configuring what to change and what not to change. This gives flexibility for the users to what extent they are wanting to obfuscate the code. Obfuscation can do following things:
  • Change your class names: What happens when a java class name gets changed? Obviously all its references have to be updated with new name, obfuscation tools take care of updating all the references. All the referenced classes must be on the obfuscation tool's class path. These tools must be able to find out all the referenced classes, if a referenced class is not found in the obfuscator class path, you can expect an error. Obfuscation can not  continue.                                                             
  • Change the java package: Imagine the amount of changes required throughout your java project if you change the package name. The classes may be in different jar files and might be referring to the class who's package name is being changed. The obfuscation tools take care of modifying all the import statements in the referencing java files. I found this feature useful especially when I had to change the package name for software branding purposes. Yes, certain branding requirements require java package name to be changed to include different company name in the package name. For example, my original package is say "com.companyabc", and branding requires the package to be "com.companyxyz". You can achieve this by using this feature of obfuscation.                                                                                                
  • Trim the method/String literal names: You've an option to tell the obfuscation tool that methods with what access modifiers have to be trimmed. This is very important especially when your obfuscated java jar file is going to be used by some one else. To be more specific, you are developing a reusable library. In this case you need skip public and protected methods from being renamed unless the other java classes that are going to use your library are being obfuscated with the same obfuscation tool. The tools provide an option for you to do this                                                           
  • Change the Control Flow: You can have obfuscation tools change the control flow of your program. You can choose from moderate flow obfuscation to aggressive flow obfuscation. Usually the aggressive flow obfuscation is not recommended if you want to have your byte code run on several JVMs.                                                                                                                                               
  • Other: Apart from above, obfuscation can delete deprecated/unknown attributes, line number information from your class files. Removal of line number information from byte code poses difficulty in debugging your java programs. Some tools provide the ability to debug the obfuscated code by providing the line number information, this is done by using the change log information. The obfuscation tools provide you a log of changes that are applied to the code which can be used to map obfuscated code to original non obfuscated code for debugging purposes. Its good practice to know upfront whether the obfuscation tool that you are choosing provides this ability.
Some thumb rules to remember in order for obfuscated code to work. Say for example you have 20 jar files which should "work together". You can not obfuscate few of them and keep other non obfuscated. The obfuscated and non obfuscated combination will never work unless you have obfuscated the jar files with extensible library option. If you choose to obfuscate as extensible library, the obfuscation takes care of not modifying the public/protected method names and public class names. We need to pay an extra attention in this case.

Does obfuscation really protect your software?

The answer is no sadly. Even obfuscated code can be reverse engineered! And hence obfuscation is not THE only means to protect your software. If you use obfuscation carefully utilizing all the obfuscation features, it will pose some difficulties while reverse engineering the code. This is the reason to use obfuscation.

Conclusion

It is good to think that is it really worth obfuscating the code? Because as I said, obfuscation can bring problems for developers when it comes to debug the code. The customer are going to send the exception traces that your program threw and obfuscated code will not have line numbers unlike non obfuscated. Though you have an option to map the exception trace to exact line numbers using the obfuscation change log, it demands extra time for debugging. If you have other means of protecting your software such as End User License Agreement (EULA) or some other means, it good to rethink on using obfuscation.

1 comment:

Related Posts Plugin for WordPress, Blogger...