Jai’s Weblog – Tech, Security & Fun…

Tech, Security & Fun…

  • Jaibeer Malik

    Jaibeer Malik
  • View Jaibeer Malik's profile on LinkedIn
  • Subscribe

  • Feedburner

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 29 other followers

  • Archives

  • Categories

  • Stats

    • 379,210
  • Live Traffic

Auditing – Story of what is happening behind the scenes

Posted by Jai on September 3, 2008

More than an year back, I had chance to audit a Social Networking Website, pretty famous in India, developed in PHP. Looking at the code I thought that what matters more to these people, the immediate revenue model or the long term model. How really different developers and the management people perceive the code quality?

Some of the top level findings from that audit are:

  • In one of the PHP files profile.php twenty plus Ajax calls were made on page load for which the response time was very high. Indiscriminate Ajax use can multiply the concurrent load, which in turn can choke the web server, DB server, physical connections, etc.

  • No caching or searching strategy was used to avoid repeated data retrieval from DB, especially for data which doesn’t change too frequently.

  • No connection pooling for management of DB connections.

  • No pagination on UI pages, as the number would grow the whole data to be displayed on the page.

  • No domain model in the application.

  • No framework used even for very standard operations. Reuse tried-and-tested 3rd party components (PEAR) as far as possible instead of writing such components from scratch, since they’re likely to be more optimized.

  • Use dedicated content server(s) for serving images and other content.

  • No logical modules separation in the application. e.g. security, persistence, logging, etc.

  • No separation of the business logic, presentation logic and database layer. From view part to database connections, everything is done on a single page, which is hard to maintain the code and causes scalability and performance problems. For example in home.php, it does some business logic checks on top by selecting data from database and depending on that the html code is echoed.

  • Very common use of “Select * “ in sql queries even when single value is used.

  • In one of the PHP files home.php 24 times the database hits are done. eg. to find information of logged in user etc..

  • No logic of abstract or utility methods in the application. As all these checks are common for each user, so it can be separated in different class which will help in maintain the code.

  • Maintain JavaScript and style sheets code in separate JS files and CSS files.

  • Use localization techniques instead of hard coding UI labels inside PHP files. Prevent the hard coding of language specific data.

  • No limit on maximum size of Audio, Video and Images.

Few days back, I got another chance to audit another Social Network Website, focusing on very niche market in US, developed in Java using latest J2ee technologies and latest frameworks. Initially it did sound very promising and surely they are doing pretty good business also. But let us see how the story goes once I started looking into what actually happening behind the scenes.

Some of the findings are:

  • No separation of business logic layer and database layer.

  • Lot of duplicate code through out the application. Same method doing the same thing copied in different classes. Same method in service layer and database layer. Lot of code duplicity in query building.

  • There are no test cases for some of the packages. Very minimal test cases for complex functionalities.

  • No default in switch statements at many places. Method contains a switch statement where one case branch will fall through to the next case.

  • Empty check for String: if ((selected != null) && !StringUtils.isEmpty(selected))

  • Extensive use of if-else statements through out the code which has increased the Cyclomatic Complexity at many places.

  • There are places where method input parameters are not used at all. At many places, local variables declared and not used anywhere.

  • At many places, nested if statement can be merged in single if statement.

  • No need to if-else statements in case returning boolean value.

  • No need of toString() for String objects.

  • Avoid instantiating Boolean, Long, Integer wrapper classes etc. Because it is object heavy instead valueOf() can be used.

  • Redundant null checks.

  • Info and error messages are hard coded inside the code, can be extracted out in a properties file to keep in a centralized place.

  • Same code for two branches, if(<condition>){ //some operation} else { //same operation }

  • Using non-short-circuit logic rather than short-circuit logic if (value != null & (!SomeEnum.VAL.equals(value)))

  • Null pointer dereference, there is a branch of statement that, if executed, guarantees that a null value will be dereferenced, which would generate a NullPointerException when the code is executed. Not enough test cases to cover such scenarios.

If we look closely at the above projects, both the websites are running fine and doing fare business. But the problems we see above are just an indication of the larger problems which may still to be discovered. The common problems at high level which both covers are:

  1. Code Duplication

  2. Big size files

  3. No separation of concern

  4. Performance inefficient

  5. Large cyclomatic complexity

  6. Unused variables, parameters, methods and even classes

  7. Over complicated expressions

  8. Design not evolving over a period of time

  9. Lack of developer knowledge

  10. Lack of adequate test cases.

Just looking at the code from the third person perspective, you can easily make out that the code is not suited for portability, maintainability and scalability etc.

In terms of recommendations I would say to pro actively look for the such code quality issues during the development phase of the project itself. There are so many Static Code Analysis tools available to make the job easy for you. Few to name which I used during these audits are Simian, PMD, Cobertura, Clover, FindBugs etc. The best option is to integrate these tools with the build process and keep a watch on the quality on regular basis. Catching the problem at early stage will help in minimizing the bigger problems which may occur later on. Most of the above issues can easily be caught with these tools and can be fixed in few minutes rather than to get these as bug. It is helpful even in case of lack of developer knowledge because you have the facts/suggestions from these tools in front of you.

Well, I have to agree that over the period of time the quality has improved a bit but still in what respect. Is it anything to do with the domain we are talking about here or is it just the two different projects or is it the business model people keep in mind to keep going. Is it really the process, may be code review practices or tight deadlines? Is it only the developers who write the code? Is it also the management team which does not provide enough time for re-factoring? Based on the experience and on interaction with the different development teams I could make out one thing that it is the way the developers and the management team perceive the code quality and take the necessary steps to keep it up to the mark.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: