Archive for the 'Programming' Category

Jan 02 2009

No Testing = FAIL

Published by Niyaz PK under Internet, Programming

For a national paper presentation contest in a reputed engineering college, they developed a website for participant registration.

Before opening the website, they changed the database table name to a better one. (You know, these developers have got a habit of naming tables and variables after their pets).

After one month of operation, they closed the registration recently.

When they checked the database for the details of the participants, they were in for a surprise. Guess what? The database was empty.

Why? They forgot to change the table name in the code.

Two things I cannot understand:

  1. Why did not the application throw any error?
  2. Why they did not test it after it went live?

What are they going to do with the paper presentation contest now? I am anxious.

5 responses so far

Dec 01 2008

Hashing is not a substitute for string comparison

Published by Niyaz PK under Programming

The last day I saw an interesting abuse of hash functions.

In an application that processed strings, there was a part where it compared medium sized strings. Instead of using the in-built string comparison routine, they calculated the hash values of the strings and compared the hashes. Clever?

Wrong!

Finding the hash value (MD5 in this case) of a string is an expensive task. If you do not retain the hash values for future purposes, just compare the strings directly to know whether they are equal.

4 responses so far

Nov 10 2008

Don’t catch exceptions if you are not going to handle them

Published by Niyaz PK under Programming

It is a bad idea to have an application with exception handling like this:

catch(SQLException sqlexception)
{
        //This is not supposed happen
}

This is not supposed happen?!!
Seriously?

We all know that exceptions are not supposed to happen. Still we catch them so that if they do happen, we can take some steps to prevent additional damage.

It is good to handle exceptions and give meaningful messages to the user. At the same time, avoid using phrases like “An error has occurred” in messages. Tell what error occurred exactly, and suggest some steps to solve the issue. Try sending the stack trace to some log file so that somebody can clearly understand where and why the error occurred.

Exception handling is not the art of concealing every single error from the user and pretending like no error occurred.

6 responses so far

Oct 30 2008

Quick tips for database programmers

Published by Niyaz PK under Programming, Tips

Since you are smart, you know these things anyway. Just tell your friends too.

Avoid using select * from in your queries.

Why? Selecting every column in the table reduces the performance of the database, the network and the application in general.

Alternative: Use something like select required_column1, required_column2,… from table_name instead.

Avoid using code like a = resultSet.fields(1); in your application

Why? When somebody changes the query, 1 may refer to another column.

Alternative: Use code like a = resultSet.fileds(“column_name”);

Never assume same order of records when a query is fired multiple times.

Why? Each database engine retrieves records using different optimization techniques. The order of the records fetched may vary each time due to difference in different system conditions.

Alternative: Try to use order by clause in your queries wherever applicable. (Remember that there is a performance penalty associated with the order by clause)

No responses yet

Oct 10 2008

The truth about ignorance

Published by Niyaz PK under General, Programming

We all know a lot about a whole lot of things.
We know a little about a lot of other things too.

We have heard of a lot of stuff that we don’t know anything about. i.e, We all know that there are certain stuff that we don’t even know about. These things are a billion times more than the things we know about.

Then there are stuff that we don’t know; and we don’t even know that we don’t know about those stuff. Believe me - these things are a billion times more than the stuff you think you don’t know about.

Scary indeed !!!

The Dunning-Kruger effect says something along the same lines:

  1. Incompetent individuals tend to overestimate their own level of skill.
  2. Incompetent individuals fail to recognize genuine skill in others.
  3. Incompetent individuals fail to recognize the extremity of their inadequacy.
  4. If they can be trained to substantially improve their own skill level, these individuals can recognize and acknowledge their own previous lack of skill.

The next time you think you know all (or nothing) about something, think again.

No responses yet

Oct 08 2008

The funny caching problem

Published by Niyaz PK under Programming

There are many situations when solutions to problems they are supposed to solve only aid in magnifying the problem. Here is one case I came across.

There was an application that queried data from the database. It was working fine till somebody got the idea that if the data coming from the database is cached in the application, the performance of the application can be improved manifold. And guess what? They implemented the idea and now they are reportedly having unprecedented performance issues in the application.

Caching is supposed to increase the performance of a system when there are a large number of hits in the cache. In our case, caching degraded the performance of the application as time progressed. The reason? The database contains a large number of tables and records and our small cache could not handle such large amounts of data. As a result, the hit rate of the cache is very low.

So when to use caching in a system?

There are many things to consider before you implement caching in your application. The main points are:

  1. What are the trade-offs? How much memory should be allotted to the cache? How much processing power will the cache take? Will these degrade the performance?
  2. What will be the cache hit rate? If the hit rate of the cache is less than a threshold, using a cache may in fact reduce the performance of the system.
  3. Are there any other bottlenecks in the system that can be addressed before the cache is implemented? Instead of focusing your efforts in designing and developing a caching system for your application, it may be logical to search and remove any other bottlenecks that may be degrading the performance.

Hope these  points will come handy for at least some of you. If you have other points to add, please do.

No responses yet

Sep 11 2008

Sanitizing user data: How and where to do it

Published by Niyaz PK under Programming, Security

User data can be dangerous. Whatever the user supplies as data, especially in a web application, cannot be assumed to be safe. On the contrary, there are many malicious users who try to exploit every security vulnerability in your application. XSS, CSRF, SQL Injection attacks are familiar to most of you. (If not, go figure it out and come back fast.) In order to protect your application from such attacks you need to sanitize user data so that it does not do anything harmful to your system.

Exploits-of-a-mom

A big question being discussed vigorously in the web development community is:

Where to sanitize the user data? Should it be done in the input stage where the data is being entered by the user or in the output stage where the data is being displayed to the user?

The solution, in my opinion, (and in the opinion of a large group of experts in this field) is to do dual sanitization. One validation and SQL escaping before going into the database and one sanitization (filtering and escaping) before going to the output.

So the process essentially boils down to validation in the input and escaping in the output. Here are the reasons why you should go by this method instead of escaping and sanitation in the input alone:

  1. The way data needs to be sanitized depends on the context the data is intended to be used. For example, if the data is to be stored in a database, we need to escape the ‘ character to prevent SQL Injection attacks. If the data is to be displayed in the HTML output, we need to escape the < and > characters to prevent XSS attacks. In the input stage we cannot anticipate the ways in which the data is going to be used.  So it is better to sanitize the data just before the output stage when it is clear where the data is going.
  2. You cannot always be sure that the data in the database is sanitized data. You cannot guarantee that it came from the sources we anticipated the data to come from. There is a chance that the data ended up in the database through a path where you have not placed your input sanitizer. What if a user directly edited the database to add some data? What if there are loopholes in your sanitizer? What if the data was placed by an SQL injection attack against your database? All these points tell us that we need to sanitize user data where it is being used – that is in the output stage.
  3. There may be other applications which use the data from your database. For example an application written in COBOL may be using the data from the database to generate some reports with it. If the data already in the database is in the form of &gt;script&lt;&nbsp;hello&nbsp;world, the COBOL application will not able to make sense out of the data. It will have to implement its own decoder to read the data. This is a very painful process. We can avoid situations like this if we do not push processed data into the database.
  4. It is always best to have pure unaltered data in the database so that it can be easily processed by all the applications using the data. Once we sanitize the data before it is stored in the database, there is no going back. It is really hard to get the original data supplied by the user back after doing all these filtering and escaping techniques. On the other hand, if we have unaltered data in the database it is easy to escape it later with respect to each application using the data.
  5. According to the above points, data sanitization in the output is anyway needed for obvious reasons. If we are encoding the user data in the input as well as in the output, the data will be in a doubly encoded form and it will not be useful at all. There is no need for double sanitization anyway. So it is always recommended to encode your data to the target format just before passing the data to the target system.
  6. Users have reported security holes with applications like phpMyAdmin when it displays database values without encoding to HTML format. The developers of phpMyAdmin anticipated the data in the user databases to be free of any malicious code, but it may not be the case. So your application needs output sanitization especially if you are using data form outside sources. Never trust any data coming your way.
  7. Assume that you are using input sanitization. If there is some bug in the sanitizer, malicious data will creep into the database and now you have to fix the sanitizer and remove all the malicious data from your database. This can be a very tedious job. But if you were using output sanitizer, you just would have to modify the code to fix the security hole.

So how to do this two step sanization? Here is how:

  1. User data comes in
  2. Validate the data
  3. If valid, do SQL escaping and store in the database. (mysql_real_escape_string( ) in PHP)
  4. If invalid, reject the data. Don’t try to modify the data and push it into the database. This will do more harm than good. The user will think that the data went through successfully while the data in the database will be something else. So just accept or reject the user data. Don’t try to alter it.
  5. Output: If the data is going to an HTML page, escape for HTML. (htmlentities( ) in PHP). If the data is going to a unix command line, escape for shell.(escapeshellarg( ) in PHP). If the data is going to a URL, URL encode the data.(urlencode( ) in PHP) etc.

In the validation step, check for the proper encoding of the data - URL/UTF-7/Unicode/US-ASCII etc. Then check if the data contains proper character-set. Allow only the characters which are really needed for the application. Put a limit the length of the input data. Remember that an attacker usually makes use of long strings to craft an attack. Check whether the data format is correct or not. Phone numbers should contain only numbers; email addresses should contain text in the specific email format etc.

Always use the methods or frameworks provided by your language/platform to do the escaping and encoding/decoding. Most of the languages out there support these operations. Java is an exception though: when you are using Java, you should write your own methods handle HTML encoding/decoding.

Finally, when sending the data to the web browser, remember to set the proper encoding for the web page. This can be done using the response header attribute or using meta tags. It is advisable to use both methods. Forgetting this step can aid some types of XSS attacks.

Other concerns

Some websites need to output user input as HTML itself – for example websites that allow HTML editing. In this case you cannot do encoding in your application. Remember to add proper filtering mechanisms to allow only the tags that are intended to be used. Always block potentially dangerous tags such as <script></script>

Read more at:

No responses yet

Next »