Archive for September, 2008

Sep 24 2008

24 x 7 support is offline !!!

Published by Niyaz PK under General

I saw the following image in a website:

24x7-support-is-offline

If your 24 x 7 service is offline most of the time, why bother calling it a 24 x 7 service?

7 responses so far

Sep 11 2008

Sanitizing user data: How and where to do it

Published by Niyaz PK under Programming, Security

User data can be dangerous. Whatever the user supplies as data, especially in a web application, cannot be assumed to be safe. On the contrary, there are many malicious users who try to exploit every security vulnerability in your application. XSS, CSRF, SQL Injection attacks are familiar to most of you. (If not, go figure it out and come back fast.) In order to protect your application from such attacks you need to sanitize user data so that it does not do anything harmful to your system.

Exploits-of-a-mom

A big question being discussed vigorously in the web development community is:

Where to sanitize the user data? Should it be done in the input stage where the data is being entered by the user or in the output stage where the data is being displayed to the user?

The solution, in my opinion, (and in the opinion of a large group of experts in this field) is to do dual sanitization. One validation and SQL escaping before going into the database and one sanitization (filtering and escaping) before going to the output.

So the process essentially boils down to validation in the input and escaping in the output. Here are the reasons why you should go by this method instead of escaping and sanitation in the input alone:

  1. The way data needs to be sanitized depends on the context the data is intended to be used. For example, if the data is to be stored in a database, we need to escape the ‘ character to prevent SQL Injection attacks. If the data is to be displayed in the HTML output, we need to escape the < and > characters to prevent XSS attacks. In the input stage we cannot anticipate the ways in which the data is going to be used.  So it is better to sanitize the data just before the output stage when it is clear where the data is going.
  2. You cannot always be sure that the data in the database is sanitized data. You cannot guarantee that it came from the sources we anticipated the data to come from. There is a chance that the data ended up in the database through a path where you have not placed your input sanitizer. What if a user directly edited the database to add some data? What if there are loopholes in your sanitizer? What if the data was placed by an SQL injection attack against your database? All these points tell us that we need to sanitize user data where it is being used – that is in the output stage.
  3. There may be other applications which use the data from your database. For example an application written in COBOL may be using the data from the database to generate some reports with it. If the data already in the database is in the form of &gt;script&lt;&nbsp;hello&nbsp;world, the COBOL application will not able to make sense out of the data. It will have to implement its own decoder to read the data. This is a very painful process. We can avoid situations like this if we do not push processed data into the database.
  4. It is always best to have pure unaltered data in the database so that it can be easily processed by all the applications using the data. Once we sanitize the data before it is stored in the database, there is no going back. It is really hard to get the original data supplied by the user back after doing all these filtering and escaping techniques. On the other hand, if we have unaltered data in the database it is easy to escape it later with respect to each application using the data.
  5. According to the above points, data sanitization in the output is anyway needed for obvious reasons. If we are encoding the user data in the input as well as in the output, the data will be in a doubly encoded form and it will not be useful at all. There is no need for double sanitization anyway. So it is always recommended to encode your data to the target format just before passing the data to the target system.
  6. Users have reported security holes with applications like phpMyAdmin when it displays database values without encoding to HTML format. The developers of phpMyAdmin anticipated the data in the user databases to be free of any malicious code, but it may not be the case. So your application needs output sanitization especially if you are using data form outside sources. Never trust any data coming your way.
  7. Assume that you are using input sanitization. If there is some bug in the sanitizer, malicious data will creep into the database and now you have to fix the sanitizer and remove all the malicious data from your database. This can be a very tedious job. But if you were using output sanitizer, you just would have to modify the code to fix the security hole.

So how to do this two step sanization? Here is how:

  1. User data comes in
  2. Validate the data
  3. If valid, do SQL escaping and store in the database. (mysql_real_escape_string( ) in PHP)
  4. If invalid, reject the data. Don’t try to modify the data and push it into the database. This will do more harm than good. The user will think that the data went through successfully while the data in the database will be something else. So just accept or reject the user data. Don’t try to alter it.
  5. Output: If the data is going to an HTML page, escape for HTML. (htmlentities( ) in PHP). If the data is going to a unix command line, escape for shell.(escapeshellarg( ) in PHP). If the data is going to a URL, URL encode the data.(urlencode( ) in PHP) etc.

In the validation step, check for the proper encoding of the data - URL/UTF-7/Unicode/US-ASCII etc. Then check if the data contains proper character-set. Allow only the characters which are really needed for the application. Put a limit the length of the input data. Remember that an attacker usually makes use of long strings to craft an attack. Check whether the data format is correct or not. Phone numbers should contain only numbers; email addresses should contain text in the specific email format etc.

Always use the methods or frameworks provided by your language/platform to do the escaping and encoding/decoding. Most of the languages out there support these operations. Java is an exception though: when you are using Java, you should write your own methods handle HTML encoding/decoding.

Finally, when sending the data to the web browser, remember to set the proper encoding for the web page. This can be done using the response header attribute or using meta tags. It is advisable to use both methods. Forgetting this step can aid some types of XSS attacks.

Other concerns

Some websites need to output user input as HTML itself – for example websites that allow HTML editing. In this case you cannot do encoding in your application. Remember to add proper filtering mechanisms to allow only the tags that are intended to be used. Always block potentially dangerous tags such as <script></script>

Read more at:

No responses yet

Sep 10 2008

Robots.txt is not a security measure

Published by Niyaz PK under Security

I am increasingly coming across people who think robots.txt file can be used to prevent search engine crawlers from crawling sensitive data in their websites. Seriously.

This is just plain wrong. Data to be excluded using a robots.txt file is: unwanted, redundant or useless data. An entry in the robots.txt file cannot protect your sensitive data from going out. Sensitive data should not be left open in your website in the first place.

There are many malicious crawlers which crawl only the pages blocked by the robot.txt file in every website. I bet many interesting stuff will turn up in their search results.

2 responses so far

Sep 06 2008

Why Google Chrome may not be the big revolution you think it is

Published by Niyaz PK under Google, Internet

The web is celebrating the advent of the new competitor in the browser arena - Google Chrome. Here are some points you should note before jumping into the conclusion that Google Chrome is a huge revolution in the browser history.

The bugs

Here are some bugs I saw in the Google browser:

1. The task Manager is a main feature in Chrome. But let us get this straight: It does not work as intended always. Look at the screenshot below:

Google-Chrome

Chrome crashed and the Task Manager option was disabled so that I could not check what was wrong. If the task manager is disabled when the browser crashes, what is the point in having a task manager in the first place?

2. The browser crashes too often. I used the browser in three different machines and the browser tends to crash once in a while. This is very annoying considering the fact that the latest versions of Firefox and IE are rather robust.

There is even a very simple way to crash the browser.

Just type “:%” in the omnibox (address bar). Voila !!!

Google-Chrome-Crashed

Another annoying thing is that when the browser crashes, it crashes every single instance of the browser running, not just the current window.

Other Interesting facts

3. Chrome is not the fastest in terms of JavaScript performance it seems. Two different tests confirms this. In one test Firefox is ahead if Chrome while in the other, Safari is ahead. It is just that Chrome outperforms the other browsers in certain tests which Google handpicked by itself. Anyway it should be noted that Internet Explorer versions are all lagging behind by great margins in every single test.

4. Tab manipulation and the omnibox are not really a big step as far as web development is concerned. Those are just usability tweaks that can be incorporated in to any browser without much effort.

5. Chrome does not support add-ons as of now. This is a very huge drawback when compared to Firefox. Without add-ons, the functionality of Chrome is very limited. We can only hope that Google will incorporate the support for add-ons in the next iteration of the browser. The Firefox tribe (The early adopters) will hesitate to switch to Chrome because of this one single drawback.

6. Regional language support is poor. I still cannot not find out how to render some regional language web pages correctly in Chrome. Especially the option to change the font representing a regional language is missing.

7. There are many other smaller glitches like the absence of Full-Screen, absence of option to restart downloads etc.

So what is there to be excited about Chrome? The multi-threading capability and ability to isolate tabs are not a big innovation either. The IE team have been experimenting with this for IE8.

A much loved feature would have been the support for JavaScript multithreading. But Chrome does not support that also. There is no real innovation in the rendering engine front also. Chrome if just reusing the webkit rendering engine which is powering the Safari browser.

The only big thing about Chrome is the new fast V8 JavaScript engine and it capabilities. I am not sure if we can bank on that for creating wonders in the web.

As of now, I will go back to my much loved Firefox3. The Beta 2 version of IE8 also looks promising. It does have a lot of new features. It will make many Microsoft fans very happy. I will wait until Google comes up with some thing really different, something really game-changing.

11 responses so far