On Internet Explorer

Jun 05 2010

Web developers always complain about the amazing ways in which different versions of the infamous Internet Explorer break their websites. Currently there are three versions of IE found in the wild to give nightmares for any decent web developer – versions 6, 7 and 8. I think the problem with IE is not just about the IE team’s reluctance to conform to the latest web standards.

The biggest problem with IE is that its release cycle is a total failure. It is obvious to everyone but Microsoft. Below listed are the years in which three different versions of IE were released:

  • IE6 – 2001
  • IE7 – 2006
  • IE 8 – 2009

Compare this to the recent major point releases of Google Chrome:

  • Chrome 3 – 2009
  • Chrome 4 – 2010
  • Chrome 5 – 2010
  • Chrome 6 – 2010 (expected)

Some might argue that it is not fair to compare just these dates without knowing the details of the version-ing system the browser teams use, but let me tell you that this argument would still not help turn the blame away from the IE team. Google Chrome released many small updates even between these major releases and sometimes even in a weekly basis.

The web browser should not be considered as just another desktop application. It is something that billions of people use every single day. It is the most important application in your computer. It is something that should be updated at least every month or so rather than every 5 years. Currently the patches from Microsoft for IE are related only to security issues. Meanwhile Firefox too is thinking about making their update mechanism silent and automatic (ie without user intervention) similar to what Google Chrome does. If the IE team is not going to release updates for their browser frequently enough why bother releasing it at all?

Now think about it in clear terms – A lion’s share of users are not diligent enough to care about the version of the browser they are using. Outside the tech world, many do not know about the availability of better browser versions. You have to update the software without the user taking the initiative. How hard is it to figure out this?

If you go to the IE9 site, you will see the following score for IE9 preview version in the Acid3 test:

Impressive? Barely so. The current version of Google Chrome (5.0) already passes the Acid3 test with a score of 100! On top of that there are still no reliable reports on when this priceless edition of IE9 will finally ship after all these months of working on polishing the CSS rounded corners. Yes, there are lengthy posts in the IE team blog about how closely they follow the specs of CSS rounded corners, while they don’t dare to open their mouth about the <canvas> tag!

IE9 boasts hardware accelerated graphics rendering for faster performance, but what that means is that IE9 will not be available for windows XP users. Keep in mind that Windows XP is the most used operating system in the world. This in turn means that when IE9 is released, web developers will have to support four different versions of IE.

One part of me prays that they ship a better version of IE soon, while the other paranoid part of me prays that they stop shipping IE altogether. Keeping the history of IE in mind, I do have reasons to be paranoid.

Links:

  1. IE team blog
  2. The CSS Corner: About CSS corners – IEBlog
  3. IE9 Acid3 Test

2 responses so far

What programming language should I learn?

Mar 02 2010

Sameer asks what programming language he should learn. Below is an edited version of my reply.

The plumber comes to your home and asks: “What tool do you want me to use?”.

What will be your reply? “I don’t care! Fix the damn problem”.

That’s right. Everybody wants to get their work done, and get their problems fixed. They don’t care what tools you use. As a developer your job is to solve problems your costumers have in the most effective manner. This in turn means that you cannot use the same tool for every type of problem. Can you use an electric driller to fix a small leak in the pipes? No. You may want to use the duct tape for that.

Which language should I learn is a wrong question to begin with. Languages are tools in the bag of a software engineer. Before deciding upon the programing languages you want to learn, you should decide what type of problems you would like to work on. Would you like to work on web technologies? Would you like to work in the Linux ecosystem? Would you like to work in the mobile platforms? There are a million different niches in programming world and you have to ask yourself all those rhetorical questions that comes to your mind and then decide which language suites your choice.

This does not matter that the tools are not important. They are; but they are not more important than the problem at hand.

How many programming languages should you learn?

There is no point in trying to learn as many languages as you can. What you should do is to try and learn about as many languages as you can, and then decide which languages you should gain expertise in.

Going back to our analogy, what tools do you think a plumber should carry in his bag? “Enough tools to get his work done.”

Exactly. If you know to use just one tool, you may be forced to work with other people who can use some other tools. This happens in most corporate IT companies. In large companies you will be working with other people who have expertise in programming languages and tools which you don’t know how to use. This has the advantage that these people will be real rock-stars in their own narrow fields. Instead if you want to work in places like startups (or if you want to work as a freelance developer), you may want to know a little bit of every type of tool out there.

Of course you don’t have to know to use every type of screwdriver. You just have to be expert in using one good screwdriver model. Similarly you don’t have to be an expert in every web development language. Just learn a pretty decent one and you should be fine.

Every programming domain has its set of tools to help you develop softwares. If you are developing an enterprise website, you may be working with technologies like Core Java, Servlets, EJBs, XML, Unix Shell Scripts, Log parsers, Databases, Various web-servers etc. This means that in addition to programming languages there are many other technologies related to programming that you should master in order to be a good programmer.

One more thing you should know – all the programming languages are inherently different from each other. Some languages are easy to program in (eg Python) while some others are difficult (C/C++). I am not referring to the expertise needed in learning the language. I am referring to the effort required in writing a program after you learned the language. If you work as a programmer in an IT company, you will probably learn a new language (may be as per business requirements) in a very short time span. You will start writing decent code in about 1 week to 3 months time. Then the only thing that will matter is which language you really prefer to work with. So don’t worry much about which langauge is easier to learn; worry about which language is easier to use. (There is a correlation here though. You will find that in most cases the languages that are easier to learn are the easier to program in too)

You can learn a lot about programming from forums were smart programmers hang out (eg Proggit and Hacker News), read the top articles and ask your questions there; you will get in-depth answers.

The biggest secret:

You will become a good programmer only by – programming a lot. Many students don’t program outside their labs and college projects, and they never become good programmers. Try to do some coding in your free time. Try to solve Project Euler problems in your favorite programming language, or try to build a website of your own.

Having said all these here are some specific tips. These may or may not work in your case:

  • Enterprise development: Learning Java is a good. Java is used in many software shops as the primary language. It will take you a long way in most situations. At the same time, I have some objections with using the language from a startup programmer point of view. Read the discussions here too.
  • Web development: Stay away from PHP. It is a badly designed language. Instead, learn Django or Ruby on Rails. If you prefer Microsoft technologies use ASP.Net MVC.
  • Windows development: Learn C# (and probably not Visual Basic). For running C# applications in Linux, check out the Mono project.
  • There are many excellent programming tools or IDEs you should try to master. Eclipse is a popular IDE. Notepad++ is a popular code editor.
  • You should be learn about stuff like Regular Expressions, Unicode, Information Security etc. (I cannot even attempt to list all the topics)
  • Try to keep up with new technologies. You don’t have to learn all the latest languages, but try to have an awareness of the latest trends in programming. For example, web development, mobile phone development etc are areas where lots of innovations are happening. You don’t want to miss any of those if you are intersted in those fields. Then again, the forums I mentioned above will come handy.
  • Learning just one language is not very good idea. Learning a lot of languages is also not a good idea. Strike a balance between the two extremes and try to be good in at least 2-3 different programming languages in different fields. (As explained earlier, different languages are used to solve different types of problems)

Good luck!

11 responses so far

Wanted: Superman

Feb 08 2010

Some time back, I could not resist sending the following reply to this person at the recruitment agency:

Dear Sandhya,

Is this a joke? Did anyone really go through the so called assignments before forwarding them to me? Are they asking me to build two full-fledged websites in 12 hours? Best of luck finding candidates who can do that.

BTW, if you can get candidates to build these sites for you, why do you have to recruit them? I have to admit, this is an excellent way to get your works done for free.

I am sorry. I genuinely not interested in working for such a stupid company.

(Thanks for contacting me with the offer though. Really appreciate that. Let me know if there are openings in companies which are not looking for superman as their programmer.)

- Niyaz

I am refraining from attaching the two assignments they asked me to complete before the interview. Not because I don’t want to expose the actual company that tried to trick me into building the websites for free, but because when you see the requirements which will make Facebook ashamed of themselves, it will make you sick for the rest of the day.

4 responses so far

The flow of PageRank

Feb 02 2010

You may be familiar with Google’s PageRank technology. Google considers lots of variables to calculate the PageRank of your website. This is a discussion on an extremely simplified version of PageRank.

Assume that we rank the websites by the number and quality of incoming links. The quality of an incoming link is defined as a function of the PageRank of the site which link to the other.

Let us take an example. The following figure shows how a small group of websites link to each other.

Note that website F does not have any incoming link while website G does not have any outgoing link.

Now from the given graph of links we have to find out the (relative) PageRank of each of the websites. Initially we will assume that all the pages have the same PageRank. Now we count the number of incoming links to each site and change the PageRank according to the number of incoming links.

We define PageRank of site A as:

PR(A) = ? PR(x)/L(x)

where L(x) = number of outgoing links in site x

and x denotes the sites linking to A.

When you run this algorithm for the first time, the PageRank of all the pages get updated. Now the problem is that since the PageRank of all the incoming pages have been updated, we have to re-calculate the PageRank of the pages again to take the new PageRank values into consideration. (You can predict this problem just by noticing that the function is a recursive one.) The same problem surfaces in every iteration of the algorithm.

The question is that if the PageRanks change in every iteration, how do we know when to stop the iteration? Do the PageRanks ever stabilize? (The proper term is convergence).

Here is a python script to simulate the PageRank calculation many times over and over to find out whether the values converge or not. The output values are represented as percentages. (Google considers this value as the probability of a person visiting any particular website).

The chart below shows how the PageRank changes after each iteration:

As you can see the PageRanks fluctuate highly in the initial iterations and then they stabilize. This means that the PageRank function converges.

Another think to note is that adding more nodes to the graph did not seem to affect the convergence. Even if you double the number of sites in the collection, the number of iterations taken to converge stays almost the same. Others have also reached the same results (ppt). The PageRank function is analogous to electric current flowing through a mesh. Even if there are a lot of nodes and sources, the current flow stabilizes (and stabilizes really fast).

Also note that site D has the highest PageRank, which is to be expected as it has the most incoming links. Site F has the lowest PageRank because it does not have incoming links.

According to this algorithm, linking out to other sites do not reduce the PageRank of your website. There is a problem though. Take the case of site G. It does not link to any other site. This means that the PageRank is not flowing out of site G to any other site. If site G linked to other sites, it would have increased the PageRank of the other sites by a tiny bit. (This case affects only the first link out of any site). To solve this problem, Google divides the PageRank of sites like these (called sinks) to all other sites. You may also want to read about Damping factor.

Before leaving can you explain why the PageRank of site A is greater than that of site B?

5 responses so far

A quick little puzzle

Jan 27 2010

Here is a puzzle for you:

In a group of people, 40% are men and the rest are women. It is also given that in this group 40% of the men and 10% of the women are obese.

What is the probability that an obese person in the group is a man?

When you find the solution, post it as a comment or send it to me. The solution will be explained in a later post.

15 responses so far

Founder equity

Jan 19 2010

A question about splitting equity between co-founders:

I envision a 51 – 49 split. Do you believe this to be fair?

A 50-50 split will bring more trust and synergy in the startup. In the long run it will far outweigh any (non-existent) advantage you think you will have in keeping the 1% share to yourself.

If you have to use the 1% share to end an argument, you have already lost. In deadlock situations flipping a coin will be better than using the 1% share to force your choices every time on the other founder. Remember the saying: Win an argument and lose a friend.

If you haven’t done any significant work on the startup before the other founder join, give them equal share.

No responses yet

Chess programming and such

Jan 16 2010

Another story from my college days. Be warned, long article.

Engineering colleges in Kerala require students to submit a mini project in the third year of their computer science and engineering courses. This is in addition to the main project which is to be submitted in the last year of the course.

Two of my classmates – Praveen Kumar (who in our circles is regarded equal to Jon Skeet), Philip and I formed our team for this mini project. We were very much excited about this project that we started discussions about it a lot earlier than the official project start time. After a lot of late night debates we decided to develop a chess playing program.

Three of us were sharing a rented house near college and we used to develop software utilities for various purposes#1. The main point to note here is that we developed these programs in – wait for it – Visual Basic 6.

We were very much comfortable in Visual Basic, and since it is very easy to develop a good chess UI in Visual Basic, we started building a prototype of the application in VB6. We thought that once the basic logic is done and working, we would port it to a better and faster language. Unfortunately, we worked on the code for very long and hard that the code base grew larger and longer.

Testing the program was somewhat tricky. The alpha-beta based algorithm was in a working condition and the program was generating and making some basic moves, but we were facing two problems:

Problem 1: How do we know if the computer has played correctly? There is no way to really make sure that the computer has played the correct move (for any given depth) because we ourselves don’t know the correct move! Of course if you are a good chess player you can find out a good move for a particular board position, but that does not solve the problem. First of all you cannot be sure that your move is the best move. What if there were better moves that you just did not see? Secondly, chess players do not think exactly like a computer. They do not calculate the moves strictly using an algorithm. Some of their decisions are based on intuition.

The very best chess players can just look at a chess position and see the 2-3 lines of play that they should analyze instead of analyzing all the 30+ available moves. Computers cannot do that. Computers should look at all the moves to decide whether to analyze that line of play to more depths or not. Humans are good in chess strategy while computers are good in the tactics. This means that a human cannot tell whether a computer is playing the perfect game (for any given depth) or not.

Problem 2: Since our implementation was not complete and no optimizations were made initially, the computer would take very long time to think and make its move. We had to wait something like 4 minutes or so to get each move from the computer and it was taking up much of our development time. Verifying the correctness of the program was going to take a lot of time.

One good way to test a Chess program is to make it play against itself#2. So we would code till late nights and in the mornings when we go to college we would start the program and make it play against itself. When we come back we would look at the logs and see what happened. As much as excited we were, the results were often depressing. Most of the time the game will go into a loop where each player played (back and forth) the same moves. The reason was that once the chess AI#3 found the best move for a particular board position, it just plays the move disregarding any other facts like the history. The best way to fix this is to tell the program to consider the move history too in making its decision about which piece to move.

Another reason for frequent loops was that both the players in the game were of the same strength (they were both thinking x moves ahead). To fix this and to see some real results we made them of different strengths. One of the players will think x moves ahead while the other would think x-2 or something like that. This improvement helped us see some real results. When we came back to check the results at the end of the day, we could see one of the players checkmated!

If you are into computers, chess programming is one of the best ways to have fun. There are a lot of intricacies in the implementation so that you can tweak the program again and again to get performance improvements. The best chess programming tutorial (for beginners) I have come across is the one from GameDev.net.

I remember one other small bug we faced. Here is how a computer chess program finds out the best possible move:

Find out all the valid moves available to the player. Assume that you played the move and find out the new board position. Now look at the new board position from the perspective of the opponent and try to figure out what move he is most likely to make. If he makes that move, what will the new board position look like? What move will you make to counter his move?

You can go on and on iteratively deeper and deeper into this search tree#4 . The deeper you search, the better your decision (move) will be. For each board position the algorithm will assign a score. The score determines how good the board position is. If one of your pieces is captured, you lose that much score. For example if you lose a pawn, you lose 1 point.

Now look at the following game tree. Blue represents a move by the computer and red, a move by the opponent. The numbers indicate the points gained by the computer in each step.

For example in the first line of play, the computer gains 5 points in the first move (probably by capturing a rook) and in the counter-move the opponent fights back by gaining 3 points (probably by capturing a bishop/knight). Now as you may remember, the algorithms just looks at the total points of the player at the end of each line of play. In both of the above cases, the total gain by the computer is 2 points.

If the computer chooses between these two lines of play randomly, you are in for trouble. The problem is that the computer does not know what happens after the last move in the line of play. This means that if you follow the first line of play you are almost certain to have gained 2 points after the next two moves while in the second line of play you will lose 3 points after the next two moves. May be you can gain 5 points later in the game, but what if after two more moves when you can see further in the tree you realize that the move you saw earlier would be disastrous? Then you will have to change your game and this means that you cannot get those points.

There are two ways to solve this issue. The first is to prefer early points to later ones. In our example, it is better to choose the first line of play over the second one.

The second way is to use quiescence search:

The horizon effect can be mitigated by extending the search algorithm with a quiescence search. This gives the search algorithm ability to look beyond its horizon for a certain class of moves of major importance to the game state, such as captures.

We did a lot of modifications to the code. We used to take the printed versions of the source code (about a 100 pages) to our college and tried to find optimizations that can be applied to the code. Chess programming is one of the areas where over-optimization is not frowned upon.

So after all the hard work, the program was working fine and it could play a decent game of chess (given enough time of course).

Translating the source code

We were just about being happy at how the project was going on when it hit us – the college put a restriction that all projects must be done using Java. We had to port the application to Java soon.

Rewriting the whole application from scratch seemed time consuming and demotivating. We needed a faster way. What about converting the source VB6 source code to Java code automatically? Of course for this to work we would have to write a VB to Java language translator. It seemed too difficult considering the fast approaching deadline for project submission. We downloaded some code translation software from the internet and tried them out, but none of them seemed to work perfectly. Converting a complex VB6 UI having a fair amount of custom animations to Java is almost impossible even for an advanced translator.

Then we had an idea. Regex!

Of course Regular Expressions are not ideal for parsing source code of any kind. We had no time, so we decided to try anyway.

In the case of any decent chess playing program, there are two basic parts: a chess engine and the UI module. The chess engine does all the complex calculations – recording the user moves, finding out whether a move is valid or not, thinking about the best move for the computer etc. The UI module displays the chess board to the user and allows the user to make a move using the mouse. Now as you may have already inferred, the UI part is not very translatable between VB6 and Java. We decided to develop the UI from scratch in Java.

The interesting thing about the chess engine is that it contains lots and lots of calculations, conditions and loops which help it to make an intelligent move. I should probably note that a chess engine cannot be considered intelligent as such. A chess engine crunches millions of number very fast and finds out the best move the computer can make against a human opponent. The moves may look intelligent to a human player, but a chess engine cannot be regarded as an example of an AI engine. It is just a number cruncher beneath its layers. Anyway the point is that we had hundreds of pages of code that consisted purely of algorithms which aided the computer in playing a better game of chess.

As we found out, converting these algorithms from VB6 code to Java was not that hard as it seemed. We employed the powers of Regular Expressions to do that. Here is a glimpse of what we did.

1. Replaced:

If with if(
For with for(
Then with {
Else with } else {
End if, Next, End Function etc with }
True with true
False with false
And with &&
Or with ||
Mod with %
Exit For with break;
= in conditional statements with ==

etc.

2. Added ; (semicolon) at the end of lines which did not start with any of the above keywords.

3. Changed the array referencing code. Something like Board(x,y) in vb6 code became board[x][y] in Java code.

4. There were some complex conditional statements in the VB code that we thought would be impossible to convert to Java using Regex. We had to convert these manually.

5. Converted looping statements of the form

For i = 0 To NumMoves - 1 to

for(int i = 0; i < numMoves-1; i++){

6. Variable declarations of the form

Dim index As Integer to

int index = 0;

There was more of this kind.

Of course it was not all Regex stuff. It is impossible to get it right with Regex alone. The small program we wrote to translate the code read each line of VB code separately and converted it to Java, mostly using Regex. The whole translation (Regex replacement) was done in several passes. We never implemented a full-blown translator or language parser.

Even after this entire Regex circus the translation was not complete. We had to read through most of the code and change a lot of things to get it working correctly. There are many differences between the languages (like array indexes started 0 in Java and 1 in VB6) and those needed to be taken care of. It did take us a bit of effort to get the details correct, but finally we completed converting the code to Java with much less effort than completely rewriting it.

Please don’t let this article make you think that I advocate using the Regex for this type of work. I don’t.

#1More on that later. Remind me if I forget. I promise that this will change the life of many undergraduates. ;)

#2Another good way is to pit it against better chess engines.

#3I included the word “AI” there just so that I could write this note. Chess playing algorithm cannot be considered Artificial Intelligence. Computer chess is a number crunching problem. Implementing the algorithms in the most efficient manner is the best way to get good results. There is no intelligence involved (compared to real AI techniques such as Neural Networks).

#4It is limited by the computing power you have. On an average the number of valid moves you have for each chess move is about 30. If you are going to search 8 moves ahead, you will have to search through 30^8 nodes, which is a very large number.

9 responses so far

Older posts »