Finding the number of votes

Jan 09 2010

A little bit of math fun.

You visit this cool website and see a poll there. You want to know how many people voted in the poll. The number of people voted is important if you want to validate the results. For example, look at the following result:

If you don’t know the number of people voted in this poll, you will be left wondering what this result means. Does it mean that most people think IE6 is the best web browser? Or is it just that a single person voted in favor of IE6? You can calculate the number of people voted in the poll:

1. Note down the current status of the poll:

2. Cast your vote. Note down the new results:

In the above case I voted Yes. Here is how we calculate the number of people voted:

Let x = % of Yes votes before I voted

Let y = % of Yes votes after I voted

Total number of people voted = (100 – x) / (y-x)

In our example, number of people voted = (100-67)/(75-67) = 4

Remember that the result counts your vote too.

Many websites do not want you to know the exact number of people who voted in the poll (Mostly because there are not many votes). This is the reason why they allow you to see the results only after you have voted (Another reason for this is of course to make you vote).

Usually you can get around this by opening a new session to the server (by clearing your cookies etc) and voting again. Note that if there is no change in the percentages after you vote, it just means that a lot of people participated in the poll  and it is difficult (not impossible) to get the number of votes.

No responses yet

Elegant logic puzzles

Dec 25 2009

Nick Yee on elegant logic puzzles:

…an elegant logic puzzle is one that can be told to anyone age 10 and up and doesn’t rely on gimmicks, but always seems impossible to anyone when first told. In other words, the problem is tough but the solution is satisfactorily simple once explained. The solution doesn’t involve a person or tool that wasn’t explicitly stated in the problem itself…

Then he goes on and asks you two elegant logic puzzles. For a curious mind, both of them are extremely rewarding to muse over and the solutions are clean and elegant too. Go solve them if you haven’t already.

No responses yet

Get cached images from your visitors

Dec 12 2009

Jeff Atwood (Coding Horror fame) was in for a horror when he realized that his server crashed and his data was gone and due to some reason, the backup mechanism was not working. The complete data in Coding Horror and the StackOverflow blog disappeared.

Since his blog is very popular, many archiving systems including the Google cache have copies of the pages and I hope that they have by now recovered the complete textual data. The biggest problem in this case is getting back the images. There are not many archiving services that may have the complete backup of the images in the website.

So what should Jeff do now?

Since Coding horror is a high traffic blog, I think there is a way to get back at least some of the images. (The probability of this working depends a lot on the traffic to the website and a bit of luck)

Here are the steps:

  1. Configure the web server to return 304 for every image request. The HTTP status code 304 means that the file is not modified and this means that the browser will fetch the file from its cache if it is present there. (credit: this SuperUser answer)
  2. In every page in the website, add a small script to capture the image data and send it to the server.
  3. Save the image data in the server.
  4. Convert the pixel data to get the original images.Voila!

Capturing the image data

We are going to use the Canvas functionality in HTML 5 to get back the image data.

Here is the code you should insert into the pages of the website. It gets all the images in the current page, loads it to the HTML Canvas, gets the pixel data for the image and sends it to the server through an Ajax post.

This PHP script (Can PHP rescue Jeff? ;) To be fair, the server side code is trivial) saves the data to files in the server. Note that the files themselves will not be images, they will just contain the pixel data of the images. In addition to this, we are also saving the original file name and the image dimensions. This means that we can easily reconstruct the original images from this data. Data from every visitor is saved in a different file to just to make sure that you have enough redundancy (Watch out for his redundancy filling up your server disks)

Remember that this is a proof of concept code. You will have to modify it to use it in regular production environments and to get some real use from it. There are many limitations to this code. It goes without saying that you will get the image data back from the users only if they have the images cached in their browsers. This script will work only in the latest versions of Chrome, Firefox, Safari, Opera etc. (Don’t ever expect it to work in IE for the next decade). In addition to this, remember that the pixel data will be many times bigger than the original file size and you may have to carefully analyze the disk space usage of this script. (I guess in an emergency, none of these really matters).

You should edit the post URL in the script to match your domain name.

Finally, I have tested the code and it seems to be working (for me, at least). You need to include JQuery in the pages using this script and remember that due to security restrictions in the browsers, you will have to place all these files under the same domain name. Please tell me if there are any other flaws in the code.

[Updated: code changes to reduce the file size by 50%. The decimal numbers were converted to hex and the spaces in between the numbers removed. The file sizes can be further reduced by using the full character set.]

18 responses so far

The General Pirate License

Nov 15 2009

I have come across many situations where I wanted to share an idea but never wanted the idea to be attributed to me, mainly because the original idea never came from me and also because I did not know who the original idea came from.

There are some other instances where I want to share some source code which is almost working, but may contain bugs and untested edge conditions that can produce undocumented results (if at all there is any documentation). I want people to use the source code for solving their problems, but in most of the cases they may have to modify the code to fit their particular problem. I don’t want to take the blame if anything goes wrong, and I don’t ever want to support the source code, other than in the cases where I really want to.

I hereby propose a new license to help you in cases like these to promote sharing of your content in the best possible way – by allowing others to pirate it. This license should be used if you want your ideas or work to be shared and modified freely, but you don’t want to maintain or support the original ideas unless you really wish to do so.

(Edited to remove unwanted clauses. Thanks Scott)

The General Pirate license (GPiL)

1. This work may be copied as many times as you wish, modified in any way you want and published in any medium you like, provided you adhere to all the seven rules in this license.

2. You will not attribute the modified version of the work/product to the original author.

3. Once you modify the work, you should clearly mention the work as yours and you will be responsible for supporting and maintaining the work (if required).

4. You will not disclose the identity of the original author (i.e. your source) without the written permission from that person.

5. You will publish your modified version of this work under this same license (i.e. the General Pirate License).

6. All uses of this work will be at your own risk. The original author is liable to give you support for this work unless (a) You modified any part of this work/product (b) They do not wish to do so.

7. The source where you obtained this work from may or may not be the real original author of this work, but that is not the point here. The real point is the work itself, not the author.

There is a reason why all works under GPiL (General Pirate License) should continue to be under GPiL (clause 5). Any project under GPiL is supposed to be free to be modified and copied at will. According to the above rules, it is perfectly legal even to sell any intellectual property guarded by the GPiL. In essence GPiL allows you to do whatever you want with the IP – copy, modify, share, sell etc. The only restrictions that apply are the seven rules of GPiL. Clause 5 ensures that even if somebody is making profit from an IP, they have to share their version under GPiL. This is how the basic spirit of GPiL is carried forward.

That’s all. Let me know if any amendments can be made to this license to make it more piratical.

Now spread the word mate, and get sailin’. Arrrr!

9 responses so far

Geocities won’t be missed

Nov 11 2009

Yahoo!’s shutting down of geocities made news for the past couple of months. Even thought the sweet memories of starting our very first pages in geocities will be lurking around for a long time, I think we are not going to miss geocities much. Yahoo! may have their own reasons# to close geocities, but I think all in all it was good that geocities got shut down.

If you think about it, geocities did not matter any more.

Yahoo!_Geocities

The traffic to geocities was declining very rapidly in the last few years. Very rarely did the old geocities pages get featured in the search engine results. I don’t remember getting a geocities page as a result for any of my google searches (May be my queries are too much specialized and are biased).

I would argue that geocites did not have much quality content. Most of the pages in geocities were personal pages which were “under construction” for eternity. Newer users never signed up for geocities. Social networking was in and creating personal pages was out and users flocked to Facebook and the like. If anyone wanted to create pages so badly, they usually started a blog in Blogger or WordPress. After Yahoo!’s announcement of a probable closure of geocities, much of the quality data was moved by the users to other sites. All this meant that the pages in geocities no longer mattered. It was just the junk of the internet that ought to be cleaned out.

The biggest impact the closure of geocities will have on the web is on the search engine results. Even though the pages from geocities were not prominently featured in the search results, they always polluted the long tail results. (38 million pages do carry a very long tail with it). Most biggies in the search engine business have removed geocities from their index. There is another big aftereffect to this. The search engine rankings of other websites will be affected. You see, this 38 million web pages of geocities had lots and lots of outbound links. Remember that these links are old  and do carry significant weight. If these links are removed from the PageRank calculations, the search results will not be the same. I hope that the search results will improve at least a  little bit.

By the way if you were sleeping for the past few months and missed the party, and if you really want to get some of your pages back from geocities, you can try to get the data back from the Reocities project or from the Internet Archive.

#Every reason is economic. Isn’t it?

2 responses so far

Why do we Startup?

Nov 08 2009

Did you know that 9 out of 10 startups fail?  We are talking about the serious startups. In the not-so-serious type, almost all of them fail. So still why do I want to start a startup?

In the grand scheme of things, it does not matter whether I succeed or not. What matters is that 1 out of 10 startups do succeed. Here we mean success in the changing-life-of-others-for-good sense, not in the absolute economic sense. When you stand for a vision as ambitious as changing the life of others, you want to maximize the chances of humanity getting better and changing for good.

We cannot improve the ratio of startups succeeding. It will always be 1:10. So how do we increase the number of successful startups? By increasing the number of startups.

After a few years we will not be here in this world, but the changes we bring and the ideas we spread will remain. We have the responsibility to become the enablers of a newer and better world. I believe startups are perfect pathways to a better future.

6 responses so far

Recursion & bad examples

Nov 07 2009

If you ask a typical computer science graduate from Kerala to write a program to print the nth Fibonacci number, most of them* will invariably give you the following function:

int fibonacci (int n){
    if(n<2){
        return n;
    }else{
        return fibonacci(n-1) + fibonacci(n-2);
    }
}

So far so good, except that the answer is wrong.

Recursion is the worst way to find a Fibonacci number. The last time I checked it was impossible to use recursion to compute even the 50th Fibonacci number in a personal computer!

If it is impossible to calculate even the 50th Fibonacci number using this function, how could you possibly teach something like this in a computer science course? The only way Fibonacci numbers should be calculated is by linearly adding the numbers in a loop or by using any direct formula you have. Of course for some applications you can speed up recursion by remembering the child nodes in the tree and thereby avoiding doing the same calculations again in some other branch.

The scariest part is yet to come. In many colleges they use finding the nth Fibonacci number as the primary example for teaching recursion!

Why not teach students the best possible way to find the nth Fibonacci number? Why not teach a real world example for recursion? Is it necessary to teach the concepts in computer science using lousy examples?

*Take blanket statements with a grain of salt.

7 responses so far

« Newer - Older »