Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A couple of thoughts come to mind:

1. If I were Microsoft, I wouldn't trust Google's index. How do I know they aren't doing subtle things to the index to give them an advantage?

2. Having the resources to keep a live snapshot of the web is one of the big players' advantages. Opening the index, while good for the web, would not necessarily be good for the company. Google could mitigate that by licensing the data: for data more than X hours old, you get free access; for data newer than that, you pay a license fee to Google. Furthermore, integrate the data with Google's cloud hosting to provide a way to trivially create map/reduce implementations that use the data.

3. On the other side, what a great opportunity the index could provide for startups. Maintaining a live index of the web is costly and getting more and more difficult as people lock down their robots.txt. Being able to immediately test your algorithms against the whole web would be a godsend for ensuring your algorithms work with the huge dataset and that your performance is sufficient.

Here's to hoping Google goes forward with it!



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: