Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One could plausibly scrape a significant portion of the adjacency of the network from chained searches for '25 things' and similar notes.


The only problem is accessing all those results Google has stored away. The attempts I have made to access their database have met in huge failures, except when jumping through some rather massive hoops. They restrict you to the first ~60 results if you do what they want and use the API, and ~250 results if you cheat and scrape the pages directly.

That's what makes Google a little disturbing to me. Their database is (in theory) open, but to get any more than a very small segment at a time you have to either craft ridiculous queries.


That's what makes this work, though --- you only need one result, one name and the associated 'tagged' names. Then you search each of the tagged names.

60 resuls is plenty to get one good result per name --- heck, you don't even need that one. Even just one good result per ply will get reasonable results.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: