Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A search index is often made of smaller independent pieces often called segments. So you can download & process progressively the data locally and upload it to an object storage. And run queries on it. That's what we did here for this project: https://quickwit.io/blog/commoncrawl

Also an interesting blog post here: https://fulmicoton.com/posts/commoncrawl/



Huh, that's pretty expensive. For comparison, my search engine has an operational cost of ~$50/mo.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: