Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Dear lord, that is a lot of wonderful reference material.

I'm not certain I get how they can afford so much (seemingly?) original material, yet be low-key enough that I haven't come across them before.



Some companies buy up the rights to old/out of print books that created this original material. Then they cut it up and automatically digitise it. There was a post on HN quite some time back about somebody describing his process to do exactly this for a repair-your-car book - and how he used it to make money via ads through SEO hits.


Do you remember about when this was? I've been searching for the post on HN and can't find anything.


I'm afraid not. I've been searching as well and couldn't find it. I recall his process to be

1. secure the rights to the book (he knew the author personally/through family)

2. cut the book open and run it through a high resolution scanner

3. use imagemagick to preprocess images

4. run OCR on the pages and convert them to markdown

5. have a compiler convert his markdown and images to HTML

EDIT: FOUND THE LINK: https://news.ycombinator.com/item?id=4974055


Thanks for tracking that down! Very interesting.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: