Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I did web-scraping professionally for two years, in the order of 10M pages per day. The performance with a browser is abysmal and requires tonnes of memory so not financially viable. We used them for some jobs, but rendered content isn't a problem, you can also simulate the API calls (common) and read the JSON, or regex the script and try to do something with that.

I'd say 99% of the time you can get by without a browser.



Fully agree. It takes some thought :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: