Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Google search isn't a static site, the results are dynamically generated based on what it knows about you (location, browser language, recent searches from IP, recent searches from account, and so on with all of the things they know from trying to sell ad slots to that device).

That being said there isn't anything wrong with using Scrapy for this. If you're more familiar with web browsers than Python something like https://github.com/puppeteer/puppeteer can also be turned into a quick way to scrape a site by giving you a headless browser controlled by whatever you script in nodejs.



I see. I am familiar with Python but I don't need something so heavy like Scrapy. Ideally I am looking for something that is very lightweight + fast and can just parse the DOM using CSS selectors.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: