Daniel Imfeld Writing Journals Notes Projects
Notes
All Notes Articles 8 AWS 2 Books 3 Cheat Sheet 4 CSS 5 Database 8 Debate 1 Elasticsearch 1 Git 1 Idempotency 1 Job Queue 1 JTBD 5 Learning 6 Logical Clock 1 Machine Learning 7 Mental Models 2 Microconf 3 Nodejs 1 Notetaking 2 Ops 1 Optimization 1 Philosophy 1 Programming 6 Project Planning 1 Projects 16 Reading 2 Roam 1 Rust 11 SQL 7 Sqlite 1 State Synchronization 1 Svelte 2 UI 5 Visualization 1 Webgl 1
Notes RSS

Web Scrapers

Written 2020-08-28
  • SDKs
    • https://sdk.apify.com/ looks like the best option I've seen so far. Supports both Cheerio and Puppeteer crawlers with the same API.
  • Services
    • Original list by Kumar Thangdu (@datarade on twitter)
    • http://scrapinghub.com
    • http://www.outwit.com/products/hub/
    • http://fullcontact.com
    • http://emailhunter.co
    • http://clearbit.com
    • http://toofr.com
    • http://import.io
    • http://apifier.com (Kumar's number one favorite)
    • http://elink.club
    • http://www.eliteproxyswitcher.com/ - ;)
    • http://www.uipath.com/
    • http://diffbot.com
    • DiffBot Crawly
    • http://cloudscrape.com
    • https://commoncrawl.org/
    • http://www.fminer.com/
    • https://scraperwiki.com/
    • http://nutch.apache.org/
    • http://www.ubotstudio.com/index7
    • http://mozenda.com
    • http://fivefilters.org/
    • https://data-miner.io

Thanks for reading! If you have any questions or comments, please send me a note on Twitter.