Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The chances are pretty low a scraper would hit multiple unrelated Cloudflare websites since Cloudflare is only used in very very few websites... Scrapers usually are interested in particular websites, they dun just scrape random sites.

What other characteristics can you detect? Can't really look at IP address, since ISPs such as AOL use the same IP address for the same user. Can't look at headers or referral strings since those can easily be faked. Also search engines such as Google have been known to use non-Google IP's to check if a site is cloaking or not. And you say you analyze the reputation of an IP - IP addresses for users change all the time. And many scrapers do use data farms/cloud services such as AWS, but a lot are moving to European data servers as well, and these IP addresses are harder to get reputation for (they're not in ARIN, etc).



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: