LinkedIn Files Lawsuit to Battle Profile Scraping Bots

Citing the Computer Fraud and Abuse Act (CFAA) and Digital Millennium Copyright Act, the professional social network filed suit against 100 individuals for the use of bots that intentionally violate their terms of service and, as LinkedIn claims, multiple state and federal laws. The publicly available court filing document includes some information about how the organization combats bots and profile scraping, as well as some measures the alleged abusers are taking.


Filed in early August, LinkedIn is seeking to identify 100 “Doe Defendants,” or currently unknown parties, that have been using automated software programs to extract or scrape data from LinkedIn pages since December 2015. The case accuses these persons or entities of utilizing a massive botnet to avoid their abuse and bot detection systems, claiming that this activity constitutes “unauthorized access” under the Computer Fraud and Abuse Act (CFAA). According to TechCrunch, filing this lawsuit is a first step in asking the court to disclose who the entities or individuals hiding behind the IP addresses associated with profile scraping and bot activity really are. The suit additionally includes complaints for violation of the Digital Millennium Copyright Act and California penal code, while demanding a jury trial.


The court filing explains how bot activity and user profile scraping goes against LinkedIn’s User Agreement, arguing that the activity is in violation of “an array of federal and state laws.” The company describes some of their automated bot detection techniques, including the use of web traffic monitoring, as well as limiting the number of profiles a LinkedIn user account can view and how quickly they can view different profiles. The filed complaint lists several internal systems LinkedIn has in place to protect against this type of activity. This includes systems they call: FUSE, which limits the number of LinkedIn profiles a user account can view; Quicksand, which monitors the patterns of web page requests and presents challenges (such as CAPTCHA) or restricts offending accounts; and Sentinel, which scans, throttles and blocks suspicious or offending IP addresses. LinkedIn also detailed the use of IP blacklisting and machine learning models to identify groups of offending IP addresses.


Because not all web scraping is bad, such as with indexing for search engines, LinkedIn additionally maintains whitelisted IP addresses which they allow to “query and index the LinkedIn website, without being subject to all of LinkedIn’s security measures,” as explained by the company in the lawsuit. However, by going through a whitelisted third party cloud service provider, the site abusers and their profile scraping bots have been able to circumvent these security measures as well.


The suit claims these bad actors have caused and threaten ongoing irreparable harm to LinkedIn, and the company is asking the courts for “expedited discovery to learn the identity of the Doe Defendants.”


For more information:


LinkedIn sues anonymous data scrapers


LinkedIn Lawsuit – Case Document Filing (PDF)