More Local News Caught in Flood of Unrelated Copyright Takedown Requests


Lumen has identified another large group of connected takedown requests, many of which are targeting local news sites seemingly by accident. Although Google does not seem to have improperly removed any news from its search index, this is yet another example of an organized takedown campaign with so many obvious errors that it almost certainly lacks any significant human oversight on the part of the senders.

The most striking similarity within this heap of requests is their descriptions. Like the series of takedowns related to a former Russian Olympian detailed in a previous post, the notices in this new group, which purportedly come from at least ten different senders, and from at least ten different countries, share a word-for-word identical description, which begins by claiming “the whole [infringing] site is full of pictures of the models working in our company,” and ends by pressing for “urgent intervention.”

Over 60,000 notices have used the description in the past year — and the connections between them don’t stop there. All 60,000+ notices include at least a few links with the word “escort” somewhere in the URL. Most of the links in the individual notices contain Turkish text along with the term, suggesting that the notices primarily seek the de-indexing of Turkish escort websites. (Ironically, only 74 of the notices list Turkey as their country of origin) More intriguingly, while DMCA takedown requests in the Lumen database contain anywhere from one to over twenty thousand allegedly infringing URLs, many, though not all, of the 60,000 notices we reviewed alleged copyright infringement from almost exactly 900 different URLs (the range was from 899 to 901), with significant overlap of specific URLs between notices.

However, sprinkled into the allegedly infringing URLs in these notices are some from innocuous and obviously non-escort websites, including a Turkish hotel’s reviews on TripAdvisor, a 2016 India Today news story about a cricket presenter, and an NBC Bay Area story about the recent CrowdStrike outage. Articles from local news sites were the most frequently appearing type of false positives among the targeted URLs. For example, one notice contains 99 links to Fox 5 News San Diego, including URLs for articles about abortion rights, Bangladeshi elections, local crime, and a rescue of zoo animals, among others.

A few weeks ago, a Lumen blog post showed how a likely use of keyword searches to identify infringing sites may be resulting in news media and scientific blogs getting excluded from Google search — in that case, some ocean-related articles were the subject of a notice from an OnlyFans performer named Ocean. Keyword searching seems to play a role here too: many of the local news pages contained the word “escort” somewhere in the text of the article. This also explains the hotel review page on TripAdvisor, as one of the popular questions about the hotel listed on the page is about escorts. Perhaps the URLs in each takedown request, then, are the results for a variety of keyword searches relating to escorts in Turkey, taking place over a period of time.

Yet these hypotheses can’t explain all the mistakes. The NBC Bay Area article about CrowdStrike contains no mention of the word “escort,” nor any obvious connection to Turkey. The same goes for the India Today article and the zoo animal rescue. So there must be more to the erroneous inclusions than merely unsanitized keyword searches.

Luckily, within this set of takedown requests Google has not de-indexed any of the targeted local news sites as far as we could tell (though many of the requests are still listed as “pending” on Google’s transparency reports as of this writing).

While the exact mechanisms behind this campaign are unclear, both its scale and the wrongful inclusion of news articles in unrelated requests point toward an automated system, one that functions without regard for any possible collateral damage to the information ecosystem.