Copyright trolls and online identification

My previous post dealt with copyright surveillance and algorithmic judgement, and here I want to focus on a particular kind of copyright surveillance and enforcement that has achieved a special sort of notoriety in recent years: copyright trolling.

Some of this is based on my most recent article, The Copyright Surveillance Industry, which appears in the open-access journal Media and Communication. I’m  also working on a future piece that deals with copyright enforcement, privacy, and how IP addresses and persons become linked.

Why this matters

First, copyright trolling is having an enormous impact, with hundreds of thousands of defendants named in US and German lawsuits in just a few years. Precedent-setting cases in other countries (such as Australia and Canada) have been determining whether this practice (sometimes called “speculative invoicing”) can spread into new jurisdictions. Some legal scholars have described copyright trolling as a “blight“, an abuse of the legal system, or a kind of “legal ransom“. Defendants must choose whether to pay what the troll demands, or face the prospect of an expensive (and sometimes embarrassing) legal fight. Balganesh makes a strong argument that this exploitative, profit-based use of the legal system disrupts the traditional “equilibrium” of copyright’s underenforcement.

Studying copyright trolling cases can also help us come to terms with the question of personal identification and attribution on the internet – what it means to connect traces of online activity to human bodies and the devices with which they interact. The thorny question of how to link persons to digital flows has been a topic of intense interest for a variety of surveillance institutions, including advertisers and intelligence agencies. Legal institutions around the world have been struggling with related questions in trying to assign responsibility for data communicated over the internet. Copyright trolling is just one example of this problem, but it’s one that is currently playing out in a number of countries on a massive scale.

What is a copyright troll?

Copyright trolls are the products of contemporary copyright regimes, internet technologies, and creative legal entrepreneurs. No one self-identifies as a troll, so the label is pejorative, and used to criticise certain kinds of copyright plaintiffs.

The term is derived from “patent trolls”: patent-owning entities that demand payments from companies allegedly infringing their patents. Like patent trolls, copyright trolls demand payments following alleged infringement of copyright. The difference is that a typical patent troll does not produce anything of value, and simply generates income through settlements and lawsuits. While the term “copyright troll” is usually reserved for law firms engaging in “trollish” practices, these firms represent copyright owners that do produce creative work for sale. It is typically the law firms that drive trolling practices. Some reserve the term “troll” strictly to describe those legal firms that acquire the ability to sue from copyright owners under certain terms (namely, to pass along a percentage of any settlements received to the copyright owner). The law firms can then exercise their copyright enforcement power autonomously.

The line between what is and is not a troll is more difficult to draw in copyright than patent law, since the law firms involved can point to a legitimate business that they are protecting and particular works being “pirated”. This has not stopped a number of authors from trying to come up with a workable way of delineating trolls from other plaintiffs, but these definitions end up encompassing only a particular slice of trolling operations (given their variability and opportunistic adaptability). There are varying degrees of autonomy that trolling law firms exercise: while some have a free hand in pursuing their legal strategies, others take direction from copyright owners. Because of this, I avoid labelling any specific companies as copyright trolls. Instead (and largely in agreement with Sag, 2014), I refer to copyright trolling as a practice – one that threatens large numbers of individuals with copyright infringement claims, with the primary goal of profiting from settlements rather than proceeding to trial on the merits of a case (see Curran, 2013, p. 172).

How copyright trolling works

In theory, copyright trolling can develop wherever a copyright owner stands to profit from initiating lawsuits against alleged infringers. The now-infamous Righthaven attempted to build its business model around suing people who were sharing news articles. Currently, Canadian government lawyers are accusing Blacklock’s Reporter of being a copyright troll, after the site filed suit against several departments and agencies for unauthorized sharing of the site’s articles. My focus here will be on the most common form of copyright trolling — suing people accused of file-sharing copyrighted works. Because the defendants in these cases are listed as “Does” until identified, and plaintiffs typically file suit against multiple (sometimes hundreds or thousands) of defendants at once, these cases can be called Multi-defendant John/Jane Doe Lawsuits. They begin with the collection of IP addresses tied to alleged infringement, proceed to the identification of internet subscribers assigned those IP addresses (discovery), and conclude with claims made against these subscribers in the hope of reaching settlements or (if defendants do not respond) default judgements.

A copyright surveillance company is used to monitor file-sharing networks (principally BitTorrent), where IP addresses of those engaged in file-sharing can be recorded. Just as the activities and IP addresses of downloaders and uploaders are largely visible on BitTorrent, so are the activities of copyright surveillance companies. This is because collecting information on file-sharing cannot be achieved without some level of interaction: connections need to be established with file-sharers so that their IP addresses can be recorded. Once a copyright surveillance company has collected the IP addresses involved in sharing a particular file, it hands them over to a law firm. While there are allegations that a particular German-based copyright surveillance company has been the driving force behind many US copyright trolling cases, typically the surveillance company exits the picture once IP addresses have been collected.

The next step is to identify the persons “behind” these IP addresses, and the only way to make this link is through the cooperation or forced compliance of an ISP. Since blocks of IP addresses are assigned to particular ISPs, a law firm can determine which ISPs’ customers to pursue by checking their list of recorded IP addresses. Copyright trolls have to be selective, targeting particular ISPs on the basis of geography (jurisdiction) or other factors. ISPs vary in their levels of cooperation with copyright owners that seek to identify allegedly infringing subscribers. In some cases it has been possible to get an ISP to forward a settlement letter without disclosing the identity of the subscriber (for instance, by abusing Canada’s notice-and-notice system), but in general the troll must obtain a court order for the ISP to identify its subscribers. In the UK and Canada, a court order used in a lawsuit to compel information from a third party like an ISP is known as a Norwich order. In the US, courts can issue subpoenas for ISP records.

It is this “discovery phase” of a lawsuit that has generated the most public information about how copyright trolling operates, since as previously mentioned, the plaintiffs in these cases generally avoid proceeding to trial. Instead, they use the legal system to identify individuals who can credibly be threatened by a large penalty if they do not settle an infringement claim. ISPs are effectively caught between the plaintiff and the alleged infringers during the discovery phase, and can behave in a number of different ways. In the US, Verizon has recently opposed a particularly burdensome subpoena from Malibu Media. In Australia, a group of ISPs have jointly opposed efforts to identify thousands of their subscribers in a precedent-setting case that continues to unfold. In Canada, Bell, Videotron and Cogeco complied with a court order to identify subscribers in 2012, but TekSavvy took a different approach in a subsequent case involving the same copyright owner — Voltage Pictures. TekSavvy claimed it could not oppose the motion to identify its subscribers (an argument disputed by Knopf), but it did go further than the Canadian incumbents in the previous case, and CIPPIC was granted intervenor status to argue against disclosure and for the privacy interests of subscribers.

Once IP addresses have been linked to subscriber names and addresses, the trolling operation can begin collecting settlements from defendants. Subscribers who ignore the copyright owner’s demands may end up subject to a default judgement, and those who protest their innocence may end up in a lengthy back-and-forth with lawyers, which in the US has included forensic examination of computers and polygraph tests.

IP addresses

In copyright trolling, the main challenge is linking IP addresses to corresponding subscriber information, which often requires a court order. But once this link is made, what does it mean? Is it evidence that the subscriber infringed copyright?

In criminal internet investigations (such as child pornography), IP addresses are only ever used as supporting evidence. IP addresses do not identify people, but they do become a crucial piece of information in tying people to digital flows and fragments. In a criminal case, the knowledge provided by this association can open the door to a further search of a property and computer hardware, ultimately leading to a conviction. It a copyright trolling lawsuit, an IP address leads to the disclosure of subscriber information, which leads to the subscriber receiving a settlement offer/demand (unless the copyright owner chooses not to send one, after discovering the subscriber’s identity). It is all well and good to argue that an IP address does not identify a person, until you are a person at the receiving end of one of these letters. At that point, you, as an identified person, have some decisions to make.

I will spend more time talking about IP addresses specifically in a subsequent post, as these digital identifiers are important in a variety of contexts besides copyright trolling. In the meantime, I’ll be paying attention to the drawn-out saga of the Teksavvy – Voltage case and how courts around the world learn from each other in dealing with copyright trolling.

The Copyright Surveillance Industry

My most recent publication The Copyright Surveillance Industry, appears in a special surveillance-themed issue of the open-access journal Media and Communication. In it, I examine the industry that has developed to monitor the unauthorized use and distribution of copyrighted works online. The same companies often help to facilitate copyright enforcement, targeting either allegedly infringing content, or the persons allegedly engaged in infringement. These enforcement actions include sending vast numbers of algorithmically-generated takedown requests to service providers, blocking uploaded content that matches the characteristics of certain files, or the lawsuits filed by “copyright trolls” and law firms engaged in “speculative invoicing”.

The scale and scope of the copyright surveillance industry

An interesting fact about the copyright surveillance industry, given the scale of its interventions (for example, hundreds of millions of Google takedown requests and copyright trolls targeting hundreds of thousands of defendants in both the US and Germany) is the industry’s relatively small size. It is certainly much smaller than the multi-billion dollar industry which develops technological defenses against infringement (known as digital rights management [DRM]), or the billions of dollars flowing through police, security, and military-serving surveillance companies. Copyright surveillance companies with just a handful of employees can leverage algorithmic methods to achieve online coverage on a massive scale. While some of their methods are closely guarded (notably, copyright trolls typically avoid proceeding to trial where their evidence would be subject to scrutiny), small teams of academics working with limited resources to track online file-sharing have achieved similar results.

The first wave of copyright surveillance companies were founded in 1999 and 2000, during the rapid rise of Napster. As file-sharing moved to other platforms, new firms sprang up and some were bought out by larger players. In 2005 MediaDefender (one of the more notable firms at the time, with major music, film, and software clients) was bought for $43 million. Another notable surveillance company, Media Sentry, was bought for $20 million in the same year. This appears to have been a time when enthusiasm for the industry was high. Four years later Media Sentry was sold to MediaDefender’s owner for less than $1 million. Subsequent acquisitions have involved undisclosed amounts of money, but this is generally an industry that deals in millions and tens of millions of dollars, and in which a large company might have several dozen employees.

Today, larger and more notable copyright surveillance companies include Irdeto and MarkMonitor – both the product of industry mergers and buyouts. MarkMonitor, which bought the prominent tracking firm DtecNet in 2010, was reported to have 400 employees in five countries in 2012. Irdeto entered the copyright surveillance market in 2011 when it bought the monitoring firm BayTSP and its 53 employees. These companies offer copyright monitoring and enforcement as just part of their “anti-piracy” or “brand protection” services. There are also smaller and more dedicated companies such as Evidenzia in Germany and Canipre in Canada, and more shadowy players such as Guardaley and its various alleged “shell companies“. Copyright owners (or the law firms that represent them), will seek out and hire these firms. Alternately, surveillance companies drum up business by approaching content owners, informing them that their content is being “pirated”, and offering their services.

Algorithmic surveillance

I’ll discuss copyright trolling and identification based on IP addresses in a subsequent post, but I want to take this post to discuss the sort of algorithmic surveillance commonly used in copyright enforcement. We see algorithmic surveillance wherever there is lots of data to scan and not enough discerning sets of eyeballs to go around, but the copyright surveillance industry has, since its beginnings, been driven by the need to comb through vast online domains, and to do so quickly and inexpensively (ideally, with as little human intervention and supervision as possible).

Much of what is reported, removed, blocked, or flagged as a result of these algorithms is rather uncontroversial from the perspective of copyright law. That is to say, a court might support the algorithm’s judgement that a particular act or piece of content counts as copyright infringement. But algorithms inevitably make mistakes, some of which are so ridiculous that it is clear no thinking human was involved in the process. These include misidentifying promotional content such as official websites and advertisements as copyright infringement. In at least one instance, a copyright enforcement company misidentified their own notices of infringement as actual instances of infringement and issued a takedown notice for them, resulting in a sort of algorithmic feedback loop. These automated misidentifications also result in removing legitimate content belonging to other copyright owners. In one 2011 case, Warner Brothers was accused of repeatedly and willfully issuing mistaken takedown requests. In response, the company essentially argued that it believed its identifications were accurate at the time, and mistakes were not willful because the volume of infringement meant that human beings were unable to fully supervise its automated monitoring.

While there are plenty of examples of algorithms behaving badly in the world of copyright enforcement, it is important to remember that what counts as copyright infringement is often not an easy determination to make. Courts continue to struggle with copyright law’s grey areas, with judges disagreeing on a variety of issues. This is particularly the case with various kinds of “user-generated content“, such as mashups, home videos, or parodies uploaded to YouTube. To make things worse, copyright owners often tolerate or even encourage unauthorized uses of their work (such as fan videos and other forms of fan culture) online. Expecting algorithms to adjudicate what counts as infringement in these circumstances has more to do with the business models of the web and media industries than copyright law. The same can be said for the expectation that users can identify which of their actions count as infringement in advance, and that users who are mistakenly targeted can appeal algorithmic errors when they occur. Ultimately however, copyright law supports and legitimates these practices, given that the potential penalties for not playing ball with copyright owners far exceed the consequences for abuse or automated carelessness in copyright enforcement.

Internet and digital technologies have opened new possibilities for individuals to create, consume, and distribute content. However, areas of contact between individuals and copyright owners have also increased. Legal and extra-judicial copyright enforcement mechanisms are being employed on a mass scale, based on questionable identifications of individuals and content, and often with limited recourse for those affected. We are likely to see continued calls to make the algorithms involved more accountable, and for ways to determine who can be held accountable for an algorithm’s decisions.