Another Page in the Google Books Saga: Appeals Court Blesses Mass Digitization Project as Fair Use
Those of us of a certain age (read: old) still recall standing in line at the bank of copy machines in the school library, quarters in hand, waiting to copy a few pages of a key piece of research found in the stacks. Those noisy machines have now largely been replaced by coffee shops and smartphone recharging stations ... as have, in many cases, the library stacks themselves.
Perhaps fittingly, when Google decided in the early 2000s to digitize the world’s books, it began by partnering with large libraries to copy portions of their collections, including many that are now out of print. Those libraries themselves generally held no copyright interest in the books they provided to Google, but rather held unique, expansive physical collections of books that made them a good fit for Google’s project. Partnering with libraries (rather than, for instance, buying up books from bookstores) was also clever on Google’s part: It put Google in league with academic and research institutions, who are more often on the fair use side of the copyright infringement equation.
In 2005, several authors brought a putative class action against Google, claiming that the reproduction of their books as part of the Google Library Project infringed their copyright interest. In 2013, the Southern District of New York rejected those claims, finding the project a fair use.
The Second Circuit has now affirmed the lower court’s decision, stating in its opening sentence that the case “tests the boundaries of fair use,” which is no doubt true: Google’s digitization of literally millions of copyrighted works, all under the fair use defense, is unprecedented. Judge Leval’s relies on the large body of fair use jurisdprudence on “transformativeness” developed over the past several decades, which he in large part originated with his seminal “Toward a Fair Use Standard” article in 1990. His decision is characteristically filled with bite-sized nuggets of fair use wisdom that will no doubt be cited in many future briefs and decisions. Still, while Judge Leval is careful to provide caveats that limit the Court’s decision to the specific uses (and protections) encompassed by Google’s particular project, with this decision the Second Circuit has to some degree opened another door allowing further monetization by third parties around existing, copyright-protected works.
The Google Library Project
Once books are scanned by Google, the resulting digitized contents are deployed in the following manner. First, its Google Books search function allows users to search for words and phrases within the database, which returns a list of books containing those terms. The search results also include a brief description of the book, as well as a short “snippet” – usually about one-eighth of a page – of the text surrounding the term(s) for which the user was searching. Google has also partnered with many publishers and authors to make available larger portions of certain books, and to provide links for purchasing the books online. Books no longer under copyright (that is, in the public domain) may be displayed in full.
Crucially, Google also places strict limits on the manner in which search results are provided. First, Google Books will only show at most three snippets from any book in response to a search. Second, Google “blacklists” one page in every ten from each book, and one snippet per page. It also does not allow the viewing of any snippets for certain works where seeing a snippet might provide too much key content, such as cookbooks (this restriction appears to reflect the fair use concern, under the third factor, of taking the “heart” of a work). Third, Google excludes books from snippet view upon request of the copyright holder.
In addition, in providing the Google Books search function, Google also provides a digital copy of each book it digitizes to the original library or other institution that provided the physical copy for scanning. Its contracts with those institutions generally require the institution to follow copyright law in using the digital copies, and not to freely allow dissemination to the public.
Background on Fair Use
Beginning with first principles, Judge Leval’s decision cites the constitutional and theoretical underpinnings of copyright law, noting that “the ultimate, primary intended beneficiary” of copyright protection is “the public, whose access to knowledge” is served by providing “rewards for authorship” to individual authors who bring their works to the public. The decision then notes that sometimes this interest in expanding public knowledge conflicts with the monopoly given to authors, and that the fair use doctrine has arisen to address this conflict.
Judge Leval is careful to note that the fair use “test” is not a rigid one, and rather that each of the four fair use factors found in Section 107 of the Copyright Act “stands as part of a multifaceted assessment of the crucial question: how to define the boundary limit of the original author’s exclusive rights in order to best serve the overall objectives of the copyright law to expand public learning while protecting the incentives of authors to create for the public good.”
In his analysis, Judge Leval concentrates primarily on the first factor (purpose and character of the use) and the fourth factor (effect on the potential market for the copyrighted work), as most courts now do. Notably, he summarizes the law on transformativeness under the first factor in a clear, succinct way and ties it back to the underlying purpose of copyright law, stating that “transformative uses tend to favor a fair use finding because a transformative use is one that communicates something new and different from the original or expands its utility, thus serving copyright’s overall objective of contributing to public knowledge.” He also cautions against relying too heavily on the word “transformative” itself, as it is merely “a suggestive symbol for a complex thought,” and distinguishes transformation of purpose under the first factor from the concept of transforming a work’s form in defining derivative works, an overlap that has confounded litigants and judges in other cases.
Fair Use Analysis
Getting down to business, Judge Leval then examines the search functionality of the Google project under each factor, readily finding that the search functionality itself serves a “highly transformative” purpose. That purpose is allowing searchers to identify and locate “significant information about those books” themselves (emphasis in original). In other words, Judge Leval blesses the provision of what is essentially metadata about the copyrighted works – the inclusion of certain terms, the frequency those terms are used – as well as data about all of the collected works as a whole, such as the use of specific words during historical periods. Elsewhere in the decision, he also notes that for the most part, copyright owners simply hold no interest in the type of information being provided by Google.
More troublesome for the Court is the question of the snippets, where clearly some portion of protected expression, not just metadata, is being provided to searchers. Here, Judge Leval holds that the provision of this contextual information is critical, and “adds important value,” to the transformative purpose of the search tool. As such, the first factor favors Google.
After finding that the second fair use factor (nature of the copyrighted work) played no role in this case, Judge Leval turns to an analysis of the amount and substantiality of the portion used, finding that the taking of the entirety of a work is justified where it is “literally necessary to achieve” the transformative purpose, such as for Google’s basic search functionality. Judge Leval’s approach mirrors that of other modern cases on digital copyright such as Perfect 10, Inc. v. Amazon.com, Inc. and Kelly v. Arriba Soft Corp., where any prior conceptual arguments against copying of works in their entirety have fallen in the face of technological uses that require access to the whole work in order to function.
With respect to the provision of snippets, Judge Leval acknowledges that depending on how such content was provided, such a use might not pass muster under the third factor. He therefore rests his conclusions on the specifics of the Google project, and the limitations imposed by it (described above). “The result of these restrictions is, so far as the record demonstrates, that a searcher cannot succeed, even after long extended effort to multiply what can be revealed, in revealing through a snippet search what could usefully serve as a competing substitute for the original.” Based on the evidence before the Court, no more than 16% of the original content could be provided through snippets, and this would require an extraordinary effort on the part of the user; moreover, the 16% would consist of “fragmentary and scattered” snippets that could not be considered “substantial” under the third factor.
Completing his fair use analysis, Judge Leval finds that the fourth factor favors fair use because the ability to search the text of the book to determine whether it includes selected words does not act as a substitute for the books themselves, but does note that “[e]ven if the purpose of the copying is for a valuably transformative purpose, such copying might nonetheless harm the value of the copyrighted original if done in a manner that results in widespread revelation of sufficiently significant portions of the original as to make available a significantly competing substitute.” In the case of the Googel snippets – again as currently offered – he concludes that while this function might possibly cause the loss of some sales, as a whole there would not be significant effect on the potential market for the copyrighted work.
In addition to the fair use analysis, Judge Leval makes quick work of the plaintiffs’ argument that they have a derivate copyright interest in providing search and snippet functions as to their own works. He holds that “[t]he copyright resulting from the Plaintiffs’ authorship of their works does not include an exclusive right to furnish the kind of information about the works that Google’s programs provide to the public. For substantially the same reasons, the copyright that protects Plaintiffs’ works does not include an exclusive derivative right to supply such information through query of a digitized copy.” In other words, the right to create derivative works does not extend to “right to supply information about that work of the sort communicated by Google’s search functions.”
Judge Leval expresses sympathy for the plaintiffs’ remaining argument, that the storage by Google (presumably forever) of digitized copies of plaintiffs’ books risks exposure to hackers and pirates. However, ultimately he finds that the concern is simply not supported by the evidence in the record, which instead supported Google’s argument that the digital files were secure.
Distribution to Libraries
With respect to plaintiffs’ claim that Google violates their copyright by distributing digital copies to the libraries that provide physical copies for scanning, Judge Leval describes Google’s activity as the “creation for each library of a digital copy of that library’s already owned book in order to permit that library to make fair use through provision of digital searches,” and he then finds this use to be non-infringing. In other words, Judge Leval posits Google as a service provider that merely digitizes works for institutions that want to make fair use of those works. The plaintiffs, to the contrary, argued that Google could not convert potential fair uses by the institutions into fair use by Google, which created the digital copies for its own commercial purposes. Rather than addressing this point specifically, the Court’s decision merely brushes aside the possibility that the libraries might use the digital copies to infringe as speculative.
One of the plaintiffs’ most compelling arguments, albeit one that was difficult to make to an appellate court reviewing a fair use decision on a specific evidentiary record, was that a ruling in favor of Google would open the floodgates for other parties to engage in mass digitization without putting in place all of the controls Google employs.
We are likely in particular to see a similar push to expand the definition of “snippet”, especially as search is applied to other elements besides mere text. What portions of a video, for instance, might be returned as a search result without infringing, and how much video before and after that portion could be considered necessary for context?
As always, those who seek to use copyrighted works for new and different purposes without permission will need to consider carefully whether they are using only the portions of the underlying works necessary to serve those purposes, and copyright holders will need to carefully consider which uses by third parties truly impact the market for the original work.