September 24, 2024
Almost two years ago, I wrote about LinkedIn’s suit against hiQ Labs, Inc. In that case, LinkedIn sued hiQ Labs for scraping its users’ public profiles and selling the results as part of an employee training and retention tool. There, the Court found that hiQ Labs violated the social media company’s terms of service because, as it states very clearly in LinkedIn’s user agreement, “NO SCRAPING.” (I’m paraphrasing, loudly.)
We now have a second court decision ruling against scraping — but for a very different reason than in the hiQ action.
This time, the venue is the 11th Circuit Court of Appeals and it’s that court’s second decision in the case since the dispute began in 2016. In its first decision (back in 2020) the 11th Circuit wrote: “Warning: This gets pretty dense (and difficult) pretty quickly.” That’s true! But don’t be scared. I think we can summarize it all succinctly without getting lost.
The plaintiff is Compulife Software, Inc., whose products are a database and software that allows licensees (generally, insurance agents) to compare life insurance quotes. These agents/licensees can incorporate Compulife’s products into their websites, but the public can also access Compulife’s products on its own site, www.term4sale.com.
The defendants are a group of individuals who used bots to scrape Compulife’s publicly-accessible site and database and built their own, competing insurance quote site. This group (they never actually formed a business entity) obtained the source code for Compulife’s software under false pretenses. (One of the group’s members contacted Compulife, claiming that he worked for one of Compulife’s licensees, and asked for a copy of the source code. Compulife gave it to him.) The defendants’ used this code to engineer the scraping of Compulife’s website.
Based on this, Compulife accused the defendants of violating the federal Defend Trade Secrets Act, as well as the analogous Florida Uniform Trade Secrets Act. (There were also copyright infringement claims relating to defendants’ unauthorized use of Compulife’s software, but that’s for another day). To prevail on either claim, Compulife had to establish that (1) it had a trade secret, and (2) the defendants misappropriated Compulife’s trade secret.
Initially, the District Court held that Compulife didn’t have a protectable trade secret because its entire database could be accessed by the public. However, in its 2020 decision, the Appeals Court reversed this, concluding the database was indeed a trade secret because, among other things, Compulife “goes to great lengths to secure its database” and that even though the individual, publicly-available quotes on the Compulife site were not trade secrets, Compulife’s compilation of them could be.
On this latest appeal, the main issue was whether the defendants’ use of bots to scrape Compulife’s database was misappropriation. The 11th Circuit, in addition to reaffirming its original holding that Compulife’s database was a trade secret, concluded that defendants misappropriated that secret when they used bots to “commit a scraping attack that acquired millions of variable-dependent insurance quotes.” That quantity was a key factor: As the Court wrote, “even if individual quotes that are publicly available lack trade secret status, the whole compilation of them (which would be nearly impossible for a human to obtain through the website without scraping) can still be a trade secret,” and the defendants’ use of bots to do what a human could not manually accomplish represented improper means.
The Appeals Court, however, was careful not to condemn scraping as a whole, writing “[i]t is important to note that scraping and related technologies (like crawling) may be perfectly legitimate.” (Italics from the court’s opinion).
This seems pretty straightforward particularly given defendants’ acquisition of Compulife’s code under false pretenses. However, I’m curious to see future rulings that shed more light on when scraping is legitimate and, more importantly, what factors do courts look at to determine when scraping is ok and when it’s not? Is it the sheer volume of material taken? The impact on the plaintiff’s business? Something else?
When the 11th Circuit (or another court) enlightens us, I’m sure I’ll be back to write about it.
September 9, 2024
Last week, the U.S. Court of Appeals for the Second Circuit affirmed a federal judge’s March 2023 holding that the Internet Archive’s practice of digitizing library books and making them freely available to readers on a strict one-to-one ratio was not fair use. For reasons I’ll get into below, the outcome is pretty unsurprising. It’s also worth looking at because it likely previews some of the arguments we’ll hear in the case between the New York Times and OpenAI (creators of ChatGPT) and Microsoft if (or when) that case makes it to the Second Circuit. (Quick summary of my post on the subject: The New York Times Company filed suit late in December against Microsoft and several OpenAI affiliates, alleging that by using New York Times content to train its algorithms, the defendants infringed on the media giant’s copyrights, among other things.)
First, some background. The Internet Archive is a not-for-profit organization “building a digital library of Internet sites and other cultural artifacts in digital form” whose “mission is to provide Universal Access to All Knowledge.” To achieve this rather lofty goal, the Archive created its Open Library by scanning printed books in its possession or in the possession of a partner library and lending out one digital copy of a physical book at a time, in a system it dubs Controlled Digital Lending.
Enter COVID-19. During the height of the pandemic, when everyone was stuck at home without much to do, the Archive launched the National Emergency Library. This did away with Controlled Digital Lending and allowed almost unlimited access to each digitized book in its collection.
Not surprisingly, book publishers, who sell electronic copies of books to both individuals and libraries, were not thrilled. Four big-time publishers — Hachette, Penguin Random House, Wiley, and HarperCollins — sued the Internet Archive for copyright infringement, targeting both its National Emergency Library and Open Library as “willful digital piracy on an industrial scale.”
The Internet Archive responded that these projects constituted fair use and, therefore, did not infringe on the publisher’s copyrights. To back this up, the Archive claimed it was using technology “to make lending more convenient and efficient” because its work allowed users to do things that were not possible with physical books, such as permitting “authors writing online articles [to] link directly to” a digital book in the Archive’s library. The Archive also insisted its library was not supplanting the market for the publisher’s products.
The District Court rejected these arguments, holding that no case or legal principle supported the Archive’s defense that “lawfully acquiring a copyrighted print book entitles the recipient to make an unauthorized copy and distribute it in place of the print book, so long as it does not simultaneously lend the print book.” The judge also deemed the concept of Controlled Digital Lending “an invented paradigm that is well outside of copyright law.”
In affirming the District Court’s ruling, the Second Circuit Court applied the four-part test for fair use that looks at: (1) the purpose and character of the use; (2) the nature of the copyright work; (3) the portion of the copyrighted work used (as compared to the entirety of the copyrighted work); and (4) the impact of the allegedly fair use on the potential market for or value of the copyrighted work.
The first factor — the purpose and character of the use — is broken down into two subsidiary questions: Does the new work transform the original, and is it of a commercial nature or is it for educational purposes? Neither the District Court nor the Court of Appeals bought the Internet Archive’s claim that its Open Library was transformative. The Court of Appeals held that the digital books provided by the Internet Archive “serve the same exact purpose as the original; making the authors’ works available to read.” (The Court of Appeals did find that, as a not-for-profit entity, the Internet Archive’s use of the books was not commercial.)
On the second factor, which is generally unimportant here, the Court of Appeals also found in favor of the publishers. Of greater significance is factor three, which looks at how much of the copyrighted work is at issue. Copying a sentence or a paragraph of a book length work is more likely to be fair use than copying the entire book which, of course, is exactly what the Internet Archive was doing. Again, another win for the publishers.
And arguments on factor four — the impact on the market for the publishers’ products — didn’t work out any better for the Internet Archive. Notably, the Court of Appeals found that the Internet Archive was copying the publishers’ books for the exact same purpose as the original works offered by the publisher, thus naturally impacting their market and value.
So what are the takeaways here as we look ahead to the case between the New York Times and Open AI/Microsoft?
On the one hand, OpenAI/Microsoft have copied entire articles from the Times (and the numerous other plaintiffs that are suing OpenAI and Microsoft), which will hurt OpenAI/Microsoft claims of fair use. Likewise, OpenAI/Microsoft’s fair use arguments won’t get very far if the Times can show that ChatGPT’s works are negatively impacting the market for its work or functioning as a substitute for journalism.
On the other hand, if OpenAI/Microsoft can show that ChatGPT’s output transformed the Times’ content, it may be able to prevail on fair use.
In any event, the case between OpenAI and Microsoft and The New York Times is likely to include a lot more ambiguity than in the Internet Archive matter, with the potential to result in new interpretations of copyright law with massive consequences for media and technology companies worldwide.
August 26, 2024
I spend a lot of time here nerding out about interesting cases and the many provocative types of legal conflict that continually arise. Keeping up with trending issues is an important part of what I do, and the latest disputes are more fascinating than ever.
But there’s another big part of my job that I talk about less that is just as captivating: Working with clients.
Why do I generally keep mum about this? Obviously, I can’t reveal any privileged client information. Also, litigation can be stressful, clients can sometimes have meltdowns and throw tantrums, and I’m not going to write about people’s bad moments even if they might be instructive for others. Finally, unlike reading and interpreting statutes and cases, working with clients isn’t something I learned in law school. It is a skill gradually acquired over years of practice and continual improvement. As a result, I (and most other lawyers I know) don’t really have an academic framework to organize and disseminate my expertise about working with clients.
But fear not: I’ve got a few things to share. Specifically, emotions, beliefs and behaviors I’ve seen that cause clients (and attorneys) needless stress and can make it harder to produce good results. Recognizing and anticipating these problem areas can help clients and attorneys have much better experiences as they navigate difficult litigation. (Also, it never hurts for me to put my thoughts down so that I can come back to them. Everybody wins!)
- The “It’s not fair!” syndrome. I think that a lot of people come to me feeling they’ve been treated in a way that is unfair and they expect “the law” to be on their side, and for lawyers and courts to make things right. In an ideal world that is exactly what would happen but, alas, as should be obvious to anyone over the age of 4, we don’t live in an ideal world. “The law” is made up of people. People with wildly differing beliefs and agendas. People who sometimes just plain get things wrong (that’s why we have appellate courts). Moreover, what’s fair to one person might not be fair to someone else. It’s important people put aside that powerful “it’s not fair!” feeling and focus instead on getting attainable, satisfactory results.
- “And another thing!” A lot of times people are determined to tell the opposite side in a dispute everything they’ve done wrong. But, in my experience, not every little thing matters. It is better to have one or two really good examples of why you’re right and/or the other side is wrong rather than throw everything but the kitchen sink at them. Doing so cuts down on needless back and forth and keeps the focus on those points that have power to change the situation. Plus, keeping some weapons in your arsenal in case you need them later is always a good idea.
- “Same thing, same result.” Often, when people come to me, they’ve already spent a lot of time going back and forth with their adversary and discussions have fallen into a predictable pattern. For example, your side keeps asking for information and the other side keeps ignoring these requests. And on and on. If you keep doing the same thing, you’ll probably keep getting the same result. That’s frustrating. If you want something different to happen, clients and attorneys need to be willing to try something different.
- “I’m not going to tell you.” If you’re a client, err on the side of telling your attorney too much, not too little. I cannot stress this enough. It’s much harder for me to help you solve a problem if I don’t have all the relevant information about the dispute, the opponent, and yourself. Anything can come up in a case, and the more unexpected it is, the more detrimental it can be — and the more stressful for my clients and myself. If I know about it, I can anticipate and plan for it.
- Finally, it’s important to draw lines (I’m not going to say, “in the sand!”). When you’re making demands of your adversary or laying out expected results, set boundaries and stick with them (of course, always be willing to adjust if you receive new information). If you don’t enforce a boundary, it can be a lot harder to get the other side to believe that this time you really mean it. When a client panics and suddenly wants to cave in on something their attorney doesn’t want to budge on, it causes tension between the two of you and jeopardizes your negotiating power going forward.
In all this, there’s a difference between understanding potential behaviors and eliminating them. But recognizing these patterns is definitely a productive first step toward ensuring a smoother, less stressful process for everyone involved in a litigation.