Perplexity and the Perplexing Legalities of Data Scraping

Of the many lawsuits media giants have filed against AI companies for copyright infringement, the one filed by Dow Jones & Co. (publisher of the Wall Street Journal) and NYP Holdings Inc. (publisher of the New York Post) against Perplexity AI adds a new wrinkle. 

Perplexity is a natural-language search engine that generates answers to user questions by scraping information from sources across the web, synthesizing the data and presenting it in an easily-digestible chatbot interface. Its makers call it an “answer engine” because it’s meant to function like a mix of Wikipedia and ChatGPT. The plaintiffs, however, call it a thief that is violating Internet norms to take their content without compensation. 

To me, this represents a particularly stark example of the problems with how AI platforms are operating vis-a-vis copyrighted materials, and one well worth analyzing.

According to its website, Perplexity pulls information “from the Internet the moment you ask a question, so information is always up-to-date.” Its AI seems to work by combining a large language model (LLM) with retrieval-augmented generation (RAG — oh, the acronyms!). As this is a blog about the law, not computer science, I won’t get too deep into this but Perplexity uses AI to improve a user’s question and then searches the web for up-to-date info, which it synthesizes into a seemingly clear, concise and authoritative answer. Perplexity’s business model appears to be that people will gather information through Perplexity (paying for upgraded “Pro” access) instead of doing a traditional web search that returns links the user then follows to the primary sources of the information (which is one way those media sources generate subscriptions and ad views).

Part of this requires Perplexity to scrape the websites of news outlets and other sources. Web scraping is an automated method to quickly extract large amounts of data from websites, using bots to find requested information by analyzing the HTML content of web pages, locating and extracting the desired data and then aggregating it into a structured format (like a spreadsheet or database) specified by the user. The data acquired this way can then be repurposed as the party doing the gathering sees fit. Is this copyright infringement? Probably, because copyright infringement is when you copy copyrighted material without permission. 

To make matters worse, at least according to Dow Jones and NYP Holdings, Perplexity seems to have ignored the Robots Exclusion Protocol. This is a tool that, among other things, instructs scraping bots not to copy copyrighted materials. However, despite the fact that these media outlets deploy this protocol, Perplexity spits out verbatim copies of some of the Plaintiff’s articles and other materials. 

Of course, Perplexity has a defense, of sorts. Its CEO accuses the Plaintiffs and other media companies of being incredibly short sighted, and wishing for a world in which AI didn’t exist. Perplexity says that media companies should work with, not against, AI companies to develop shared platforms. It’s not entirely clear what financial incentives Perplexity has or will offer to these and other content creators. 

Moreover, it seems like Perplexity is the one that is incredibly shortsighted. The whole premise of copyright law is that if people are economically rewarded they will create new, useful and insightful (or at least, entertaining) materials. If Perplexity had its way, these creators would not be paid at all or accept whatever it is that Perplexity deigns to offer. Presumably, this would not end well for the content creators and there would be no more reliable, up-to-date information to scrape. Moreover, Perplexity’s self-righteous claim that media companies just want to go back to the Stone Age (i.e., the 20th century) seems premised on a desire for a world in which the law allows anyone who wants copyrighted material to just take it without paying for it. And that’s not how the world works — at least for now.

Happier Life, Better Business

The year is drawing to a close, which means I’m looking back on the good and the bad of 2024 and trying to focus, naturally, on the good. Among the good things of 2024 are three key realizations that have helped improve my legal practice. I’m sharing because I think they can be of value to anyone starting, growing, or managing a business. 

One thing that really hit home this year is that when you’re a business owner, business problems are personal problems (and vice versa). This isn’t because a personal problem means that I’m making less money or that I take every frustrating or difficult situation personally. 

I, like most everyone else, enjoy doing the things I’m good at and don’t like to do stuff that feels hard or stresses me out. But, as a business owner, just because something isn’t easy doesn’t mean I can avoid dealing with it. I still have to either slog through it or go back to bed and pull the covers up over my head (I never do that but hey, technically it’s an option). 

There is, however, a third and, IMHO, better option: Understand why the task is hard and figure out how to make it less hard.

As I’ve talked about before, I used to dread posting to this blog because I was sure that some anonymous Internet troll was going to get offended by something I said, scold me for getting a fact wrong or get all huffy over a misplaced comma. I became so focused on not upsetting anyone or making mistakes that I ended up churning out some pretty pedestrian content. Worse, it took me FOREVER and a day to write anything because I obsessively examined and reexamined every damn word. Unsurprisingly, this did not make it easy to regularly post new material.

Acknowledging these feelings was a huge first step in overcoming them. It enabled me to look at my fears objectively and consider if there was any actual data to support their existence (surprise: there wasn’t!). Ultimately, addressing these personal fears and starting to make more regular and compelling blog posts turned out to have huge results for my business, as this blog has measurably helped attract new clients (for which I am extremely grateful). 

The second big revelation is that it’s not only okay to be choosy when taking on clients, it’s critical for my sanity and my firm’s success. For a long time, I operated as if every potential client might be my last. Irrational, to be sure, but also pretty normal. As a result, I felt like I was endangering my business and financial future if I didn’t say yes to any matter that even vaguely fit into my area of expertise. That meant ending up saddled with work that wasn’t profitable or, worse, made me miserable because I either wasn’t interested in the subject or the client didn’t value my insights, knowledge or ideas. Perfect example: Have I litigated securities fraud issues? Sure. Could I do so again? Of course. Do I want to? No! Securities fraud cases are not something I enjoy, nor will they lead to more of the cases I thrive on. In other words, while taking on cases or clients that aren’t a good fit may put money into my pocket in the short term, they don’t result in work I can excel at and people I enjoy working with. That’s where I need to focus my attention. Now, I am way more selective, and while I know that turning down work sounds a little crazy if you’re just starting your own business, it’s been a game changer for me. 

Which leads to my final big discovery of the year: By saying no to things that don’t serve my firm’s (and ultimately, my own) long-term interests, I have more time to focus on doing and getting work that I DO want. My time and energy are finite resources (this is really the BIG realization) and by using them more efficiently I’ve seen rapid, tangible results in the growth of my practice. I’m happier, my clients are happier, and my family are happier. And that’s ALL good, this year and for the years to come. 

OpenAI’s Texts and DMs: Business or Personal?

If you’ve been following this blog, you’re familiar with the copyright infringement cases the New York Times and the Authors Guild have brought against OpenAI, makers of ChatGPT. So familiar, in fact, I won’t summarize these suits again. You can find a prior post about these cases here. The current dispute is interesting, at least to me (social media + law = fun for a nerd like me!) because it is another data point on how courts grapple with the blurry line between business and personal communications on social media.

Taking a step back for the non-litigators and non-lawyers in the room: In litigation, the parties must exchange materials that could have a bearing on the case. This generally covers a pretty broad range of materials and requires each party to produce all such materials that are in its “possession, custody, or control.” A party can also subpoena a non-party to the case for relevant materials in the non-party’s “possession, custody, or control.” However, where possible, it’s generally better to get discovery materials from a party instead of a non-party.

Turning back to the cases against OpenAI, the Authors Guild asked the tech company to produce texts and social media direct messages from more than 30 current and former employees, including some of the company’s top executives. It claims these communications may shed light on the issues in the case.

OpenAI has pushed back strongly. It claims that its employees’ social media accounts and personal phones are, well, personal and, therefore, not in its control. It also contends the Guild’s request might intrude on these persons’ privacy. OpenAI also rejects the Guild’s assumption that OpenAI’s search of its internal materials relevant to the case will be inadequate without its employees’ and former employees’ texts and DMs. It sniffs that the Guild should wait until it receives OpenAI’s documents before presuming as much (how rude!). 

The Authors Guild has responded by pointing to OpenAI employees’ posts on X (yes, formerly Twitter) that clearly indicate they used their “personal” social media for work purposes. Same goes for their phones which, while they may not be paid for by the company, seem to have been used to text about business. 

So, who’s right here? For starters, it seems pretty likely that, at least for current OpenAI employees, OpenAI could just tell people to turn over DMs and text messages. Assuming the employees don’t object or refuse, this should be enough to establish that OpenAI has “control.” The fact that it seems that OpenAI hasn’t taken this basic step before refusing to produce DMs and text messages seems like a really good way to piss off the Magistrate Judge hearing this issue, especially if the employees violated OpenAI policies requiring work-related communications to take place on devices and accounts owned by the company (it should have such policies if it doesn’t!) or if the communications were clearly within the scope of an employee’s employment. Without that basic showing, it seems likely that the Authors Guild will prevail. 

If it does (or if it doesn’t) there will be more about it here!

IP in a Partnership: Who Owns What?

I talk a lot here about aspects of intellectual property law. It’s an area I find pretty fascinating because it has to do with how a society encourages people to create, and the law embodies beliefs about how to accomplish that. I also talk a lot about partnership disputes which, along with IP work, forms a big part of my practice. 

Sometimes, when you put two good things together you get something great (Reese’s!). Other times, though, you just get a mess. (Melted chocolate in your pocket? OK, I’ll stop now.) Often, it’s my job to sort out the issues created when partnership disagreements intertwine with intellectual property issues — specifically, who owns a company’s IP when a partnership falls apart.

In such disputes, there are a few rules that usually apply. I’ve found these are often unknown to or misunderstood by the people involved in these scenarios. So let’s run through them.

  1. Just because two people or a larger group didn’t formally register a company doesn’t mean there isn’t a partnership. In New York (where I primarily practice) and in other states, courts can find that people entered into a partnership even if they never filed paperwork to create a business entity. There are a range of factors that can come into play here but, in general, courts will look at whether the parties shared the business’s profits and losses; jointly managed or controlled the business; contributed money to the business; and/or whether they intended to be partners. Why does this matter? Because, during the existence of a partnership, the partners owe each other fiduciary duties, meaning they must treat each other fairly and, importantly, no individual member of the company can claim the company’s property for herself.

  2. Thus, even if a partner registers a partnership’s trademark in her or his name, that trademark belongs to the partnership — not to her. For example, if a business operates under or sells a product with a name and/or logo, one of the members of the business can’t take ownership of that name or logo by individually obtaining a trademark registration for it. Nor can they exclude other members of the business from using the name or logo if the partnership breaks up.

  3. Copyright rules are different! Generally speaking, a copyright vests in the creator, not the company. This means that if partners (either individually or together) create a work that is copyrighted or copyrightable, the copyright goes to the creator or creators, not the business. Moreover, under copyright law, transferring a copyright requires a written document, so if any owner wants to transfer a copyrighted work from themselves to the business, they need to have a document that says so.

  4. On a related note, just because something is created by a partner under the auspices of the business doesn’t mean it’s a “work for hire” and thus belongs to the business from the moment of its creation. Something only becomes a work for hire in two situations: (a) if it’s prepared by an employee within the scope of his or her employment; or (b) if there’s a signed written agreement stating that the material is a work for hire.

  5. Finally, the idea for a business is usually not protectable because, in general, ideas are not protectable intellectual property (I know, that sounds counterintuitive). Copyright law protects the expression of an idea, not the idea itself. So if you say to a friend, “Hey, we should open a business making ice cream for cats,” and your friend goes out and starts up Kitty Kreameries, you’re not entitled to any ownership of it. You have to put in the work and actually do the thing, not just think of the thing.  

No one starts a business with others expecting things to turn sour. But it happens a LOT. So the overall lesson here: If you’re entering into or already in a business with others, whether you’ve formally created it or not, be aware what belongs to you and what belongs to the business as a whole so you won’t be taken by surprise if it all comes crashing down someday.

Digging Through Yesterday to Plan for Tomorrow

First, a disclaimer: bear with me on this one. Even though I start off with descriptions of the various offices I’ve inhabited since 2021 and my struggles furnishing them, the tale does lead to some lessons that are worth thinking about as we prepare for the inevitable onslaught of articles and emails about how to plan for 2025. 

Like many people with desk jobs, I worked from home during the pandemic. It wasn’t a big deal, as I was used to meetings on Zoom and my bookkeeper, assistant, and paralegal had always been remote. 

In the fall of 2021, as COVID was starting to ease, New York City decided to install a new water main outside my bedroom/office. This ensuing construction cacophony was the end of my working from home.

I moved into a private office within a small shared space that was pretty great in many ways. It had a big window, a lovely view of the Manhattan skyline and one of my neighbors was a floral designer, which meant I frequently had fresh flowers in my office. However, there were rarely any other people around, so it still felt like I was stuck in my bedroom. After that, I moved to another space with the hopes that I would have a regular officemate. Unfortunately, that didn’t work out as planned and I found myself still mostly alone every day. 

About a year ago, I moved once again to my current office, which is in downtown Brooklyn. Third time is indeed the charm. There’s a nice mix of having other people around, but a door I can close when I’m on the phone or need to concentrate.

Even with this upgrade though, my actual office was pretty bleak. My furniture amounted to a junky old filing cabinet, a hand-me-down bookshelf, and a depressingly blank Zoom background. Mostly, this was because I just haven’t had time to find furniture that I like. 

Recently this changed. I finally had some time to buy a new bookcase and filing cabinet. They’re quite nice and certainly a big improvement over my prior decor. 

Of course, these purchases meant I had to transfer everything from the old furniture to the new. That archaeological dig unearthed a bunch of articles I had printed out and hand-scrawled notes I’d thrown in a folder to come back to later. As I read through this collection, I quickly realized almost all this material had to do, in some way, with growing a business. I soon became thoroughly engrossed in reading, stopped checking my email, let my computer go to sleep, and left my phone on the other side of the room. 

It was an interesting journey through the past few years of my practice. Some of these articles and ideas were no longer relevant, as they contained ideas or advice I’d tried that didn’t work for me, or experiences I’ve subsequently written about here. But a lot of it still resonated and, as I worked my way through this stuff, it became pretty clear that there were some recurring themes. Nothing particularly earthshaking or radical, but ideas that are definitely worth revisiting. More importantly, the process — particularly being separated from my phone and other distractions — allowed me to step back and see connections that I had forgotten about or previously missed.

So what are the lessons here? First, creating a strategy for growing a business isn’t a one-and-done deal. What worked a few months ago might not work now, or could be ripe for further improvement. Through my review of this collection of material, I could see the evolution in my thinking and approach, and sort out what didn’t work, examine whether improvements were possible, and chuck the stuff that didn’t work or was no longer relevant. 

In the next two-and-a-half months we’re all going to be bombarded with articles, commercials, and general blather advising us to plan for 2025, and my experience reading my articles and notes reinforced how you can’t plan for the future without assessing where you’ve been. Looking back on decisions and moves I’ve made is essential for taking stock of what works (and what doesn’t) and how to deploy resources in the future. Simply having an idea once, implementing it and never reexamining it can too easily lead to stagnation. 

And what’s the best way to do this? By freeing ourselves from distraction! Stepping back from our phones and computers (and even some of the idle office chitchat I now enjoy that I missed so much during the pandemic) allows you to get new perspectives and see the connections between what you’ve done before and the results they’ve led to. Because the past is the strongest foundation we can build upon for the future.

Too Much Information: Social Media Subverts the Statute of Limitations for Defamation Suits

Over the last few years there have been several cases of professional models suing “gentlemen’s clubs” (a/k/a strip clubs) for defamation. These suits involve the clubs grabbing the models’ pics off the Internet and using them on social media to promote their entertainment. (Weirdly, all of these suits are against strip clubs in New England. Draw your own conclusions.) None of this is particularly surprising. However, one current case has raised the interesting question of when the statute of limitations begins to run on defamation claims stemming from social media posts. 

In this case, five models are suing Club Alex in Stoughton, Massachusetts, alleging the club used their photos in Facebook posts, creating the impression the models worked as dancers there. That’s defamation! 

The club pushed back, noting that the offending posts were made between 2013 and 2015, but the models didn’t bring the lawsuit until 2021 — well after the three-year statute of limitations for defamation claims in Massachusetts had expired. On those grounds, the federal Court hearing the case initially granted the club’s motion to dismiss. 

The models asked the Court to reconsider that decision. And, amazingly, the Court did! 

Why would a federal judge basically admit, “ok, maybe I was wrong”? In a nutshell, here’s why: In some cases, Massachusetts (and most other states) use a “discovery rule” to determine when the statute of limitations starts to run. This avoids the unfairness of having statutes of limitations expire before a “Plaintiff knew or reasonably should have known that she may have been harmed by the conduct of another.” The models argued that this should also apply here because the vast ocean of information on social media meant they didn’t know (and couldn’t be reasonably expected to know) about the misappropriation of their images until years after the posts. What’s more, even if they had suspected misuse of the images, it’s very difficult to manually search thousands of strip clubs’ social media pages and websites, especially when search engines can’t search images without names. 

Recognizing the models’ point, the District Court sent the issue to the Supreme Judicial Court (SJC) of Massachusetts — the highest state Court in that state — asking “under what circumstances, if any, is material publicly posted to social media platforms ‘inherently unknowable’ for purposes of applying the discovery rule in the context of defamation, right of publicity, right to privacy and related tort claims?” 

The SJC held that, in the context of social media posts, a determination of when the statute of limitations begins to run should not be based on the date of publication, but rather “requires a fact-intensive, totality of the circumstances analysis to determine what the Plaintiff knew or should have known about the social media publication.” (It noted that this is not required where postings are widely available and readily searchable). 

The SJC instructed judges faced with this issue to consider things like: “how widespread the distribution was;” whether the posting could be readily located by a search; if there is technology that could assist in locating potentially offending posts; and how widely the images are distributed and, thus, how hard or easy it is to separate authorized uses from unauthorized uses. 

Here, this means that the models can continue to pursue their defamation claims against the club.

A final thought: In a way, this is the flipside of the Netflix case involving their series Baby Reindeer and the lawsuit against them by Fiona Harvey, which I wrote about here. In that case, the information on social media enabled Internet sleuths to out someone whose identity was meant to be concealed, whereas in this case, the volume of information on social media makes it more difficult to find out when someone’s persona is being used without their knowledge. Whichever way you look at it, one thing is certain: Controlling our identities (and our lives) is waaay harder than it used to be.

Data Scraping: How Much is Too Much?

Almost two years ago, I wrote about LinkedIn’s suit against hiQ Labs, Inc. In that case, LinkedIn sued hiQ Labs for scraping its users’ public profiles and selling the results as part of an employee training and retention tool. There, the Court found that hiQ Labs violated the social media company’s terms of service because, as it states very clearly in LinkedIn’s user agreement, “NO SCRAPING.” (I’m paraphrasing, loudly.)

We now have a second court decision ruling against scraping — but for a very different reason than in the hiQ action. 

This time, the venue is the 11th Circuit Court of Appeals and it’s that court’s second decision in the case since the dispute began in 2016. In its first decision (back in 2020) the 11th Circuit wrote: “Warning: This gets pretty dense (and difficult) pretty quickly.” That’s true! But don’t be scared. I think we can summarize it all succinctly without getting lost. 

The plaintiff is Compulife Software, Inc., whose products are a database and software that allows licensees (generally, insurance agents) to compare life insurance quotes. These agents/licensees can incorporate Compulife’s products into their websites, but the public can also access Compulife’s products on its own site, www.term4sale.com. 

The defendants are a group of individuals who used bots to scrape Compulife’s publicly-accessible site and database and built their own, competing insurance quote site. This group (they never actually formed a business entity) obtained the source code for Compulife’s software under false pretenses. (One of the group’s members contacted Compulife, claiming that he worked for one of Compulife’s licensees, and asked for a copy of the source code. Compulife gave it to him.) The defendants’ used this code to engineer the scraping of Compulife’s website.

Based on this, Compulife accused the defendants of violating the federal Defend Trade Secrets Act, as well as the analogous Florida Uniform Trade Secrets Act. (There were also copyright infringement claims relating to defendants’ unauthorized use of Compulife’s software, but that’s for another day). To prevail on either claim, Compulife had to establish that (1) it had a trade secret, and (2) the defendants misappropriated Compulife’s trade secret. 

Initially, the District Court held that Compulife didn’t have a protectable trade secret because its entire database could be accessed by the public. However, in its 2020 decision, the Appeals Court reversed this, concluding the database was indeed a trade secret because, among other things, Compulife “goes to great lengths to secure its database” and that even though the individual, publicly-available quotes on the Compulife site were not trade secrets, Compulife’s compilation of them could be. 

On this latest appeal, the main issue was whether the defendants’ use of bots to scrape Compulife’s database was misappropriation. The 11th Circuit, in addition to reaffirming its original holding that Compulife’s database was a trade secret, concluded that defendants misappropriated that secret when they used bots to “commit a scraping attack that acquired millions of variable-dependent insurance quotes.” That quantity was a key factor: As the Court wrote, “even if individual quotes that are publicly available lack trade secret status, the whole compilation of them (which would be nearly impossible for a human to obtain through the website without scraping) can still be a trade secret,” and the defendants’ use of bots to do what a human could not manually accomplish represented improper means.

The Appeals Court, however, was careful not to condemn scraping as a whole, writing “[i]t is important to note that scraping and related technologies (like crawling) may be perfectly legitimate.” (Italics from the court’s opinion).

This seems pretty straightforward particularly given defendants’ acquisition of Compulife’s code under false pretenses. However, I’m curious to see future rulings that shed more light on when scraping is legitimate and, more importantly, what factors do courts look at to determine when scraping is ok and when it’s not? Is it the sheer volume of material taken? The impact on the plaintiff’s business? Something else?

When the 11th Circuit (or another court) enlightens us, I’m sure I’ll be back to write about it. 

Fair Use or Foul Play? Free Books Cross the Line

Last week, the U.S. Court of Appeals for the Second Circuit affirmed a federal judge’s March 2023 holding that the Internet Archive’s practice of digitizing library books and making them freely available to readers on a strict one-to-one ratio was not fair use. For reasons I’ll get into below, the outcome is pretty unsurprising. It’s also worth looking at because it likely previews some of the arguments we’ll hear in the case between the New York Times and OpenAI (creators of ChatGPT) and Microsoft if (or when) that case makes it to the Second Circuit. (Quick summary of my post on the subject: The New York Times Company filed suit late in December against Microsoft and several OpenAI affiliates, alleging that by using New York Times content to train its algorithms, the defendants infringed on the media giant’s copyrights, among other things.)

First, some background. The Internet Archive is a not-for-profit organization “building a digital library of Internet sites and other cultural artifacts in digital form” whose “mission is to provide Universal Access to All Knowledge.” To achieve this rather lofty goal, the Archive created its Open Library by scanning printed books in its possession or in the possession of a partner library and lending out one digital copy of a physical book at a time, in a system it dubs Controlled Digital Lending. 

Enter COVID-19. During the height of the pandemic, when everyone was stuck at home without much to do, the Archive launched the National Emergency Library. This did away with Controlled Digital Lending and allowed almost unlimited access to each digitized book in its collection. 

Not surprisingly, book publishers, who sell electronic copies of books to both individuals and libraries, were not thrilled. Four big-time publishers — Hachette, Penguin Random House, Wiley, and HarperCollins — sued the Internet Archive for copyright infringement, targeting both its National Emergency Library and Open Library as “willful digital piracy on an industrial scale.”

The Internet Archive responded that these projects constituted fair use and, therefore, did not infringe on the publisher’s copyrights. To back this up, the Archive claimed it was using technology “to make lending more convenient and efficient” because its work allowed users to do things that were not possible with physical books, such as permitting “authors writing online articles [to] link directly to” a digital book in the Archive’s library. The Archive also insisted its library was not supplanting the market for the publisher’s products.

The District Court rejected these arguments, holding that no case or legal principle supported the Archive’s defense that “lawfully acquiring a copyrighted print book entitles the recipient to make an unauthorized copy and distribute it in place of the print book, so long as it does not simultaneously lend the print book.” The judge also deemed the concept of Controlled Digital Lending “an invented paradigm that is well outside of copyright law.”

In affirming the District Court’s ruling, the Second Circuit Court applied the four-part test for fair use that looks at: (1) the purpose and character of the use; (2) the nature of the copyright work; (3) the portion of the copyrighted work used (as compared to the entirety of the copyrighted work); and (4) the impact of the allegedly fair use on the potential market for or value of the copyrighted work. 

The first factor — the purpose and character of the use — is broken down into two subsidiary questions: Does the new work transform the original, and is it of a commercial nature or is it for educational purposes? Neither the District Court nor the Court of Appeals bought the Internet Archive’s claim that its Open Library was transformative. The Court of Appeals held that the digital books provided by the Internet Archive “serve the same exact purpose as the original; making the authors’ works available to read.” (The Court of Appeals did find that, as a not-for-profit entity, the Internet Archive’s use of the books was not commercial.) 

On the second factor, which is generally unimportant here, the Court of Appeals also found in favor of the publishers. Of greater significance is factor three, which looks at how much of the copyrighted work is at issue. Copying a sentence or a paragraph of a book length work is more likely to be fair use than copying the entire book which, of course, is exactly what the Internet Archive was doing. Again, another win for the publishers.

And arguments on factor four — the impact on the market for the publishers’ products — didn’t work out any better for the Internet Archive. Notably, the Court of Appeals found that the Internet Archive was copying the publishers’ books for the exact same purpose as the original works offered by the publisher, thus naturally impacting their market and value. 

So what are the takeaways here as we look ahead to the case between the New York Times and Open AI/Microsoft? 

On the one hand, OpenAI/Microsoft have copied entire articles from the Times (and the numerous other plaintiffs that are suing OpenAI and Microsoft), which will hurt OpenAI/Microsoft claims of fair use. Likewise, OpenAI/Microsoft’s fair use arguments won’t get very far if the Times can show that ChatGPT’s works are negatively impacting the market for its work or functioning as a substitute for journalism.

On the other hand, if OpenAI/Microsoft can show that ChatGPT’s output transformed the Times’ content, it may be able to prevail on fair use.

In any event, the case between OpenAI and Microsoft and The New York Times is likely to include a lot more ambiguity than in the Internet Archive matter, with the potential to result in new interpretations of copyright law with massive consequences for media and technology companies worldwide.

Five Tips for Happier Clients (And More Productive Cases)

I spend a lot of time here nerding out about interesting cases and the many provocative types of legal conflict that continually arise. Keeping up with trending issues is an important part of what I do, and the latest disputes are more fascinating than ever. 

But there’s another big part of my job that I talk about less that is just as captivating: Working with clients. 

Why do I generally keep mum about this? Obviously, I can’t reveal any privileged client information. Also, litigation can be stressful, clients can sometimes have meltdowns and throw tantrums, and I’m not going to write about people’s bad moments even if they might be instructive for others. Finally, unlike reading and interpreting statutes and cases, working with clients isn’t something I learned in law school. It is a skill gradually acquired over years of practice and continual improvement. As a result, I (and most other lawyers I know) don’t really have an academic framework to organize and disseminate my expertise about working with clients.

But fear not: I’ve got a few things to share. Specifically, emotions, beliefs and behaviors I’ve seen that cause clients (and attorneys) needless stress and can make it harder to produce good results. Recognizing and anticipating these problem areas can help clients and attorneys have much better experiences as they navigate difficult litigation. (Also, it never hurts for me to put my thoughts down so that I can come back to them. Everybody wins!) 

  1. The “It’s not fair!” syndrome. I think that a lot of people come to me feeling they’ve been treated in a way that is unfair and they expect “the law” to be on their side, and for lawyers and courts to make things right. In an ideal world that is exactly what would happen but, alas, as should be obvious to anyone over the age of 4, we don’t live in an ideal world. “The law” is made up of people. People with wildly differing beliefs and agendas. People who sometimes just plain get things wrong (that’s why we have appellate courts). Moreover, what’s fair to one person might not be fair to someone else. It’s important people put aside that powerful “it’s not fair!” feeling and focus instead on getting attainable, satisfactory results.

  2. “And another thing!” A lot of times people are determined to tell the opposite side in a dispute everything they’ve done wrong. But, in my experience, not every little thing matters. It is better to have one or two really good examples of why you’re right and/or the other side is wrong rather than throw everything but the kitchen sink at them. Doing so cuts down on needless back and forth and keeps the focus on those points that have power to change the situation. Plus, keeping some weapons in your arsenal in case you need them later is always a good idea.

  3. “Same thing, same result.” Often, when people come to me, they’ve already spent a lot of time going back and forth with their adversary and discussions have fallen into a predictable pattern. For example, your side keeps asking for information and the other side keeps ignoring these requests. And on and on. If you keep doing the same thing, you’ll probably keep getting the same result. That’s frustrating. If you want something different to happen, clients and attorneys need to be willing to try something different.

  4. “I’m not going to tell you.” If you’re a client, err on the side of telling your attorney too much, not too little. I cannot stress this enough. It’s much harder for me to help you solve a problem if I don’t have all the relevant information about the dispute, the opponent, and yourself. Anything can come up in a case, and the more unexpected it is, the more detrimental it can be — and the more stressful for my clients and myself. If I know about it, I can anticipate and plan for it.

  5. Finally, it’s important to draw lines (I’m not going to say, “in the sand!”). When you’re making demands of your adversary or laying out expected results, set boundaries and stick with them (of course, always be willing to adjust if you receive new information). If you don’t enforce a boundary, it can be a lot harder to get the other side to believe that this time you really mean it. When a client panics and suddenly wants to cave in on something their attorney doesn’t want to budge on, it causes tension between the two of you and jeopardizes your negotiating power going forward.

In all this, there’s a difference between understanding potential behaviors and eliminating them. But recognizing these patterns is definitely a productive first step toward ensuring a smoother, less stressful process for everyone involved in a litigation.

A TikTok “Aesthetic” Goes to Court

Everything should be clicking (as it were) for TikTok influencer Sydney Nicole Gifford. She has half a million followers who eat up her posts promoting home and fashion items from Amazon, propelling her to the kind of celebrity that garnered coverage in People for her pregnancy. But alas, Gifford is apparently a little too influential.

She claims fellow TikToker Alyssa Sheil is copying her posts and using Gifford’s visual style to promote the same products! And yes, Gifford is now suing Sheil, in a case that could shake up the world of social media influencers and potentially make it harder for influencers to create content without fear of accusations of copying. 

According to the complaint, which was filed in District Court in Texas, Gifford “spends upwards of ten hours a day, seven days a week, researching unique products and services that may fit her brand identity, testing and assessing those products and services, styling photos and videos promoting such products and editing posts…” for social media. As a result, according to the complaint, “Sydney has become well-known for promoting certain goods from Amazon, including household goods, apparel, and accessories, through original photo and video works…” 

The lawsuit goes on to allege that defendant Sheil “replicated the neutral, beige, and cream aesthetic of [Gifford’s] brand identity, featured the same or substantially [the same] Amazon products promoted by [Gifford], and contained styling and textual captions replicating those of [Gifford’s] posts.” It says at least 40 of Sheil’s posts feature “identical styling, tone, camera angle and/or text,” to Sydney’s. Here’s a pretty obvious one, with Gifford on the left and Sheil on the right.

In the suit, Gifford is claiming, among other things, trade dress infringement, violation of the Digital Millennium Copyright Act, copyright infringement (she has registered copyrights for some of her posts and videos), and unfair competition. 

Does Gifford have a case? Here’s what I think: 

  1. To prevail on the claim for infringement of her trade dress Gifford will have to establish, at a minimum, that consumers associate her “aesthetic” with her. That may be difficult because, at least to my eye, the style of Gifford’s posts doesn’t seem wildly different from a lot of other influencers. (I am so not her target audience and I’m doing my best not to dunk on her “aesthetic,” but I have to put “aesthetic” in quotes to convey my eyeroll.) 
  2. The claim under the Digital Millennium Copyright Act is based on the fact that Sheil removed Gifford’s name or social media handle from posts. This is, shall we say, a novel argument given that the intent of the DMCA is to prevent people from circumventing digital rights management software. This is not that. At all.
  3. The copyright claim is going to raise a lot of questions about exactly how original these social media posts are and, as a result, how much protection under copyright law they are entitled to. Gifford and other social media influencers might find out that they don’t like the answer to this question. 
  4. If Gifford is able to establish that consumers associate her “aesthetic” with her, she could win the battle… but lose the war because it might open her up to lawsuits by other influencers who claim that she copied their look.

Meanwhile, Sheil has asked the Court to dismiss Gifford’s case. 

Thinking more broadly, a decision or decisions on the copyright claim could have implications for appropriation artists and others who closely copy another creator’s work. Which is one reason it will be fascinating to see how this plays out. And yes, I know I often end these posts saying something like that. Because it’s true! This case, as with so many IP lawsuits lately, especially those that involve AI, are all going where no court has gone before (or even imagined possible ten years ago). Every one of these potential decisions could have massive socioeconomic impact, with a real effect on how a lot of people earn a living and how the rest of us spend a lot (probably too much) of our time.