New York Times sues Microsoft and Open AI for Copyright Infringement in AI

The New York Times has sued Microsoft and OpenAI for copyright infringement based on the output results from AI results.  Read the complaint here.  Some excerpts from the complaint:

57. Despite its early promises of altruism, OpenAI quickly became a multi-billiondollar for-profit business built in large part on the unlicensed exploitation of copyrighted works belonging to The Times and others. Just three years after its founding, OpenAI shed its exclusively nonprofit status. It created OpenAI LP in March 2019, a for-profit company dedicated to conducting the lion’s share of OpenAI’s operations—including product development—and to raising capital from investors seeking a return. OpenAI’s corporate structure grew into an intricate web of for-profit holding, operating, and shell companies that manage OpenAI’s day-to-day operations and grant OpenAI’s investors (most prominently, Microsoft) authority and influence over OpenAI’s operations, all while raising billions in capital from investors. The result: OpenAI today is a commercial enterprise valued as high as $90 billion, with revenues projected to be over $1 billion in 2024.

91. While OpenAI has not released much information about GPT-4, experts suspect that GPT-4 includes 1.8 trillion parameters, which is over 10X larger than GPT-3, and was trained on approximately 13 trillion tokens.23 The training set for GPT-3, GPT-3.5, and GPT-4 was comprised of 45 terabytes of data—the equivalent of a Microsoft Word document that is over 3.7 billion pages long.  Between the Common Crawl, WebText, and WebText2 datasets, the Defendants likely used millions of Times-owned works in full in order to train the GPT models.

92. Defendants repeatedly copied this mass of Times copyrighted content, without any license or other compensation to The Times. As part of training the GPT models, Microsoft and OpenAI collaborated to develop a complex, bespoke supercomputing system to house and reproduce copies of the training dataset, including copies of The Times-owned content. Millions of Times Works were copied and ingested—multiple times—for the purpose of “training” Defendants’ GPT models.

93. Upon information and belief, Microsoft and OpenAI acted jointly in the large-scale copying of The Times’s material involved in generating the GPT models programmed to accurately mimic The Times’s content and writers. Microsoft and OpenAI collaborated in designing the GPT models, selecting the training datasets, and supervising the training process.

More Breaking Legal News

Jeffrey Skatoff Esq

Jeffrey H. Skatoff, Esq.

Probate, Trust & Guardianship Litigation

Hourly & Contingency Fees Available

AV Rated Martindale Hubbell 

(561) 842-4868