NY Times vs. Open AI

00:01

Hey everybody, welcome back to the Elon Musk Podcast.

00:05

This is a show where we discuss the critical Crossroads, The Shape, SpaceX, Tesla X, The Boring Company, and Neuralink, and I'm your host, Will Walden. The New York Times has filed A lawsuit against Open AI and Microsoft, alleging the unauthorized use to millions of its articles to train and operate ChatGPT. Now this legal action is the most recent among several filed by creators and publishers including Sarah Silverman and author George RR Martin amongst

00:34

others. This is against tech companies for using their work to develop large language AI models without their permission. A central to these lawsuits is the practice of scraping, which involves collecting vast amounts of Internet data to train AI models like ChatGPT. Web crawlers designed to index and download web content are increasingly feeding AI models, raising concerns among creative content creators about copyright infringement and fair

01:02

compensation. The New York Times claims its content was significantly used in the Common Crawl data set, which Open AII has admitted to using for training earlier versions of Chan GBT. But legal experts are divided on whether using Internet data falls under fair use. That's the commercial use is a key in consideration. Commercial use is a key consideration in determining

01:28

fair use. Now, many AI companies, initially nonprofits eventually develop profitable products like Open AI websites, have started blocking web crawlers to protect their content. Now there's two methods to do this. One's based on mutual respect and another uses technology to identify and block bad behavior.

01:48

Bots that differ from human users and the reduction in accessible data for web crawlers could benefit content creators but might also hinder other users like researchers In the past, web scrapers were used to collect data about competitors and some people use them still for that. But also you can get tracking and privacy data from these trackers. And now there's an increased reliance on web crawling for archiving digital content.

02:17

This modern technique captures online primary sources, preserving them as historical records, and major publishers have engaged in discussions with Open AI Now about licensing content for AI training. However, reaching agreement on pricing and terms has been challenging, indicating a complex negotiating landscape, and confidential talks have been ongoing between top US media companies and Open AI recently.

02:40

Organizations like Ghana News Corp and IAC have been part of these discussions, according to sources very familiar with these negotiations. Now, Microsoft, who's a huge investor in Open AI with millions of dollars invested, has also participated in these talks, and the talks have been complicated by the rapid development of AI applications, raising important questions about the future of the media

03:02

industry. Open AI has expressed respect for content creators, rights, and the need for mutually beneficial collaborations, as indicated in their deals with The Associated Press and Axel Springer. The media industry, having previously lost significant advertising revenue to tech giants, is cautious about undervaluing their content in deals with AI companies. There's a concern about AI applications potentially spreading misinformation by inaccurately citing articles.

03:28

Some news organizations have successfully negotiated deals with Open AI, like The Associated Press and Axel Springer. Like I said before, however, companies like Bloomberg and the Washington Post have opted to focus on their own AI strategies instead of collaborating with Open AI Now. Despite these tensions, though some industry executives acknowledge the potential benefits of AI for journalism, the mutual dependency between news organizations and AI firms shows that the need for a

03:54

balance and swift resolution for these disputes is needed. The lawsuit underscores the growing tension between the media industry and AI tech as well, potentially reshaping the news landscape. And Microsoft and Open AI are accused of using copyright content to train AI services like ChatGPT allegedly causing significant financial damages. Microsoft and Open AI have been silent in response to the lawsuit. The case represents a major challenge to Open AI's practice of scraping web content.

04:24

This is the common practice for ChatGPT since its debut, and the company has attempted to secure licensing deals with publishers to address all these issues. And now Open AI faces multiple lawsuits from various content producers highlighting this complex legal terrain. That's surrounding AI and copyright right now, and the outcome of these cases could set an important precedent for large language models and its interaction with content creators.

04:51

And Microsoft is Open AI's largest supporter. It's integrated the startups AI tools into its products, and the lawsuit alleges that Microsoft's use of the New York Times content has significantly boosted its market value. Now, the New York Times spokesperson also emphasized the legal requirement for obtaining permission before using their work for commercial purposes, A requirement they allege Microsoft and Open AI have not met.

05:17

And the resolution of this case could have significant implications for the future of AI in relation to copyrighted content. Hey, thank you so much for listening today. I really do appreciate your

05:29

support. If you could take a second and hit the subscribe or the follow button on whatever podcast platform that you're listening on right now, I greatly appreciate it. It helps out the show tremendously and you'll never miss an episode, and each episode is about 10 minutes or less to get you caught up quickly. And please, if you want to support the show even more, go to patreon.com/stage Zero and please take care of yourselves and each other. I'll see you tomorrow.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript