[Update] Apple, Salesforce break silence on claims they used 'swiped YouTube videos' to train AI

Two tech giants clear up recent accusations.
By Kimberly Gedeon  on 
YouTube UI on an iPhone
Apple provides a statement on the matter. Credit: Photosince / Mashable

UPDATE: Jul. 18, 2024, 4:44 p.m. EDT Salesforce reached out to Mashable with a comment in response to Wired's report.

A new report claimed that tech giants including Apple, Nvidia, Anthropic, and Salesforce used data from "thousands of YouTube videos" to train AI. The investigation, performed by Proof News and published on Wired, alleged that subtitles from 173,000 YouTube videos were swiped for the companies' AI models.

Called "YouTube Subtitles," the dataset contains video transcripts from educational channels like Khan Academy, MIT, and Harvard, as well as the Wall Street Journal, NPR, and the BBC. Material from YouTube stars like PewDiePie, Marques Brownlee, and MrBeast were discovered, too.

We haven't heard from Anthropic yet after reaching out for comment, but Apple and Salesforce has issued a response to Wired's report.

Will Apple use this data for Apple Intelligence and other AI services?

The short answer is no, but here's the longer response for those who don't identify with the "TLDR" crowd:

In an email to Mashable, Apple said that its open-source language model, OpenELM, indeed used the dataset, but not in the way some may be thinking.

The OpenELM project is a part of Apple's ongoing effort to benefit the broader research community. In other words, according to Apple, the OpenELM model was created for research purposes only and will not underpin any of Apple's machine learning-powered hardware or AI services, including Apple Intelligence.

Mashable Light Speed
Want more out-of-this world tech, space and science stories?
Sign up for Mashable's weekly Light Speed newsletter.
By signing up you agree to our Terms of Use and Privacy Policy.
Thanks for signing up!

For the uninitiated, Apple Intelligence is the company's new suite of AI features, which were revealed at WWDC 2024 (Apple's annual event where the company spills the beans on what's to come with its software offerings, including iOS and iPadOS).

Apple Intelligence, for example, can help summarize text, whether it's an email or text message, for quicker interactions with friends, loved ones, coworkers, and more. It will also underpin more entertainment-focused features like Genmoji, which generates new iOS emojis with a prompt. There's also Image Playground, which lets users create AI-generated images on the fly.

Genmoji demo at WWDC 2024
New Genmoji feature coming to iOS 18. Credit: Apple

When it comes to AI utilities for its consumers, Apple highlighted that it offers websites an option to opt out of having their content used for AI training. Apple assured that its generative models are built and fine-tuned using high-quality data, including licensed content from publishers and stock image companies, alongside publicly available data on the web.

To put it succinctly, Apple doesn't deny that its open-source language model, OpenELM, used the dataset, but wants to make clear that it will not underpin any of its AI services, including Apple Intelligence.

Salesforce claims academic-based usage

In an email to Mashable, Salesforce also offered its side of the story:

"The Pile dataset referred to in the research paper was used to train an AI model in 2021 for academic and research purposes," a Salesforce rep said. "The dataset was publicly available and released under a permissive license."

What does Nvidia have to say?

We also reached out to Nvidia for comment, but the company, known for bringing AI to many of its gaming hardware and services, declined to issue a statement.

We will update this article if we hear anything from Anthropic.

Mashable Image
Kimberly Gedeon
East Coast Tech Editor

Kimberly Gedeon, at Mashable since 2023, is a tech explorer who enjoys doing deep dives into the most popular gadgets, from the latest iPhones to the most immersive VR headsets. She's drawn to strange, avant-garde, bizarre tech, whether it's a 3D laptop, a gaming rig that can transform into a briefcase, or smart glasses that can capture video. Her journalism career kicked off about a decade ago at MadameNoire where she covered tech and business before landing as a tech editor at Laptop Mag in 2020.


Recommended For You
(Update: Apple has responded) Apple and Salesforce AI training datasets co-opt MrBeast, Marques Brownlee videos
Marques Brownlee in glasses and a gray shirt.

Top 10 YouTube videos this week, including Quenlin Blackwell, Mr. Beast, and more
Animation of the YouTube logo on a laptop


YouTube extends limits to body weight and fitness videos for teens in Europe and UK
YouTube logo on a smartphone.


Trending on Mashable
NYT Connections today: See hints and answers for September 19
A phone displaying the New York Times game 'Connections.'

Wordle today: Here's the answer hints for September 19
a phone displaying Wordle

NYT Strands hints, answers for September 19
A game being played on a smartphone.

NYT's The Mini crossword answers for September 19
Closeup view of crossword puzzle clues

The biggest stories of the day delivered to your inbox.
This newsletter may contain advertising, deals, or affiliate links. Subscribing to a newsletter indicates your consent to our Terms of Use and Privacy Policy. You may unsubscribe from the newsletters at any time.
Thanks for signing up. See you at your inbox!