laitimes

Runway's AI video generator is trained on thousands of scraped YouTube videos

#记录我的2024

Quick Reading Runway trains its AI text-to-video generator with tons of YouTube videos and pirated movies, involving channels from major entertainment companies like Netflix and Disney. Through a web crawler, the company downloads videos and utilizes proxies to prevent them from being blocked by Google. Runway has received significant investments from Google's parent company Alphabet and Nvidia to develop tools that allow users to create photorealistic AI-generated videos, and the latest Gen-3 Alpha was released in June with diverse creative capabilities. However, Runway's dataset also contains links to pirate websites, raising concerns about piracy and policy violations. Although the Runway co-founder mentions using a "filtered internal dataset" for training, no specifics are provided. Google said in response to the issue that the use of platform video to train AI constituted a "clear violation" and that Runway did not respond in a timely manner.

Runway's AI video generator is trained on thousands of scraped YouTube videos

Runway's AI training data source

According to 404 Media, Runway trained its AI text-to-video generator with thousands of YouTube videos and pirated movies. A spreadsheet detailing this training data includes links to YouTube channels owned by major entertainment companies, including Netflix, Disney, Nintendo, and Rockstar Games, as well as popular creators like MKBHD, Linus Tech Tips, and Sam Kolder. In addition, the spreadsheet includes channels from news organizations such as The Verge, The New Yorker, Reuters and Wired. According to a former Runway employee, "The channels in this spreadsheet are the result of a company-wide effort to find quality videos to build a model." ”

Runway's AI video generator is trained on thousands of scraped YouTube videos

The role of web crawlers in data collection

This training data is processed by a giant web crawler that downloads videos from identified channels and uses proxies to prevent them from being blocked by Google. Runway, an AI startup backed by significant investments from Google's parent company Alphabet and Nvidia, has developed compelling tools that enable users to create photorealistic AI-generated videos, including videos in specific animation styles. The latest tool, Gen-3 Alpha, was released in June with the ability to "create videos in any style you can imagine." Like other AI models, Gen-3 Alpha requires a wide range of content to be trained effectively.

Runway's AI video generator is trained on thousands of scraped YouTube videos

Concerns about piracy and policy violations

In addition to the YouTube channel, 404 Media also found that Runway's dataset contained links to pirate sites such as KissCartoon, which allows users to watch anime and other animated content for free. It's unclear if Runway leveraged all of the videos in that spreadsheet to train its Gen-3 Alpha model, and we may never know its full range. In a June interview with TechCrunch, Runway co-founder Anastasis Germanidis said that the company uses a "filtered internal dataset" for model training, but did not elaborate further. When contacted for comment, Google directed The Verge to a statement from YouTube CEO Neal Mohan, who mentioned in April that using the platform's video to train AI constituted a "clear violation." The Verge also sought a response to Runway, but did not receive an immediate response.

Runway's AI video generator is trained on thousands of scraped YouTube videos
Runway's AI video generator is trained on thousands of scraped YouTube videos

Read on