Verified source report

The Atlantic created a searchable database of the music used to train AI

Atlantic reporter Alex Reisner recently uncovered four datasets of music being used to train AI models and made them fully searchable for the public. Two of the sets are absolutely enormous at 12 million and 9 million tracks. The other two are much smaller, but still represent a significant amount of training data at over […] Atlantic reporter Alex Reisner recently uncovered four datasets of music being used to train AI models and made them fully searchable for the public. Two of the sets are absolutely enormous at 12 million and 9 million tracks. The other two are much smaller, but still represent a significant amount of training data at over 100,000 songs each. According to Reisner, the sets have been downloaded thousands of times and, while it's impossible to know exactly who has used them, Google and Stability have both confirmed they have in research papers. Some of the source

Illustrated culture, style, film, music, and arts source file
Reading time2 min

coverage / Source report

What happened

According to The Verge’s source item, The Atlantic created a searchable database of the music used to train AI, Atlantic reporter Alex Reisner recently uncovered four datasets of music being used to train AI models and made them fully searchable for the public. Two of the sets are absolutely enormous at 12 million and 9 million tracks. The other two are much smaller, but still represent a significant amount of training data at over […] Atlantic reporter Alex Reisner recently uncovered four datasets of music being used to train AI models and made them fully searchable for the public. Two of the sets are absolutely enormous at 12 million and 9 million tracks. The other two are much smaller, but still represent a significant amount of training data at over 100,000 songs each. According to Reisner, the sets have been downloaded thousands of times and, while it’s impossible to know exactly who has used them, Google and Stability have both confirmed they have in research papers. Some of the source

Context

The development sits in VINI’s Technology file for readers following technology, science, product policy, markets, infrastructure, and the public consequences of innovation. The original report is linked so readers can check the source account, follow later updates, and compare new coverage against the first published record. The source item is dated 2026-06-20T18:46:48+00:00.

What to watch

Open questions include whether primary sources issue follow-up statements, whether local or market impacts become clearer, and whether additional reporting changes the timeline or adds material context.

Source

Primary source: The Atlantic created a searchable database of the music used to train AI via The Verge. VINI cites and links the source; it does not reproduce the publisher’s full article text without rights clearance.

This source-cited VINI report links to the original publisher record. VINI does not republish third-party article bodies without rights clearance. 1 source listed.

Source links

Reader comments

Moderated discussion

Account access

Comments are open to authenticated approved accounts, screened for spam and abuse, and published only after newsroom moderation unless editors change the story control.

Loading comments.