Is Spotify’s Newly Patented A.I. Plagiarism Detector a Data Collection Scheme?
‘Spotify wants machine-made music the same way Uber wants self-driving cars’
OneZero’s General Intelligence is a roundup of the most important artificial intelligence and facial recognition news of the week.
Spotify, the music streaming giant with a reputation for underpaying artists, has staked its claim to a technology that it says could protect musicians from plagiarism allegations, according to a patent recently granted by the European Union.
According to the patent application filed in 2019, before publishing a song, or even when writing it, an artist would share a “lead sheet” with Spotify, a document that outlines a song’s melody, chords, and sometimes lyrics. The A.I. algorithm would translate the sheet music into a more machine-friendly format, and then compare it to music already in Spotify’s database. Spotify told OneZero that not every one of its patents becomes a part of its product, and wouldn’t say whether the system had been implemented or not.
Plagiarism is a critical issue in the music industry. Think back to the “Blurred Lines” legal battle, in which Robin Thicke and Pharrell Williams were made to pay more than $5 million to Marvin Gaye’s family after a court determined they had copied one of the songwriter’s hits. Using this new system, a plagiarized melody could be nixed before the virtual ink even dried.
While other platforms like YouTube have systems for identifying existing copyrighted music, which YouTube calls ContentID, Spotify’s approach is more targeted toward musicians creating new music, rather than people adding already copyrighted music to videos. And systems like ContentID rely on analyzing the audio itself, rather than associated sheet music.
An artist would share a “lead sheet” with Spotify, a document that outlines a song’s melody, chords, and sometimes lyrics.
But are musicians and labels going to trust Spotify, of all companies, as some sort of legal cover against plagiarism?
George Howard, professor of music business at Berklee College, is skeptical. Howard is the former president of music label Rykodisc, owner of an eponymous consulting company, and co-founder of the Music Audience Exchange, which helps artists license music to brands. He also co-founded TuneCore, which helps artists sell music to large streaming platforms like Spotify.
“I don’t think in any scenario anyone can say Spotify’s motive is to be helping artists,” Howard said. “Music and podcasts are their product. If that’s their job, this becomes a tool to either protect themselves from litigation or generate more works that they don’t have to pay royalties for. As an artist myself and a musician, both of those I find offensive.”
Howard suggests that the tool provides more cover to Spotify when defending against plagiarism than it does to musicians. The company could point to their proactive approach for mitigating the scourge of plagiarism during a lawsuit, while an artist accused of copying a song would likely find that a proprietary and untested tool’s assurance that a melody wasn’t stolen doesn’t hold up in court.
Spotify’s A.I. researchers are also some of the field’s preeminent scholars on creating A.I. generated music, and Howard fears the music data freely submitted by artists could be valuable in building algorithms that generate music on their own without human musicians.
“Spotify wants machine-made music the same way Uber wants self-driving cars,” Howard said.
Others have also speculated about Spotify’s potential to create A.I.-generated music, especially after the firm hired François Pachet, head of Spotify’s Creator Technology Research Lab, though Spotify has dodged questions about whether that’s one of their ultimate goals.
Pachet is listed as an inventor on Spotify’s patent application. He is well-known in the world of A.I. for his decades of research on creating algorithms that can produce new music, as well as delving into why humans like specific music. Much of Pachet’s previous work has focused on using lead sheets as the standard format for machines to understand music.
If there was anyone who could use a few more leads sheets to train music-making algorithms, it’s Pachet.
Pachet’s most famous project, Flow Machines, relied heavily on lead sheets as the medium for music generation. Flow Machines received press attention after being hailed as generating the first A.I.-produced pop song. As a part of the five-year project, which spanned from 2012 to 2017, Pachet and his team built a database with more than 12,000 machine-readable lead sheets to train their algorithms.
In 2016, Pachet co-authored a paper describing an algorithm that generated new music in the style of Bach, which I wrote about at the time. Months later, in early 2017, Pachet co-authored a study titled “Sampling Variations of Lead Sheets,” in which the team broadened the scope of the research outside of just Bach. For instance, the algorithm generated a lead sheet for a version of Duke Ellington and John Coltrane’s “In A Sentimental Mood” in the style of The Beatles.
Despite that history, there’s no evidence right now to confirm that the plagiarism detector is a play to capture artists’ songwriting data.
But the patent serves as a reminder of the fine print in any free tool offered by a company that traffics in machine learning. Oftentimes, the data users provide can be more valuable than the tool itself.