In his brilliant, provocative 1966 essay, The Prospects of Recording, Glenn Gould proposed elevating – pardon the pun – elevator music from pernicious drone to enriching ear training. In his view, the ubiquitous presence of background sound could subversively train listeners to be sensitive to the building blocks, structural forms and hidden meanings of music, turning the art form into the universal language of the emotions that it was destined to be. In a not-unrelated development, Gould had somewhat recently traded the concert hall for the recording studio, an act echoed by The Beatles’ release in 1967 of Sgt. Peppers’ Lonely Hearts Club Band, an album conceived and produced in a multi-track recording studio and never meant to be played in concert. And while Gould’s dream of a transformative elevator music never quite panned out, it is clear that from the 1940s through the ’60s — from Les Paul and Mary Ford’s pioneering use of overdubs in How High the Moon, to the birth of rock and roll with Chuck Berry’s “Maybellene” in 1955, and on to Schaeffer, Stockhausen, Gould, The Beatles and many more — a totally new art form, enabled by magnetic tape recording and processing, was born.
Today we are at a similar crossroads. Music streaming – and, in general, music distribution and networking via the Internet – has become the “elevator music” of our time, offering endless songs and sounds, all supposedly adapted to our tastes and primed for making social connections. But many of the current trends are not promising, and can even be seen as leading to the downgrading of music’s potential. Algorithmic curation is still primitive and often proposes paler – not bolder – versions of music we supposedly like. Current machine-learning techniques for music generation produce generic, composer-less pieces that sort-of sound like something, but never sound great. And it could be argued that the vast potential of the Internet as an artistic medium has not yet resulted in a new kind of music, as potently different in form and content from what surrounds us as magnetic tape music was from live performance. In fact, it seems as if the Internet and streaming have changed everything about music except music itself.
The key to harnessing the power of streaming to create something really new might be to turn the medium’s ubiquity and fluidity into an advantage. Can we meaningfully allow for a given piece of music to morph and evolve with different impact on each hearing? Can this mutability engage artists’ imaginations in new ways? Can listeners – or even the entire environment – play important collaborative roles in building such a “living music” culture? Several current projects at the MIT Media Lab, where we work, explore various forms that dynamically streamed music might take.
The current paradigm – unchanged in the streaming era – is to treat a static recording as the terminal and canonical version of a composition. But a mastered, unchanged, “finished” recording is actually a limited representation of a composition. It is, also, not always what artists actually want. John Cage and many others invented numerous open forms to allow for multiple compositional (not merely expressive) interpretations, and Pierre Boulez famously revised most of his pieces from year to year, often without leaving a “definitive” version. After The Beatles stopped recording together in the early 1970s, John Lennon told George Martin that he was unsatisfied with their catalog and wished to re-record everything the band ever released (especially “Strawberry Fields,” apparently). And of course, prior to Edison’s first phonograph in 1877, every single music performance was unique by necessity and could never be repeated without variation.
When recorded music was primarily distributed on physical media, finalizing a recording was an essential step. Now that music is primarily distributed over the Internet, this constraint has been lifted. Music can now, again, be less about the master recording and more about the dialogue between artist and medium, artist and public, or music and the world itself. Labels and artists have begun to scratch the surface of what is possible. Consider the now-common pattern: An artist releases a song, and if that song starts to get traction on social media, it is quickly followed by an acoustic version, a music video and then countless club remixes. This is a first-step example of how a recording can change after it is first released, but it is currently the only option available within the narrow confines of popular streaming platforms.
In the future, artists will push the concept of evolving music much further. Instead of releasing a static recording, artists could release music that is dynamic, fluid and open for reinterpretation, remixing and reimagining. This would undoubtedly develop in numerous, well, streams — some of which we are currently working on.
A first example experiments with an open-form approach to music production. Conventional pop songs today layer tens, hundreds, or even thousands of different sounds together. Before that song is released, the relative loudness level of all parts is finalized in a studio in the “mixdown” process, during which the structure of the song, the instrumentation, and all the additional audio effects are locked into place, resulting in a final arrangement. In the conventional workflow, a mix engineer is responsible for every tone, level and effect configuration for all the separate parts. The techniques we are currently developing enable the engineer to share control over the mixdown and arrangement with intelligent algorithmic processes. The most obvious use for this kind of music production software would be to train AI agents to perform some of the simpler parts of the mixing process; for example, a software agent could be taught to set the balance between the main vocal part of a song and the background. It might also help a musician or engineer prepare a song for release more efficiently. It does not enable a kind of music that is fundamentally different from the original model provided.
The more exciting potential comes from working toward an idea where music is not the output of such a system, but is in fact the system itself. From this perspective, we could imagine and create a whole range of musical experiences that would not fit inside today’s streaming music paradigms and techniques.
To go beyond this “smart mix” model, Charles is working on an “Evolving Media” environment, through which a music composition changes as time passes. In particular, he’s is creating a feedback loop that causes a recording to permanently update itself based on how it is consumed and shared on the Internet. To make this possible, he is re-designing multiple existing technologies, from the software that we use to record, synthesize and mix music; to the cloud servers that stream content to listeners; as well as the playback apps on listeners’ devices — interconnecting them all in a single, iterative platform, allowing for:
- Notation and annotation by the artist to be bundled like enhanced, hyperlinked liner notes.
- Compositions could be updated or revised, either by the artists or algorithmically.
- It becomes much more practical for other artists to remix, cover, and collaborate.
- The system leaves behind a history of the song’s evolution, a record of that song’s compositional process.
- This “procedural” content could produce “infinite compositions” that evolve forever.
- It could be that, as with Snapchat, only the current state of the evolving composition would be available to listeners or collaborators, then gone forever, making forward evolution an essential – and only partially controllable – part of the composition itself.
Another example of an evolving, collaborative composition process is represented by the City Symphony series, developed by Tod and his colleagues in the MIT Media Lab’s Opera of the Future group. Started as a collaboration with the Toronto Symphony Orchestra in 2013, these projects develop a sonic portrait of a city using both “musical” and “found” sounds, and invite the creative participation of anyone who lives in that place and wants to contribute. Using the shared experience of locale as a unifying element, the symphonies have established unusual dialogue between very diverse members of each community, from Perth to Lucerne to Edinburgh, and from Philadelphia to Miami to Detroit, all pulled together through Tod’s compositional vision. Special mobile apps were developed for each city that allow the public to record sounds that they would like to contribute to the project. All sounds are tagged geographically and form a growing sonic map of the city. Constellation software automatically analyzes, organizes and color-codes the collected sounds, arraying them to be mixed by anyone online with mouse or finger. These “city mixes” are in turn uploaded to be shared and further morphed, creating an ever-changing city soundscape that can be incorporated into the final symphony. Numerous other apps and online tools have been specially designed for each city — such as Media Scores and live online collaboration sessions — to facilitate creative public participation. The next series of City Symphonies, currently in development, will extend the city model to countries, such as a first-ever collaboration between citizens of South and North Korea, and a “world trade” symphony for Dubai that will continue to evolve – publicly and via streaming – far into the future.
Although tools are currently being developed here at the Lab to intelligently automate making sonically meaningful connections between collected clips in a massive database, and between “noisy” and “musical” sounds, something normally done manually and impossible to accomplish at scale, the Media Lab’s “Cognitive Audio” project takes an even more radical approach. Musician/scientists Ishwarya Ananthabhotla and David Ramsay are working on a system that allows the generation of constantly evolving compositions, based on the intriguing sounds surrounding us that we may not even notice. Using cutting-edge research in psychoacoustics, auditory scene analysis and auditory memory-recall, their software can take hours of recorded ambient sounds from the found environment and then automatically select and edit the sounds which we are likely to find most interesting and might most want to remember. Then, by measuring our mood through preference tests and biometric readings fed through machine learning algorithms, the system produces constantly streaming audio experiences that turn the everyday into an emotional, personalized, musically relevant, memory-enhancing journey.
All of these projects make us wonder: Which features of a composition could best be modified, while retaining important aspects of the original music’s DNA? How quickly should this new kind of media change? Which network signals should feed into our compositional systems to update the media content? And perhaps most importantly, what might the dangers be? (Questions that, not for nothing, are spiritually similar — although pushed to a new extreme — to what any artist might ask themselves about a work in progress.) Feedback loops of this kind are often used in the design of web pages, where they optimize for “engagement.” What are the implications when these kinds of feedback loops are applied to music?
Conventional advice for musicians is to push as much content to social media streams as possible. This might be good advice for musicians, but it is definitely even more beneficial for the online media platforms that monetize our attention. Suppose that, instead of constantly pushing out new media, artists were able to spend more of their time refining existing work. Music is not the only space that stands to benefit from shifting the focus away from creating new content toward maintaining, cultivating and healing what exists already.
Critics of streaming media point out that popular music tends to sound more and more similar, as artists optimize their music for these platforms. The algorithms that curate our social media feeds rarely promote the best or the most interesting music — rather, they promote the most similar music. Counter-critics point out that even if the mainstream is homogenizing, streaming platforms also enable countless subgenres of more obscure music to flourish in the fringes of the mainstream. Both are correct, but both also miss a more important issue. As media on the fringe and in the mainstream is increasingly optimized, curated, and discovered by algorithms, we devalue the human role in all these aspects of music creation and consumption. Isn’t making, discovering, and listening to music as a human the only true value of music?
There are many ways that music in this new century can achieve the kind of paradigm shift that was made possible by magnetic tape in the last one. Will the time come soon when we will look back on the 21st century and identify the Sgt. Pepper’s, Chuck Berry, or Luciano Berio of the Internet age? With luck, it could still be the artists, not the platforms, that best illustrate how a new art form emerges. When this happens, will the new art form – vastly collaborative, grown from our minds and from our surroundings, partly conscious and partly mystifying, made of signals as much as from strings – approach the provocative power of Glenn Gould’s final vision from 1966:
“In the best of all possible worlds, art would be unnecessary. Its offer of restorative, placative therapy would go begging a patient. The professional specialization involved in its making would be presumption. The generalities of its applicability would be an affront. The audience would be the artist and their life would be art.”
Charles Holbrow is a Ph.D. Candidate in the Opera of the Future group at the MIT Media Lab, where he designs and builds next-generation connected music technologies.