Science

We publish research on
eliminating humans from music
neural synthesis

Generating Albums with SampleRNN to Imitate Metal, Rock, and Punk Bands (MUME 2018)

Music Metacreation workshop at the International Conference on Computational Creativity

This early example of neural synthesis is a proof-of-concept for how machine learning can drive new types of music software. Creating music can be as simple as specifying a set of music influences on which a model trains. We demonstrate a method for generating albums that imitate bands in experimental music genres previously unrealized by traditional synthesis techniques (e.g. additive, subtractive, FM, granular, concatenative). Raw audio is generated autoregressively in the time-domain using an unconditional SampleRNN. We create six albums this way. Artwork and song titles are also generated using materials from the original artists' back catalog as training data. We try a fully-automated method and a human-curated method. We discuss its potential for machine-assisted production.

Generating Black Metal and Math Rock: Beyond Bach, Beethoven, and Beatles (NIPS 2017)

Machine Learning for Creativity workshop at NIPS 2017

We use a modified SampleRNN architecture to generate music in modern genres such as black metal and math rock. Unlike MIDI and symbolic models, SampleRNN generates raw audio in the time domain. This requirement becomes increasingly important in modern music styles where timbre and space are used compositionally. Long developmental compositions with rapid transitions between sections are possible by increasing the depth of the network beyond the number used for speech datasets. We are delighted by the unique characteristic artifacts of neural synthesis..

Curating Generative Raw Audio Music with D.O.M.E. (MILC 2019)

MILC workshop at ACM Intelligent User Interfaces

With the creation of neural synthesis systems which output raw audio, it has become possible to generate dozens of hours of music. While not a perfect imitation of the original training data, the qual- ity of neural synthesis can provide an artist with many variations of musical ideas. However, it is tedious for an artist to explore the full musical range and select interesting material when searching through the output. We needed a faster human curation tool, and we built it. DOME is the Disproportionately-Oversized Music Explorer. A PCA-component k-means-clustered rasterfairy-quantized t-SNE grid is used to navigate clusters of similar audio clips. The color mapping of spectral and chroma data assist the user by enriching the visual representation with meaningful features. Care is taken in the visualizations to aid the user in quickly developing an intuition for the similarity and range of sound in the rendered audio. This turns the time consuming task of previewing hours of audio into something which can be done at a glance.