I’ve been enjoying the combination of deep learning and sound design as a hobby recently and wanted to checkout Magenta Nsynth’s
fastgen for resynthesis on a well known set of samples. The Roland 808 drum machine has a classic and unmistakeable sound so I thought this would be interesting to see how it might sound using Nsynth resynthesis.
I used the basic 808 drum sample pack that comes as a part of the Ableton Live software. I loaded the wav file encodings using a pre-trained WaveNet model and then used
fastgen.synthesize on each encoding. This took around a minute per sample using the CPU on a 2014 Macbook Air. Decoding the audio took a bit longer, usually anywhere from 10 to 15 minutes.
For the most part the samples sound ok, but they are a bit lo-fi. Here’s a sample of the resynthesized drum machine playing a simple non-quantized beat:
The low frequencies on the kick drum sound heavy and fairly accurate, here is a graph of the encodings of the generated sample:
I think the strangest sounding generated sample is the “open hi-hat”. The frequencies in this sample are more mid to high ranging and the model isn’t quite sure how to interpret some of those frequencies. It seems to give off wobbly effect, as seen the following graph:
After I re-synthesized each sample using Nsynth, I loaded all of the newly generated samples into an Impulse sampler in Ableton Live and created a primitive drum machine as seen in the video at the bottom of the page. I would love to eventually take this further and create an Ableton Instrument that mixes multiple drum machines together.
I put a Jupyter notebook on Github which has the encoding graphs of each of the samples if you would like to take a further look: https://github.com/sunhypnotic/858-drum-synth
And here’s a video that covers the project a bit more: