Generative Music | Performance-RNN

In the past performances, I encoded and trained each track in my MIDI files (melody & harmony) separately. Then I play the generated tracks on the same time.

But I think that was not the proper way of generating two-track music, because the melody and harmony should correlate.

So for this assignment, I tried using Performance_RNN workflow to encode the dataset and train the model.

Dataset

I used the same dataset I trained my 2nd performance with: 20 MIDI files of Yoko Shimomura’s game soundtracks. All of these tracks have a melody (right hand) and a harmony (left hand) track.

The issue with these tracks are that its MIDI formatting has no ‘note_off', and the end of the note is indicated with 'velocity=0’. I was not sure that the sequencing algorithm would be able to parse these MIDI files. But from Magenta’s GitHub, it seems like Performance_RNN has a config that takes into account a spectrum of velocity value:

"The performance_with_dynamics model includes velocity changes quantized into 32 bins"

I don't know what the warnings mean and I ignored them

Training

I tried training the model locally with 1000 steps first. The final loss value is 4.4152, and perplexity 82.698395

Then I moved on to training 20,000 steps on Paperspace twice, and 10,000 once. I didn’t realize that if you shut the console down the machine will get rid of the tmp folder so I wasted a good 2 x 3 hours....