Featured Posts

Generative Music | Duet with Generative Model (POC) Performance 2

November 12, 2018

In the first performance, I frankenstein-ed 3 of my favorite songs with Markov chain and generate a midi file as an output. For this performance, I essentially planned to do the same, but with added objectives of: (1) having a more meaningful data corpus, and (2) designing a performance with the generated music that makes sense.

 

This project has been more of a proof-of-concept of a "duet" pipeline.

 

(1) Meaningful data corpus

 

Playing games was a huge part of my childhood, and a significant aspect of the experience was definitely the soundtrack. Until now, every time I listen to a battle theme song of the games I played, I can still feel the adrenaline rush. The songs could still take me back to a time when I experienced magic and lived in the game fantasy world.

 

So I collected 20 MIDI files of game soundtracks composed by Yoko Shimomura from Musescore. She has composed for few of my favorite JRPGs (but also for Super Mario & Street Fighter) and is arguably the most prominent female composer in the gaming industry.

 

Even though 20 is not a big enough data set, I decided to go ahead because I knew the MIDI formatting would be consistent and hence easier to encode.

 

This is the encoding code (taken from the 1st performance)

 

if message.type == 'note_on':
        for item in message_components:
            if 'note=' in item:
                note = item.split('note=')[1]        
            if 'velocity=' in item:
                vel = int(item.split('velocity=')[1])
                if vel == 0:
                    # ! velocity 0 means end of note
                    # ! only save time information (duration) when it is end of note
                    isSaveTime = True 
                else:
                    # ! if velocity is more than 0, means start of note
                    # ! don't save time if it's start of note
                    isSaveTime = False
                    print ('note: ' + note)
                    notes.append("n"+str(note))         
            if 'time=' in item:
                if isSaveTime:
                    dur = int(item.split('time=')[1])
                    # ! if velocity and time is 0, means current note plays at the same time as previous note
                    # ! the duration should follow the previous note
                    if dur > 50: 
                        print ('duration: ' + str(dur))
                        notes.append('_d'+str(dur)+" ")
                        index+=1
    

 

(2) Designing a performance with generated music

 

Coming to this I knew that the output of training a model on the MIDIs would mostly be gibberish. Shimomura’s songs are hardly homogenous: some of them would be fast tempo and aggressive sounding, while the others sound like a relaxing lullaby.

 

Transposing everything into the same tempo and key would also be too much work.

 

So how do I design a performance that still makes sense and is meaningful without similarity or listenability? I designed a performance pipeline that could emulate how I experience Shimomura’s music:

  1. I would “play” a game, or try to evoke certain sounds that remind me of the games I played as a child.

  2. The program would respond to me playing and generate a “soundtrack”.

  3. I would play more, and the program would respond further. It’s like a "duet/dialog" between the player and the soundtrack composer.

Technical Pipeline

 

Prep Work: Tensorflow

  1. Encode MIDI files into text files, train model.

 

Live Processing: JS

  1. Listen for MIDI inputs, play note for each input with Tone.JS

  2. Store inputs in the same MIDI encoding format.

  3. Once inputting is complete, use it as seed for generation with ML5.

  4. Translate generated text to sounds with ToneJS

 

(source code to be posted soon)

 

 

 

Future Improvements

 

  1. Bigger data set, and account for different keys and tempo.

  2. Better encoding(!!) – instead of splitting the 2 piano melodies (MIDI tracks) into two different encodings, encode them together into one file.

  3. Faster / more responsive generation. Perhaps use a timer function instead of space/enter as generation triggers.

  4. Use game controllers instead of MIDI keyboard.

  5. More real-instrument-sounding output.

 

Share on Facebook
Share on Twitter
Please reload

Related Posts
Please reload

Generative Music | Latent Space Sequencer

December 13, 2018

1/10
Please reload

Recent Posts
Please reload

Search By Tags