On this essay, I will try to summarize and explore the basic notions regarding cognitivemusicology which is a branch of cognitive science, and also examples and concepts ofcomputational models on musical creativity. As my background studies are in computer science,I will approach these topics more related with my technical knowledge of programming andmachine learning, give examples of actual and available algorithms to create music, all of themusing deep learning (the newest and hot trend method of machine learning) and I hope it willprovide an insightful, simple, short but helpful essay.
The goal of computational modelling is trying to simulate a really complex behaviour thatwouldn’t be possible without the use of computer system. In cognitive musicology, these modelsare used to try to understand how the musical knowledge is represented, stored, perceived,performed, and generated in the brain, all of that using a well-structured computer. Before thepopularization of computers, cognitive scientists could only theorise on paper about themechanisms behind their models, and that was a massive pen-and-paper exercise, enormouslytime-consuming and error-prone. However, with the advent of fast computers and at the era ofbig data, high-quality databases of stimuli are available and it is possible to embody a cognitivetheory as a computer program and to test its consequences repetitively. (Wiggins and Pearce,2009). When we talk about music creativity in a computational context, we need to talk aboutmachine learning. That old thought that a computer can’t be creative because it can only dowhat it has explicitly been programmed to do is over nowadays with all the neural nets and deeplearning algorithms that are able to learn and create new information without any task-specificprogramming, the system basically learns by itself and generate some creative output.
- Thesis Statement
- Structure and Outline
- Voice and Grammar
- Conclusion
I will try to focus on deep learning because it’s the machine learning method which isgaining popularity1 and being used by the community of computer scientists. One of the projectsof music creativity using deep learning is the Magenta Project, an open source Google creationthat has two goals: “It’s first a research project to advance the state-of-the art in music, video,image and text generation. So much has been done with machine learning to understandcontent—for example speech recognition and translation; in this project we explore contentgeneration and creativity. Second, Magenta is building a community of artists, coders, andmachine learning researchers.
“2 This project already has to the date that this essay was written (31/12/2017) 12 models,each of them with different characteristics and using Recurrent Neural Network (RNN) and Long short-term memory (LSTM). It’s worth explaining what are RNN, LSTM and how they are usedin Music Creativity, all models are available on the GitHub of the Magenta project3. RNN are Artificial Neural Networks (ANN) with loops in them, allowing information topersist (a shortcoming of a standard ANN). A recurrent neural network can be thought of asmultiple copies of the same network, each passing a message to the next one, as shown infigure 1. An Artificial Neural Networks is a biologically inspired network of artificial neuron, thatare going to perform simulations on computers and acquires knowledge through learning.
4 Figure 1. An unrolled recurrent neural network.Source: http://colah.
github.io/posts/2015-08-Understanding-LSTMs/ Theoretically, RNNs might be able to connect previous information to the present task,but in practice doesn’t work like that, because RNN are only able to use recent information toperform the present task. When the gap between the relevant information and the point where itis needed to become very large, standard RNN aren’t able to connect the information and that’swhen LSTM are used. LSTM are a special kind of RNN, designed for learning long-termdependencies.
The difference between a RNN and a LSTM is exactly the repeating module,while in a RNN the structure is a really simple one, one single neural network layer, in a LSTMthere are four layers interacting in a more complex way that will provide the neural network thepower to remember long-term information, as possible to see in the Figure 2. Not all LSTM havethis exactly inner structure, there are different versions of LSTM that use slightly different gates,for more information of different versions and also the logic of this modules, visit this GitHubpage from a machine learning researcher. 4 Figure 2.
On the left a RNN, on the right a LSTM with the 4 layers module.Source: http://colah.github.
io/posts/2015-08-Understanding-LSTMs/ One of the main problems in using machine learning to generate sequences, such asmelodies, is creating long-term structure. Long-term structure comes very naturally to humanbeings, but it’s way more difficult to reproduce that in machines. Basic machine learningsystems can generate a short melody, but they have trouble generating a longer melody thatfollows a chord progression or follows a multi-bar song structure of verses and choruses. So,they can produce a screenplay with grammatically correct sentences, but they fail to producewith a compelling plot line. And that’s where LSTM achieve the best results until nowadays, andfits perfectly the field of music creativity, because, without long-term structure, the contentproduced by recurrent neural networks (RNNs) often seems wandering and random. 5 The magenta project is in Python, and to install and execute some of the modelspresented there it is necessary to set up the environment before. But there a lot of creationsusing these models that can be found on the demo page.
6 It’s possible to execute them andsee the results, like the NSynth, an interactive AI Experiment that lets you choose pairs ofinstruments and mix them to create new sounds. Or the A.I. Duet that allows the user to playsome piano notes and the machine will create a sequence to respond to that, using everythingthat it learned about musical concepts, building a map of notes and timings in the training time. As a conclusion of this paper, it’s evident the recent enthusiasm in machine learning anddeep learning inspiring art, and it’s possible to find plenty of other experiments in the field ofmusic creativity, like DeepJazz 7, BachBot 8.
There is still a somewhat long way to developreally complex music, deal better with polyphonic music, harmonies and to combine differentmelodies like some of the masterpieces created by humans, but with the steep advancement ofcomputers and algorithms it is a natural road in the future and it’s just matter of time of whenmachines are gonna achieve that.