AI for Music and Visual Art

Jean-Baptiste Thiebaut

In its current, sophisticated form, Artificial Intelligence Art uses the vast quantity of images and knowledge available to recreate scenes that have never existed, using only a prompt that describes the scene.


In the examples below, we created images using prompts like “futuristic learner of music technology in a distant galaxy, in the style of Jackson Pollock” using DALL-E or Stable Diffusion. 

Stable Diffusion

Dream Studio is Stable Diffusion’s AI Art web app tool. It’s free to generate a limited number of images, and very simple to get started.


As of yesterday (28th September) you can use DALL-E 2 like you would Dream Studio, without a waiting list. 


Some other ML powered tools and resources

  • Midjourney, image AI generator
  • RunwayML, video editing
  • Descript, text based video editing – we use it and love it)
  • Two Minute Papers, YouTube channel, exploring developments in the fields of AI
  • Dreambootha GitHub implementation of Google’s Dreambooth with Stable Diffusion
  • Jasper (not sure what they do, but they advertise a lot)
  • We’ll add more here, let us know on Discord your favourite ones. 


What about copyright?

There are still issues of ownership that need to be addressed. AI Art generators use materials available on the web to train their models, which might include copyright work. While it is perfectly fine to prompt a drawing in the style of a famous painter, it is less clear what will happen for copyrighted materials. 

In the example above, we’ve used the prompt “futuristic apple logo” in DALL-E and Dream Studio. DALL-E went significantly further away from the original Apple logo than Stable Diffusion. At this point, it seems that we would get into trouble if we claimed that the Stable Diffusion render of the apple logo was our own. Read more about legal issues on Silicon Republic.

What’s the technology behind this?

Essentially, Machine Learning is the glue that binds the images, the interpretation of the text and the render together. AI Generators have trained their software to recognise millions of images (a reported 650m in the case of DALL-E – OpenAI PDF), and analysed painters, trends and styles (such as pop art or comics). 

Google is credited with the first AI Generator, DeepDream, with its recognisable Van Gogh renderings featuring bizarre eyes and animals blending into the picture. They’ve published several papers on the technology, including ongoing work with Imagen, whose generator isn’t available yet, but claims to have the highest photo realistic definition. 

What about Generative Music? 

Generative Music has been a research topic for quite some time, and Machine Learning is already making its way into commercial plug-ins and applications. It doesn’t feel quite as a breakthrough than AI Generators of images though. 

The example video above, Daddy’s Car, is the outcome of a research project by Sony led by François Pachet (who’s now a Research Director at Spotify). The melody for the parts and vocals of the song was entirely created by training a model on the Beatles catalogue, such that the song resembles something like the Beatles would have created. Lyrics were created on top of the music, and it was recorded in the studio by a band. 

Many of these techniques have now found their way into creative workflows, to create drum patterns, melodies, arpeggios and more. 

How to get started?

Many artists who teach here at Music Hackspace are using Machine Learning in their work, and we have a growing collection of courses to help you get started with generative music and AI Art. Browse below for a selection of courses on generative music and generative art.