you know all this neural net nonsense? you can do it with audio too apparently...

lala · July 9, 2015

im not sure how it would all work, apparently its a LOT of work, listening to millions of songs (?), then you can maybe feed it white noise and it will make music?

or zoom in on a track (slow down?) and have only taught it amen breaks and it starts to sketch out or something.

Wheres Autechre when you need them

**modey** · July 9, 2015

could do it with spectral/FFT synthesis? It'd probably sound like low quality mp3s though.

ie. feed it spectrograph images of certain sounds and try to get it to extrapolate from there.. not sure if it'd be useful but probably worth a try

Edited July 9, 2015 by modey

lala · July 9, 2015

Can this be done on audio? video?

Yes. To make a video you can run the code on each individual frame of the video then stitch them together afterwards. But there are more efficient ways discussed in this thread. The best resource for learning about this is here: https://github.com/graphific/DeepDreamVideo

If you wish to make one of those zoom-into-an-image-really-far gifs like this one then you should follow the guide here: (TODO: guide link)

To perform this on audio you need to really know what you are doing. Audio works better with RNNs than CNNs. You will need to create a large corpus of simple music to train your RNN on.

https://www.reddit.com/r/deepdream/comments/3cawxb/what_are_deepdream_images_how_do_i_make_my_own/

misc · July 14, 2015

Yeah I wanna hear what this would sound like so bad. Any guess on how soon we'd get to hear an example? Surely someones gonna give it a go.

Even after reading the thread on the image versions of this, I still don't really get whats going on so can't really judge how difficult it'd be.

lala · July 15, 2015

very...?

you have to let the net listen to millions of songs apparently. its harder than images according to one article i read. even if we could somehow get the software and/or be able to run it..

coax · July 15, 2015

A DNN is just meant to recognize patterns but not output it. All they're doing with images is amplifying certain levels of abstraction to see what the neurons are picking out. With audio, what you'd start out with is a bunch of single notes at layer 1 abstraction, and then at say, layer 12 you'd have the complete mozart symphony or whatever, which means you'd have to feed it the entire symphony (or enough to be unique in the system) for those neurons to fire and recognize the pattern. It's the same thing as seeing a dog or whatever in an image. A neural net is basically a set of constraints that trigger only when the right inputs are presented, from simple and short scale to big complex objects.

So with audio, you'd end up with different length "audio clips" either of single notes, chord progressions or entire chorus's or even entire tracks, blended together or in a row depending on the code. I don't know anything about programming audio software, so it's a little hard to see how to output the results. Either samples or some kind of synthesis generator that is able to output exactly what one wants?

See this is where I'm really interested in DNN's cause, it would be nice if the computer could have an understanding of music theory, and beyond that a taste of what's modern or what's old, and then be able to output original music that way. In theory all these abstract thoughts humans have should be just networks of neurons etc. You're kind of doing a randomizing effect otherwise, unless it stumbles upon some kind of music theory by accident IDK

Edited July 15, 2015 by coax

lala · July 15, 2015

Yeah ^ thats why when you run the DD you get the output numbers going up in resolution through the inception layers.

would it be simple notes or would it be more like looking at the wave from afar and then zooming in to it, with realtime being the final inception layer?

coax · July 15, 2015

I have no clue. I imagine there's many ways to go about it. I mean you could do pure audio, or you could "convert to midi" type and work with the notes. I'm not sure what would be the best thing to feed it or what would be the best way to output it when you amplify layers to render them etc. Would be cool to input autechre as pure audio and then have it put those harmonies and frequencies etc back together in various layers but don't ask me how to do it ;e

Just to add I *think* you should be able to have abstraction layers for several dimensions like frequency, harmony, amplitude etc, which means it's not just a waveform but different aspects of audio.

Edited July 15, 2015 by coax

**psn** · July 15, 2015

The Shazam back end software would surely be a good start.

**peace 7** · July 15, 2015

Shazam_Music_Signature_Database.torrent - 5,349,309 GB - seed: 5 / leech: 398

lala · July 16, 2015

dont look at me guys, took me a day to install deep dream

**hoggy** · July 16, 2015

Does it sound like JPEG dogs yapping?

drillkicker · July 16, 2015

Doesn't seem like a very Autechre thing to do. More of a glitchy Ryoji Ikeda type of business.

**peace 7** · July 17, 2015

On 7/16/2015 at 12:57 PM, lala said:
dont look at me guys

July 17, 2015

Quote
lala, on 09 Jul 2015 - 2:18 PM, said:
im not sure how it would all work, apparently its a LOT of work, listening to millions of songs (?), then you can maybe feed it white noise and it will make music?

they already did it, it's called jazz

Edited July 17, 2015 by skibby

oh2 · July 17, 2015

evidently the audio here was generated by neural net stuff

**juiceciuj** · July 17, 2015

soundtrack to the apocalypse

lala · July 19, 2015

Lol I think I'm thinking too far ahead of what this is capable of haha.

Cheers peace7 I nearly spit out my drink

**mcbpete** · July 20, 2015

On 7/17/2015 at 7:32 PM, oh2 said:

evidently the audio here was generated by neural net stuff

Isn't that battle music for FF7 on the PC before it recently got 'fixed' in the steam release ?

KovalainenFanBoy · July 20, 2015

It would work with pretty much any kind of data, wouldn't it? All it does is search for patterns

lala · July 20, 2015

Yeah you would think, an image is like audio, as midi data is like vectors?

drillkicker · July 21, 2015

On 7/20/2015 at 11:16 AM, mcbpete said:

On 7/17/2015 at 7:32 PM, oh2 said:

evidently the audio here was generated by neural net stuff

Isn't that battle music for FF7 on the PC before it recently got 'fixed' in the steam release ?

No, it's the latest Liturgy album.

doublename · July 21, 2015

flololol

coax · August 9, 2015

Bit of a bump but a guy did this http://www.hexahedria.com/2015/08/03/composing-music-with-recurrent-neural-networks/

Plarkreluke Banloboh · August 9, 2015

Given that streaming services can automatically recognise copywrited audio if you upload it, why is it that we don't have a reverse sound search equivalent of reverse image search?

you know all this neural net nonsense? you can do it with audio too apparently...

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Guest skibby

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Recently Browsing 1 Member