From New York Times Opinion, this is the Ezra Klein Show. We are already a wash in crappy AI content. Some of it is crappy commercial AI content, and wants to sell you things. Some of it is crappy AI art. And it got me interested amidst all this complaining. What does it mean right now to be making good AI art? And so I read this profile of the AI artist and musician Holly Herndon in the New Yorker, and then separately this DJ I met mentioned her work to me.
So I should check this out. And so I went and listened to her 2019 album, Proto, which was done alongside an AI voice trained on Her Voice and others. And I was walking to work when the song Fear Uncertainty Doubt came on. I just stopped walking. What makes so much AI art so bad, in my opinion, is it so generic. These are generative systems. We keep calling them generative, but generative is so... When we use that term,
it usually means it helps you get somewhere new. But these systems are mimics. They help you go somewhere old. They can help us write or draw or compose like anyone else. But I find it much harder when using them to become more like yourself. And most of what I see coming out of people using them, it's all riffing on others in this very obvious way. What I like about Herndon's art is she uses AI to become weirder, stranger, more uncanny, more personal. It's going in the exact opposite direction.
And some of her art questions the entire way these systems work. She and her partner Matt Dryhurst did this project at the Winnie Biennial this year, where they created an image generator based on images of Herndon, or at least what the AI system seemed to think she looked like, which is she's got this very striking copper hair. And so the way it understood her was really around the striking copper hair. She is as she put it a haircut.
And so they manipulated these images and they made this AI system where anybody can generate any image in the style of what AI systems think Holly Herndon is. So you can generate an image of a house and it'll have this long flowing copper hair. And it'll tag itself as an image of Holly Herndon. And because it's on the Winnie Biennial, these images have a sort of authority in the way these AI scrapers work.
And so as they are scraping the internet for images in the future, she is potentially poisoning their idea of what she is. She is taking control over the AI's idea of Holly Herndon. I find that fascinating. AI art that is acting as a kind of sabotage of AI systems and the lack of voice we have in how we appear in them. Along with a bunch of collaborators, Herndon has a lot of projects trying to blaze a trail and did not just good AI art, but fair AI economics and ethics.
And so I want to have her on the show to talk about it. As always, my email is reclineshow at nmytimes.com. Holly Herndon, welcome to the show. Thanks, it's great to be here. So something I find fascinating about you is that you grew up singing in church choirs. Then you moved to Berlin after college, you got deep into Berlin techno. And I think those are respectively the most human and the most inhuman forms of music that human beings make. So how did they shape you?
Yeah, that's a really good question. I mean, I feel like I'm such a product of the environment that I've spent a lot of time in. So I'm really interested in folk singing traditions coming from East Tennessee, of course, growing up in a town next to where Dolly Partens from. She always loomed large. Then I spent a lot of time in Berlin. And so of course, electronic music and techno has played a really big part of my story.
And then also moving to the Bay Area where I got really deeply interested in technology. I feel like even though techno might sound and is does kind of have a synthetic palette and does sound maybe in human. I feel like the rituals that happen around the music are very human and very sweaty and very embodied. So I think if you experience that culture in person, it feels less in human. But why does that magic happen? So I was in Berlin and I was down in the sort of big room in the bunker.
I would call it this sort of the way it felt to me. And I always say the music felt like being inside of a machine gun, but in a good way. And meanwhile, like as you say, what's happening around it? I mean, it was actually the most inhuman music I've ever heard and I like electronic music. But what's happening around it is so human. I mean, all these people engaged in this most physical, sweaty, smelly, ritual of dancing together.
How do you understand both like the meaning and the function of it? Why does music like that create that kind of transcendence? I mean, this might sound strange, but music is a kind of coordination technology. So a 4-4 techno beat is maybe the most clear communication of that. It's so easy to participate in. It's fairly easy to make. It's also fairly easy to dance to and understand.
So I feel like as a kind of, if I want to call it a kind of protocol, it's an easy way to communicate what to do in that scenario. So I think that that's why people have organized around it so much. When I go out and listen to the further reaches of techno in Berlin, in New York where I live, I'll often find myself, it's up one in the night, thinking every piece of sound in this music is a choice. And when that choice sounds very artificial, right?
When it sounds like something so removed from somebody playing strings or somebody singing, I think this person wanted to communicate in this extraordinarily machine-like way. And this has been happening for a long time. I mean, talk boxes and synthesizers and all these technologies. And I'm curious to somebody who's made some of that music or is deeply, at least within the culture that has made it, what is appealing about that? I mean, you said it creates this very sweaty human ritual.
But first there is this transition of the person into something that does not sound like people. It sounds like music that robots might make. It sounds like music from a far away culture. Hmm. Maybe there's something about living in such a technologically mediated world that makes us want to find how we fit into that as humans. And music is such a kind of innate part of being a human.
I mean, as a performer of the laptop, I was always trying to find a way to make the laptop feel really embodied because at the time when I started performing a lot, you know, there was this criticism that, oh, you could be checking your email or this doesn't really feel like a lively performance. So I started using my voice as a kind of input stream.
And the thing that I found really liberating about using my voice in that way is that I could kind of do anything to digitally manipulate my voice to make it be so much more than it is physically. But what I really enjoyed was using my voice as a kind of controller or data stream. And then it could do things that I couldn't imagine once I put it in the laptop, I was able to process it in specific ways.
So there's something about trying to come to terms with the systems around us by working through them and working with them, collaborating, maybe helps us kind of understand where we sit in that feedback loop. So in a minute, I want to play a clip of a piece of music you made, but first I want to talk about how you made it. So tell me about Spawn. Spawn, who was a kind of like AI baby experiment, proto was released in 2019 and Spawn came about two years before that.
So at the time, you know, it was a very different time, especially for audio, a lot of the visual models were developed earlier. But eventually things got better. We started playing with a project called sample RNN and some other software. And you'll hear still from the stems that we might play later that the vocal quality, the sound quality from 2017, 2018.
You know, to me, it sounded like the really early recordings that you can find on YouTube. I think it's like the earliest audio recording. It sounds really scratchy and super low fidelity. That's what the audio sounded like back in the day. And so it was this real issue of trying to get the high fidelity recordings that I was doing with my ensemble in the studio to live in the same universe as this really scratchy low five audio that I was generating through Spawn.
Or do we play a bit of that because you kindly shared the stems for the song swim. And maybe we should start here by playing the ensemble, the sort of chorus you brought together to sing for the album. That's really beautiful and really human. And now on the other side, I want to play the Spawn track on its own. What am I hearing when I hear that somewhat nightmarish spawn there?
So Spawn was trained on the voices of the ensemble. And so back then, we couldn't deal with polyphony, which means one more than one note at a time. So what we had to do was break each line into an individual line. And then we would feed that line to spawn, who would then sing it back through the voice of our ensemble. And I think we were feeding it through with either a voice synthesizer or a piano. I can't remember. It's been so long.
But so we basically use this idea, which is called Tamber Transfer. So that's where the computer learns the logic of one sound and kind of superimposes that onto the performance of another. So that's what we did. We had the ensemble sing a variety of phrases. We trained Spawn on their voices. And then we did a Tamber Transfer. We fed her the line that we wanted her to sing. And then she sang it back to us.
And I think hearing that one question you could have is, well, what do you need Spawn for? Why not just have a human being sing into a talk box or use a synthesizer or able to? We can make people's voices sound strange already, auto tune. What is the value of Spawn here? I think overall Spawn has a unique tamer quality that I actually really love because it really is a snapshot in time.
It doesn't sound like that anymore. It sounds really clean and yeah, really high fidelity. But at that period of time, I almost have this like romanticism around that almost like a vinyl hiss or a pop for that very particular period of time and machine learning research. But also I felt like I really needed to be making my own models and dealing with the subject directly in order to have a really informed opinion about it.
And I'm really glad that I made that decision because it's informed so much of the work that I do today. I've been the very basic understanding that a model's output is so tied to the training data, the input. I don't know that I would have come to the profundity of that had I not been training my own models and that's really informed all of the work that I've done since then. So I think sometimes you just have to deal with the technology in order to make informed work around that technology.
We're going to come back to the profundity of that because I actually think it's really important but I want to do two things before we do one is to play a bit of the full song swim so people can hear where this ended up. And so then I want to play something you just released more recently using I don't know if you'd call this an updated spawn but we're calling Holly plus which is this much more modern voice model trained on your voice that you had cover dolly pardon's Jolyne.
So obviously the unearthly quality is gone what am I hearing who what is singing. So that is a voice model trained on my voice I worked with some researchers and Barcelona in a studio called Voktro Labs at the time and Holly plus was born and as you can hear it's leaps and bounds better more high higher
and then the ability than little spawn so basically that version of Holly plus there are multiple versions there's a version that can be performed in real time but this particular version is a score reading piece of software so I basically just write out a score with the text written out in phonemes and then the software spits out basically pitch perfect performance of that song. So I think that's a great thing to have a Ryan Norris playing a beautiful human guitar accompaniment.
That's a use case that I'm fascinated by that I imagine will become more and more. Common in the future which is a model trained on a person that one can sort of almost autonomously create as if it were that person you can imagine.
And then the model generates questions they could ask somebody or a model generated all of my columns and you know you can spit out a you know an op ed. What is your relationship with that and do you see it as an extension of what you can do or do you see it as a kind of partner you can collaborate with or do you see there's just some version of you that makes you scale.
Because like you can't take commissions to sing from everybody out in the public but they can all you know go to Holly plus and get it to sing on their behalf like what is your relationship with this you know nascent other you or at least other voice of you that now exists in the world. I think I'm probably an outlier in my relationship here because my practice involved so much vocal processing so if you listen to movement. Or platform.
The albums before Proto before I started working with machine learning I was already taking my voice and kind of mangling it beyond recognition turning it into a machine itself. And I think I'm going to be able to make a model of my voice that felt like the natural next step in an already very kind of highly mediated process with my voice. I don't expect everyone to have that relationship. I don't really see the Holly plus voice is something that replaces me in any way.
I think that I have fun playing with I can attempt to perform things that I wouldn't normally be able to you know I did a performance with Maria Arnal in Barcelona and I mean that music is so difficult to perform I could never sing that she can do all of these amazing.
Melismatic diva runs that I could never dream of but my voice model could do it and that was really fun and it didn't confuse me to think OK I can do that now it was more just fun to hear myself do something that I know that I couldn't. Do alone acoustically so I guess for me it's maybe like an extension or an augmentation of my own self. So what did Holly plus add to that cover of Jolene I mean you could have just sung a few tracks of harmony and and added them above the melody.
So what is what is the AI mean to you in it specifically. Well I think that one is perhaps a little personal because growing up in East Tennessee dolly part and was kind of the patron saint of that region and the kind of music that I usually perform has very heavily processed vocals and is usually is a bit more abstract than a dolly part and song so.
It was almost like I wouldn't afford myself that or allow myself that but I would allow Holly plus to do it because there is this kind of like level of removal it's almost like Holly plus can perform things that I would be too bashful to perform myself. Oh that's really interesting the idea that having another version of yourself out there could give you license to. Try things you wouldn't otherwise try.
Yeah like Jolene I mean I love Jolene as a project but it doesn't have the same ghostliness and quality as the music on proto which is why I didn't release it as an album you know it's just not as interesting somehow. I guess the other thing there's a question of meaning here that I've been circling in my own playing around with AI. I spent a bunch of time recently creating sort of AI friends and therapists and you know trying to understand like the relational a eyes that you can build now.
And on the one hand I was amazed at technically how good a lot of them were at the same time I find I never end up coming back I find it very hard to make the habit sticky or the relationship sticky. When I sit with my friend or my partner the fact that they are choosing to be there with me is separate from the things that they're saying. And an experience I'm having with a lot of AI projects is that the output is pretty good right Holly plus sings really well or the therapist.
Friend I made on kindroid. Text in a way that if you had just shown me the text I would not know it's not a human being. But the absence of there being the meaning of it that another person brings the fact that I know it's Holly plus it's a cool project but I'm not going to keep listening to it.
The fact that I know the kindroid can't not show up to talk to me that that's a relationship I control totally it robs the interaction of meaning in a way that makes it hard for me to keep coming back to it. And so somebody who works a lot with like the question of meaning and sees a lot of these AI efforts happening. How do you think about what imbues them with meaning and in what cases they end up feeling hollow.
It's really funny we did a live performance from proto I guess in 2019 in New York and we had the ensemble on the stage and afterwards someone came up to me and they said. I really enjoyed the show but I don't understand what it has to do with AI and actually that was the biggest compliment that I could receive because I wasn't trying to project this kind of like super future you know AI high tech.
Story I was trying to show all of the kind of human relationships and the human singing that goes into training these models that's something I was really trying to get to with that album is you know allowing the some of the things that the computer can do you know some of the coordination that it can do is remarkable but it can also free us up to just be more human together to really just focus on the parts that we really really want to focus on which is just enjoying that moment of singing on stage together.
I'm also not so interested in necessarily having an AI therapist that's not what I find interesting or compelling about the space I'm interested in exploring some of the weirdnesses and how we as a as a society defined different things that's the kind of stuff that I'm interested in not having a kind of like AI chat pet. I've already say that with AI it's the model that's not necessarily the output of the model.
Yeah that's one thing that we're exploring quite a bit so one of the potentials around machine learning is that you're not limited to just a single output you can create a model of you know whether that's my own singing voice or whether that's my own image or likeness and you can allow other people to explore the logic of that model and to prompt through your world so it's almost kind of like inviting people into your.
Your subjectivity or inviting someone into your video the video game of your art so I think it has a lot of potential to be. Interesting and a kind of collaborative way with your audience one term that we're. Often using is protocol art basically understanding that any work that's made is a kind of seed for infinite generations so we're trying to lean into that so for example if we make a sculpture which we did a project called ready weight.
We also make it available as a package with an embedding and a Laura and all the kind of tools that anyone would need to be able to.
Explore that sculpture in latent space or you know when we made the model of my voice with Holly plus we made that publicly available so anyone could make work for that so that's the example of protocol art where really it becomes a collaborative experience between myself and the people who are engaging with my work and in a way arts kind of always a little bit like that it's a conversation between.
The work that you're making and the viewer or the recipient but that becomes a little bit more complicated and fun I think in an AI world. You wrote something in 2018 that I think is worth exploring where you said that AI is a deceptive over abused term collective intelligence is more useful why.
Because I really do see it as a kind of aggregate human intelligence it's trained on all of us specifically when you when you look at music it's trained on human bodies performing very special tasks and I think it does humans a great disservice to try to remove that from the equation.
I think that's why I like to draw parallel also to coral music because I see it as a kind of coordination technology in the same kind of lineage as group singing I think it's a part of our evolutionary story and I think it's a great human accomplishment that should be celebrated as such.
I want to explore what changes when you emphasize the collectivity of these models the fact that they are in some ways an aggregate of all of us versus the artificiality of them right artificial intelligence which which really emphasizes no there's something that somebody has written into software over here they're like unearthly there a new kind of thing.
And one thing is actually I think economic that there's this whole question about who gets compensated and who's going to make the money off of this and what all this training data is going to end up doing economically and it does seem very different to me if you understand these as on some level a societal output there's something this built on a kind of commons as opposed to a tremendous leap and feet of technology that is the sort of Indian.
And that is the sort of individual result of software genius is working in in garages and office parks somewhere. Yeah, I mean that basically summarizes the work that I'm doing for the last several years is kind of like shouting that from the rooftops because I think if you see it through that lens and it becomes something really beautiful and something to be celebrated and also something that's not entirely new you know we've been embarking on collective projects.
The entirety of our kind of humanity to make things that are bigger than ourselves and so if we can find a way to make that work in the real world with the kind of future of the economy then yeah I think it behooves us to figure that out. But they're not entirely new part feels important to me the degree to which this is all a continuum feels often under played in conversations about about AI about the future of work about humans and machines.
But there's also a way in which you see the AI companies using this argument to say that they should be given much more free reign and much more full profits over the products of these models because they say look we're not doing anything different than any other artist or anyone ever has scientists today work off of the collective body of knowledge of science before them.
You know Holly herndon is influenced by folk music and choral music and German techno and everybody is always absorbing what has come before them and mixing it into something new that's all we're doing we're not doing something new we're not making a copyright infringement.
So how do you understand the effort to use the collectivity right the fact that human beings have always been in collective projects but we do give people a lot of individual ownership and an authorship over their works from you know what might be different here in the scale and the nature of what these models are doing.
So okay I think that there's a middle ground that can work for everyone that can allow people to experiment and have fun with this technology while also compensating people so I'm spawning is a is a neologism that I like to use to kind of describe what's happening here and it's a 21st century corollary to sampling but it's really distinctly different and that difference I think is really important.
It's different in what it can do and also how it came about so what it can do we've kind of gone into that already you know you train a model the kind of logic of one thing to be able to form new things through that logic so it's distinctly different from sampling which is really like a one to one reproduction of a sound created by someone else that can then be processed and treated to make something new but was spawning you can actually perform as someone else based on information trained about them so that's distinctly different.
But also the way that it comes about with sampling it's this one to one reproduction was spawning it's a little bit more of a gray area in terms of intellectual property because you're not actually making a copy the machine is ingesting that media if you want to call it looking at reading listening to learning from so I kind of land in that I like to call it the sexy middle ground.
Between people who are all for open use for everything and people who want to have really strict IP lockdown and so that's one of the reasons why spawning then kind of mutated even further into an organization which is something that I co founded with three other people Matt dry hers Patrick Kupner and Jordan Meyer to try to figure out this messy question of essentially data manners. How do we handle data manners around AI training because what's happening right now isn't working for everyone.
Are there experiments that you find exciting or that you've conducted that you found the results of them promising.
Yeah, I mean, I think Holly plus was a really fun experiment because people then actually used my voice and we were able to you know sell some works through that and generate a small profit but enough to be able to continue to build the tools for the community so that was a fun experiment that I that I think really worked and there's one experiment that I'm running right now that I'm really excited about my partner Matt dry hers and I have a exhibition at the serpentine London.
In October and as part of that we are recording choirs across the UK I think there's 16 in total and they're joining a data trust and we've hired a data trust to pilot this idea of governance where we're trying to work out some of the messy issues around how a data trust might work and then we'll negotiate with that data trust directly as to how we can use their data in the exhibition and moving forward.
I think it's a really fun experiment and it's also because it's singing and it's coral music it's not really sensitive health data we can really experiment and try out different ways to make this work in a way that's not dealing with such sensitive information so I'm really excited to see how people how people engage with that and you know how much to people really want to deal with the kind of day to day governance of their data that's also a big question.
You're saying earlier that often the models of the art but in this case the governance is the art. You know in this case I think the model and the governance and the protocol around it are all the art. This idea of control is interesting though I mean so it came out a while ago that Facebook that meta have been training its AI on a huge cache of pirated books.
And I think my book was in there my wife's book was in there like the books of virtually everybody I know we're in there and so a bunch of father sued. And I also felt some part of me like I wanted to be paid for my inclusion but I didn't want to not be included in all these and it reminds me of social media where at a certain point whether you wanted to be on social media or not.
It was sort of important that you had something representing you there right it could be not your real photo right you you could have some control over it but if you didn't do it then you had absolutely no control over what you appeared as online and it probably wasn't plausible you could
appear as nothing online so maybe something you didn't want would you know be your top Google search result. And here it's going to get even weirder because there isn't really you can't have your homepage in the artificial intelligence model all you are is training data.
And so there's something very strange about this you know if before all you were was kind of a profile which was a very flat inversion of you now you're training data which is a very warped version of you and this question of how do you have any control over that that data like if you want to participate but you want some definition over how you participate.
There's no real obvious avenue towards that there's none at the moment but I think that that's coming I think people will opt in under terms that they feel comfortable with to be able to shape the way that they appear in this new space I don't think it's tenable that people have no agency over how they appear in the future of the internet.
That feels that feels idealistic to me I feel like we've been we've been going through it for a long time where I would have said this level of data theft or use is not tenable the level surveillance is a tenable the level of flattening the way we get each other to treat each other in social media it doesn't feel like this is going to hold.
Like I am amazed that people are still on X as hostiles out platform has become to many of them it's just so impossible to imagine leaving something happening that they will accept something they really feel angry about they really feel like the way it is run as hostile to them that it is degraded but you know what are you going to do I'm amazed at how powerful the what are you going to do impulse is in life.
Well I mean I totally get that but what we decided to do was to try to build a universal opt out standard and it's actually gaining traction and there's precedent in the EU AI act ideally it would be something that would be from from the beginning all training data would have been at you know people would have been asked permission from the offset but that's not how things played out so now we're in a position where we you know we're building tools where people can really easily opt out the data.
They don't want to have included in these models we have an API where you can install that on your website and easily have everything on your website not being included in crawling so I do think that there are things that we can do there you know it requires a little bit of legislation requires a little bit of diplomacy but I don't think that we should just throw up our hands and say okay it's over we should just have everything you know if we do have a situation where we where we're able to do it.
We're able to get the opt out as a kind of standard then I think you can you can start to build an economy around an opt in something that I'm really proud of we just announced source plus so I'm not trying to show here but I think this is a really important part of this conversation where we put together a data set of all public domain data and it's huge and people should be training their their base models there and then you can allow people to opt in to fine tune their models and we can just be able to do it.
We can tune their models and create an economy around that if you have a public domain base layer model then you can actually create an economy around that but I don't think we should give up.
I definitely I don't think we should give up for a lot of people they need to make a living out of the work they're doing yes one thing that I find inspiring about the idea of thinking of it as a collective intelligence is it maybe points away towards the idea that there's modes of collective ownership or modes of collective compensation and at least in the in the space of art when you're doing it.
Art when you're thinking about this idea that you know you might have your voice out there for anybody to use I think for a lot of people that's scary right I mean we're very used to business models that are about nobody can use this thing of mine unless they pay me right we have patents we have copyrights what does that spark for you like what if we you know what are the ways to do this and more collective open source way that you think might work to make it possible for people to live but also to create well I think first and foremost it should not be a
one size fits all solution I mean you know we're talking about art and that encompasses so many different practices that function economically in so many different ways that's something that was really devastating I think when it came to streaming streaming was really revolutionary and
it was wonderful for a lot of people but it was really devastating for a lot of other people because everything had to have the same economic logic as pop music and a lot of experimental music doesn't follow that per play valuation logic lot of experimental music is about the idea and you just need access to that idea once you don't need to listen to it on repeat and so if the access to that idea cost a fraction of a cent that's going to be really difficult to pay for it.
It's almost more you almost need more like a movie model where you pay a little bit more to gain access to that idea I think what's really needed is that people have the ability to create whatever sub cultures and whatever kind of economic models work for their sub cultures and aren't squeezed into a kind of sausage factory where everything has to follow the same logic.
So I know you and your partner are working on this book for this forthcoming exhibition that has I think the most triggering possible title to people in my industry all media is training data what's the argument there.
Yeah so this is a book that's a it's a series of commissioned essays and interviews between me and Matt about our approach to AI data over the past 10 years I do realize that this is kind of triggering for a lot of people but I think it's something that's worth kind of recognizing you know as a whole.
As soon as something becomes captured in media as soon as something becomes machine legible it has the potential to be part of a training cannon and I think that we need to think about what we're creating moving forward with that new reality.
You know a lot of the work that we're doing around the exhibition is we're creating training data deliberately so we're treating training data as artworks themselves I'm writing a song book that a collective of choirs across the UK will all be singing from and those songs were written specifically to train an AI so all of the songs cover all of the phonemes of the English language so you can really the AI can get a full the full scope of the sound of each voice.
Of each vocalist so we're kind of playing with this idea of making deliberate training data kind of we like to call them mine children that we're sending to the future. I want to talk about what's triggering in it for a minute because I think when people hear that that might think media in the sense of the news and I'm actually least worried about the news because the news is where we're covering new things that happened that are not in the training data.
But media if you think about it broadly right visual media and music and all the other things human beings create I think what people hear on media is training data what they hear is everything we do will be replaceable right that the AI is going to learn how to do it and it's going to be able to spit it back at us.
And then it doesn't need us anymore when we become the training data we're sort of training our replacement right like the sort of very grim stories that will come out of factories before they outsource somewhere right where people are training you know the people are going to replace them at a lower cost is that how you see it if you're training data does that mean you're replaceable.
Art is a very complex cultural web it's a conversation it's something that's performed it's a dialogue it's situated in in time and place we wouldn't confuse a poster of the Mona Lisa for the Mona Lisa those are two different things so I'm not worried about artists being replaced or about you know infinite media meaning that artists have no role in meaning making anymore I think that the meaning making becomes all right.
And then the meaning making becomes all the more important I do think we have to contend with the future where we do have infinite media where the single images perhaps no longer carrying the same weight as it did before. Yeah there's some things to contend with but I think that we won't be replaced and I think it'll be weird and wonderful.
There are a bunch of programs that are coming out now that use AI to generate this sort of endless amount of pretty banal music for purpose so I have this one I downloaded called endell and it's like do you want music for focus do you want to sleep do you want it to and it's fine if I heard it on one of those you know playlists on Spotify wouldn't think much of it.
And I think it points towards this world where I think the view is we're going to know what we want and what we're going to want is a generic version of it and we're going to be able to get it and kind of vast quantities forever. But you're an artist and you said something in an interview I saw you give about how reality always ends up weirder it always mutates against what people are expecting of it.
And so I wonder how much use suspect or or see the possibility of the sameness at AI makes possible the kind of endless amount of generic content leading to some kind of backlash for people actually get weirder in response both weirder with these projects but also you know more interested in things created by humans in the same way that you know a lot of artisanal food movements got launched by the rise of fast food I mean how much do you think about backlash and the desire for differentiation as something that will will shape culture is a lot more important.
And so I think that it will will shape cultures and software here. Well there's a lot in there I mean the backlash has been huge I think that it AI is certainly joined the ranks of culture wars and especially in on Twitter. So I think the backlash is already there but I think we're also in really early days.
Some of the examples that you gave I feel like they're kind of trying to please everyone and as we move into a situation where your specific taste profile is being catered to more I think it will feel less mid and feel more bespoke. One direction where some of this can go I think a lot of people are really focused on prompting at the moment because that's how we're interfacing with a lot of models but.
In the future it might look more like you know maybe you have a kind of taste profile where the model understands your tastes and your preferences and the things that you are drawn to and just kind of automatically generates whatever media that would kind of like please you so the kind of.
Production to consumption pipeline is kind of collapsed in that moment one of the things that I always appreciated as a young person growing up was hearing things that I didn't like and didn't understand and that was something I always found really difficult with algorithmic recommendation systems as I just kept getting fed what it already knew that I liked but you know when I was just being exposed to new music as a young person I really needed to hear things that I didn't like to expand my palette and I was.
And so that's one thing that I think you could just kind of have like a stagnation of of taste if people are constantly being catered to so I think I think people will crave something different or will crave to be challenged some people won't but some people will. And so that's what that occurred to me while I was looking through a lot of your work was that what I enjoyed about it was you were using the relationship with the generative system to make yourself and make the work stranger.
And I felt refreshing to me because my experience using chat you P T or Claude or anything really so often is it it makes me more generic. And that there's this way in which AI feels like it is this great flattening it'll give you a kind of lowest common denominator of almost anything you know that human beings have done before. And that the danger of that feels to me like it it's a push towards sameness.
Whereas a lot of your art feels to me like a push towards weirdness and a kind of sense that you can interact with different versions of these systems in a less. Sanded down way and find something that neither a human or machine could create a loan is that a reasonable read of what you're doing is there something there.
Yeah, I think that's largely because I use my own training data I create training data specifically for this purpose of training models rather than using something that's just laid out for me. I think you get a lot of kind of mid or averaging from these really large public models because that's basically the purpose you know it's supposed to kind of be a catch all but I'm not interested in the catch all I'm interested in you know this weird kind of vocal expression or I'm interested in this.
This other weird thing and so that's what I really want to create training data around and really focus on for whatever my model is so I think people should just get into training their own models. I want to end by going back to a song from Proto and it's one of the stranger songs on the album and I thought maybe we could just sort of talk about what is doing in people can hear it so why don't we play a clip of godmother. What's happening there?
Yeah, I when I delivered that single to 480 for I was like here's a single for the next album they were like okay what do we do with this. So this is I guess a really early voice model trained on my voice so if you compare that to the the Jolyne song that's basically how far we've come in the last five years which I think is just remarkable.
The speed is incredible so I trained spawn on my voice my singing voice and then I fed spawn stems from a collaborator of my name Jaylin and so spawn is attempting to sing Jaylin stems through my voice and Jaylin's music is very percussive it's mostly percussion sounds so it ends up being this kind of like almost like a weird beatboxing kind of thing because it's trying to make sense of these these sounds through my voice.
Well here I want to play a clip of the Jaylin this is one of my favorite songs from her it's called the precision of infinity. And so yeah it's not that's a machine it's just something that a human being cannot do quite on the run is like a film class sample in there it's beautiful but I don't know it's funny when you say that that spawn feel so old because something I like about it is it feels very compared to a lot of us coming out now it's
it's strange in this feels much more modern it feels true to how I feel to me. Then the much more polished things were currently hearing or seeing which it's like this is exploded in all of its
and all this effort is being made to make it seem normal and I think the reason proto sounded very current to me when I heard it for the first time this year. Is it in sounding abnormal it feels more actually of this moment which feels very strange even as everybody keeps trying to make it seem not that strange.
Well thank you I appreciate that I feel like at the time I was you know this AI conversation has been going for so long the hype was kind of already started back then and I feel like so many things that were being marketed as AI.
It was kind of misleading what the AI was doing or how sophisticated things were so at the time a lot of people were creating AI scores and then having either humans perform them or having really slick digital instruments perform them and so it was giving this impression that everything was really slick and polished and finished and that's why we decided to focus on audio as a material specifically because you could hear how kind of scratchy and weird and unpolished things were at that time.
And that's I wanted to meet the technology where it was and that required a whole mixing process with martis logan is an amazing mixing engineer and London to try to get the human bodies in the slick studio to occupy the same space as the kind of crunchy. Low fi spawn sounds but it was really important to me that I wasn't trying to do the whole smoke and mirrors of like this is some glossy future thing that it that it wasn't because I actually found the weirdness in there so much more beautiful.
As somebody who has now been for years playing around with models and working in these more sort of decentralized possibilities I think it's easy if you're outside this and don't have any particular AI software engineering expertise as I don't as I think most of my listeners don't.
And you see well this models by open AI by Google by Facebook it feels like no human being can do this right these are companies getting billions of dollars how are you able to participate in this sort of world of models like how much expertise do you need how do you figure out like what are the interesting projects like if someone wants to understand this kind of world of home brew AI so to speak.
How did you started where do they start that's a really good question I mean I think the landscape has changed so much since I started I would say you know first thing you can interact with publicly available models and once you kind of understand how those are working then I would just do the really boring work of reading the academic research papers that are tedious take your time drink a coffee watch the YouTube video where they presented at a conference and I was like, I'm not sure if I can do that.
I'm not sure if you're interested in learning more the information is out there you just kind of have to roll up your sleeves and get your hands dirty. I guess it is place to end. So always our final question. What are three books you would recommend to the audience? Okay, so Resa McGarristani wrote a book called Intelligence and Spirit. It's a pretty dense philosophical book about intelligence and spirituality that I think is really great.
On a lighter side, Children of Time by Adrian Chakowski is a really enjoyable AI science fiction about intelligent, genetically modified spiders and this. It's a favorite book. Yeah, so good. So you kind of see the kind of society and technology that a super intelligent spider society would build, which I love. And then there's a book called Plurality that was led by Glenn Vile and Audrey Tang and a wide community of contributors. I also contributed a small part to this book.
It's about the future of collaborative technology and democracy. And it was actually written in an open collaborative democratic way, which I think is really interesting. So check it out. Holly Hunden, thank you very much. Thanks so much. This was really fun. This episode of The Azurclandia is produced by Annie Galvin. Fact checking by Michelle Harris, our senior engineers Jeff Gell with additional mixing by Amin Sahuta, our senior editor is Claire Gordon.
The show's production team also includes Roland Hu, Elias Isquith and Kristen Lin. We've original music by Isaac Jones and Amin Sahuta, audience strategy by Christina Samueluski and Shannon Basta. The executive producer of New York Times Pitting Audio is Annie Roastrosser and Special thanks to Sonia Herrero and Jack Hamilton. ["The Azurclandia is a Neroastrosser and Special thanks to Sonia Herrero and Jack