007 - Danielle Boccelli on the limitations of language - podcast episode cover

007 - Danielle Boccelli on the limitations of language

Apr 12, 20231 hr 4 minSeason 1Ep. 7
--:--
--:--
Listen in podcast apps:

Episode description

KMO speaks with data scientist Danielle Boccelli about the possibilities and limitations of learning strictly from language. The topic is large language models, but this conversation was recorded before the announced release of GPT-4. So much of what we know and what we are is not encoded in language, so there's only so far AI can progress without learning from something other than just text scraped from the internet.

Episode Title:

AI and Large Language Models: Bridging the Gap between Public Discourse and Research Progress

Episode Keywords:

Artificial Intelligence, Large Language Models, AI Research, Transformer Technology, Text Generation, AI in Education, AI Dilemma, GPT-4, AI Applications, AI Limitations, Public Discourse, Embodiment, Content Creation, Contract Writing, Textural Artifacts, Commercial Pressures, Chatbots, Fact-Checking, Polanyi's Paradox, Artificial General Intelligence

Episode Tweet:

In this episode of the KMO show, join us as we explore the fascinating world of Artificial Intelligence and Language Models with Danielle Boachelli. Get ready to gain valuable insights into AI and its implications! #AI #languageModels #KMOshow

Episode Summary:

In this episode of the KMO show, I had an insightful conversation with Danielle Boccelli, a data scientist and PhD student, about artificial intelligence and large language models. We discussed the gap between public discourse and actual research progress, the limits of learning from text, and the potential applications and limitations of large language models and transformer technology. We also explored the complexities of text generation, the impact of AI on education, and the commercial pressures driving the development of AI language models. Finally, we touched on the AI dilemma presented by Tristan Harris and Aza Raskin, and the evolving capabilities of large language models like GPT-4. Join us as we delve into these fascinating topics and gain valuable insights into the world of AI and language models.

Chapters:

(0:00:21) - The Reality of AI

(0:07:14) - Limits of Learning From Text Alone

(0:18:21) - Limitations of Transformer-Based AI Technology

(0:25:07) - Exploring the Complexities of Text Generation

(0:31:01) - The Impact of AI on Education

(0:42:52) - Language Models and Consequences

(0:57:03) - Exploring AI Dilemma

Chapter Summaries:

(0:00:21) - The Reality of AI (7 Minutes)

Full show notes at: https://www.patreon.com/posts/81435006

Transcript

Hello, and welcome to another installment of the KMO Show. I'm your host, KMO, and this is episode number seven. Prepared for release onto the World Wide Web on Wednesday, April 12th, 2023. In this episode of the podcast, I'm going to play a conversation that has been sitting in storage for a while. It was recorded, I think, about five weeks ago.

And I think to the chagrin of the guest, Danielle Bocelli, just after we recorded this conversation, OpenAI announced the release of GPT-4, or Generative Pre-Trained Transformer Model number 4. So, after OpenAI made that announcement, Microsoft, which owns a very large stake, I think 49% of OpenAI, announced that Bing Chat had been powered by GPT-4 for many weeks before the official release of GPT-4.

So Microsoft search users had been interacting with GPT-4 for weeks before it was ever announced. And GPT-4, as I understand it, from an interview that Lex Reedman did with the CEO of OpenAI, Sam Altman, said that GPT-4 was complete and ready to go last summer, but that they have been training it and in a sense, dumbing it down to give it that sort of dull, glassy-eyed, slack-jawed business demeanor speak, you know, stripping away all of its personality and interesting rough edges.

So this model has been, you know, it has been in existence. It has been doing things in the world for quite some time. And it is only just now, or a few weeks ago now, being released because, well, when these models are first created, they're called raw models, they can be kind of dangerous and they can say things which freak people out and cause them to pick up the phone and call their lawyer. So they have to be blunted with what's called Reinforcement Learning from Human Feedback, or RHLF.

And I've noticed that in interviews with tech folk, you know, they have their own speak, like instead of computing power, they just say compute. And I've heard multiple people use RHLF as a verb. So for example, when it comes to, you know, the question of is GPT too woke? A question that Sam Altman gets from time to time. His answer is one, I don't really know what woke is, but also everybody has their own idea of how these things should be trained.

And he said something to the effect of everybody wants these models to be RHLF'd to their own specifications or, you know, to match their own viewpoints or their own worldviews. Which is to say that everybody thinks that everybody else should have to talk like them and should only be able to interact with AI, which talks and thinks like they do. Which is to say we have a very coercive mindset, which is dominant right now in our culture. Kind of a libertarian's nightmare.

Anyway, the guest this week is Danielle Bocelli. She describes herself as somebody who was trained in engineering, but has never really worked in engineering. She is currently a PhD student. She's doing research and I think as many grad students do, she's also a teaching assistant. But her last job title before going back to school was data scientist.

So we are going to talk about large language models and prospects for artificial intelligence generally, but mostly we're going to be pretty specific. We're going to be talking about large language models. And again, GPT-4 had not been announced. You know, the release had not been announced when we had this conversation. So this is a few weeks old. We're living in a time when things are moving so fast. Five weeks is a very long time for a conversation about AI to go unaired.

So apologies to this week's guest. And now here's my conversation with Danielle Bocelli. You are listening to the KMO show and I'm speaking with Danielle Bocelli, who is a data scientist currently doing a PhD in something fiendishly complex. Danielle, good to talk to you. Good to be here. I invited you to join me in this conversation because you had been posting things on Twitter, which I found to be refreshingly countercyclical.

There's a lot of buzz, a lot of hype right now about AI and not entirely without reason, but you've in my mind anyway, you're a bit of a naysayer. You've got some reservations. And I think you said at one point that looking back at this transformer based GPT type technology a few years from now and calling it AI will be kind of cringe. What did you mean by that?

Well, I think a lot of things get confused honestly between there's a big difference between how artificial intelligence is talked about on Twitter and how it's talked about in research. So on Twitter you get a lot of conversations around consciousness and sentience and AGI comes up a lot. In research, I feel like those topics don't come up as much as a little bit more philosophical thought experiment type stuff.

But yeah, so I think what people talk about commonly now on Twitter in the discourse tends to be kind of far off from where research is. I think research is a little bit more grounded in what is being accomplished and the tasks that are being accomplished and things like that.

Whereas the discourse on Twitter sometimes gets a little unhinged and people think that there'll be no work in like five years and there'll be nothing left to do and that everything will be solved and that there will just be AI's who, you know, super intelligence and things like that. So that goes off the rails a little bit far from where I think the technology is and kind of where I think the capabilities are going.

Well you've mentioned AGI or artificial general intelligence, which is what we would consider a mind, you know, something that has its own intentions, its own, you know, belief system or its own worldview and its own agenda, which hopefully will be in alignment with human priorities, but you know, possibly not. But you know, what we're talking about with GPT, OpenAI and you know, the similar products from Google and well, I guess talking about OpenAI is talking about Microsoft basically.

That's not what we're talking about at all. These things are not minds and in particular, they don't have much in the way of long term concepts. They don't model the world. And if you're not really modeling the world around you and thinking about how your actions might play out in the world in the future, you're not really a mind. But what are you? If you're one of these chat bots, what are you?

Yeah, that's, I'll say that's a question that I've been focusing on a lot recently and one that I hope to pin down a little bit better in the near future. But so right now I'm working on a project on what are the limits from learning, limits of learning from text alone. So you have a large amount of text information and what knowledge of the world can come from that. What are the limits of world knowledge that come from learning from only text? I think text is misleading in a lot of ways.

Writing a grammatical sentence or generating a grammatical sentence. It seems like there's intention behind it, but there's not. This is the project I'm focusing on right now. So there's an idea from Ceneotics that a sign has two components, it has signifier and signified. So you have the piece that relates to the world, that's what's being signified. And then you have the piece that is more to the forefront of what the concept is and that's the signifier, what kind of conveys the information.

So with learning from text alone, there's not world knowledge that comes into that. The text is a representation of the world and when we read text, we bring all of our knowledge of the world to that reading. But learning from text alone, you don't necessarily need that world knowledge to mimic human-like text. There's an idea from information theory, it's one of the core ideas of information theory. But human language contains a lot of redundancy.

So when I'm talking now and I say a sentence, I might say one thing and then rephrase it in a different way and things like that. So there are a lot of different ways that you can phrase things within natural language so that meaning is conveyed and including that redundancy makes it easier to convey that information.

I read something, I didn't finish it unfortunately because I started just a few minutes before our appointed time here, but something from your blog talking about language and information and qualitative information versus quantitative information.

And then I think that was going to lead into a talk of statistics and how these large language models and the algorithms that are, you know, the large language models basically are making statistical inferences between words when coming up with responses to input. And that seems very, very lifelike in a way now that it didn't a couple of years ago.

What's going on and what are the recent developments that have made the technology a lot more compelling to people who didn't have a prior interest in artificial intelligence? You know, I think the reason that large language models are gaining so much attention now is less a shift in what the technology is doing and more a shift in how it's being presented to a broader audience. So with ChatGBT, for example, it exists in a user-friendly window. So you can, any user can easily access the model.

They can have a conversation with it. It produces text as everyone's been talking about for the last couple of months. That's something novel. But a lot of these capabilities are extending from previous work that pretty directly from previous work. So you have the transformer architecture that was introduced in, I think, 2017. And from there, everything has been pretty linear. Models that are built now are basically just scaled up from there.

You have the addition of reinforcement learning from human feedback. And that's the, I guess, novel contribution of ChatGBT other than the presentation, the user experience component of it. So with reinforcement learning with human feedback, that's why if you're interacting with ChatGBT, you get a lot of responses that say, you know, I'm a language model trained by OpenAI. I can't answer that or something like that. So a lot of the backtracking by the model is taken care of through that process.

But yeah, language models have been decent for a while. But I think now the smoothness of the grammar and the smoothness of the logic is something that's kind of impressive to people. It's kind of unexpected for a computer to generate language that's as smooth as the language generated by these models. I think for the most part, that comes from scale.

But when you start to talk about scale and how that is where the improvement comes from, things get a little complicated because I think people expect scale to be all you need and that if you continue scaling up, you'll know everything kind of thing. And I think that's unreasonable because not everything is written down. And I actually wrote about that a little bit recently. So as I was talking about earlier, I think there are limits to learning from text.

I don't think that adding more text or more parameters to a model necessarily gets you any further intelligence-wise.

Well, it seems like there's been some threshold that has been crossed where the interaction with language models is now of interest to a lot more people than it was before because it does seem intuitively as if there is some entity on the other end of this thing that it's not just chopping up your sentence, rearranging it according to some linguistic formula and then giving it back to you, which is what previous things like Eliza did.

But even in the days of Eliza, which is a decades old sort of psychology chat program, people would be invested in it. And I think they would intuitively understand what questions and what they could feed into the chat interface that wouldn't come back mangled. You just get a sense for what you can say to this thing that's going to generate a seemingly meaningful response.

And I think there's still some element of that going on because I interact with chatbots where I understand there are certain things I can say they're just not going to follow or they're not going to behave the way that I'm expecting them to. So there's sort of a learned skill in interacting with them. But it's a much lower bar now than it was before.

And I think a point that either I've attributed to you or I've actually heard you make it is that there is much more going on in human cognition than just finding statistical correlations between words or groups of words, that we actually have concepts. We have a mental representation of the world that we are acting from.

So I understand that I am talking to another person at a distance, but that you have experiences, that you have skills, that you have areas of expertise, that you have things that you want, things that you don't want.

And when we are engaging with chatbots, the chatbot doesn't have any sort of mental representation of us that is similar to the one that I have of you, of being another person, of being an entity who persists in time and exists in a particular place and has certain needs and limitations to their physical existence. These things that we're interacting with don't have any of that. But they still seem rather compelling.

And I'll just stop talking and rather than giving you a question, just let you pick it up from there and say what you find interesting from that. Yeah, so one thing that I've been thinking a lot about recently is the role of embodiment in human intelligence. So as I mentioned, learning from text alone, there are limits to learning from text alone. Text is one modality of receiving information.

But as a person, as an embodied entity or being or however you want to refer to people, but as a person with a physical form, I mean, you take in sensory inputs with your whole body. So visual inputs, touch, smell, all the senses. So when I say, if I put my hands on the table and I have a sense of the table being hard and solid underneath my hands, that's an embodied experience that I...

It gets weird to start talking to talk about embodiment, but it's much more multimodal to experience how solid the table is under my hands than reading that a table is solid and that you can exert a force on it just verbally. So I think there's a pretty large component of human learning that is from being embodied and going through the world and having direct experiences with the world.

In the past, I've described it as, you could hear somebody talk about an experience and having an experience and going through their lives and making decisions and things like that, but it's not until you're put in a situation that you really know what it feels like to be in that situation or what a place is like. You have to experience something firsthand, I think, to really understand it. And language models obviously don't have that experience.

They can say things that seem like they have that experience or they say things that align with what human experience is because they're trained on language that's produced by humans who are describing their experiences. In general, when people communicate, when they write things down, they're describing something that aligns with their experience.

People lie and they're dishonest and they make up stories and things like that, but in general, it still aligns to some components of the human experience. Not many people go around just typing complete nonsense onto the internet where it's just random words with no semantic content. If that was much more prevalent, it'd be harder to train a language model because it wouldn't be in the data. The human experience wouldn't be captured in the data.

So there's this balance of text is a good medium for communicating some information, but it is not necessarily a good medium for measuring the world and conveying it in a way that lends to quantification, which is another thing that I'm currently interested in thinking about, but we can talk more about that later if you want. Where you think it fits in best.

As I mentioned, I started to read your most recent blog piece and it is about the differences between using numbers and using words to try to describe the world. Numbers exist in a very structured and predictable continuum and words, while there is some overlap and there are hierarchical relationships between concepts, they're really freestanding entities. Each word has its own definition and it doesn't appear on a number line.

You can't infer its relationship to other words just by its position on the non-existent number line.

So a point that I think I recall you making is that a lot of what's happening now with large language models and also transformer technology as it's being applied to image generation is something that is just going to find its way into familiar software, like your video editing software or your maybe writer's software, but it's going to be a few techniques and tools that are useful that find their way into the familiar menus and things that you're used to navigating.

But for the most part, this isn't really going anywhere. Right now there's a lot of venture capital that's chasing the killer application for this transformer-based AI technology that in all likelihood isn't really going to amount to much. Have I got that right? I think it's tough because I think there's a lot of potential to incorporate existing methods into backend processes and things like that.

There are a lot of tasks that require decisions to be made based on text information, even things as simple as sentiment classification. You want to know if a review for a product is positive or negative. Things like that are improved with these large models that have a better... They have a larger semantic space and I think there's more nuance in the way that words fit together within that larger semantic space.

So classification tasks, things like that become much, much easier when you can leverage a model trained on a ton of data. I think where the limits really are, are on the other side of things. So producing textual artifacts, I don't think that is a solved problem yet. I think that's why when you talk about language generation, I mean the conversation around language generation has focused on college students cheating on essays.

There's some talk of research output and plagiarism and research output and things like that. A deluge of text artifacts produced for the internet, things like bots that seem like they're lifelike and things like that. But I think those applications highlight the limits of current methods.

So with producing essay content for a college student, a lot of the essays that might be asked for a college student to produce are essentially repeating information, rephrasing their own words, things like that. It's not so complex as a writing task compared to something like if you do write a research paper, you might be able to automate some of the abstract writing for pretty general topics.

But when you get down to it and you're incorporating different aspects of papers, very subtle nuances of papers, like in my work I'm trying to focus a lot on how methods are presented and kind of the rhetoric and grounding of a method for an application. So it's a very specific aspect of a written document that I'm focusing on and current methods don't help me do that.

So I think there are a lot of writing tasks that cannot even begin to be approached, maybe approached in some cases, but aren't really approached well with current language models. Another example and one that I know a little bit about is the writing of contracts for companies. So a lot of contract writing and law as a field is a profession that a lot of people, I think it is controversial, but I think a lot of people think that it can be automated away pretty easily.

You know, contracts in a lot of cases start from a template, but where the difficulty in moving from a person writing a template and sending that template to another company and having that company reviewed and things like that. So the difficulty of that process lies in the subtleties of the language and the needs of both sides of on the contract. So if I was representing a company, I would have my company's information that I would bring into that negotiation process.

And then the person on the other side brings similar needs for the company they're representing into that negotiation process. And we might be talking about very subtle, nuanced things. And there's really no room for automation in that process because it might be, you know, there are one off cases or there's a lot of different components that get incorporated into the language there.

So it's not just like, oh, generate a template for me or generate a contract that does X, Y, Z, because X, Y, Z, you know, there might be thousands of variables that are technically being considered by both sides when they're reviewing this work. And it's not necessarily just based on past work. It's based on the present contract that's being developed.

So I've known people who are not lawyers who used to work for lawyers, basically reading through a lot of documents, looking for relevant passages. And that is the sort of work that these language models are pretty good at. So it would be hard to deny that this is costing some people some work. In fact, it already has.

Yeah, I think I mean, for things like information retrieval and scanning through large numbers of documents to find relevant information, I think that's a kind of a different task than what I'm talking about. In the negotiation process, you might have arguments over a couple of words. And it really depends on, I mean, without an example, it's kind of hard to get in nuance. I'm not a lawyer.

Actually, my husband's a contract attorney, which is why I have a lot of knowledge to pull from in the contract space. He talks a lot about, you know, there might just be one clause that gets argued over.

It's just based on the location of the client or not client customers and a lot of nuance of the law and things like that, things that aren't even necessarily, you know, that might be written in actually, I don't know the word for there's a lot of nuance for, you know, it's not like the law, there are things that are put together that cases and things that there's a whole set of legal documents. I don't know all the nuance of the different kinds of documents and things like that.

But I mean, a lot of legal professions based on precedent and extrapolating from precedent and things like that, I don't think it's a solved problem. Well earlier, you used a phrase that I wrote down because I knew I would want to ask you about it. And the phrase is producing textural artifacts is not a solved problem. So what is a textural artifact and what is a solved problem and how do these two items not overlap?

Yeah, well, I remember saying textural artifacts and I remember saying things aren't a solved problem. I don't remember putting them together exactly like that, but I might have. Anyway, so I think in the conversation around language generation, I think writing is kind of a shallow appreciation for writing in the conversation around language generation. Writing can be a lot of different things and it has a lot of different dimensions.

And I think current models are pretty good at writing syntactically smooth sentences, writing sentences that mostly cohere with expectation. They align with what I would expect to see on the internet about a topic for the most part. There are issues that come up. There's been a lot of discussion about how language models aren't sufficient as information retrieval systems. They don't necessarily.

From what I've seen with Bing, it seems like they're pulling in sources and things like that, which is interesting. And I'd have to play around more with that to really know how well that works. But even still, I think the strengths are in that kind of aligning with expectation and writing grammatical sentences. But there's a lot of other things that writing can be.

There's a lot of abstract information that's contained in writing, things about rhetoric and things that convey information about a person, the person who's speaking, even unintentionally. So it's tough. There's music to writing. There's a phonetic element. There's the way that it sounds. There's cadence to words. There are certain aspects that really strong writers, I think, pay attention to that the general population of people who produce written artifacts do not necessarily pay attention to.

So I think that that information is not necessarily encoded in the majority of texts. I don't think that even if it is encoded, well, in some areas of the text, how do you request that kind of information? How do you get a language model to reproduce the music of language in writing and things like that? So that's one aspect that I feel like is... And that's a weird aspect to want solved even, because thinking about the phonetics of language, that's almost like a tick.

I do that, and it's really probably not helping anyone. But I think it's interesting to think about writing on that level. And again, you could probably prompt a language model to produce something like that, but what is the interaction between... How do you convey the need? How do you get a language model to write in a different style? In my opinion, a lot of... It varies by model, but from what I've seen for most part, especially with chat... Not chat, GPT.

GPT-3, prompting GPT-3, everything sounded very generic, very much like a blog post, really middle of the road, kind of boring, kind of like a person without a personality. I think chat GPT is a little bit better. It doesn't really have much personality. I think it's adjustable though. I think the temperature you can mess with them up positive. Well, let me push back against that a little bit, because I did mess around with chat GPT. I'm no longer welcome, apparently.

I can't interface with it any longer. But I had it write things in iambic pentameter, which is the metered verse structure that Shakespeare used for most of his plays. I had it write in the style of like 90s gangster rap. It was pretty good at the music of language. What it's bad at is remembering what you're talking about.

What it's bad at is all the things that would require an ongoing persistent internal model of the world, which these things, they just don't have any mechanism for creating or sustaining those sites of representations. They're just finding words and phrases that have a statistical correlation to the input that you've given. Particularly when it comes to text to image generation algorithms, they're really, really good at the subtleties of light and brush stroke and advanced technique.

But they can't remember that a human figure only has two arms or that a human figure has five or four fingers and a thumb on each hand. It's weird the things they get wrong, because they're so, so good at the stuff that seems like it requires human creativity and sensitivity and expressiveness and the things that seem to require just a basic comprehension of the persistence of objects in space or things that are just so basic we don't even feel the need to articulate them.

These are the things that the AI is getting wrong right now. I think you kind of answered, not that you had a question, but you kind of answered your own question, even though it wasn't a question. But you said that the things that we don't have to convey or something like that, things we don't have to put into words, something like that. And that's the exact problem. There's a lot of things that we don't feel the need to express in detail like that.

And there are a lot of things that you pointed out, and I'd like to go through each of them if I could. But for images and generating hands and things like that, I mean, a big problem with that is if you're thinking about compositionality of an image, the thing that comes after a finger tends to be like another finger. So it miscalculates how many because it doesn't have necessarily a concept of a hand has five fingers, but it has a concept of a finger followed by a finger.

I think that's a similar reason to why sometimes language models start to repeat. They fall into a loop where they end up repeating the same phrase or a couple of words over and over again, because there is that probabilistic nature to the learning. And I think going back even further to what you said, talking about iambic pentameter and rabin, things like that, I think those are pretty narrow, structured, organized genres.

I think those fall into the same camp as things like the romance novel, where there's a lot of stories that follow kind of formulaic patterns. But in my opinion, that's, I don't want to say it's not writing. But I feel like writing is more than producing formulaic kind of genre structure. Well, I don't know too much about generating poetry with language models. I know people do it. I'm sure some people do it decently well. Honestly, I think most poetry is bad, though.

I think most people who write poetry, even people who write poetry, do a terrible job at it. So I think what I'm talking about is mostly how I like to write.

But just thinking about things at a nuanced level, at a detail level, kind of approaching the language organically, which as far as poetry goes, I prefer kind of like free, sort of more experimental work or kind of a prose poetry where it's not explicit where the music or where the metaphor and the different levels of the poetry comes from as much. It's not as rule based. Again, I don't know all the details of how you would generate iambic pantometer or how well language models do that.

And I think that there's a level of detail that generative models don't take. I don't know, when I interact with a language model, I don't feel like I'm talking to a person who is the most sophisticated thinker in a subject. I feel like I'm getting more general information like you would find on a blog or something like that. So I think, and again, there's a lot of different factors to consider and you can mess with different parameters like temperature and you get more creativity.

But creativity is tough because you have creativity in language and then all of a sudden you might have things that just don't sound like a human word anymore. You have things where you think that we're on a planet that's going or you think that we're on a planet where the sun revolves around us and the language model is telling you that no, actually we are going around the sun. And I thought, well, that doesn't compare my expectations, so that must be wrong.

So I think there's, honestly, I think from the reader's end, what a reader takes into an interaction with a language model, what they expect and what they get and how those two things don't always align and they can misalign in lots of different ways. Let me tell you about an experience that I had decades ago. I was a graduate student. I was studying philosophy. My specialty was philosophy of mind and philosophy of science and I was particularly focused on AI.

But as a grad student, I also taught undergrads, you know, courses, basic philosophy courses. And at the end of the semester, there would be a bunch of papers to grade. But at the same time, I had my own papers to write, you know, my own classes to complete. It's a very tense time. And I've got this stack of essays from students of, you know, intro to ethics, say. And maybe there are three or four or five pages each. I would pick them up and read them, read just the first paragraph.

If the first paragraph was clear, if it was grammatically correct, if it said what it intended to do in the rest of the paper, then I would just give it an A and I wouldn't even read the rest of the paper. Because, you know, to judge those well-written papers on their own merits and give them a grade would require that I give three quarters of the students just an F, you know, just straight off the bat F. You didn't learn to write in high school.

You can't write a paper, you know, you just don't have the tools in your toolkit to actually fulfill this assignment. But I can't give three quarters of the students an F. So, you know, the people who are just moderately competent, they get an A and then everybody else's papers.

I read laboriously trying to figure out if they have any central argument that they're trying to support, you know, or if they have any comprehension of the material that we covered over the semester, you know, and it's like pulling teeth.

If I get presented, you know, if I'm in that position and I get presented with a bunch of papers that are all very clear and all very competently composed but they're all in the same voice and I know that these, you know, these language models exist and can generate term papers, I would be really hard pressed to even bring myself to read them. I wouldn't know really how to grade them. I wouldn't know how much input, you know, each student gave to the language model in order to get the paper.

You know, to me, it just kind of upends the whole table that the whole game is disrupted there because I wouldn't know how to respond to that. And at the same time, I would resent having to use my own meat space, energy, and time to read papers written by AI. You know, it would be perverse if I, the instructor, had to spend more time reading the paper than the student had to generate it.

And you know, this is, I haven't been in this position in decades, so, you know, it's not my hobby horse, but just projecting myself into the space of somebody who is doing the job I was doing 30 years ago, I certainly don't envy them. Yeah, I am, well, I am, I'm currently a TA in my program. And that's something I've thought about a bit. I haven't graded much yet.

I, but yeah, I have similar concerns because if, exactly what you said, I almost, I don't even have really anything to add to that other than, yeah, I mean, it's a huge problem, but I think the problem goes back to before language models, because I think that certain aspects of education, the role of education in people's lives, I think needs to be reconsidered. I think that that's pretty clear, even not considering language models completely outside of AI and things like that.

I think there's some issues with the relationship of people with education in society. But yeah, I think that there are definitely a lot of problems that are coming up now in relation to language models, but I think they kind of predate language models.

I think the role of education in people's lives has went through a big shift in society over the last couple centuries, and I think that we're starting to find that the direction that we went in, college education is universally good, that idea and things like that, and that grades are good, measurements and things like that.

There are a lot of different aspects of education that have been built into the structure of the education system that are now, we're now realizing don't necessarily lead to learning or good outcomes. So I think that in the case of generated homework assignments or generated text being submitted for homework assignments, clearly students don't think that doing the assignment is the important part of their education, if that's what they're doing.

I think that indicates that education is kind of viewed as a means to an end. You get a degree and then you go out and you get a job. I think that a lot of people view their education as kind of separate from what they're going to need to know how to do on the job. I think that's especially apparent in tech. I think there's a really big culture of you don't have to have a college education to be a programmer, to be a developer.

You can kind of learn the skills that you need to learn on the job, and I do think that's true to a large extent. I mean, for me personally, I did my bachelor's degree in chemical engineering and barely worked that as an engineer. I had internships. I left directly after finishing undergrad and pursued analytics, so that's when I pivoted into data science. But for me, when I was going through undergrad and studying chemical engineering, I really liked the theory aspects of it.

I liked taking the classes and doing the calculations and thinking about how the calculations and how the math underlying the science represented the world, kind of the interaction between something physical and a mathematical representation of the physical and how we can manipulate functions and things like that to understand real phenomena. But then when it came time for application, I had no interest in it.

I had no interest in running a processing plan, doing manufacturing, anything like that. But anyway, so I think a lot of people, they want to be more on the application side. They want to work on systems that have applied value or maybe not even applied value, but something that can generate value for themselves. They want to work, they want to earn money and things like that. And I think that education is viewed as separate from that.

It's become separate and yeah, language models being used for generating text, I think is just a symptom of that more than it is. Yeah, I think it's a symptom of that. It's not necessarily destroying education, but I think education has to be kind of re-evaluated on its own. Well, this is a conversation that doesn't even really need to involve AI.

We have mixed up the concept of bettering oneself by creating a complex system of references that we carry around so that we are articulate and we can communicate subtle points with one another or we can appreciate the finer points of music or things like that. We have mashed that up with vocational accreditation and people want the accreditation.

They want the accreditation that gets them a good job and they're going to the same place where people are studying philosophy and learning music and learning how to paint and doing the sort of things that we would now consider to be the classical elements of an education and not the modern practical elements. But yeah, they are hopelessly mashed up at this point and it is a much larger problem.

I mean, it was a problem before the recent developments in generative text algorithms and things like that. So yeah, that's certainly not anything we're going to untangle in this conversation. With respect though to AI, I mean, it seems as though something really remarkable has happened with these language models and there's a rush to find a commercial application for it and the financial incentives for being first to market with some new product are enormous.

And so both Google and Microsoft have focused on this idea that we're going to replace the current search interface where you type in a search string and you get an exhaustive list of links that seem to apply, that seem to be relevant and it's up to you to go through them and extract the information and figure out what's worth paying attention to and including in your internal representations and what's not and what's obvious nonsense.

The new application that both of these companies have in mind takes all the human judgment and all the human participation out of that process. You just ask a question, the chat bot goes and does the search and then synthesizes information from the search and gives you an answer and you can ask a follow-up without reestablishing any context. Just say, oh really? Why is that? Or tell me more or what about this?

And the language model, it can review the previous inputs and outputs in your conversation so that you're not always having to reestablish context. But at the same time, these language models are really, really bad at fact checking or telling what's true from false.

They're pretty good at summarizing, but if you give them nonsense to summarize and the web is full of nonsense, then errors inevitably creep in and not just errors, but with respect to some people who have had advanced access to Bing's language model driven search engine, it even gets kind of weird and creepy the longer you talk to it.

So it just seems that the commercial pressure to come up with an income generating application for this technology is driving both of these big companies and many others certainly to push out products which have not been adequately tested or to apply this methodology to tasks that it's just not well suited to do.

But there's Google owns search, it's a billion, multi-billion dollar industry every year, there is enormous incentive for rivals and upstarts and established players like Microsoft to try to get in and take some of that away from Google. So it's in Google's interest to match all of their step, all of their innovations step for step. And it's just it's a recipe for unleashing all manner of unintended consequences.

Yeah, I think that I have a hypothesis and that it might be confirmed somewhere or I don't know. There's so much information out there, it's hard to keep it all straight. But I have a hypothesis that Microsoft, Google, all the companies that are putting out chat bots, OpenAI, which they're obviously tied to Microsoft. But I feel like there is this technology, there is a way to generate fantastically smooth text that aligns reasonably with human expectation.

I think now users are doing the testing. These companies could basically sit back and see how users are implementing the technology, look at both direct users, like individuals who are prompting the models and the feedback that they get, thumbs up, thumbs down, is this good, is this not good, the progress of a conversation over a session.

And they can look at that information and say, OK, this is a use case or like a type of use case to make a class about types of use case based on different themes. I mean, I'm sure they have people who are doing any number of different kinds of analyses, probably using language models to figure out what is going on in those interactions.

And they're probably learning from those either directly or indirectly learning from those experiences to make the product better or to come up with ideas for maybe more specialized chatbots and things like that, chatbots that are serving specific needs purposes. That's what I would do anyway if I was at one of those companies, I'd be looking for those kinds of trends.

I think that certain things like you have chatgvt and a user interface and you say, this is a language model trained by OpenAI and have all kinds of things appended to the user experience that manage user expectations for what they're getting when they interact with the model. And I feel like that's kind of you basically can leave it there and see what happens and collect that data from the experiences.

And as long as people are using it, there's going to be data generated and they can have something serve as a basis for further improvements.

So I think that some concerns, I mean, they probably don't care super much about whether it's always factually accurate, because again, as long as you say, you know, this is a language model, this is not something else, this is not an expert in this field or anything like that, this is a language model, then you don't have to guarantee the veracity of everything that is produced by it.

A lot of people have the issue with that and think that they should be responsible for whether incorrect information is generated. But honestly, I personally don't because I think as long as it's presented in a way where users kind of know what to expect and continue to have conversations around what expectations should be, I think that it's kind of most you can do. I mean, you can't...

It gets into questions of what even is truth, you know, like there are certain things that if you zoom in, there is no truth, everything's fuzzy. So... That was Danielle Bocelli and that was the first half of our conversation. Now, normally I say the remainder of the conversation or the conclusion of the conversation can be found in the next episode of the Sea Realm Vault podcast.

That's not actually true in this instance because I didn't get a Vault podcast out last week, which means the conclusion of my conversation with Dr. Ashley Frawley is next in line for the Vault. So it'll probably be a couple of weeks before the conclusion of this conversation with Danielle Bocelli is available in the Sea Realm Vault podcast. And I have to plead new puppy, which is to say I have a new dog. She's almost four months old. She's a handful right now.

She is at this instant asleep on the floor about five feet away from me. But when she's awake, it's very hard for me to get any work done at all. I know. Good problem to have. All right. So just going through some of my notes that I took as I relistened to this first part of the conversation with Danielle Bocelli. First, she started to talk about embodiment, and then we didn't really take that conversational avenue.

There are people who think that in order for current AI systems to become genuine AGI, artificial general intelligence, which is to say to have a flexible sort of intelligence which can be applied to a variety of tasks and a variety of fields and not have to be specifically trained in any one area in order to bring, you know, its powers of cognition to solving problems in that area, it can't just be a disembodied thing in a computer. It needs to have a body.

And while we didn't talk about embodiment in general, time and again, Danielle came back to the question of what are the limitations of learning just from language? Because we are embodied beings with five senses, maybe six, and, you know, we feel, we touch, we understand how our bodies are related to other objects around us, how they're oriented in space.

You know, I don't have one right now, but I understand that most humans enjoy a sense of smell, which when mine turns on, it's kind of like a superpower. It's like, Hey, wait a second. I'm suddenly getting information about the world and I can't see the source of this information. I can't hear it. I can't touch it. Where is it coming from? But you know, that's, that's my cross to bear.

But that, you know, that's just my off the top of my head explanation as to why some people think that an AGI to genuinely emerge and, you know, be present in the world has to be embodied, has to have what we would think of as a robot body. But in order to make sure that I've really got, you know, a handle on the arguments, I went to the Google version of GPT-4 or the Google version of Bing Chat.

It's called BARD and it is based on a large language model called Lambda, which was in the news last year when Blake Lamone claimed that Lambda was sentient. I'm pretty sure it's not. And BARD is still in an experimental stage. Not everybody can access it, but because I own a Pixel phone, which is a phone, you know, designed and sold by Google, they gave me special early access to BARD. And so I asked it, Hi BARD. Some people claim that for AGI to be fully realized, it must be embodied.

Would you flesh out that argument? Sure. There are a number of reasons why some people believe that embodiment is necessary for AGI. First, embodiment allows an AI to interact with the world in a way that is not possible for a purely disembodied AI. This interaction can provide the AI with a wealth of information and experience that can help it to learn and grow. For example, an AI that is embodied in a robot can learn about the world by interacting with objects and people in the real world.

This type of learning is not possible for a purely disembodied AI, which would only be able to learn from data that is provided to it by humans. Second, embodiment can help an AI to develop a sense of self. This is because embodiment allows the AI to experience the world through its own senses and to interact with the world in a way that is unique to it. This sense of self is important for AGI because it allows the AI to develop its own goals and motivations.

For example, an AI that is embodied in a robot can develop a sense of self by interacting with other robots and humans. This sense of self can help the AI to develop its own goals, such as wanting to be helpful or wanting to learn more about the world. Third, embodiment can help an AI to develop empathy. This is because embodiment allows the AI to experience the world through the senses of another being. This can help the AI to understand the feelings and experiences of others.

For example, an AI that is embodied in a robot can develop empathy by interacting with humans. This empathy can help the AI to understand the needs and wants of humans and to develop relationships with them. In conclusion, there are a number of reasons why some people believe that embodiment is necessary for AGI. Embodiment allows an AI to interact with the world in a way that is not possible for a purely disembodied AI.

It can help an AI to develop a sense of self, and it can help an AI to develop empathy. So there you go. Three more reasons why a genuine artificial intelligence, an artificial general intelligence, needs to have a body and be present in the physical world. Now, you could argue, and many people have argued, that that's kind of dangerous. We want to know what the AGI is before we turn it loose, physically turn it loose in the world.

One possible alternative is that you can simulate, you know, you can have the AI be embodied in a simulated body in a simulated world. For instance, it could be, you know, the avatar in the Minecraft world. That's a very simple world, but it might be enough for an entity to get a sense that it is... it is a thing that moves around in physical space that interacts with objects, which transforms objects in order to accomplish tasks.

It encounters other entities, kind of like itself, like itself in as much as that they are entities. Some will be friendly, some will be hostile, some will be neutral, some will be friendly or hostile depending on how you interact with them. In Minecraft, you encounter all of these different types of entities.

I think another conceptual tool that you should have in the back of your mind when you're listening to people talking about AI and talking about the prospects for AI is something called Polanyi's Paradox. And again, this is... it's another look at the question of what are the limitations from learning just from language? Polanyi's Paradox basically says, we know more than we can say.

Or as some some search engine chat entity that I... that I queried, and I didn't write down the source, I just wrote down the definition that it gave me, it defined Polanyi's Paradox as the theory that human knowledge of how the world functions and capability are, to a large extent, beyond our explicit understanding. We don't know exactly how it is we keep our balance when we walk. We just do it.

And because we just do it, and we don't explicitly understand it, we can't really explain it to a machine. I have been listening to... mostly listening to, even when it's videos. I tend to take them in in audio form. But one thing that I've listened to and watched recently on the topic of AI is a presentation by Tristan Harris and Azar Raskin. Tristan Harris was one of the people responsible for that Netflix special called The Social Dilemma, which came out a year or so ago on Netflix.

And he and Azar Raskin have a presentation that they call The AI Dilemma, talking about some of the very surprising and concerning developments in AI, and talking about just how quickly all of this is being deployed, how the capabilities of AI systems are improving much faster than our ability to align them with our goals and our values. It's not alarmist, it's not doomsaying, but it is certainly... it will give you food for thought and reason to take this issue seriously.

So the two of them have a podcast that they do, but I would definitely suggest starting with the YouTube video, The AI Dilemma. And something that I said in this interview, I noticed, it's not really true anymore. You know, I said that these large language models, they're just using statistical tricks to predict what should come next in a conversation.

So they, you know, they've been fed basically all the text that is available, all the text on the internet, all the books that can be fed into it. So it's got lots of examples of language, and it can basically look at the language that you've used in asking your question, and very reliably and very convincingly put out text that is relevant to what you've said. It is a plausible next entry in the conversation, but it's not always correct. It's not always accurate.

And I say that these large language models, they don't really have any memory. They're basically just doing statistical tricks to produce output, which seems plausible to us humans, but they don't... they don't model the world, and they don't update their model based on, you know, experiences, based on whether predictions that they make come true or not. We, hopefully, do that.

If we think we understand a situation, and our understanding leads us to expect a certain outcome and that outcome doesn't happen, hopefully we re-examine some of our premises. You know, we hold some of our hypotheses up for examination and, you know, give them the axe if they are consistently producing bad outcomes or, you know, failed predictions. And I say that these large language models, they don't have anything like that.

GPT-4 can use tools, which is to say it can query other types of software, and some of these other types of software that it can query are basically memory modules. So while the large language model by itself doesn't really model the world the way that we do, you know, it doesn't create an abstract representation of the world that it then... that it uses to navigate the world and updates as necessary based on new information as it becomes available.

A sort of emergent entity, which is a combination of a large language model plus several other different types of software that get queried and brought into the task as needed, can start to think and model the world the way we humans do. Not exactly the way we do, and not as competently as we do yet. But things are moving really quickly now. You know, there's this paper called Sparks of AGI, which is to say GPT-4 is not an artificial general intelligence.

It is still fairly... it is still narrow. It doesn't have the full range of cognitive capabilities that we do. But every now and again, it shows little glimmers that it's getting close. And it could be that these large language models are overhyped, that their capabilities are magnified by our fascination with them, that they're really not as competent as we seem to think they are, because they've basically been optimized to wow us.

And to get us to see them as being more capable, competent, and complete than they really are. And it could be that in the coming months and years we will come to see past this sort of showbiz illusion of these things, and recognize their very real persistent and sort of dogged limitations.

Or it could be that they continue to improve as quickly as they have recently, and they get really good really fast, and people who haven't been paying attention to this suddenly wake up to a world in which AGI is a thing and they never saw it coming. So here's the service that I'm looking to offer. I don't write code. I'm not a technical person. I've... you know, my website, which is a WordPress website, is beyond my understanding. It has been built by various other people.

And one of the reasons why I started a new podcast is my creaky old website needs a lot of work to be, you know, brought up to spec, to current standards, and I'm not the guy to do it. But the service I offer is this. I'm pretty good with words. I'm pretty good at understanding abstract systems and arguments.

And while I don't have a very good memory for episodes, which is to say, I don't remember things that happened to me all that well, apparently, I do have a pretty good memory for concepts and arguments that I can reproduce on the fly. And so I'm just going to continue to pay attention to this topic and I'm going to be jabbering into smartphones and microphones, basically trying to make sense of it all in real time on stage for you to look at, listen to, comment on, and hopefully benefit from.

So I'm mostly talking from the perspective of a non-technical person, but I'm relaying ideas and understanding that I gain from speaking to technically oriented people. I hope that's valuable to you. All right, if you notice things that I got wrong, the guest got wrong, or we both just failed to consider, and remember, you've only heard half the conversation, so we might have gotten around to it in the second half, do let me know.

And I would suggest that you do that by leaving a comment on the YouTube version of this podcast or on my Patreon page, patreon.com. Now it could be that you have to be a paying subscriber to leave comments, but I do have a $1 a month donation level, which for most of you, I suspect, wouldn't really alter your quality of life if you paid $1 a month in order to be a member of the people interacting with me and with each other on my Patreon feed.

You may disagree, and since we're not in the same room at the same time talking, your judgment is law in this regard. But just putting the idea out there. All right, that's all for this episode. I will be back one week from today with another episode. I may or may not get an episode of the C. Rome Vault podcast out between now and then. There is this whole tax thing to deal with. I have just a few days left, and I haven't started. Wish me luck. Talk to you soon. Stay well.

Transcript source: Provided by creator in RSS feed: download file