AI Tone-of-Voice Recognition: Transforming Education - podcast episode cover

AI Tone-of-Voice Recognition: Transforming Education

Dec 17, 202428 min
--:--
--:--
Listen in podcast apps:

Episode description

In this episode of Startuprad.io, we explore how AI tone-of-voice recognition is revolutionizing industries like education and startups. Our guest, Daniel Iglesias, CEO and Founder of Digi-Sapiens, shares insights into their groundbreaking technology that analyzes not just what is said but how it’s said.

Revolutionizing Communication with AI

What if your startup could decode emotional tones, improve communication, and deliver personalized user experiences? That’s exactly what AI tone-of-voice recognition brings to the table. Whether you’re a startup founder, investor, or tech enthusiast, this episode will open your eyes to the endless possibilities of this cutting-edge technology.

Key Topics Covered:
  • The Tech Behind AI Tone-of-Voice Recognition: How Digi-Sapiens’ tools go beyond traditional speech recognition by focusing on tone, cadence, and emotion.

  • Transforming Education: Learn about Laletu, a reading tutor powered by Digi-Sapiens, which supports students by evaluating their fluency and comprehension levels through tone-of-voice analysis.

  • Real-World Applications for Startups: Discover how startups can leverage this technology to enhance customer interactions, train employees, and even develop personalized learning tools.

  • Challenges and Opportunities: Daniel shares insights into navigating ethical considerations, overcoming biases in AI, and scaling this technology for global markets.

Notable Insights:
  • 80% of communication is non-verbal \u2013 Tone matters more than you think!

  • Digi-Sapiens’ technology helps address declining reading proficiency among students, providing tools that adapt to individual learners.

  • Learn why Digi-Sapiens was crowned Startup of the Year 2024 by Frankfurt Forward, a testament to their innovation and social impact.

Guest Spotlight: Daniel Iglesias

With a background in banking and technology consulting, Daniel took a bold step in 2020 to create Digi-Sapiens. His passion for education and social impact drove the development of AI-powered tools that are making a difference in schools and startups worldwide.

Why This Episode Matters:

This episode isn’t just about AI \u2013 it’s about using technology to solve real-world problems. Whether you’re an educator, a founder looking to optimize team dynamics, or an investor scouting the next big innovation, you’ll walk away with actionable insights.

Episode Links:Call to Action:

Enjoyed the episode? Don’t forget to subscribe to Startuprad.io on your favorite podcasting platform, leave a review, and share this episode with your network! Follow us on social media and let us know your thoughts using #StartupradioAI.

Startuprad.io™ - All Rights Reserved

Transcript

Hello, and welcome, everybody. Guys, it is getting close to Christmas. This is Joe from Sudapri. Io, and therefore, I'm bringing you a very special bonus episode this week to you very shortly before Christmas. But nonetheless, I would like to welcome Daniel here. Hey. How are you doing? Hey, Jan. I'm doing well. Thanks. Thanks for having me. AI pleasure. We may tell our audience that this, recording is sponsored by Frankfurt, meaning the business development

agency who is also supporting Frankfurt Forward. And the reason you are here, you guys won start up of the year 2024. Congratulations to that. Did this recognition first, can you tell us a little bit about you and your company before we get into the specific questions? Yeah. Sure. Yeah. So Daniel Iglesias is my name. I am from the AI Main region. I AI married to teacher. My professional background is in banking technology consulting. That's what I did before I founded, Digisapiens in, 2020.

And, what we do at Digi DigiZapiens is we produce speech recognition systems that not just recognize what is being said, rather how it's being said. So it's a speech recognition system that is geared towards special use cases, that are relevant to measure how well somebody speaks or reads. And with the help of this speech recognition technology that, is AI, so that we've built ourselves, you can build use cases in especially in the area of

education. So that's where we started. And the first use case that was built with our speech recognition technology was the LALI 2. LALI 2 stands for loud laser tutor in German, which is, reading aloud tutors. So it's it's a tool, that listens to students in schools from 2nd to 7th grade in Germany. While they read aloud, it analyzes how well they do that, so it carries out a diagnosis.

And after that diagnosis, they are, being trained to become better readers based on the, diagnosis that was carried out before that. So basically, AI looking for a question here. Sure. When you said they are your tool helps, how well they are speaking their AI the tone of voice recognition. Is it only how well they vocalize the tones? Or is it also that you can deduce some some level of their understanding of what they're reading? The speech recognition technology itself is,

audio technology, so it listens. So we it detects everything that can be caught by a microphone. And, in contrast to, I call it, regular speech recognition systems where the goal is to detect the probable intent. So what would be this what are you probably meaning? So, the goal is to carry out a a task or, AI I'm looking, a command. You have to carry out a command. Mhmm. Yeah. Play this and that song or whatever. In contrast to that, we really transcribe and list listen to what has

actually been said. So this includes arrows AI my that I already said now a few times. And repetitions, or text so regarding text repetitions, things that have been left out or added, the tonality of things, and also whether you pronounce words correctly. Yeah. So that's what the, speech recognition technology does. But we also develop, systems that help understand help students understand the text better by generating, differentiated quizzes. So, you cannot ask

every reader the same question. It must be adapted to his reading level and also understanding capabilities. So the complexity of the questions and the possible answers also must be adapted. So, all in all, we're in the business of providing education, specialists, with, with the necessary tools to build very innovative adaptive tools, for learning reading skills or language skills? That is exactly what I had in mind.

Vividly remember when, for the first time in my life, I understood a Chinese joke about foreigners, instead of how how, she said, how how. So the the 1 means good good, the other means mouse mouse, different tones. So the the the the question is here, how many languages can you do? And isn't something like Chinese where there's a different if I ask or the 1 we means please ask. So if you ask for direction,

the other one's the other 1 means please kiss. Well, I made an older Chinese lady on the streets of Beijing really blush. How many of those differences could you actually do? Because on the top of my mind, yes, of course, English is a little bit difficult to pronounce Spanish as well. But if the tone really makes a difference, like in languages, like Chinese, Cantonese, and so on and so forth, that would weigh your AI way 2 would really come in handy. So my question would be, how

much can you do there? Could you, give us an idea of the granularity and languages you cover? Yes. We so as of now, we're covering German and English. We did not experiment with Chinese since the Chinese education technology market is highly regulated and basically close towards foreigners. But, we my team, I have dedicated experts in, working with non Roman languages, so especially also Indian languages.

There's 40 something. Sorry that I cannot recall the exact number, but there's, more than 40 languages, official languages spoken in India. We can work with those. We can also work with, Arabic character sets. But as of now, in terms of solutions that are at hand, we can work with Germany, German language, and, we or next year, we will also launch, the English version AI API, and other languages.

We have the skills and capabilities to train and fine tune models, with a short pilot project that we need to carry out with potential customers. So that means you already, by the way, I linked it in the show notes. La Lalu. Who is it? La la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la. It's There's a there's a, a song that you sing to children before they go to bed, AI. It's called. Yeah? And, we have some similarities there. So it's called

too. It's about 3. Mhmm. So, sorry. Just typing here, that we also linked the song here, in the show notes. My my quest so this already establishes something we could deduce from what you're saying. So, basically, you are not a customer facing product. You're 1 of the tools, the APIs, others could include, could work with in developing their own client facing b to b, b to c, b to

g tools. Right? Yes. With the asterisk, we we have developed the LALI 2, for our partner at hence, So we do develop platforms and applications, but we rather license our the speech technology to partners, b to b, b to g, b to c, whatever. We are already discussing, the tool. But before that, I I actually wanted to be because I have so many questions. Before that, I would actually, wanted to ask you

where this idea is coming from. I do believe I have an idea since you're married to your teacher, but you have been working in banking, finance, technology, triangle consulting. So so, where did the idea come from? How did you get that? And especially the question, when did you decide jump ship to really do this full time? Okay. So I was, like you said, I was about 17 or 18 years into banking technology, and, I always had the goal to promote young people in achieving, how do you call

it, higher education. Let's put it this way, to get the most out of their potential. So I did, trainings in schools for how to apply to a job, how do I choose the right job for me, etcetera etcetera. And I always had this passion for helping young people. So, that's that that passion was always there. But in 2019, that was the time where I really, thought about what can I do with the skills that I have and the knowledge that I have? And, to to really have an

impact on our youth in a bigger scale. Because this is the AI, shortly after AI became a father of a of a daughter, and I have observed certain trends in our society in Germany. 40 to 50% of the children have background, with non German parents or migration background, as you would call it in Germany. And this leads to some hurdles and some difficulties, in, in the school system, and and and since we also, at the same time, have a shortage of

shortage of teachers. And I have also observed what happens in the market regarding the upcoming AI of AI, robotics, automation. I was part of it in banking. So I added the deterioration of reading skills towards higher requirements regarding job skills and came up, with, with a

perspective that I didn't like for the future of my children. So that's why I decided to take my skills, which is general management, business development, technology understanding skills, and work together with the best experts I can find, in terms of reading capabilities and, reading training and, excellent techno technology experts bring those things together and build what you find today. Mhmm. The tone of voice recognition is fascinating. You already told us the, loud reading tutor is

something you develop for a customer. Could you also share another example already where an external, client is using your tool? The second example is in the making. It's not ready to be shared publicly, but, I must openly say the past was, highly, we were highly invested into building that tool, which is 1 of a kind. It took a lot of attention and all of our resources to get it running in time and, make it

scalable and stable and user friendly. So now that the product is fully marketable and most of the almost all bugs are fixed, yeah, now we are able and ready to focus in on new, projects and partners. AI see. Talked about partners here and winning new clients. Winning the Frankfurt Forward award is a huge accomplishment. What do you think made DigiSapiens stand out among the competition this year? AI think it's the social impact,

dimension of what we're doing. We are for profit social enterprise, so we are here to do good and, earn some money at the same time. And I think the audience AI that idea, and I think there's a lot of ad techs out there, but pulling it off in the way that we did, by partnering with such a, renowned brand AI Ernst Kedfalak, as a first initial project. And at the same time, building such a unique technology like we do.

I think that's what impressed the jury and, caused a lot of, support in the audience. I see. I see. Your technology has potential across industries. We already know you're working and focused on the education, but could you also see some other industries where you could, like, in the future, a few years down the road, apply it?

Yes. Outside of, education, there is also the entertainment and gaming industry that could work with our technology, where you could you could use it to build games that, based on reading skills, which would be some sort of, yeah, educational games at at the same time, but you can also use it to, to build, how do you call it? A a presentation trainer or speech trainer that, helps you become a better speaker for public speeches.

The creativity and opportunities are unlimited, but everything that involves the ability to carefully listen to what and how things are being said. You know? Mhmm. I AI was wondering, AI sure you thought a lot about potential use cases. Could you share, like, the the the the the the most interesting, the most, quirky 1 you already came up with? No no need that it actually AI, but you thought, theoretically, our idea could also be applied too.

Yes. So we have applied it to reading learning or reading promotion, but, very, very relevant use cases also in the area of language learning. You have seen big companies like Bubble, etcetera use it in some way. Yeah. And we envision other ways that are much more focused on dialogues, that our technology could be used for to promote language skills or learn a new language. So we are able to evaluate how

things are being pronounced. That's 1 major skill. We can analyze, literally what has been said. We can analyze 4 ds, all areas of, language that we can analyze. And I think this is relevant if you want to learn a language properly. You have, you have given some example from Chinese. If you listen to people talking German with all their accents and dialects, we also have ways to tackle, dialect, dialects because the way you use your mouth, your

tongue, your teeth. So your complete speech apparatus is also something that we can derive from the audio signal and combining all those, all those measurements into a cohesive didactical concept is something really unique, that we haven't seen so far. Mhmm. Going into a little bit different topic because every everybody talks about accuracy, like fantasizing AI and

ethical use of AI. With analyzing of tones, you're you're, collecting potentially sensitive information or your clients do and you process this. How do you, ensure the ethical use and the accuracy when handling the sensitive data? Yeah. So the in Germany, it's always a relevant question whether it's a personal personality AI identity related data.

You know? That's 1 major question. And, the the thing is, if voice really is such data, you would need to have some registry of confirmed identities that are linked to a voice profile or voice biometrics profile to to pose some danger to a data leak or whatever. You know? Mhmm. And this is not the case and will never be the case. We will I well, let's let's not say it will never be the case. I don't know what happens in, the year 21 100. But as of now, we don't have a voice register, a public 1.

And, the question is also, even if this existed on a government level, the question is also, do companies, do other individuals, criminals have access to this registry, and can they use it to harm you? And the I don't I don't see that. When you ask me about ethics in my context, we regard the topic of ethics in terms of accessibility to our solutions. So can somebody from Bavaria use it, as the same way as somebody from

Saxony can use it? And can somebody with Turkish or Arab accent use it AI somebody from, Hanover without any accent? And the answer is yes. So we AI to and we put a lot of effort into avoiding any biases in our speech recognition system by training it very profoundly with different accents and dialects to make sure that it works with every user. Yeah. So that's how we look at that. So you you you put a lot of effort into that,

making it possible for everybody to understand. We may tell the audience that there are some people who speak very, hefty, local accents, not only from Bavaria, but AI, Thales, Saxony, and so on and so forth, but also Platych in the very north. It's really hard for you to understand when you're from a different area. Many Americans will understand, will, have an idea when I talk about somebody with a very

heavy southern draw or something that's also hard to understand. So you took care to cover all those peoples and not disqualify somebody there. So I do believe there was a lot of development work going into. What challenges did you face, and how did you overcome this in developing digital DigiSapiens, not digital. DigiSapiens. Sorry. Yes. So yeah. There there were a lot. So which 1 can I

yeah? So we started as a company that wanted to provide speech recognition systems only, And then we were suddenly in the position to develop a whole platform. So in a short time, we had to set up a team that was able to do that, build a product team or, UX and front end development team around our core technology team in a very short time and, build a product that fulfills high expectations. And this, was a

real challenge. To be honest, we, like a lot of other startups, were in the forced to publish a product a year ago that was not perfect, far from perfect.

So we got that feedback in the beginning, but we worked really, really hard, with our team, which also includes the first level support who's directly in contact with us, our schools, that use it and the partners to really get the first hand impression of what is working well and whatnot, and they are very much integrated into our development process. And we take we took every feedback very

seriously. And, yeah, now a year after we've launched, I am confident to say that we have a very unique, innovative, and highly effective, reading promotion solution that is, yeah, that we can be really proud of. I see. The AI world is developing pretty rapidly. You are right now, I would say, on the cutting edge of development. How do you make sure that you remain there as 1 of the top solutions keeping up with the with the AI developments? We

thanks, for that feedback. Yeah. And we and I, we really work hard to be seen this way. So what we do is we invest a lot a lot into r and d. Most of our money goes into r and d. We publish papers. We participate in international conferences, where we also do take over tasks and compete against other teams in optimizing models, quantifying models, and raising accuracy. And we always, come up on top, also, leaving

huge names behind us. So we, we regard this as a sport to develop new methods, overcome, overcome hurdles and, yeah, really try to be on the cutting or beating edge when it comes to sophisticated speech recognition and NLP solutions. Yes. Daniel, I'm sure there will be questions on where are the papers. I do have a few suspects where you mentioned something like this. You will give me after the official end of the interview, you will give me, the link, and I'll post it in the show

notes. I do have a few certain suspects that always request something like this. Yes, Claude from Paris. I'm looking at you. Exactly. And then they they can dig through it. So, let us go into the very last part of the interview because I'm now already bothering you for, like, almost 40 minutes in this online meeting and, more than 25 minutes in actual interview. So, don't worry. There there are only a few more questions left. I was wondering winning an award like Frankfurt Forward often

reflects strong local support. How has the Frankfurt ecosystem contributed to your journey so far? There are a lot. So the our ecosystem is very regional. Our investor is regional. Our network helped us find our first customer, is from the region. I we get a lot of recognition and inquiries due to that, price. So, it's mostly visibility

and also recognition. So it's when you approach somebody and, he asks you who you are, what you do, and, you mentioned the start up of the year world from Frankfurt forward, especially in the region. Everybody stops questioning whether you are or whether what you are doing is sound and makes sense. So, the intro and entrees into conversation building partnerships is much easier. Mhmm. And now only 3 more questions left. You are AI now a leader in a very specialized niche in

AI. But what advice would you offer to other start ups looking to innovate in the AI and tech space? Some kind of skills, like processes you have learned so far, not only taking the, your business ideas from influencers on Instagram. Yeah. I don't know if I'm the right 1 to give advice. Yeah. I'm not, the tech guy in my company. But what I would generally suggest, when looking at tech is not looking at a hype or the technology, but rather trying to solve real world problems.

So AI in Germany, we have 25% of the children, that at the age of 15 do not understand what they read, and they don't understand what they read because they are not fluent in reading. So that's a huge societal problem, and that's, that was the initial thing that led me to found it found DigiZapiens. And I would really not focus on the technology or the AI for the purpose of the technology in AI, but rather using tech as a means to solve a real problem. Yeah. And if it takes

AI, it's fine. If not, not. But 1 also has to say in German, there can be very difficult, sentences and structures. III everybody who's who tried either in German or in a translated version to read Kafka should know what we're talking about here. So, therefore, it can be difficult. There's a saying in German, German language, difficult language. So, actually, it it it says good things about you guys that you started with the German language and mastered it for your, for your, tool, DigiSapiens.

We usually close out with 2 more questions, and they're usually pretty simple and usually end with a yes. But I'll ask them anyway. Are you open to talk to new investors? And as always, I'll link your LinkedIn profile down here in the show notes wherever you're looking this wherever you're watching this. No. Sorry. This AI, no watching. But, anyways, you you either directly in your tool. I'm sorry. Not every tool allows links you can then click.

So, basically, you could go to our blog, standard break. Ioforward/block, and there we link Daniel's profile. Plus, when you are expanding, when you're growing as a young company, I am sure you're all so open to have applications from potential new employees. Right? Yes. Yes. Yes. Yes. Yes. We're looking for we're looking for good people always.

AI, is there a career website that I could launch, or should the people simply, that I could link, or should the people, simply reach out to you via email? Yeah. The latter. Directly reach out to us. We, we could, we we have a way to go regarding an HR department, so everything's handled by the team, depending on the competency somebody's applying for. So can't use the public channels or directly contact me, and I will forward it to the colleagues. Again, go to the blog to the LinkedIn

profile. Guys, it was a pleasure talking to you, Daniel. Thank you very much. Thank you very much for answering more than 30 minutes difficult questions here and, keeping up my stupid interjection interruptions. Thank you very much. Thank you. Nice being here. Yeah. So and yeah. Thanks for having me here, and, yeah, Merry Christmas and a happy New Year to everyone. Thank you. Yeah. Merry Christmas. Happy New Year from you as well. Thank you, guys. Bye bye.

Transcript source: Provided by creator in RSS feed: download file