13: Is this real, Data Science, or is it a fantasy? - podcast episode cover

13: Is this real, Data Science, or is it a fantasy?

Oct 12, 202146 minSeason 1Ep. 13
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

Over the last decade, we have seen tremendous advances in big data, data science, artificial intelligence and machine learning. Every compnay wants to be a tech-first comapny now, and wants to “do data science". Companies can probably double their valuation by just adding a  “.ai" to their names. Companies that actually use artificial intelligence and machine learning maybe have an even higher premium on their valuations.

However, is Data Science worth the hype? Is AI going to take over the world?  And why is data science being eaten by computer science? What happned to classical analytics, operations resarch and statistics?

This week’s guest is someone who did data science even before the phrase had b een invented.

Amaresh Tripathy is SVP and Analytics Business Leader at Genpact. Till recently he was a Partner with PWC, leading the firm’s Data & Analytics Consulting, and helped build a $500mm business. Previously, Amaresh founded and co-led the Information and Analytics Practice for Diamond Management & Technology Consultants, and also serves as Adjunct Professor of Data Science and Business Analytics at the University of North Carolina, Charlotte.

Amaresh has helped Fortune 500 companies in multiple industries (healthcare, retail & consumer, communications) to help define and implement their analytics and AI strategies and institutionalize data enabled decision making.  He has led organizations to help embed analytics in their front, middle and back office functions and manage the change process.


Show Notes:

00:03:00: Definitions - data science, artificial intelligence, machine learning, etc.
00:04:15: The rise of computer science and machine learning
00:10:15: The probelm with Kaggle, and the “race for accuracy”
00:11:30: How to scale analytics without doing bad data analysis
00:18:00: How selling data science has changed over the last decade
00:23:00: The interaction between business and Data Science
00:26:30: “Creating bilinguals at scale”
00:30:30: Machine learning trying to eat data science
00:39:00: Comparing data science practices across countries

Links:

Thomas Davenport and DJ Patil on Data Science as the “sexiest job of the 21st century” (2012 article)

Hal Varian on statistics as a “sexy job”


Data Chatter is a podcast on all things data. It is a series of conversations with experts and industry leaders in data, and each week we aim to unpack a different compartment of the "data suitcase".

The podcast is hosted by Karthik Shashidhar. He is a blogger, newspaper columnist, book author and a former data and strategy consultant. Karthik currently heads Analytics and Business Intelligence for Delhivery, one of India’s largest logistics companies. 

You can follow him on twitter at @karthiks, and read his blog at noenthuda.com/blog

Transcript

Intro / Opening

On the other hand, I think we definitely have lost some of their discipline of analysis, in the name of analytics. Right now. You are seeing a lot more scaling from. I would say, our Tech first mindset rather than a decision first mindset and over a period of time. I think it will come find its own balance and settle it out when I think about Talent scarcity. And kind of all these challenges that we talked about. It's about the biggest challenge is getting those bilingual books.

Hello, and welcome to data, Shadow the podcast on all things data. This podcast is a series of conversations with experts and Industry leaders in data. At each week. We aim to unpack a different compartment. Data suitcase. I am your host Catholic charity that I'm a blogger newspaper. Columnist book author and a former data and sides inconsistent at currently head analytics and business intelligence for delivery. One of India's largest logistics

companies. You can follow me on Twitter at Karthik s that is Kar Phi. K s and read my blog at no Intruder.com that is n and ph.d. A.com or opinions expressed in his podcast, belong to me and my podcast is and I do not reflect the views of any organizations. We might be Associated. Nothing discussing this podcast should be taken as been achieved from the letters. Over the last decade, we have seen tremendous advances in Big Data data science, artificial intelligence and machine learning.

Every company wants to be a tech first company now and wants to do data science companies can probably double their valuations by just adding a DOT AI to their names companies that actually use artificial intelligence and machine learning. Maybe have an even higher premium on their valuations. Is data science. Worth the hype is a, I going to

take over the world. And why is data science being eaten by computer science over the years and what happened to Classic analytics, operations, research, statistics, Etc. To answer all this. We have someone who worked in data science, even before the phrase had been coined. I'm rich, 3 Part. D is s v bi and analytics business leader at genpact till recently. He was a partner with PWC leading, the firm's data and

analytics Consulting and help. The 500 million dollar business, previously, embraced confounded and collect the information and analytics practice for Diamond management and Technology consultants. And also serves as an Adjunct professor for data science and business analytics at the University of North Carolina, Charlotte.

Definitions - data science, artificial intelligence, machine learning, etc.

Let's start with the Fundamentals by your definition. What is data science? What is analytics? What's machine learning? What's actual intelligence? Yeah, Gothic. I I actually don't even think of any of those things. All these terms have been so bastardized where one ends and the other begins or what are overlapping is even hard to get your head around, but I really think about it is How do you make better decisions? Like what is the, what is the decisions?

How do you make that more efficient? And to do that? You have to think about like Data Insights outcomes and process, right? Or workflow of some sort and open data Insight, workflow actions and outcome. In some ways that said, I felt it and that's what matters. And then there are two go from

data to in size. There are many, you said of In case you need to do drive, you could apply all caps, all kinds of techniques, organs of terminology, but that's kind of all the things that you talked about kind of somewhere falls into that but somewhere like in artificial intelligence of the automation stuff also, probably 40 to that in some ways doesn't really matter. I really start thinking about decisions like kind of work backwards.

The rise of computer science and machine learning

Everything else is pretty much a marketing term know. So the recently actually, what's happening is I was rereading. This is famous article by Thomas Davenport in a DJ but 2012, HP are so I'll link it in. So gnomes. This is that Data scientists expected job of the 21st century or something like that. And I was rereading and I was like, okay this mentions all about hypothesis testing designing experiments. How do you collect data and so on as nothing about machine learning?

But in the industry, I found that some people sort of like use these to interchangeably. Somebody had once even told me that you're not doing machine learning. Then how do you call yourself a data scientist? So, it's a why, why why? Because machine learning and programming in general taken center stage in in data science over the last 10 years. Yeah, you're worth really interesting. I'll just go back to that same

article. I think the article it except coat but I think Hal Varian rules boobs Economist, right? And he was an economist to start with and talks about it from his vantage point and sometimes you kind of go and you you hear the whole notion of design of experiments and everything that you kind of refers to there. But I mean, what is what has happened? Is those of us who've been in space of a takes for whole bunch of years right now?

It used to be statistics shins, econometricians, maybe operations research, people who used to do this, right used to used to do this in like the initial analytics kind of 1.0 wave which essentially was those folks.

And then obviously you bunch of things happened in the world of computer and you kind of heard the word big data and then a lot of it is has Around less around, I would say more and it does mean less and less around thinking about the decisions and thinking about the art of a pit order of accuracy because you could compute your way through it.

You could breed force your way through it and getting all getting more and more into like numerical and mathematical techniques that will allow you to brute force and kind of solve the problem. Which is care, which is a good thing in some ways because it creates access to some of these techniques in the very, very way.

And essentially, Analytics that there's evolved is more of discipline of computer science, and Computing than some of the earlier things we talked about, and which is why you see machine learning and you can see artificial intelligence and in those terms being thrown around. But essentially it's, I would say the march of and computer science on this whole field and that has a huge set of advantages in terms of, I think it democratizes it kind of scale set up in a very different way.

You could use it in all kinds of things because I still remember there are experiments. Of things that I would run, that would go on for like code, the SAS code that will go on for 24 hours and you don't see series. You don't see right now as many of those but at the but the on the other hand, I think we definitely have lost some of their discipline of analysis, in the name of analytics. Can you elaborate a bit on that that with that we have lost memory if you thing about

analytics? And if you think about decision making I mean I'd be. Why are we doing any of this is fundamentally to change some Behavior, some decisions and you make it more active. Got to go and do that. You need to start thinking about you. Basically, you're trying to mod mathematically model reality in form of an equation. And reality is pretty damn hot. Damn hard to model. There are all kinds of things that you don't understand. You don't know what's going on

and everything. There's all this is not part of it, through the whole process and how you, what you what you think about. All the we talk about least bias and explain ability as if they are new like they're like invented yesterday. I am going to be asking is to this tournament even thinking about that from the very, very beginning and that notion around the thinking about the mathematics. It's a mathematical

representation of reality. There is a there is a not Then you actually start thinking about in those contexts. You don't think about automating and putting all the data together and do all the feature engineering at one go and try all of it in an extra boost to model and then kind of do an ensemble model and find an answer.

That is like, okay. I kind of think think I'm putting kitchen sink throwing the kitchen sink at it and yes, some version of reality it is there but there is a little bit of a structured process around thinking when I bought the sister tell you what is actually means a real life. Would it actually made me even make any things around decision? That I think is what I talked about. There is a not process and they used to be. It used to be through an apprenticeship model.

You learn that thing over a period of time in the world, which is a Computing. First word. You think in terms of calling functions from Sky kit, and, and replacing that. So I think this is the sort of a constant sort of hat. I might be able to call it back that I have with some of the younger people in my team because I have come from a conventional sort of.

I learned my first. In statistics, then I learnt econometrics and then start applying it in business and so on. So for me, it's very much like you need to understand the data. You need to know what your modeling the, why do you want an extra boost here and not a random forest, for example, because what is it about this data set? That is that suits it for extra boost, but sort of the younger people in my team. They're like, there are a lot of them, their approaches like here

are 20 different models here. Is this data set. Let's just apply this on everything and then like see which gives the gives the best accuracy. And we'll just pick back. Yes. Yes. I'm in a tournament in the world of carrier Lauren that it's all a competition about accuracy, as like, accuracy towards what and for why it's sometimes, that's lost, ya know. They actually cattle remind me of that. I mean, living huge is, is the disservice that it is done. I mean it done some good work.

It's put out some interesting problems out there. It's Allowed lots of people to work on interesting things and showcase their work in public and so on. But I think what's happened, is it just become an accuracy race,

The probelm with Kaggle, and the "race for accuracy"

right? It's almost like you're trying for the best R square or whatever, right? So so think that's one problem with Carol, where you just go for the highest accuracy rather than let's say hi. Have best model that you can understand. And things like that. I think we're going with it is when you start modeling. Reality. And you start thinking in terms of decisions. You actually, you actually force yourself not to go after the accuracy, but correct, the pragmatism in the practicality

of getting something dark. And and by the way, there's a psychology of decision. We have an economics and everything is a whole host of other things just by that really matter to how you were. You can might have the best answer but you still won't make major break based searches with a whole host of other things that are going on inside, which is what white too big. And it makes the whole world was over. So so our whole disciplines of

fascinating there. Economics that has been met with economics and experimental design. That is statistics is operations research. There is Computing and but it's one of the 15 things that actually are important. And then, but everything kind of. It sucks. The oxygen out of a whole lot of other things. Sometimes you just have to be careful about. So and I think what's happened is, I think you are particularly

How to scale analytics without doing bad data analysis

in a good position to answer this. So if you think about it, like analytics has sort of scaled massively over the last 10, 15 A year since since you joined the industry and what's happening slots, lot, more companies are using it, which means that we need a lot more people who need to deliver this and so on. So when you scale, so massively, how do you do it without sort of dropping quality, without without dropping your standards, or like, making sure that you

don't do bad data analysis. Yeah, it's, it's hard because because it's seen, as, okay, if you learn three Tools in your kind of getting it. But are two things that are happening, at least, my fundamental belief has always been, you will have to understand the context of decision-making to understand the context of decision-making have to actually understand the workflow, right? And not a whole lot attention is paid to the process part of it

of that equation. Like you, are you, you always make decision in context and you make all this decision in as a sequence of things that is going on. And if you don't understand that, or if you don't understand the content, whether you are doing your making a A delivery route choice or you're making a choice around which application to except for giving a loan or whether you understand which, what is the forecast that you are going to lab?

But put your product plan, your production to these are all decisions. And these are like very real live, very candid. David like very practical decisions and you need to understand how all of this thing works together. There's a sales force is a sales person who's trying to kind of forecasted and he's trying to protect his Birds or the process that he follows when he's talking to the to his clients or his her clients. And how does the whole process work that there is a richness

around that if you understand. Okay. What are you influencing in the in that whole system? That becomes the starting point? And actually, it's in someone it's not very hard. It's these are these are practical set of things that go on and you remember you work backwards from there. You will go backwards from that and try to understand. Okay, what do I need to influence? And what did I do? I need in our techniques. I'm either going to do that.

It's basically kind of almost like going left all the time. The more you are able to do that. And and if I mean, I would say companies or organizations that are more who think like that you can scale up pretty effectively. It's just damn hard to kind of do that to that that process, especially because it's easier to just say, listen, you need, you need to understand these three tools. These two techniques, this one certification, and you go.

So it's about that balance. If you're if you're doing, you can So you guys, so I just don't like to answer questions, like that. Scale. You absolutely can scale analytics, data science, the whole thing, but to ski I think was the mindset is the right one to do scale or not. And I think as an industry we will kind of get but right now you are seeing a lot more scaling from. I would say, our Tech first mindset rather than a decision first mindset and over a period

of time. I think it will come up. Find its own balance and settle it out. Yeah. Because I think what's also happened? Is that like TV? Especially in some older organizations. What you have obviously, you'll have some sort of a few old timers late say, who would have been like sort of deficient first and they are somehow able to percolate their way of thinking through the organization and that works great. But what, but what I also see now is like, interval, or usually sort of younger

companies. We're like pretty much everybody who's, let's say working in analytics. Data science has graduated fairly recently in the tech. First word, post 2010. Because I think if you think about it, when he then was a sort of a around this, a massive inflection point for the industry, that because I think a combination of big data and cloud and faster Computing, which enabled better machine learning like things.

Like, when I did my undergrad, a and squirted nobody was working on. He didn't say. So my generation of computer science, people has been like completely. I don't think anybody really works in core data science. No, Exactly. I agree. I mean it was the AI winter. That was the term. Yeah, so I think 2010 was when like you had all these all the data sitting in the cloud sitting in everything sits in one day in Google cloud and all over awsm everything kind of some of those things.

Become a lot more mainstream in care of these these tools are I think we came this way and a lot of the tooling came out. I've been done with like I just made it a lot easier to. That's why I think I normally think about have you started, they have you started in this world after 2010 or pre-2010 a, that kind of tells you how your mindset a little bit of what are you? Yep, that's what I found. Is that like especially I mean, I am grossly generalizing here

based on a few data points. I'd if you have a, let's say if you have a team actually mostly post 2010. I mean it's now very poor who have sort of looked at data science from a very thick first perspective. Then it's possible for the entire organization to sort of get lost in it. Take ten. Not really, not really look at the decision part of it, right? Like so worrying. I mean, considering that given the distribution of Ages and sir.

If most teams are sort of of this nature, how do you sort of make sure that you don't end up doing what I would just call is bad data else. Yeah, it's not bad is good. It's fun in some ways. I mean, they have some fantastic data scientists after 2010. Also, I'm just going to H over generalizing it, but you need to be, you will know very quickly is whether you have a menu controller career experience. I look at organizations all the time.

Do you have relevance India? And because Designs, I mean unless you are even if you had a wedding in the product and the platform of the company, in most of the times in the business decision-making, set of things, do people care about what you're doing. Do they use what they are actually use any of it, right? And that automatically Tells A lot of it. And some of it. I mean, you see, like you've data, what's really amazing is you did at evaluating sources becoming so prevalent.

So quickly item is as a function, but also, you can see the growing pains essentially, what, which essentially, they are. The cio's of the 1990s. I was like, get through the Growing Pains of how do you establish? What, what these folks are? What do they do? How do you make themselves relevant? In an organization? So we see some of that, some of the challenges.

So, yeah, I would say, it's not about like, clean ants, but the system kind of works itself out because some of those things in the larger business analytics, kind of a context, those teams

How selling data science has changed over the last decade

have less, and less relevance, and things change and a kind of the next version comes along, make since Nixon. Changing tracks a little bit. I think we've spoken about what what data science has been like and how its evolved. I think direct from us selling perspective, how have things changed in terms of like, organizations more receptive to this? I think, 10 years back. They used to be the huge concerns about. Can I give my data to a third party to kind of get them to

give us insights? And so on the, so how has that changed over the years? Yeah. I mean, I mean, I've always been on kind of the Professional Services and selling piece of it and And ours is selling that and then the selling of internally selling of analytics group internally to their stakeholders. This is stakeholders. I mean, I think I've learned this in both of that side of it. I think do things have happened.

Number one. This discipline was I would say a back office discipline in like early 2000s. Now over a period of like the good part of. It is like whether it was Big Data, a high, all the hype, and the buzz words that came along and it kind of changes every few years. What becomes more important or less. It has become a very business discussion. Right? When I had like clients, who would ask? Okay. Like, how can I get AI faster

and whatever that means? But the point is it is where there are there's an education Journey that actually happens. But there is a lot more awareness that we need to do that. And this is why it's important and everything. But there's a massive, I would say, between 2010, to 2015. I mean before that, they were like early movers but I believe that there was like a massive education and awareness that has happened in the corporate world

around these topics. And I would say even even before that to do a little bit and that actually has opened up around people. I would say there are vast. Majority of people are we absolutely need to do this and then like a why do you need to do is exactly what you need to do. This? We are still, I think working at myself so out and now there's this new transformation, whatever word.

You use in that broader context, people follow early realize, I'm going to use the technology in my ecosystem. In a very different way to interact with my customers, interact with my employees, as or interact with my partners in the ecosystem and and all of that stuff is happening and with covid-19 with Stacy when it's like, which is crazy. Everything you can realize you can do it over the river in in

very different ways. So, there is this, the whole that awareness is creating a huge pull towards it. The data side of it, by the way, pre-2010 was actually a lot easier discussion. There. It was I okay you want to do these are data do but whatever. Yeah, give me the stick right now. The data discussion is a lot more, your data has suddenly become an corporate asset, right? And so everyone is starting to think about how do I manage the

data and everything sedate? I would say the people that want more, it's but there's a lot more progress being made also on how to manage the data, how to kind of, like, do you identify how we gonna do that? So this event that thing is progressed. Well, so yeah, it's a, it's a, it's a, it's a supply side. It's a supply curve. Market right now. She's like last 24 months.

Maybe that's 48 months and will continue for I as I see next body at once, which is very supply side and Urban Market order to Mandarin Market. There's a ton of demand out there. So from a from, from our perspective, the the rubber meets the road. However is yes, I mean, it's there's a lot of things happening. How many of them are being used. How many decisions are being? Are you making better?

Because of this that maturity we are still like in They were saying that the cricket 50 over credit of allergy. We are still in like probably the first 20 overs there. Now that that the other again is that like one of the other guests on the podcast had mentioned this, you know, that her financial services investing back in context. He's back for investment Banks. Right now. It appears that the purpose of having a data science team is not to do data science. It's to have data science team.

There's a little bit of that is like I need one of the why I need one of those and then I'll figure out what to do with them yesterday. I mean I would say at least a feeling the last 24 months. If nothing else. We are getting from the toll earlier doctors to kind of scale. You just sense it in like everyone wants to do it at scale and and a few areas which has being right relatively proven out. So you kind of got us to do that.

So yeah, there is a little bit of like, I'm vanity impact in business for having things like that. I see less and less of that dog. Yeah, because I think there's only so far that vanity can go right because IMA get to find it. A fun of my old clients who was like, okay, what you've done is fine, but can you put in a little bit of machine learning into this? It's be good. If you are investors. It is it is that, by the way, that is the other big piece of it. They invested emits.

A lot of it is driven by investors and overnight. Everything became a, I write. I mean, sometimes my well the

The interaction between business and Data Science

jokes is like, yeah, if you're able to sort of column a minute. Let's say I started function, isn't it? I function. So this brings me to the other one other thing, you do that again. I think we reached the talking about business impact and things

like that. So, how do I, I guess you deal with a lot of business, folks, who are looking to sort of introduced AI or whatever into that process, slightly how to burn business people are what the equation between the business, guys, and analytic, guys, like nowadays, in terms of like, because one view that I have seen, In the past is that either the business doesn't understand it and just let them do what they want or they just assume that whatever the

analytics team has done is they come it great. It's like it's artificial intelligence. So we ought to use it kind of thing. So how has that evolved over time? Yeah, actually the second I don't see a whole lot Sia. The a number one is the whole notion around business guys. And analytics, guys right now in that divide that the way we actually even talk about it, which actually is the way that

actually, I didn't go away. It's a matter of, I mean, because if you go and look at, like, I mean, the, the business school courses of today and the classes of today, right now, whether it's MBA classes or whether it's like marketing classes, whether it's studying to become, even a social scientist, you have literally analytics and data science, course, in pretty much every curriculum out there.

I don't know, we're driving, at least most of the universities, you will kind of go and do that and then you have data science courses, also. Oh, they're so so I think that whole barrier of understanding of what, some of these techniques are. And what does it mean? And everything are very rapidly, kind of going away on one on one end of it on the, on the flip side of it right now.

It is seen as though this these are the technology people and this is like the business person and the technology people are continue hyping up and newer new things that are coming in and the business guys are, what are you telling me like this thing makes no sense. Like, you don't understand the pragmatic aspect of it. Which is this whole notion right now, right now, at least, what is I would say, next?

That's a few years, this whole notion, what I call bilinguals becomes very people who aren't who can understand analytics, and who can understand, who can understand the business of the town, or the context of the domain part of it. And how do they get of, how does it all interact? And you can go from one to the other, right?

You could be a very good marketing person who understand more text acts, and understand the idea a little scope for portion of it, or you could be a very good analytics person or you or pick first person in who was picked up marketing, but You need to be bilingual to be successful right now. When I think about Talent, scarcity and kind of all these

challenges that we talked about. It's about the biggest challenge is getting those bilingual folks in place and the moment you get that you will be able to kind of bridge that and that's get that and that is going to be probably the mechanism to Bridge and drive value in next few years until I think a whole lot of bilinguals show up because that's what we have in training

people. Yeah, the way because I know I have what I think I was a sort of a early adopter in some Being a bilingual of somebody who's probably better in business than most data, people better in data than most business. People. I found myself in this sort of bucket in 10-15 years back. And then, like, I found it is difficult to sell in some sense because, like, it's, because there are very few bilinguals and it's and even now I don't

see a whole lot of them. I either have to kind of take business people and sort of teach them programming or take

"Creating bilinguals at scale"

take data people and sort of try to make the make sure they First and the business and so on. So it's a bit of a that's I still see that as being a challenge for the next few years, at least it is. And if you have bilingual set scale that becomes a very big differentiator, how do you create bilingual set scale? If you, it's probably, if you start from a business mindset and go to technical, they have a better chance by my sense of it versus.

If you start as a, just from a technical impact on the business. Because if you have it, This goes back to the, my process, my decision making in the context of something. If I understand that, and the tools are becoming easier and easier to take. And think those things are becoming easier easier to understand that. I know how to apply the right way. There are some things you have

to be careful about. But so so I think you would you do bilinguals at scale which I mean if you think about the big of the dusty buzzword around is data literacy. This there's some value in it. I mean, if done well, that has probably the best Roi because you met, you start doing bilingual that scale. Gail for any within any organization and they are. Again, they're able to kind of put this to rest of it. Through the, if you are able to

solve the whole problem. We, we actually because most of the approaches has been black kind of. It's at, for a discipline that has grown by worshipping. The Brute Force Technique. We try to Brute Force more technology through this and rather than doing that. If you start kind of solving for the pool problem is why, what, how would you create the pull on the other side, rather than creating more push? I think you'll have much better results. And why did you say that?

Like you think it's better to kind of teach technology to business people and teach business technology because fundamentally because it's becoming simpler, right? I mean if you looked at and of a SAS code versus of python code or in that direction of tell you to tell you a little bit about hahaha easy, this is that if you think about libraries that you have to write in kind of a c or Java to what do you want like Sky kit and python that you can learn on that thing?

You keep you think about really date myself, Crystal Reports, where you require the programmer to kind of do they do to put a table and chart together to what you what you can do in the power. Bi interval that problems on the basic stuff. You can learn a few Arts things. I just becoming a lot more simpler and democratize and technology companies. Fundamentally want that.

They want a lot more Democrat. They want to, these are the, these are the arms, the arms race and they want everyone to have a God. I will it's basically that. And once you have that, then it becomes Becomes a lot easier. That's why I think fundamentally you are going with that flow, which is weird. We're all kind of creating the environment to make more of that thing easier.

It's of the technology barriers little bit less, even though they're very confusing for other things, but it's a little bit less of them right now. You don't have to be big programmer right now, but because all of these things up, yeah. And but I guess the one of the downsides at least like if you ask the more technical people what they'll tell you is that over to you, if you have people who don't have a match background who are just applying these things, as Blackbox technique.

Then that will increase the sort of likelihood of bad analysis in some sense. That is absolutely right. But if you think about it's a great analogy I used to do a lot of work and I and Motorola versus Motorola versus Apple. I bought rather radio I think is still was was a fantastic radio and you could hear a lot better and everything.

I couldn't do a win but because it had a better voice reception a bit of girl quietly life it one because it was more usable, it was and it created an ecosystem in a very different way. So Yeah, I mean, there are dangers of what you were just talking about that and they would be rad analysis. You have to figure out ways to protect bad analysis and they could be liabilities around bad analysis. They could be divided with regulations, in this space. I think in the next few years.

So all of that stuff, notwithstanding, I think the, the volume, and the number of decisions that you are going to influence, I think in Balance would be, that would probably be

Machine learning trying to eat data science

the packet of. And were told that the side we will go towards and we have to figure out manage the arrest of As we go along. Okay, I think I'm going to take another big jump right now. So I think we will talk about organizations business everything. So coming back to sort of for analytics itself, right? So, I mean, as you mentioned, like, I when it started, or it was mostly like stats, people who are people, who are into

this, its own. But now, it seems like, I mean, I know a few friends who started off as over hardcore over people, but now who do nothing but machine learning, it's almost machine learning has eaten up every other Kind of branch of data science and so on. So what's happening as in? Why is it that it's taken over so much and you live, is this sustainable in terms of? What are you? It's in some ways, is like technology, whatever replacing a mice imperishable.

Hypothesis is is Tunic, right? The, the tooling available to do machine learning is a lot more democratized, then the tooling available to do extra design of experiments or to do operations research, right? I mean, if you really think about seaplex to grow beyond the oven and I'm an older guy.

So the yeah, they have been a few things too few things that have happened, but a lot more Innovation, a lot more thinking has happened to make the machine learning, kind of using the tooling there, and has become a lot easier on one side of it. I think that's why s the number of use cases. I mean at the end of the day seek sorting and sequencing certain things in a Smart Way, a lot more use cases.

I mean in the, for all the and all the regulation you have in regular regression and logistic regression just application. I mean you there are just a lot more use cases to kind of do that. Did this are easier to kind of go and upload machine learning or basically like more for I said, I mean, if you're doing regression, are doing supervised learning. And some sort, there's just a lot more use cases of optimization.

There are fewer of those use cases comparative and I mean, and third, the it's easier to take the art part out and do that. I mean, the other big Next thing about simulation. I'm in simulation and system dynamic as a big Navy, for instance, as a super powerful, especially when you're in sports data world and you have a lot of expertise and you have to building data, very powerful doesn't scale, doesn't scale that woman doing is they're

still clunky. So part of it is dueling, part of it, is we have the use cases, are part of. It is like where this very quickly you can industrialize certain things. So it has gone. The machine learning way for a reason and over and and over a period of time, there is where the demand is, you will are ya even much of our problems you start to start thinking in terms of organic and then can I make them prediction problems rather

than optimization problems? And there are ways you kind of think about that and that's what people are doing. I mean, you you work in a huge essentially working in our company, and I think you, you probably use more whereby, we would do it. When you are chatting, you're telling me you probably use more machine learning and you have made a lot of those problems. Prediction problems versus

opposition problems. Yeah. No, I mean I do try to I mean I do have some were people in my teammates one because there are some something, if you I mean, I'm still liking if you think about it old school life, where I'm like, if you kind of formulate the problem in a nice and elegant - then like, you'll be able to solve it far more cleanly in an even know what perspectives rather than sort of throwing the kitchen sink at it. But even what I find is even as

a sub problem to the war, we end up like sort of we end up solve doing some machine learning models to to be inputs. To the over thing. And then it's not one or the other at the end of it, but it is the that's that's as I said, I will more and more and more problems. I mean, this is a great book by Professor, Roger ver, well, which is our production machines and he's an economist who studies Ai. And as this thing is, you make

it simpler and make it easier. You could create more problems that you and you make more problems only as prediction problem. And then you kind of hear this whole Theory word, Rich machines, but that's, that's, that's the court. Why? Because we are making more and more problems, prediction problems and within that. So you kind of incorporate more

of that stuff. Yeah, you'll still need work on the stuff but there's a just become if you make it easier, if you bring the cost of production down, you will create more production problems. That's fundamentally. What's happening. Yeah, I think, yeah, I think it's, yeah, I think that's later sort of a virtuous cycle in machine learning that because it everything is open source. Everything is open, everything is easy to use, like the theoretical, three lines of scikit-learn code.

You can do whatever you want. So, So it's easy to run which means that you have a lot more people doing it. And so like and in the tooling again becomes better and like it's sort of just lie leads to sort of, sort of much better outcomes in better outcomes or not. It just, you are just, you are going to use it a lot more and you're going to apply it a lot more. I need to be incrementally better outcome than what you

started with. Yeah, but some of the other thing I was trying to ask is little, is there. Do you think there's some sort of an Arbitrage in that their data size? Because I've long felt that like The way a lot of people do data science, Exeter, like there are things that can be done better. But take it's difficult to sell the lie. It's like you, you think there is the you think you can sort of do better as than a lot of

people are like through. Let's say, I don't know, like being bilingual or whatever it is. But like, is there something like the, is there some Arbitrage for the lack of a better word? In the way data science is done nowadays. And is there a way to sort of how do you exploit? The Exploit the inefficiencies in the system right now because right now we have you guys, you know, there's a lot of hype around data science hyper on machine learning.

So clearly if it's a bubble, there is a sort of bitter somewhere. Yeah, every bubble Arbitrage. And then, then you go back to the core competency lotion, right? Of certain things, what I mean, this is the question. I asked a lot of lot of like friends and colleagues and clients. What is the IP in some of these things that we are doing and everything Analytics? If it's three line of Sky kit and then we are going to apply algorithm. There is completely, definitely not the IP right?

But data access and how you put it together and how you organized in that I think are even the system that IP is absolutely there. And that I think is important, but it is is a little bit, very contextual to the problem and how you are doing it. The very big IP I think in an organization is actually how you incorporate and drive the change and try to decision-making the cultural, change the process.

That is where a lot of the ipas. And so if you start thinking about from a, the core competence model, if this is what kind of where my core IP is I'm going to concentrate on that one, is it absolutely makes sense to take the rest of it and go with more specialized people who just do that, right? So that's that's an Arbitrage opportunity from an organization perspective. And so it's not better or worse. It's like thinking about where you can do it and it is with it,

which it's where you get scale. Scale economics. I'm going to go and drive that. So that's going to Vector from 1. We can do better for more. I think, cost and speed perspective and both are very important. In terms of the quality perspective. We've already talked about, kaggle like better from a better rabbit runs from a political perspective, most problems 90% or 85% to 95% won't matter, right? It will be the last mile of that

cultural change and everything. That's where the things are going to happen. I don't think we novel tries that out and a lot of them doesn't matter. And so in that way, so, yeah, that's that's kind. That's kind of where ever you. There's a Professional Services. Were you see a lot of the value, a 1, and the other other aspect of it. In product companies. They start thinking about what is what will be in platform and what will be like the ecosystem around my platform.

And that's how they make decisions in my mind, which is like, okay, in platform. I'm going to do it internally and around the platform. There's a lot of other things that will happen. Lll kind of created some sort of

a partner ecosystem. I'll open a lot of things you can do. So, I think there's a no The rather than thinking about Arbitrage, I think of our core competency and where the value or what is the, what is the, what is the IP that cannot be replicated or no one else is going to build you, how you want to deploy in your organization, within your own cultural nuances and everything, you're operating

in 15 different countries. And you dip, this in that one else is going to bother about creating that you have to figure out. Figure that out yourself. Everyone else. Someone else probably has done, it came five times, 15 times, 30 times. You can probably leverage some of it, if it's not kind of core

Comparing data science practices across countries

part of your core product in some ways. That's, that's, I think of all the ways to think about it, as well, as IP. Where am I? Not so much Arbitrage from a cost in business perspective, but it comes up speed and choir and decision, quality perspective and somewhere over there.

I think you mentioned about like, how do I deploy it in different countries and so on. So I think since you run a very large Global organization, I think you would have worked with organizations pretty much everywhere. So how was the start of the uptake of Analytics? It assigns sort of varied across geography. So the last 10 years and how was it in Clayton 2010. How is it now? Like, how does Europe differ from us? And so on? I mean, it's actually really interesting larger companies.

The correlation is a very simple, one who has invested. Whereas the Erp investment Gohan, and in which order attempt on. Because once you put the process part of the data together, right, then you need to figure out how to make the Caribbean, your view figure out the process. Now you make my processing Wireless. Watch that all. If you want to make a broader decision, making intelligent what I will kind of wherever you need to have some of the basic

stuff in Plumbing in place. So, which is why you see us, as a massive Market, your app as, like, you were senior brother Master market and everything. So I so that's kind of like that's very easy. High level things to look at, but what I see in emerging economies and emerging economies and especially if you do working Gates, it's completely crazy. I mean you have been I was at a conference in Hong Kong a few

years ago. Insurance companies in Hong Kong to Mainland China. Point out of here. With that example. We had some Mainland Chinese companies presenting and then we had companies from like Australia presenting. Right? So like the whole Japan Australia, you can see like entire ecosystem of that, that's fascinating is the use cases were like like Star Trek, kind of like, it's like Matrix and it's like 1970s. That's, I mean, it was that stock and of some of those things that they were.

We're talking about. We are talking about an insurance company that does. They recruit 100,000 agents into the network in an automated way through video chat through IAI, right? I'm doing that right now. These guys are talking about all of these kind of things and the next guy is talking about adjourn waterfall. I like the children. Go to school to learn what all the developers schoolwork. I like, for preventing whatever. Like, some customers from leaving. I will I. Okay.

There's a this 1980. Someone attempted it in some ways. And 2010, everyone has some version of it but these guys are doing something else altogether. So so and there is a little bit of the cultural thing about a, I think in China, which is, I think people appreciate it. But if you until you see it and this is the first time I saw it, it's like, okay, they are in the operating in a completely different world altogether there right now.

So they have kind of Leap Frog. The point I'm trying to make is there are there are jogger, fees, which I'm going to just Leap Frog. I think India. India probably also isn't the kind of the same situation, scalable whole notion around of a national Focus around it.

Like you kind of take, do you like the Erp World goes to properly a cloud Erp World, any kind of, so very quickly, you all the Legacy things don't, even matter, your regulations in your perspectives and data privacy, probably has something to do with it and just the focus around like an entire ecosystem. I think, for a few geographies, where a bunch of these things will fall in place and like in Mobile phones. Are you think about when tax and mpesa and Kenya versus risk?

We are still figuring out a lot of the advanced economies, how to kind of make some of the context work. It's kind of the same thing. They'll be a leapfrog effect on analytics and AI in certain by inserting. They got emerging economies which looks very very fascinating to see. I mean Indian education. I think I can see that so some of that stuff will happen, but the Western world has been essentially followed a very simple thing about where they are.

We dollars have been spent and if I've spend those dollars, they are latex, will probably follow and that's what we have been fundamentally a lot of the companies have followed. Okay. I'm rich. This has been a fascinating conversation. I mean, I love our conversation over the last 45 minutes. So in closing, what do you think about David? Do you have any sort of final things to say about what we see as the hype and data science that's going on right now?

How it's going to play out. I mean because if cycles for a reason, right? So for folks, who've grown up like, you started their careers in like late 90s, like me. I mean, we saw the internet hype cycle in some ways, right. I mean, there was like a pasta and I kind of started just before the last of the.com and there was a little bit of ago, Wide area. But because we had invested so much and we are done so much great things at that point in

time. You just suddenly saw all kinds of productivity all kinds of cool things happening. So so 11 that yeah, it will probably kind of there'll be a little bit of a disillusionment at some point and then it kind of scales up or you can you can argue this has been going on and last eight years, nine years were a little bit of her disillusionment and now the poor, the productivity productivity platter will start start in a very different way. So I don't know what the right answer is.

Going to be, but one thing is there, the high definitely leads somewhere at some of the high points of the, some of the things we've been talking about leads to more General acceptance. And that I think is a given, whether it will be, and I think it would be hugely it and I said, I think Arthur Clarke to see. Clark said, I think, right. I mean you are, you normally are wrong about the future in the, you overestimate it in the short term and underestimated in long-term.

This is one of those. This is one of those friends. And one of those things where I think they're absolutely underestimated. Long term and the hype and everything is because we are trying to overestimate in the in the in the short term. I think that's bull of that stuff is happening at the same time, which is fascinating to see. And and and part of the hike will go away because this is the business analyst of future is

the data science of today. So everyone is going to be a data scientist or a business analyst of some sort, which is over you call it when they dig become their, you call them data scientist or you call the business analyst to the new business, that is 2.0. So, yes, it will be a lot more pervasive, and once it to learn more, Whatever. It is just fear of a matter of fact that it will happen. So that that one is definitely coming.

And I mean, in some ways it is, I feel I'm like, less than and super excited to be just see how everything unfolds for someone like Avenue. And I were trying aware, just in some ways. We have kind of given a careers to this whole thing and it's been evolving in front of us. Thank you for listening to data. Shut. If you like this show, please leave a comment, share and subscribe to the podcast. You can find this podcast on Apple podcast Spotify or wherever else you go to.

Your podcasts. Once again, this is Karthik signing deaf. Thank you.

Transcript source: Provided by creator in RSS feed: download file
For the best experience, listen in Metacast app for iOS or Android