1. Excel-lent Graphics - podcast episode cover

1. Excel-lent Graphics

Jun 22, 202158 minSeason 1Ep. 1
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

When we read or talk about “data science”, most of the talk is around modelling - the maths behind it, the “cool” modelling techniques, what kind of CPUs or GPUs are required, and all that. What we normally talk less about is how data science interacts with business.

In this inaugural episode of Data Chatter, I talk to S Anand, co-founder and CEO of Gramener, about this so-called “interaction layer”. Our conversation is almost completely focussed on two such interfaces - Microsoft Excel, and data visualisation. We talk about various aspects of what it takes to communicate data to business, and pros and cons of different tools.

Anand is a co-founder of Gramener, a data science company. He leads a team that automates insights from data and narrates these as visual data stories. He is recognized as one of India's top 10 data scientists, and is a regular TEDx speaker.

Show Notes:

00:03:40 -  On how Anand was “always a data guy”
00:12:01 -  Anand’s first tryst automated infographics
00:15:30 -  What visualisations work best for whom?
00:22:00 -  Visual Basic and Python
00:27:15 -  “Gymnastics in Excel”
00:32:00 -  Creating choropleths using Excel
00:41:20 -  Google Sheets
00:45:30 -  Business Intelligence Tools, such as Tableau, Power BI, etc.
00:52:09 -  Pie charts

Links:

Gramener: https://gramener.com
Anand’s website: http://www.s-anand.net
Tufte’s seminal book: https://www.edwardtufte.com/tufte/books_vdqi

Data Chatter is a podcast on all things data. It is a series of conversations with experts and industry leaders in data, and each week we aim to unpack a different compartment of the "data suitcase".

The podcast is hosted by Karthik Shashidhar. He is a blogger, newspaper columnist, book author and a former data and strategy consultant. Karthik currently heads Analytics and Business Intelligence for Delhivery, one of India’s largest logistics companies. 

You can follow him on twitter at @karthiks, and read his blog at noenthuda.com/blog

Transcript

Intro / Opening

So the first thing that I put together for them was actually a mini data, visualization of sorts a scatter plot Matrix, which of course didn't exist. Even by name at that point. This was in 2090, personal stuff. You can do in Excel. Yeah, nine percent of the stuff you can do in these tools that we are still can't do this one person. That neither of these can do one, go to have JavaScript python, whatever you need. Welcome to data, Shadow the podcast, on all things data.

This podcast is a series of conversations with experts and Industry leaders in data. And each week. We aim to unpack a different compartment of the data suitcase. Your host that the chassis that I'm a blogger newspaper, columnist, book author, and a former data and strategy consultant. I currently heads analytics and business intelligence for delivery. One of India's largest logistics

companies. You can follow me on Twitter at act Karthik s and read my blog at no into.com that is noen the HUD a.com all opinions expressed. In this podcast belong to me, and the podcast, guess and do not reflect the views of any organizations. We may be associated with nothing. Disgusting, his podcast should be taken as Financial or legal advice. Now onto Today's Show. When we think of data, many of us instinctively think of

spreadsheets. And that means, Microsoft exit, the world's most populous big sheets of metal and the time of recording at least in some ways, the use of excel in the world of data analytics is much underappreciated. Another thing that many of us instinctively, think of when we think about data is graphics and visualizations bar graphs line, graphs dashboards and the like in this Not get a lip is sort of data shatter. We bring together these two

obvious Concepts in analytics. My guest today is s Anand or stud Anand as he's known in. I am Bangalore circles and is a co-founder of granular data science company, leads the team that automates insights from data and narrates, these as visual data stories is recognized as one of India's top 10 data scientists and is a regular text speak. Ow, another is a gold medalist at IIM, Bangalore and an alumnus of IIT Madras London Business School, IBM, enforces Lehman, Brothers. And BCG.

Most importantly is hand transcript, every Calvin and Hobbes script ever and dreams of watching every film. The IMDb top 250. You can follow him on Twitter at s and 0 that is s a nando64. Oh and his website is It's a document. That is s Dash DN A and D dot - you get into this because you would have heiko sadly consultant at BCG from what I from what I know. So, how do we go from there to kind of being a hardcore data? Gay know, the strategy Consulting was the mistake.

On how Anand was "always a data guy"

I was actually at it again. Okay, so meaning I wanted. To be in first used computers. When I was, what in class 6. We had a BBC micro at school, and they taught us programming and I thought that was a really cool. That's, very impressive. Comes considering you are much older than me and I started in class 7, so yeah, 86 is when I started, okay, and then in 87, my dad bought me a ZX Spectrum.

So initially it was playing games and the learning a little bit of basic and then slowly it went into generating Fibonacci series. He's and then slowly went into fractals creating a mandelbrot set, slightly more efficiently and creating a 3D rendering with Phong shading and stuff like that. All of it on a tile is Eric Spectrum with 48k. So it was fun.

Yeah, and I was hoping to get into a computer science degree, but pretty much no College. Gave me a computer science degree the closest I could get to that was tight. The EM where I say, I basically ticked off every Branch other than chemical engineering and then the Prof. Who was supposed to be guiding? He looked at this and said, no, for your rank. You will get chemical engineering support to take against chemical engineering. It separately don't like intentionally.

No, it's okay. Go ahead. That's what I was. It's so I spent those four years trying to do anything other than chemical engineering and computer science department was my retreat. Okay. So in 96, when I got a job offer with I've been said, okay great. Good, riddance. Goodbye, and it Three, good years of thought. Then then that 99 is where the mistake started. Actually, the Mystic started in 96 when I cared and didn't get through. So I was pretty pissed off.

So I decided to write cat again just to show them that I can get through and I wrote it and I got through and then know this whole pressure builds up. No, no, you have to get there. So I got the winter. I am a interviews income. You have to go to any sense too far away as I am be, that's as far as I was able to resist or IBM. Was thinking, I'd be most impacted. So have you was enough? Okay, okay. And so it was almost a repeat of 1992.

Again, enough air pressure. So I said, okay, so let's do it. Yeah. And MBA and then it just continued after that. Now you have to get into an investment banking career or Consulting career. Luckily. It did an internship at Lehman Brothers and you that your investment banking is nice good and all that but not for me. So as a chill, let's learn something that lasted for four years where I dcg. I basically tried to work my way into every technology. Project. That was in fact, I want strong

but there was a piece of work. We were going to do for Oracle and others. The First Technology project that I come our way. This was based out of Delhi. I was in Bombay by default, Consultants from Delhi would get staffed. So I remember calling our partner James and that was the morning when they gauge meant was I think about to start our kickoff or something. I said James. I'm at the airport. If you tell me, I'll buy a ticket, get on a flight and come over there. If not, I'll go back.

Yes. He said yet. If you're that desperate, they're okay to hop on the flight and come but 2005. I had clear I had Clarity that. Yeah, I'm going to get back into technology of course having done strategy Consulting. Nobody really gives me a job into technology. Yeah, so it was disconcerting. I'm good. Yeah. Now you have to say going from McKinsey to Google or Facebook wherever, right? Now, it's quite a common today.

But yeah, most people said, look, the best I can do is give you a sales job or a marketing job. Yes. I want to program. So Infosys Consulting was the closest where it was a quasi technology quasi Consulting, kind of a role, and there was mostly sitting and coding doing stuff like this, IMDb to goofy. So it wasn't quite a shift away from strategy Consulting. It was a desperate claw back from 99, 2005 to technology, which is what I really love. Yep. Okay, today, this very interesting.

So I probably we should take you here on a slightly alternate path because I got my first computer when I was 11 or 12 in a couple of years, learn to code, then started doing these random graphic things on our 386, 10 things like that. Then I decided, I wanted to study computer science actually managed to get computer science in IIT Madras and then like, I completely fell out of love with computers things. For the those four years, I completely sort of.

I hated programming. I was reputed to be the best programmer in my school, but I completely hated programming. So for me, I am be was a respite. It was like I want to get away from take so and so I am be happy. Very similar to you internship at JPMorgan happened. And I was like, no, I don't want to do this. Then I joined 80 Connie.

And then I said, I don't want to do this either and I left in within the within the first few months and then like and then yeah, like, I mean, I 3500, various places went back to invest in banking for a bit and so on. But yeah, you know, it took me about five years after graduating, from computer science in, IIT Madras to that liking to code again, ho ho, ho. Wow. So maybe maybe in some way by not doing computer science and doing chemical, you manage to sort of retain your love for

coding through your shit. Okay, then I have to thank that Professor, not curse him. Hey, it's also where it's from Infinity to the thing you say to grammar or like you you deserve. Yeah, yeah, 2011, my ex-boss from IBM, ROM. He reached out and said, took all of us old guys. We are looking to do something you're interested. I said was, yeah. Absolutely. Let's try again. Okay. Okay, awesome. So, I did like, I mean, you sort of like, I mean, I also like we sort of bumped into each other

at various places. First, know you as a sort of a Visualization Guru. So, how did you get into Data visualization, which is not exactly programming, right? So, I think there are two parts that let me there. The earlier one was just an interest in graphics programming its key. I like playing games. Yep. The question becomes. Now, can I create games and for that, I have to learn some of the basics of Graphics fractals were always an interest and 3D

rendering was always interests. So, and my dad's an architect, so at his office, I was interning of sorts creating 3D models for him and sketches and stuff like that. That was one phase where I was generally interested in programming stuff that generates visuals, even if it's a simple matter of the ray tracing a seen, those is software called all free which I've been using for quite some time.

Of course, autocad's autodesk's 3D Studio was another that I've been using quite It was on one side, but the data hadn't really come in at that point data independently. During my, I am days. I got into mostly playground with financial data and a bunch of other things.

But during my BCG days. I did get an interest in visual design, simply in terms of slide design, for example, because I really sucked at it. So, for many years, I had been spending time trying to read learn the principles of design. I'm and honestly, the one book that taught me. What little I know of design was this book by Robin Williams, not the actor, called the non-designers design book.

Okay, and so brilliant book. It tells you a few simple principles that you follow to make sure that you get. Design, that is not bad, not a sign that is good. But designer is not back. And while I was doing that since I maybe was familiar with programming and said, what does it take to make design programmatic and automate as much as I could. And since I was also looking at data left, right and Center, what does it take to present this in a cleaner nicer? Professional way.

So some of the early data visualizations that I put

Anand's first tryst automated infographics

together, I distinctly remember One when I was at a Consulting I was doing this piece of work for S. Co.com., The British Retailer's website. So one of the things they had was a 40-page PowerPoint presentation that talks about what where the weekly statistics from Google analytics. Okay said what would it take to create this in the form of an infographic? A brochure high-quality design your at takes the data from Google Analytics. It's and programmatically

generate set. So I put something together. It was an interesting experiment in many ways. It was very information dense design a dread Edward tufte by then. So okay had a sense of what kind of information dense displays one can create put it together. It was a python script. I still probably have it somewhere that pulled the data and then I tried an experiment. Is it what kind of attention would this grab from? And who does it really attract? So it was late, one, Tuesday

evening. I took color print outs of this after everybody had left for, which the UK is easy. So, yes, basically around him or something. Okay, they worked a little harder. They're left these printouts on pretty much, everyone's desks. Okay, with my hypothesis being well, not even hypothetical my question being Who is likely to pick this up? Okay, so then came in the next day and I hadn't put my name on it. I had put anything so it's like nobody knows where this comes from.

Yeah, so this is support sitting in there and I check mid-morning. It is interesting. These are the reports had been picked up by every one of the exits. Pretty much everyone in the marketing team had picked it up. Only the head of operations had picked it up. No one from the vendor's teams had even touched the report number. From the front team have touched. The report was just laying there in from called kind of thing.

So it gave a beautiful perspective of who actually is interested in this kind of data and force that landed mean, probably because it was a big Witch Hunt of sorts that started saying okay, who's this person who seems to have gotten access to all of our data and it's doing stuff with this scent. Okay, let's happening. Yeah. Ended up well, but that was the first time I realized what happened was then. The head of marketing after I

shared that. Yeah, I could sneak the head of marketing came over and said, why did you do this? Say, why do you ask? She said, well, look, this is a professionally designed report. So I thought we had paid someone to do this, and I had no clue whom we had paid the same. Oh, okay. Something that's professional. Yes, not a bad thing to start off with. Yeah. That's where the idea hit that. This could be a thing. Safe before data visualization

was even thing. Yep, which year was 2008 or 2009 so Miranda, so wow. Questions like I mean, so let's do the difficulty to the static. You said that the marketing everybody had picked it up operations, only the head had picked it up and nobody else was interested. And in the vendor team. Nobody was interested to, why do

What visualisations work best for whom?

you think that happened? And like let's talk about Let's use this to get into. What kind of visualizations are? What kind of reports work for? Which kind of people like, based on what they doin sir. And so one thing I've learned is finance teams actually do well with numbers and not, that they can't do poorly with visualizations, but they just so much better with the numbers and need the numbers so much yet.

We are probably better off staying away from I shouldn't even say Finance. I should really say account. Yeah, That's Amore out. Yep. I learned that the hard way because the purpose of a visualization is to give an overview to save. This is bigger than this for the accountant. It, it's kind of for a manager. Yep. One cent versus $100 is 10,000 x different to accountant. A discrepancy is a discrepancy. And that is something that I missed for quite some time. To be honest.

It in very be visualizations, invariably work for senior execs, because ultimately, it helps them get a perspective. What's big, who, what's important? What should I focus on? Because prioritization is one thing that we realization helps dramatically in. And that's one of the key challenges analysts. I increasingly find are using this because Helps them from an exploration perspective. Again. It's a prioritization thing. What should I focus on?

Where should I dive in? Yeah, and that is happening. Typically at Junior levels when people are saying okay, if I can use a visualization little help me dive deeper, that works across Fields, whether its sales and marketing or whether its operations or admin or even H are now finding that the less numerically Savvy. The people are the more visually attuned. They are the think it's simply a yet Style. Love thinking or working.

So net, net. I have two simple rules of thumb for 33 on whether a person who will dive into visualization senior vs. Junior. Yep, seniors dive in more numerically Savvy versus numerically non Savvy and the numeric lean on savvy are more likely to die even more. Yep, and the Actually, I'm just these two third of the subset of the second.

Okay, so mapping with back to your original example, I guess the marketing people were more like they were likely to be like I gets less numerically Savvy and so they all picked it up and like the report. Is that the no, I know I think even if I had just the numbers there, they would have picked it up because this was literally about sales and marketing. See this case. It will serve domain bias. I suspect that if I had put together something similar for

operation. It's in Terror operations, team would have picked it up and only the sales and marketing head would have picked it up fair enough. And what explains the windows not taking that they give us completely relevant to them the window managers even care. Yes, you did tell me to do this. Yes, sir. Okay. Okay, you know, we are very busy. I mean, like, as you might know, for the last few months, I've been working for delivery taken

up a job after a very long time. So I have a friend in finance. She was like, can you when will you do some analytics visualizations for us? I was like, look, I'm Doing visualizations for you because your needs are too precise for

me like actually, right? Because the finance is like it's all about like which is what I mean going back 15 years, which is what I figured out during my little internship in Investment Banking. As well that click the need for precision was like, yeah, I'm like you need thousand two hundred dollars have come 2001. 98 you have word just round it and you're dead and they're like, no, we need to. We need to match the last penny and I was like, okay. Yeah, I guess we're visualization.

Is that, since works for when? You are when you want to produce the show brought trains, let's say or when you want to Lexa. So I guess so if you have to present it to finance people or other otherwise numerically Savvy people. I guess it's more about yes, give them tables and give them well formatted ajj table. Exactly in the areas that they are looking for very refined. Visualizations. Helpful is our to help them see stuff that they otherwise don't

see to give you an example. You said the two dollar founding of a new gas. Now their point is that to two dollars, two thousand dollars. It's all the same and error is an error in which case the metric that we are tracking starts becoming different. It's not the magnitude of error. It's the presence of an error and then we get into something a lot of roughly along the Realms of data quality.

And now, can we visualize data? Quality can be visualized fraud but it's data T. We did something interesting there. Was an issue where they were getting fraudulent purchases, adulterated matches, and a big part of the problem was trying to trace where this was coming from. Okay. So what they did was they actually took about 100,000 batches and looked at the flow of these hundred thousand

batches from beginning to end. And at the end that these are batches that they had money, re-inspected and found that a certain proportion of them where it will treated. And when they had the whole supply chain here. Facing the flow of these are all great, advances across the supply chain, and being able to visualize that now that's an Auditors Delight. This particular guy. Oh, I know this guy. He's actually colluding with this fellow because he's his brother-in-law at this plant.

So I know why this is happening. Ah, yep. It's a very good. Yeah, which is harder to do with the numbers and it wasn't something that they had fought off. So generally we think about presenting this differences, give them what they know they want as Numbers. Yeah, them what they didn't even know. They wanted as visuals. That works. Really, really. And, and what do you what do you use to buy?

I mean, now I guess you've come a long way, like, having been running this company for a decade and things like that. So, what did you start? In terms of a, you said you wrote a python script for your original thing when you were working for Tesco and then like, so what how is your tool Journey evolved over the course of time? So the one constant in the tool

Visual Basic and Python

Journeys? And I'm sure we'd want to talk a lot more. Of course, of course, of course, we come to that in a bit. Yeah. Yeah, but the programmatically so well since I was in Texas Bay's, I'd always been playing around with Visual Basic. So that was part of the Toulon since usual, basic actually can be almost copy pasted into python, which I thought was remarkable. Maybe. Yeah, in 32, cam in Python. I've done this.

I've actually recorded macros in Excel and copy pasted them and with Very few changes, it actually runs in Python. If you import the winter to come Library. Wow, so I'm not that surprised, because there was a news, a few years back that Microsoft was planning to replace Visual Basic with visual python to allow you to manipulate Excel sheets using python. So I guess they would have also seen this similarity in some sense quite possibly.

They've gone the JavaScript route now, which okay, actually, I'm very glad for because for us from visualization. Stick to python was the de facto server-side language because it was really good with data and any language could do visualizations, but the one disadvantage of any language except JavaScript is. It doesn't run natively on the browser, and if you want interactive visualizations, it's got to work on JavaScript. Yeah. Overtime. JavaScript became so good and

good. Enough on the server side that in 2015 or so. We shifted our visualization tuning from, bye. Ethan to JavaScript, right? But barring that there hasn't been a major language change on the library, side on the server side, the python always had the likes of matplotlib and now Seaborn book Etc, but I've never really used those. Those are for conventional charts. And these generally reason can exist is because of her unconventional visualizations.

It's very doing what these like these can't do. So we ended up creating svg's using templating languages and that's in my mind still over the last 10 years, that one technique of creating svg's using templates is the one that has led to the proliferation of the kinds of visuals.

We create today and is still arguably the most powerful way of creating unconventional visual representations, so that we did in Python and then shifted over to D3 initially, but the trouble with D3 is it requires too much programming, too much learning? And Vega was a good mid-level alternative, higher, than that. And Vega light is a good high level alternative. On top of that. I wouldn't say that. The problem is solved.

We still need a certain amount of tooling on top of Vega light for more people to be able to create it. Basically between Vega late and Excel, there is still that Gap, but if that can be bridged and a little more flexibility were brought into these libraries. I think we could say that the visualization problem is solved yet. Probably. Every step of the way there. Okay, actually, I mean it's interesting that I my journey has been like a briefly disjoint from this.

I moved directly from Excel to our to ggplot and I got repulsed by python, because Matlab matplotlib, it's so difficult to sort of handle back road little Seaborn. It's like I just couldn't get the hang of it. So I almost like a, this one assignment. I was doing it a couple of years back where like the client was like, can you please give us code in Python? Because it will be easier for us to Integrate I will I did it for

six months and then I told him. No, I'm shifting to higher because I'm 10x faster than the are then by. So fair is like, I guess personal preferences and like also I'm not a corded corded anymore in some sense. So right here, we are really looking for the answer more than to the repeatability of the answer. And what we're doing really is more helping someone else figure out the answer. I am less interested in telling

you someone the answer dancing. Here's something that will Constantly, keep telling you, the answer in the perspectives are very different. Luckily, for exploration. Our is forgive much better. Is a much better environment. And even after we factor in the likes of Jupiter lab and whatever other advances have come into the ecosystem. This whole lot easier to it's flat. It is a whole bunch of these libraries.

Each of these libraries. There, something specific you go. Look at the documentation. You try it. It works finished. Move on. Yep. Yeah, that is In Python, you make sure that well, it fits well with all of the other pieces, the beginning and at the end and it's repeatable. So it's engineered in a very different way. Yeah. Well, I think I have sacrificed on engineering which is why I like I get a lot of sort of benefits on the other side,

right? In terms of how quickly you can do things, how easily you can do that at how much programming you need. And all those things though. Like it's like like the tortoise and I am be a long time but everything is about it laid off anyway, so let's come to your favorite topic which is The so how do you feel having? I would ask you how you got to accept because it as a strategy consultant. I think that would have been your 99% of your time, kind of thing.

So and then, when you do start doing sort of protocol is

"Gymnastics in Excel"

gymnastics in Excel and like creating all those cool things in one. You know, I actually don't remember. I think it must have been during my IBM days. It is when I learned Visual Basic or maybe even before that. Even during IIT was playing around a little bit with Excel and learning a little bit of Visual Basic. But the fact that you can press a button and get it to do something cool.

That was always interesting. Yeah, first really interesting piece that I actually remember doing was at Lehman Brothers. So there was this guy Scott Tucker. He had he had the ETA of how the markets where correlated with is basically had historical performance of the indices. All the major indices. Mm 40, 50 of them. So the first thing that I put together for them was actually a mini data, visualization of sorts a scatter plot Matrix,

which of course didn't exist. Even by name at that point. This was in 2000. So pretty much did all kinds of index match and we cup jugglery to take all of those scenes and create that scatter plot Matrix of the correlations between those, another one was looking at the US Treasury yields across time. So take the US Treasury yield on a given date and it's a line graph.

Yep. So what I did was I created a slider, which, As you move the slider, it changes the date and plots a graph on a different day. So as you drag the slider it smoothly animates. Yep, the USD and it was possible to see that the treasury yield was clearly flattening because visually really powerful. Okay. So among other things that got me placement of, okay, good. I deserve, I think back when the, it boom was still on - Ruth, summer of 2000.

Yeah, exactly. Because it was literally, I think a week before the.com bust. Yeah. I was in Tokyo at that time. And in fact, in the Night before it crashed and everyone's like partying. Things are going great. And so on wanted to keep your very worried about the bubble. I've actually everybody was called confident that it was a boo-boo. But question is, when is it going to burst? And then next day was oh my God, the dating field of a bloodbath. I can employ young exactly

around that. Yeah. Yeah. Yeah, I okay. Yeah, and I think the thing with Excel is that like, I mean, especially once you have learnt Visual Basic, the number of things that you have, you can do with it, like multiplies, like, sort of with a large Factor. Absolutely, absolutely. Arguably. Even before that. I think in the developer tab, is at least one feature that I think is as powerful as visual. Eric is simply the ability to connect the scroll bar to excel As you move.

The scroll bar the value in the cell changes in bicycle. Yeah. Now that's effectively linking a visual element with control, effectively slider to a salon. Once you've got something into a cell then formulas can take care of all kinds of things. Yeah. So just with that, and of course, similarly radio buttons were required and so on. So putting together and interface with these kinds of control, sometimes even just drop downs for which you don't even need the developer tab.

I felt that was remarkably powerful. Even more powerful than the likes of pivot table than vlookup which in themselves are extraordinarily powerful. Yep. But yeah, we should base it takes it a whole realm beyond that but so high that I very rarely needed to resort to that. So things like for example using an external service, connect to web application and do some complex processing on the server or get social data from there. Yeah. That absolutely needs.

The likes of visual. Basic which is unfortunate because on Google Sheets, we can just import a Json or XML using a formula. So those formulas are built in and it's a Pity that actually doesn't quite well and how it kind of does. But it did for a very long time. Have you informed that that could fetch data? But yeah, outside of that, the kind of stuff that you can do, especially Excel has the shapes. And the fact that you can start coloring shapes Based on data is really powerful.

Yes, so for me, the bulk of the Visual Basic that I have written is largely about mapping shapes two numbers. Okay? Yep. Yep. Yep. Okay, interesting. And I think one cool thing that you are done in Excel. I mean like which I mean again, I don't work from a very long time because they think you can

Creating choropleths using Excel

create Maps using exit colored mastering. This is during some election or something. Some work you some election was you will talk talking about a new have created colored Maps using Excel. I mean, that blew my mind. And I hadn't thought that's possible in Excel. So seriously, the actually, I don't remember how that came up. But the need was always there. No matter what kind of format a shape file is in. It's just a nightmare even today, color a map, it, I can't

do it anymore. I have this with the same. It's not practical for me. It would take me an hour to color a map in any way other than on, except if it's there on Excel, it takes A minute. That's okay. You should quickly difference that need was always there and it kept nagging me. So the first experiment was if I change number, can I change the color on a set and serious? Yes, with a little bit of Visual Basic. So once I cracked that, then it got pretty exciting, which is

now can I figure out? Can I take a color scale and interpolate based on that color scale? So interpolating colors was a little bit harder to get on Visual Basic and Visual Basic as a language just sucks. Coming back to practically any other language data. Is there an array? I don't even know by and large. My answer to the question is a featured as a feature exists on between basic is? I don't know. I'm going to do a Google search. I'm going to copy paste.

Try it out if it works great. So it's something that I really don't know Visual Basic at all. Okay, what. Yeah. Interpolated colors finally managed to find or write something like the we that does it, which end up being reasonably compact. So now we can integrate colors. The toughest part was reading a shapefile. File and there are drawing it. So again at the time I wrote it, there wasn't any Library good enough that could read a shapefile and tell me what the

points. Where, so, the closest I could get to was using think some JavaScript library that would render it as SVG, and then read the S EG. And, for each point in the SVG, draw a free-form in in Excel. So, the good thing is, this is where pythons ability to work with Excel comes in. I can be literally anything that can be done in Visual Basic Works in Python. So in visual basically, I basically record a macro where I'm drawing a shape and it gives me the code for it.

Just clean Visual Basic code paper copy paste that into python just change the points, put it into a loop and that works. So, we took a whole bunch of shapefiles, converted them into CG convert them into Excel, put them online, which will basically at that time the election of the Parliamentary constituency is the assembly constituency is for a country for each state exactly, which makes it a whole lot easier for journalists to just fill in the numbers publish. It made a huge Larry.

These are still being used. In fact, literally this morning. I was talking to strike on that collapse who was doing some support Initiative, for covid, terms of getting beds and to get the donors to see what areas have less pets more beds. He said over a shapefile. I could go to that an actual map and it's a whole lot easier for their team, label them. However, they want put in arrows. However, they want, exported it works seamlessly.

Yep. Yep. So why is it that like, I mean, I have two questions. Why is it that in general data scientists don't like Excel? Think it demeans the nobility of their talent. I'm obviously being ridiculously sarcastic here, of course, but it'll probably take a few seconds. Maybe even a minute to work that out of my system. Yeah. Yeah. At some level if I'm, I don't know dick close enough to a

real-life problem. But if I am a surgeon and you come to me for advice, my advice is going to be well, chop it off. Yeah. As opposed to a general physician who's going to say, well look, I don't know what surgery but medicines, might cure it now, what pays a data scientist more? Bitin or exit. Let's go even more precise. What weather data centers Force? I touch or pandas or exit. And the answer is, probably in orders of magnitude is forgive me multiples higher. So given that.

Why would I want to do something filling my experience in a tool? That pays me less? Yeah, and therefore if I have problems are scarce at Level. So time is scarce certainly. So if I have a problem at hand and I can choose to use my time, building my skill in a particular tool, that is going to pay me more. I'd rather do that if I'm already really good at and US python by torch and the whole stack. Let's say, yeah, then I'm going to look for the next one. That's going to pay me more.

Not the one that I left behind. You're not going to be programming in Haskell. You need a suit, right? Even if of course, no Bhaskar nor am I, I think it's exactly the same thing. The other hand. I think Excel is the better tool. Lets, you know, I read that I think one shortcoming that I've noticed with accelerate, especially I mean over the last few months that I've been extensively were getting data, out of databases. Is that with Excel?

Like it's little automating tasks, into end seems a little more difficult compared to like, with our equation. So, absolutely. And this is where I just Yeah, I just hate that Excel Works. Only on Windows and sheets because Excel is quite automatable. If you have Python and windows, servers aren't quite as good as Linux servers. See almost on a Linux machine practically anything is automatically?

Yeah, heck browsers are automatable to am in with the chromium and specifically Puppeteer coming in. Yeah, there's nothing that you can't do on the server side that you can do the proxy. Literally mimics every action that is still not true for Excel. So yeah, I was talking to the visit is be names deeper. She was talking about how she wanted to build simulations for classrooms. Okay, and could we build a tool that will allow students to do it? The use case is basically Perfect.

For example, you have a set of formulas. You have linkages, you build a full-fledged model around it. The student weeks, a few things gets to see the result. Now if Excel could be exposed, like a little mini web application, you send an input, it does all the calculations and sends it back for her. It will be trivial to edit the model. And for the user, it will be in just seamless because hundreds of people can use it exploring.

They were then, So I guess yeah, I'm actually I'm doing nothing more than repeat what you just said. In many more words. This literally are the lack of automate ability in Excel such a way. This is why I mean, in some ways I have not really used Excel for over a decade. Now, as when I use it. I always make sure even on a personal computer. I have a Microsoft 365 subscription just for exit because I got one without it, but I haven't done any real work. Work on exit.

Well, I think except release course is in terms of interaction with business letter. For example, you have to the business, guys are all very proficient at Exit. If you have to give them something like okay, you tweak this then need not even be a slider. They can they can change the cells and you change the cells and this is what it pops up.

Excel is so much Superior for them to them, than giving them a web app or any of the other things like that, crazy, two clicks, and not just for the spreadsheet. Except itself, but one of my colleagues to talk was telling me that we are giving except for one of our clients as the to in which they related stuff, dynamically generating fairly complex supports and in fact, models and formulas in Excel.

So because the shortcut keys are so familiar with identification, they could quickly jump around from section to section and do stuff. And it's so much more productive to do that yet. So perhaps this isn't even a case of this may also be a case of a tool that they are familiar with. Yep, as much as it is the right tool for this, for the problem. Forget.

It's a good combination of a tool that they are familiar with, and which has which is reasonably flexible and Powerful. So you bring those two together and like you can create a great interface method. It was it's just traffic talking, which Simon picking Jones, who has, in fact, written, few papers talking about how Excel is actually a turing-complete. Programming language. It is meets all the criteria. Yep. Yep. I have probably come across the existence of the paper, but I

haven't really read. It will put a link here for people who are interested by the way, they cup. So see the allocation high-risk. What's your view on Google

Google Sheets

Sheets? Even that you are such a big fan of fix it. Oh, brilliant. So why would I not use? Google Sheets? A because it's not on my system. So sometimes when I'm traveling on a bus car in a flight, whatever I need Excel. Why else? Because shortcut keys are not the same as. Yep, but short of that, I use Google Sheets like crazy. The specific youth. Basically our website grammar.com is powered by Google Sheets. Okay.

Okay. The content is actually all on Google Sheets. The marketing team is goes to some spot and edits, some text and click submit button. And it refreshes our website and not just that whole bunch of websites are powered by that. This is not even an uncommon thing. Hey, the New York Has micro sites, powered by Google Sheets collaborative editing in a structured way is just remarkably powerful.

In fact, one of my side project which is on hold for almost a year now is to come up with so Excel is great with table structures. It's naturally structured as a table. Is there a standard by which we can create a hierarchical structure? Okay, so I want Inside which I have a table inside which I have a given potential. Yep. And the reason for that is one of the cells literally needs to be table, that kind of a concept.

So, now if I am still working on some kind of a standard by which we can represent this in spreadsheets in a way that's possible by systems and humanely, highly readable. But if that existed, right, our website will become a hell of a lot managing. Our website will become a hell of a lot simpler. But put on With Google Sheets is now the database for our content management system. Okay, why do I like it? One? It has its collaborate offices, which way now does provide that,

but super funky very clunky. Yeah. Exactly. That's not quite the same. Yeah, s. Its web native in more ways than Google than Excel is specifically. I can you know formula import from a webpage just big chunks and read from it. Third, I can write macros in JavaScript though. I'm not entirely happy about how it's been done. Not recall. I'm not at all. Happy about how it's been done. But that it's been done is not, I think is a great thing.

But yeah, but it's in descending order fact that it's collaborative the fact that it's all that native, until I can write stuff in job. I cannot commit in JavaScript. These are wide why I choose Google chief. That accept who I choose Excel or Google Sheets is familiarity and all climaxes. Whatever term visualization on, I mean, on Google Sheets. Have the I have never even really used it for making graphs and things like that.

Not right now. I use it for I use except for it. Even though if Arabic. But never Google G2, we are. I haven't really thought about why either, but you're right. I've never done that hike. Strings. Yeah, so maybe that's something that Google really I okay the maybe Google didn't really think about that as being a use case because I think the collaborative - itself to cover so much that like they didn't feel like they needed to offer proper powerful Graphics or

something true. So I'm doing course right now, with an organization called landmark and set of leadership, course, they run this course for the whole series of metrics and

these metrics are tricky. Person gets a really, really sophisticated Google sheet in which data is drawn not just from across sheets, but from across works, okay, and they've connected that into Master dashboard for the coaches to get to see the summary across all of their coaches that holds and higher level for the organization acts like a network of Google Sheets. Now, I don't think that can be done except not to the kind of Fitness. Yeah. Yeah. Yeah. Let's get back to visualization.

Business Intelligence Tools, such as Tableau, Power BI, etc.

I think we have spoken enough over to Excel and like let's get back to visualization. So what do you what is your opinion on stuff? Like they say? If I mean I use are we both agreed that python is not great for visualization. We both agree that like Excel is great for visualization and what we can do.

And so on whatever it other your tools, you have all these dashboarding tools, like Tableau and click View and also have you used any of them, does your company use them and equality of view of the graphics element of of that? So a blow is very so okay. Firstly, all of them are a notch above Excel. I must be okay. And come with the associated fiction partly of learning. Yeah, partly of availability, which is a function of living.

Everybody has, except everybody doesn't have more power, bi or click. So I can't really speak for click because I haven't really used not, we have that all use. Click, but I've been using power bi and Tableau server button for be a more so than Tableau just a bit more. All of them offer our more than accept and that's kind of like saying. 90% of stuff you can do in Excel.

Yeah nine percent of the stuff you can do in these tools that they are still can't do this one person that neither of these can do and go to our JavaScript python, whatever you need, for the 90% of it, still use Excel. Okay, it's really only for that 9%. What I've seen is that the people that use these tools, the likes of Tableau power, B, Etc. It's not as much fun. Self-consumption. It's for production ization, which is a very big thing. These are supposed to be self-serve tools.

But the majority of the users are part of a reporting team, who create reports for others, who in theory are supposed to be able to click explore and so on, and a small fraction of them do that, but the majority of them, export it to Excel and play around with it there. Okay? For a couple of reasons, they have other data in Excel that they want to link with, or they have a workflow, which fits with Excel or Because they just more familiar with excess shortcuts

and that's the tooling system. They prefer. So at some level the promise of self-service bi has been realized a lot more people can create reports yet the other end it was promising a need that probably didn't exist in that self-service bi has existed ever since it's still Excel existing. So, I believe these tools have a very important role in that nine percent range, but whatever the quality of the Fix-It that, they could example, if I see it, I don't know.

I mean, I'm that I'm telling my opinion here, if I see a tableau table. I'm like, I just want to close my eyes somehow just like a blue Graphics. I mean, somehow they just it just doesn't look good. Like, I don't know what what it is about them. And like, though, recently. I read that like Tableau completely subscribes to the. So, you know, the ggplot in our stance of grammar of Graphics. It was like, Tableau is also a direct descendant of the grammar of gas.

That has its own. But if I look at the graphics in the summer, I don't know. It just doesn't by default. It doesn't look good to you. So think about it, D room for the gracias reproduce with Excel 97 and Excel 2001. Yeah. Yeah. The defaults of pretty terrible there and I think it was Excel 2007 or 30 year, 2007. I think we're through the quality of the defaults in Excel started. Improving dramatically 2013, to get a notch higher.

Because of the power, I think in Tableau people create what they can and leave it at the default. I don't think the quality of people working on Excel versus startup Loan. In terms of their design ability is any significantly different? How come neither have any design ability? Yes, so they're just going with what's out there.

So at one level, you're right excels managed to improve the quality of the The default Graphics to a level higher than what Apple has been able to achieve arguably, Tableau faces, a tougher challenge in that. There are more things that people can do and Excel has some minor, but important advantage of see, just aligning to the Grid in Excel. You just an ALT drag away. Yeah. So alignment, which is arguably, the single most important design

aesthetic. If you ask me, it is easier because somebody says, oh, look, these are in the lines and is okay, worst case, they don't all drag the manually drag. But anyone who knows? It's a you just all Dragon snap it to another ruined. Columbo dancing. Things are now automatically aligned and it's a resize columns and other tasty aligned that W doesn't have a grid that people use by default. Yes. It supports a great but it's a very different if you don't

think grid first. When you think of Tableau any more than think of great source when you think of PowerPoint, so some of these small little things basically mean that Tableau designs are very, I think Excel designs. We're in The five, but I think another part of the problem is the rapid growth now that the eye is big and people are getting in the volume of work that we're seeing done by people with less than let's say two years of experience in Tableau.

Is it honest of break somebody who's in 2005? We had people who use Excel for a decade. Yep. They're veterans yet. We haven't yet. Gotten to those Tableau veterans, who are Aren't really concerned as much about how to do stuff. They know how to do stuff. The veteran knows it, and now he's looking at the same. I don't like it. So they get bored of what they create that. They start looking for Aesthetics. That stage hasn't come yet in

bulk attic. Yeah. Yeah. Yeah, I'd also like the other thing is the experience graph. If you look at it, like it's very bottom heavy in tablet compared to excel at least like a yeah, even 15 years back. It was in that bottom heavy. So it was like, people Using it, for various reasons. And so I took that out their own stuff, I guess. So, okay. So let's get to the, possibly the most controversial part of the thing. I mean like this, a few quick. What do you think of pie chart?

Pie charts

I've probably gone the complete circle on this and from using them because they look nice to eating them because of the said. So and I could see why, and going back to sink. Yeah. They have a place in life. They are good. Good, they have very limited data density. So the amount of information that you can communicate with those pretty small, but the aesthetic behind them is reversed. So where do they have that place? Where would you use a pie chart are not useful.

Well, the classic case which I guess even tufte would agree with this Harvey balls. So just a matrix of Harvey balls, is actually a pretty efficient way of communicating. Okay, this one. Low medium high kind of a dick. So okay that works and that's effectively a pie chart with two variables. I would never recommend the pie chart with more than two variables.

So it's really only proportion. Yeah, the Second Use case is where you're sharing a proportion and you want to get a better aesthetic, simply showing something that is different. It also lends itself to far more info graphic variations. You can put in a picture into a pie chart that looks What different. You can shape the pie chart into a variety of circular shapes or coin of, which is like sort of buy a pie out of which is slice of the pie.

So many other things that also has value and these things are at one level. They are charged chunk at another level chart. Junk has value, the Normans gone through this cycle, as well from no functional design, in the Design of Everyday Things too emotional design where he has an ordering point of view. Aesthetics has its place. So I've gone to that kind of a journeyman.

So short answer pie, charts. ER, don't use it for anything more than one variable, or one proportion and use a matrix of features if you can, but for those cases, yeah, it works. Oh, wait, I think we're going to be my last question to you. So, while talking about pie, charts you mentioned update. And and data density. Can you quickly in maybe five minutes or so? Like I think most of our listeners would not have Really, unless the real Vision, they are

real visualization weeks. They would have come after stuffing. So, can we just talk about this? Perfect. So, Edward tufte is the god of father, of modern data visualization, his book. The visual display of quantitative information is still the seminal work in the field and was created at the time when even charts, we're fairly popular. Let alone data visualization which goes beyond that. Pretty much all of this. Books are a visual Delight day. Share some fairly deep with circles.

These are principles that we have. I believe absorbed and gone beyond the sense. We know where they work, where they don't work, but it's kind of like Newton's Laws. Like you don't study quantum mechanics before you figured out that Newton's Laws, really absorb them, really applied them and of these principles lay out the equivalent for data visualization. One of which is data density. One way he puts it is Data to

Ink ratio. Yeah, whiskey use as little ink as possible to show something and this is far better seen than explain, but I'll try and do what I can to explain it. Supposing you had the sales of beer, shown as a bar chart with beer bottles on top. It looks nice, but the beer bottles are not adding anything to the information. Content knocked him off. Yeah, keep applying that principle knock off. Whatever you can knock off

without making it unclear. And you may find that in fact knocking stuff, off makes it clearer. So use just like, they say it less is more. Use the fewest words to convey a point. Edward tufte is data, density principles. Use the fuel, the least amount of ink to convey a piece of data. Thank you for listening to data shatter. If you like this show, please leave a comment, share and subscribe to the podcast.

You can find this podcast on. The podcast Spotify or wherever else, you go to get your podcasts. Once again, this is Karthik signing off. Thank you.

Transcript source: Provided by creator in RSS feed: download file
For the best experience, listen in Metacast app for iOS or Android