Data Transformation Transcript
Disruption Happens With What You Don't Know You Don't Know

R

ead the transcript of my podcast with Shailendra Kumar, Vice President and Chief Evangelist at SAP and author of the acclaimed book, Making Money Out of Data: The Art and Science of Analytics. In this podcast we discuss how companies can create value from data, how much of the data they need is already in their systems, what people think data analytics is vs. what it really should be, and finally, how true disruption starts with what you don't know. Give it a read.

Peter Schooff: Welcome to another Data Decisioning podcast. Today we're going to dive head first into the latest on structured and unstructured data with one of the chief practitioners. That person is Shailendra Kumar, VP and Chief Evangelist of SAP. He's a thought leader and visionary in the cognitive and analytics space with the sole motto (and I love this) of "Making Money Out of Data," which is also the title of his book, available here on Amazon. He has helped multiple organizations across the globe generate incremental revenue and optimize cost using machine learning and advanced analytics techniques. So, Shailendra, first of all, thank you so much for joining me on this podcast.

Shailendra Kumar: Thank you.

Peter Schooff: So leading up to this, we were talking about your book, and you had an interesting story about the title. So why don't you tell me that story?

Shailendra Kumar: Okay, so before the title, how the book came about, and that's an interesting story in itself. I'd done some very interesting work up until 2016. I was working for IBM, left the job, and I was looking for another job. I called one of my ex-colleagues from Accenture for a drink and I asked him, "Can you help me get another job?" He said, well, you're very senior, it's difficult, but we'll try to find a way. He gave me a couple of options to talk about stuff. We were catching up for a drink and that was 'round about 6:30, 7 o'clock, and we agreed he was going to help me find a job.

"Before jumping into that area of unknown, you still have a lot of data lying in your organization which you're not utilizing to create insights.."
Shailendra Kumar

Come 10 o'clock in the night, we were both drunk, and he said, "Shaili, you always wanted to write a book." I said, yeah, he said, "this is the perfect time to write the book". I said, "What do you say that?" "You don't have a job, there are no intellectual property issues, so just use this time to write a book." I agreed. Then in the morning I woke up and said, What did I agree to last night? Now I have to write a book. And that's how I began writing the book. That was October of 2016 and the book finished May 31st, 2017.

Now the title of the book as the question came in, Making Money Out of Data, was the title given to me by one of my ex-bosses. Now, in those days, I'm talking 2011, 2012, in those days the titles Chief Data Officer and Chief Analytics Officer didn't exist. So my ex-boss had a very clear demarcation, he said, "This is the lady who spends money and this is the man who makes money." So you make money out of data and she spends money to create data.

There were two functions. It was very simply aligned two functions in this organization, which is one of the largest retailers in Australia. And he said, "We have only two people working with data. You make money, you spend money." And then later, when I wanted to write the book, I turned around and said, why don't I, you know, just make the title Making Money Out of Data. For me, it's all about making money out of data, and that's what I've been doing.

Peter Schooff: That's cool, and I think that interests a lot of people. So the book you said took you six months and you finished in 2017. Now we've seen a sea of change, like you said, even the Chief Analytics Officer didn't exist. So what has changed since you published the book? What's happening in the data world that's different in the three years since?

Shailendra Kumar: Things with data haven't changed that much. And why I say that is that even in those days people thought analytics was creating reports. And the reason why the book is a hit or a success and what I did was a success as well was because we didn't consider analytics as reporting. We considered analytics as solving a business problem using data and you would use statistical modeling in it and business integration and understanding and talking to the business. And that was the key core concept and the key differentiator between what people thought analytics was and what analytics is, in my view.

And even today, when you talk to people and you say, analytics, they would say, "I do a lot of reports". That has not changed even today. Today, people think analytics is reporting. Whereas reporting is just a byproduct of analytics, a report is just a byproduct of analytics. It is not reporting. That has not changed. Because people are more comfortable looking at reports and consuming those reports, I think they still call it analytics. That's one of the biggest issues I've seen. I've tried to educate people on that as well.

Peter Schooff: That makes sense. And I have to say the one thing, even all the people I know on LinkedIn, a lot of people's data jobs have started in the last few years so data has certainly has blown up. So everybody has this big idea of data. What are still some of the biggest myths about data that people get stuck on would you say?

Shailendra Kumar: So, let's take it to the next level. People are still thinking about analytics as reporting. Now the type of data has changed over a period of time and I think that is important to explain now that all this while we were talking about structured data — which is the IDBMS, the data line inside, the CSV file, the Excel file, and so forth. We've been using that data effectively to create something. To be honest, not many organizations are actually using that living structured data in an organization to solve business problems. Even today you will find lots of large organizations which are not doing much work with the already existing data. And this is structured data.

But then came the next level which people today unfortunately call unstructured data, which isn't unstructured data. It is semi-structured data and this is things like social media data, like data coming from Twitter and Facebook. The data coming from these social media platforms has got a structure. If you look at it, the data coming from Twitter, for example, it is a JSON file, it is an XML file. It has got structure. It has got the name of the person, the ID, what they've said, it has got a text component to it that may be unstructured, but it does have a structure. That is what I call semi-structured data. Most of the people think that is unstructured, but that is not unstructured.

When you talk about the unstructured data, as the word indicates, as the name indicates, it has got no structure. For example, a video, for example, a picture. The pixel of the picture that does not have any structure. And this is what is an unstructured data source. Now, what is happening now is that people are bringing all of these three together to make sense out of it. But having said that, I tell people, before jumping into that area of unknown, you still have a lot of data lying in your organization which you're not utilizing to make sense, to create insights, to solve problems. And that is what I see as a challenge at the moment.

Peter Schooff: That's a fantastic breakdown. So you've broken it down into structured and semi-structured data etc. Let's look at structured data — one of the first sets of problems is they're not utilizing their structured data fully. Would you say all structured data is created equal?

Shailendra Kumar: It depends. What data was created, and how it was created, and what is its utilization? And that is one thing. Because what you're seeing is that the data coming from IDBMSs, because it was created by the IT professionals who actually designed and articulated it very well, structured it very well, it's structured. Whereas when you look at data coming in a comma-separated file, which has got a structure, but you cannot relate it back until you've got that structure aligned to the IDBMS data source. So they're not equal in one sense.

Similarly, the data coming from websites, right, that is a structured format. The weblogs, they've got a structure format. But that's not equal to the data coming from a financial transactional system. Because that's very tight, very very structured, very very aligned, whereas the data coming from a weblog may not be that structured, may not be that aligned. I'm not saying it may not be that structured, but it's not aligned to the fashion that it can be equally utilized by matching it to the financial data or the comma-separated file itself. So you need to think on those lines — how these three different sources of structured data work together and they're not equal.

"The problem is people start with the data. That's the wrong place to start. You start with the business problem."
Shailendra Kumar

Peter Schooff: So now would you say, in the future with competition, do you think every company has to do everything with all the three various forms of data? Or do you think one form of data and how they deal with it is going to determine how they survive in the marketplace?

Shailendra Kumar: I'd turn it around. The problem is people start with the data. That's the wrong place to start. You start with the business problem. Then you think, what data do you need? Whether structured, semi-structured, unstructured. What do you want? The problem is people say, "Give me data and I'll do something with it." That's a wrong statement in itself.

Because you need to figure out first what problem are you trying to solve. Every time I have a discussion with the client or with an individual or an organization, I ask them to write — and this is the most difficult part of the project — I ask them to write the business problem in plain English in one sentence on the white board. Don't worry about what data source. Don't worry about what models you're going to use. I want to know, what do you want to achieve? Define the problem. Then we will see what outcome you want to achieve. First we define the problem. Then we look at the outcome. Then we see whether you will be able to utilize the output to achieve that outcome. And once we do that, then we will go and find out what data is required and what model needs to be built.

They are part two and three of the project, but the first part is looking at the business problem. And once you know the business problem, you know what outcome you're trying to achieve, then it becomes an easy aspect off looking at whether I need the internal structured data, whether I need to know what customers are talking on social media, do I need the social media information, do I need pictures and videos and facial-recognition information? What do I need? That only happens once you know you know where you're going.

Peter Schooff: Yeah, I love that. I love the idea of just defining the business problem down to one sentence because simplicity and just let's be all on the same page before we start. So now I love this quote from your book, "There are times when the power of analytics will only tell you what you know. What is more important, though, is how you utilize the insights by taking the sort of actions that can quickly and exponentially make you more money." We're back to money again, which a lot of businesses are interested in. Would you elaborate on this a little bit.

Shailendra Kumar: That quote is, "There are things we know we don't know, and there are things we don't know we don't know." And what I'm trying to explain is the practitioners point of view. Now, when I come to you and I tell you, you know what, "You don't know this, Peter. You don't know this. And if you do this, you would be making a lot of money." You will say, "Who are you to tell me that?" So I need to build confidence first. So the first part of the discussion starts from telling you what you already know. So when you do use the data, the idea is — and to create a report, and that's what reports are for. Look at how organizations make decisions — what they do is they get a report and they take a decision on that report. But 95% of the time, I know that people who are making that decision or are reading that report know the answer in the report. That's why they're comfortable with the report, right?

So let's look at a board meeting where the board has a hunch that this quarter they're going to make 25% increase in their sales. They have the hunch. Now, that is where they're going to get a report which will save 24% or 29%, it will be in the ballpark range. So there's no unknown. But if I'm only telling you what you already know, you're not going to make a change because what you're doing is you're only reaffirming what you already know. And then you will say, OK, I know what it is and then I'll solve the problem. I don't even know the problem. Analytics, data science, they've changed terms, I call it hardcore analytics, which I don't call reporting in the first place, so now they're started calling it machine learning, which they don't know what machine learning is in the first place. So people are writing predictive models and calling it machine learning, which is another problem.

So analytics helps you ask the question, the question which you were not asking in the first place. Because if you don't ask the question, then you won't get the answer. And because you haven't asked the question, it's not on the report. and it's not in the report because you don't know the question.

Peter Schooff: Well, that's great. I'd love to hear if you have a real world use case of this happening? It's disruptive, I can imagine.

Shailendra Kumar: I know, but that's how disruption actually happens. Otherwise, think about this. Someone imagined that there would be driverless cars, right? And the first person who actually talked about that people would have laughed at him. And I usually talk about this story on the stage every time I have to tell people. I come from India. We have a great history and we've had kings like Shah Jahan and Akbar, you know, we've all heard about them. I've actually created the story which is: Shah Jahan, who had two capitals, one in Agra and one in Delhi. And once he was sitting in his palace and he said, You know what? If you want to talk to someone in Agra, he was sitting in Delhi, if you want to talk to someone Agra what I'll do is I will put a metallic wire and put two plastic boxes on the other side. And you know, you can actually, physically, you can talk to them on the two sides. The king got very furious. He said, "What are you talking about?" He said, "Yeah, I will just lay a cable and a piece off metal and put two plastic boxes on the other side and you can have a conversation. " The king said this guy has gone mad.

Then someone came along like me and said I've got two wires too, and you can actually see the other person as well. He said, "Put them into jail they have gone mad." But the reality is, we do that every five seconds of our life today. Yeah? So if you're not thinking outside of the box and if you're not thinking something that was not normal, the chances are you will not achieve that. People will laugh at you and that's normal. And people will even come and say, What has he done? But that's how disruption happens. That's how we get disruption. That's how you make a change.

Peter Schooff: All right, so we've covered a few things, we've talked about your book. What is the key takeaway? I mean, we're in a see of change with so many things going on. What is the one thing you want to sit in people's brains when they come away from this podcast?

Shailendra Kumar: The key takeaway is, first of all, for large and small organizations, you've got a lot of data lying in your internal systems. Use that. Use that, but before that, try and articulate a problem you are trying to solve and then use the data that you've got. Now, everyone will tell you that you need more data because that's how they make money. They will sell you boards, they will sell you hardware, they will sell you software — of course you will need that. But the most important thing is define what you want to achieve. Ride that through and solve problems and then go in and add on stuff. Because if you do that, you will get quick success stories. You will get quick successes and you'll not have to wait for 10 to 12 years, or in some cases, even more, to actually get all the data in one place and then start solving problems. Start doing that now.

Peter Schooff: And as you were saying, use the data you have. That's fantastic advice,

Shailendra Kumar: That's the first thing you do.

Peter Schooff: This is Peter Schooff speaking with Shailendra Kumar of SAP. Make sure you check out his book, Making Money Out of Data. Thank you so much for joining me today, Shelly.

Shailendra Kumar: Thank you.

Listen to the podcast: SAP's Shailendra Kumar Details How to Make Money From All Kinds Of Data


Peter Schooff | July 21, 2020