When thinking about starting your AI project, you’re likely feeling a combination of excitement and concern. Wow, this can be amazing. All the success stories, the numbers of sales increase, revenue growth…so many opportunities! But on the other hand, what if it goes wrong? How can you mitigate the risk of wasting time and money on something that just isn’t viable at all? There are so many questions, there’s so much hope, and (hopefully) there’s a plan. Bright future ahead of us, am I right? Well, a recent white paper released by Pactera Technologies states that 85% of AI projects fail. Oops.
“But it won’t be me” — you might say. It won’t, or it will, there’s no way to tell now. You can hope for the best but nothing exempts you from strategic thinking. Be well-informed, be prepared, be advertent.
You may also like: Are We There Yet? Effort Estimation in AI Projects
There Are so Many Ways to Screw Up
And not a single reason to put it more mildly. AI offers some awesome possibilities and a plethora of things you can do wrong. You can go wrong with the data strategy, business/tech alignment, the human factor, and that’s still not all. I’m not trying to scare you away, though. It’s the spooky Halloween season, so we’ll be telling ghost stories — only with AI fails — so you can be more cautious and mindful in the future. You know, learn the lesson before it hurts you.
Why AI Projects Fail — Common Problems: Big Is Never Big Enough
Big data is a buzzword, but it’s also rather enigmatic. How big is “big”? How much data do you need? Yes, data is a problem. Not just because there’s not enough of it — though sometimes there is, naturally — but also due to issues with labeling, training data, etc. Because an AI system can only be as good as the data it’s fed with, you can’t have any tangible results if there’s no data behind it. So what’s the problem with data? Well, where do we start…
First, there may not be enough of it. If the business you’re running is small and has a limited set of data, you have to carefully discuss your expectations and the current state of your data set with an experienced AI advisor or data scientist. How much data is enough? See, that’s a tricky question because that depends. It depends on the use case, the type of data, and the result you expect. However, we can often hear “the more, the better”. Seems like in data science projects, more is more, period.
Do As I do, Robot
We tend to expect that AI systems perform intellectual tasks as well as we do — or better. That’s a reasonable thing to expect since we all know that “AI is outperforming humans at more and more tasks.” It is. It even beat a Go champion. However, our minds are much more flexible than AI systems.
Think about recommendations: you meet an interesting person at a startup event. Let’s give him a name: it’s John. John enjoys talking to you and appreciates your knowledge of business and technology – he asks for a recommendation of a book that will help him gain more knowledge about these things too. You quickly run through all the titles in your head. There’s book A, B, C, D, E… OK, John, I’ve got it. You should read (insert title here). How did you know what you should recommend to John?
Your brain scanned the information you’ve gathered so far — what John knows, what he was interested in when talking to you, what his style is – to assess which book will be best for him, even though you have no idea about his actual taste in books. You had a feeling he’ll like it, and you might be right.
Now, let’s look at an AI system that “meets” John. John enters the website of an online bookstore and he’s instantly welcomed with a list of bestselling books. Nothing interesting, he keeps clicking “next”. The AI has no context to John — it’s in a “cold start” situation when it can’t generate personalized recommendations because it has no information about John. But John clicks the search bar and looks for “startup”. Oh, there’s the list. He’s browsing and clicking through some titles. At this point, AI figures out that “startups” are what John likes, and recommends content on this subject. It doesn’t know John very well but it uses data about what other users who browsed (or bought) the book “Startup” also liked. But what will happen if nobody else looked for startup books? John will not get relevant recommendations because the system didn’t have any data to learn from.
You and AI may end up recommending different books for John. You both can be right, you both can be wrong, or one of you will be the winner. However, your brain never said “insufficient data” — it just improvised. Artificial intelligence cannot do that. And we, as AI’s “employers” cannot expect it to perfectly reflect the operations and intricacies of the human brain.
I Thought Labeling Was Passé
Putting labels on people — sure. Putting labels on data — never. Data doesn’t just have to exist, it has to be labeled — so it has a meaning, too. If data isn’t properly organized, humans have to devote their time to the tedious task of labeling it. Data labeling is troublesome, yet somehow many companies just don’t think about it at all. In an article published on AWS blog, Jennifer Prendki writes:
There is a huge elephant in the room that even some of the savviest tech companies seem to have overlooked or chosen to ignore — the problem of data labeling.
For many machine learning models that are trained in a supervised way (supervised learning), data labeling is crucial. The models just require the data to be labeled, otherwise, they won’t make sense of it. And because data labeling is such a huge issue, data scientists often choose to use data that has already been labeled. Let’s take the example of images. There is a whole variety of quality images available, yet many machine vision projects rely on ImageNet, which is the largest labeled image dataset that contains about 14 million images. Additionally, more and more data is created every day. About 50 terabytes of data is uploaded to Facebook every single day. And Facebook isn’t the only data-generating source. With all the data, we have actually reached a point where there aren’t enough people on the planet to label all the data.
There’s so Much of Data, It Can’t Be Right
And it might not be right. You may have this feeling that you have all the data you need, you’re just killing it! There might be a lot of data — but is it the right data? If you’re an e-commerce, you likely have a lot of information about your customers — their names, addresses, billing information, perhaps credit card information. You know what they buy and when they buy it. You know what they browse. You also know when they contacted you and via what channel.
Now, what data is necessary? You will look at different information when addressing different problems. So when you’re implementing a recommender system, you may not need all the demographic data, but the purchase history is a must. However, when you want to predict churn, different factors will come into play.
So you may have all the data in the world (no, actually, that’s impossible), but is it the data you need? It’s tempting to collect all the data you can, but it’s just not necessary. The key is to get it right, not to collect it all, it’s not a collectible item.
The Algorithm vs Justice
In 2017, Joy Boulamwini, an MIT researcher and the founder of the Algorithmic Justice League, gave a TED talk about fighting algorithmic bias. Her presentation starts with her “experimenting” with the software:
“Hi, camera. I’ve got a face. Can you see my face? No-glasses face? You can see her face. What about my face? I’ve got a mask. Can you see my mask?”
So the camera doesn’t detect Joy’s face. It sees her colleague and it sees a white mask, but not Joy’s face. And it’s not the first time it’s happened. When Joy was an undergraduate student at Georgia Tech, she worked with social robots and had a task to teach it to play peek-a-boo. The robot couldn’t see her. Joy “borrowed” her roommate’s face and let it go. But it happened again during an entrepreneurship competition in Hong Kong where one of the startups was presenting their social robot. It used the same generic facial recognition software – it didn’t see Joy.
How did that happen? Joy goes on to explain:
“Computer vision uses machine learning techniques to do facial recognition. So how this works is, you create a training set with examples of faces. This is a face. This is a face. This is not a face. And over time, you can teach a computer how to recognize other faces. However, if the training sets aren’t really that diverse, any face that deviates too much from the established norm will be harder to detect, which is what was happening to me.”
But how’s that a problem, you might ask? Bias in algorithms spreads fast and wide, and it’s not just about face recognition. Sure, that’s an extreme and dangerous example — the misidentification of minorities due to faulty face recognition can lead to unfair arrests since US police are planning to use such software to identify suspects. What if the machine makes a mistake then?
Since we’re talking the justice system, how about we bring up COMPAS again? I’ve already described COMPAS, an algorithm used in the US to guide sentencing by predicting the likelihood of reoffending, in an article about trust in AI. The algorithm, learning from historical data, decided that black defendants posed a higher risk of recidivism.
Oh, and there’s also that infamous Amazon AI recruiter that favored men – because most of the workforce was male, so it’s just logical…
What Is Bias in AI?
AI bias, or algorithmic bias, describes systematic and repeatable errors in a computer system that create unfair outcomes, e.g. exhibiting traits that appear to be sexist, racist, or otherwise discriminatory. Though the name suggests AI’s at fault, as described above, it really is all about people.
Cassie Kozyrkov, Chief Decision Scientist at Google, writes:
“No technology is free of its creators. Despite our fondest sci-fi wishes, there’s no such thing as ML/AI systems that are truly separate and autonomous…because they start with us. All technology is an echo of the wishes of whoever built it.”
Bias is generally bad for your business. Whether you’re working on machine vision, a recruitment tool, or whatever else — it can make your operations unfair, unethical, or in extreme cases — illegal. And the unfortunate thing is that it’s not AI’s fault — it’s ours. It’s people who carry prejudice, who spread stereotypes, who are afraid of what’s different’ But to develop fair and responsible AI, you have to be able to look beyond your beliefs and opinions, and to make sure your training data set is diverse and fair. Sounds simple, but it’s not easy. It’s worth the effort, though.
One of the challenges to AI implementation is the fact that senior management may not see value in emerging technologies or may not be willing to invest in such. Or the department you want to augment with AI is not all in. It’s understandable. AI is still seen as a risky business — an expensive tool, difficult to measure, hard to maintain. And it’s such a buzzword. However, with the right approach, which includes starting with a business problem that artificial intelligence can solve and designing a data strategy, you should track the appropriate metrics and ROI, prepare your team to work with the system, and establish the success and failure criteria.
As you can notice, I use the term “augment” when referring to the task of AI – that’s because AI’s primary job is to augment human work and support data-driven decision-making, not to replace humans in the workplace. Of course, there are businesses aiming at automating as much as can be automated, but generally speaking, it’s really not AI’s cup of tea. It’s much more into teamwork. What’s more, it has been found that AI and humans joining forces gives better results. In a Harvard Business Review article, authors H. James Wilson and Paul R. Daugherty write:
In our research involving 1,500 companies, we found that firms achieve the most significant performance improvements when humans and machines work together.
However, as a leader, your job in an AI project is to help your staff understand why you’re introducing artificial intelligence and how they should use the insights provided by the model. Without that, you just have fancy, but useless, analytics.
To illustrate why this matters, let’s look at an example described by CIO magazine. A company called Mr. Cooper introduced a recommender system for its customer service to suggest solutions to customer problems. Once the system was up and running, it took the company 9 months to realize that the staff is not using it, and another 6 months to understand why. It turned out that the recommendations weren’t relevant because the training data included internal documents describing the problems in a technical way – so the model wasn’t able to understand the issues that customers described in their own words, not in technical jargon.
This example shows both the importance of the staff understanding why and how they should work with AI – and that they are allowed to question the system’s performance and report issues, and the significance of reliable training data.
You can even fail with AI before you start. Yeah, really. This happens when you jump in before having all the necessary resources — the data, the budget, the team, and the strategy. Without these elements, it’s only wishful thinking. That’s why we emphasize the importance of a strategic approach: making sure you are ready for artificial intelligence, identifying the appropriate business use case, outlining a decent data strategy, and establishing the goals. Starting without that strategy is difficult and risky.
You want your AI project, especially the first one, to go towards a bigger objective but also achieve some quick wins along the way. This way, it proves its viability and mitigates the risk of you wasting your company’s money on a useless tool. The first AI project should not be a company-wide AI implementation but a proof of concept that gets the entire organization accustomed to the new normal.
With time, both AI and your company will grow: your systems will be getting better and better, and your team will be more data-driven and efficient. It can be a win for all, if only you do it step by step and not lose sight of your objectives. AI is a tool that’s supposed to help you reach your goals, not a goal itself.
How not to Fail at AI
You don’t have to fail. The good thing is that with so many organizations having already failed at AI, you can learn from their mistakes and avoid making the same ones in your company. It’s a good practice to observe the market, not just in your direct competition, but also in the tech world. This way, you will know what you can realistically expect, what use cases are promising, what limitations you have to take into consideration. And if you want to learn how to prepare yourself and your organization for a well-planned AI adoption, read on: What are the things you must consider before implementing AI in your business?