In my most recent post, I covered some areas that I hope to see evolve in the next year and beyond. How we can do more with data across industries is, of course, an important consideration for data scientists, businesses, and society as a whole, as better models lead to improved products and services.
When machine learning models for cancer diagnoses show promise, we naturally rally around this positive step and rejoice in the vision of a brighter future because it’s a victory that touches us all in some way. But there are many other ways AI can and must be used for good in the world, and in my next few posts, I want to use a financial services example that affects all of us, to show how that can be achieved.
A Matter of Interpretation
Most of us have, at some point, needed a loan. Lending is an essential revenue generator for banks and loan providers. The importance of making the right decision when it comes to approving a loan cannot be understated. The trouble with this statement, however, is that the “the right decision” can be a matter of interpretation.
A small business owner who needs a loan to maintain operations until the next guaranteed order comes through is obviously going to question that interpretation if unsuccessful with an application. In the past, this application would have been subject to approval from a manager within a bank or loan company. Today, automated decision-making solutions have been widely adopted that don’t rely on the decisions of individuals. Increasingly, advanced models are being adopted to improve the process and provide a more accurate assessment of risk.
The demand for loans, through traditional lenders, challengers, or fintechs, is booming, particularly as point-of-sale loans such as buy-now-pay-later offer new and flexible ways to gain credit. The challenge for any lender, however, is ensuring that their loans are (and continue to be) universally accessible and fair.
A Data Science Solution?
As data scientists and machine learning engineers, we have a very specific pipeline we all know and love. Data is the heart of modeling, so we start by exploring our data sets and identifying relationships. We go through exploratory data analysis and turn the data into a usable form. We spend time wrangling, cleaning, pre-processing our data and then go through intense feature generation to create more useful descriptions of the data to solve the problem at hand. We experiment with different models, tune parameters and hyperparameters, validate our models, and repeat this cycle until we’ve met our desired performance metrics. We then focus on deploying and productizing our models, as well as maintaining them to ensure that they are running properly and are adaptable in our production environments.
In case you didn’t spot it, there is no point in this process that we tend to think about or prioritize the fairness of the model, which is because the traditional cadence focuses on creating the most effective model. But this, I believe, has to change.
The lending industry is packed with data. Every day we are collecting more and more in structured and unstructured formats, and growing databases with billions of detailed records. By leveraging AI, we are revolutionizing traditional lending, as the wealth of data we can now surface and sort can be used to automate and create more accurate decisions, increase processing efficiency, reduce internal operational costs, and create better and more bespoke experiences for customers and clients.
We’ve been solving some amazing problems with machine learning. For example:
- We can now leverage optical character recognition and natural language processing to parse documents required in the lending process and extract information to pre-populate front-end point-of-sale (POS) systems.
- We’re using machine learning to help automate the mortgage decision process and we have access to billions of data points to better understand our customer base, allowing us to create smarter marketing campaigns.
- We’re also challenging the traditional FICO score, using machine learning to better assess the credit-invisible.
Clearly, AI in lending has accomplished a lot, and machine learning is playing a significant role in changing and transforming the industry. However, there’s one thing a lot of us typically don’t think about when it comes to our ML pipelines, and that’s model fairness.
Fair Isn’t Foul, and Foul Isn’t Fair
Ensuring that the decision of our models is fair for the populations we model from, and does not discriminate against particular groups or individuals, is vital today. If not for their own sake, organizations across all industries must realize that their AI propositions will come under increasing scrutiny in the coming years, and they should be equipped to defend their records.
If we return to the example of the small business owner I presented at the start of this article, we should consider the reasons why a “successful” algorithm might have rejected the loan application.
Algorithmic fairness is a hugely important topic, especially in the lending industry, and in my next post, I would like to delve deeper into how we build the tooling and practices to enable it.
Organizations and data scientists today must remember one thing, which is summed up in the following quote: “Algorithms don’t remember incidents of unfair bias, but customers do.”