I have been rolling out alternative data initiatives for financial inclusion across Latin America for several years. At some point, my clients ask: “is this going to work?” My usual answer is “I’ve failed enough times to have figured this out.”
This is a fairly new and not completely mature field. LenddoEFL has been doing this for over 10 years. While there is still a lot to learn, my team and I can share some wisdom.
In response to Accelerating Financial Inclusion with New Data, I recently wrote about the promise and challenge of using alternative data to bank the unbanked. We’ve learned a lot about applying alternative data and have identified five key success factors:
- Aim at the pain
- Build on top of your current business
- Leverage the best data source for you
- Partner with somebody that can handle multiple data sources
- Persevere. Capture low-hanging fruit without losing sight of the big prize
We will tackle one at a time.
1. Aim at the pain
Some financial institutions come to us interested in “trying out” alternative data. Our usual question is “what problem are you trying to solve?” Sometimes they are not clear about what they want to solve, and sometimes they want to fix too many things at the same time. The whole approach for the initiative will depend on this understanding. Choose one pain, focus on it, and build the KPIs to measure success according to this.
Keep repeating to everybody the pain you are attempting to solve to make sure everybody shares the same understanding.
These are some examples from our experience:
• An MFI wanted to increase productivity per loan officer while maintaining default rates: reduce turn-around-time, workload in the field, and complexity. Its client base was made up of unbanked and thin-file customers, so, automation based on traditional scores was not an option. Solution: Collect psychometric information for credit scoring which would allow a centralized, automated process.
• A non-traditional microlender wanted to obtain early warnings of clients that would likely fall in arrears on their next installment so that they could better focus pre-emptive collections efforts. By combining traditional repayment data with Android phone data, we are able to “rank” clients by the probability of next payment default. Now they can focus on the the one-third that will create 75 percent of the defaults.
• A traditional financial institution was turning down about one-third of applicants due to lack of credit history, and not belonging to the “right” demographics. They decided to invite “rejects” to re-apply by providing psychometric information, which allowed us to “rescue” about half of those prospects without increasing the default rate.
• A home appliances retailer providing $200 loans to consumers was losing clients due to the time required to verify their identity. By leveraging social network data, they have been able to reduce the approval turnaround time from two days to a few minutes in most cases. They have been able to approve more clients, reduce the cost of identity verification, and reduce cases of fraud.
2. Build on top of your current business
A good friend and a brilliant risk professional called me asking for help: “We are planning to launch a new product, for a new segment, in a new channel, so we need to use a new source of data to build an origination model.”
“Too many ‘news’ in the equation,” I told him. However, I joined his new venture.
You can guess how this adventure ended: slow volume uptake, lack of an actionable model after several months, and little enthusiasm to keep investing in order to capture value.
As we discussed in the first post, building models with alternative data is a numbers game. You need volume.
In the successful cases we mentioned before, we collected alternative data from a population that was already being served through a channel already established. This was to support a product with existing traction in the business. Innovation was concentrated in the data source and methodology to asses risk.
3. Leverage the best data source for you
Each source of data has advantages and drawbacks. In the front end, some sources may create more or less friction on the client onboarding, depending on origination processes. On the backend, usually the “low-friction” data is not structured. Unstructured data is not organized in a predefined way, so using it to build a risk model is more challenging than using structured data.
Once you have identified the pain point, you may work out with your partner/vendor the tradeoffs considering your population and channel. Note the following tips:
• Highly digital populations already served through an online channel may be approached using digital data, but you must make sure that you can get the volumes required to build a model based on unstructured data (unstructured data requires more volume to build a model).
• People with whom you already have an ongoing relationship may be a good population to leverage mobile phone data, as they may perceive a benefit to downloading and keeping your mobile application.
• Less digitized populations, served through traditional channels (branches or field loan officers) may be better suited for psychometrics.
Avoid the pitfall of falling in love with a specific data source and then figure out a use case within your business. Go the other way around: “given my business need, what data source better fits it?”
4. Partner with somebody that can handle multiple data sources
“When you only have a hammer, all problems look like nails,” my first boss told me a long time ago. To avoid the pitfall described on recommendation three, you must partner up with a vendor that can manage several data sources.
This will not only let you choose the right pain and business to focus on, but also give you flexibility as you roll out.
For example, we found, while working with a one client that their clients would willingly share their email data. Unfortunately, we found that they used their email so scarcely, that we couldn’t score many of them. Now we are working with psychometrics in this population.
In another situation, we started using psychometrics to approve more people at a Mexican e-lender. In the meantime—while they were approving more clients—we collected digital data from these same applicants. After several months, we have been able to combine both sources of data to approve even more people.
If you are like most of us and work for an organization that needs results in a few quarters, structure your initiative to collect early results that may give you inertia while you go for the long-term prize.
We work with an institution that provides big loans. They do not have that much volume, but they invest heavily in each prospect. Big stakes, low volume is the most challenging environment to build an alternative data-based score. It took us almost 4 years, but now they are harvesting the fruits of their perseverance.
To deal with this issue, you need to be creative to identify secondary pain points that may be addressed quickly along the way.
For example, we worked for a retailer that wanted to increase approvals while keeping defaults in line by approving new-to-credit consumers. Loans had mostly 24 to 36-month terms and most 60 days defaulters tended to recover. That was a challenging situation: we would have to wait 12 months for vintages to mature, and look for 90 or 120 days in arrears for the “bads” to profile. It looked like a 2 to 3 year project.
But we found a secondary pain: “straight rollers.” These were loan recipients who didn’t pay their first two or three installments and were eventually written off. We collected data on all their clients to quickly build a “straight rollers model.” We only needed 3 installments on each vintage to identify bads.
Along the way, we are collecting data that will be used to build an admission score to address the main pain.
In summary, building credit policies based on alternative data is challenging. Fortunately, there is enough learning accumulated in our community to avoid some pitfalls and we hope you find these tips useful.