In this article, we explore AI distillation and the role it can play in transforming models that deliver financial services to low-income consumers. While AI distillation models hold great potential to democratize AI capabilities in catering to underserved consumers globally, through applications like financial coaching and offering personalized recommendations, they can also increase risks of algorithmic bias and security threats and exacerbate vulnerabilities by providing incorrect advice. Without considering the right design, data, and commercial incentives, we risk expanding access without positive consumer outcomes. We unpack how AI distillation works, and examine some of these risks highlighting areas where CFI is deepening research and insights.  

In Greek mythology, Prometheus tricked the gods and gave fire to mortals — an act for which he was punished by Zeus. It is tempting to think of AI distillation, a process in which developers fit the powerful capabilities of larger generative AI models into small, cost-effective models, as a modern-day Prometheus in delivering more efficient models. But are there hidden costs and misaligned incentives we should consider?  

In January this year, researchers at UC Berkley’s Sky Computing Lab used distillation to train a large language model (LLM) for only USD $450. The model has comparable performance to OpenAI’s o1-preview model. A week later, DeepSeek, a well-funded Chinese AI research firm, released their DeepSeek-R1 model with performance rivaling Open AI’s state-of-the-art o1 model that was released only a month earlier. DeepSeek claims to have developed this new model for only a few million dollars, and they distilled R1 into much smaller models that can be run relatively cheaply but retain much of the original’s capabilities. 

However, efficiency cannot be the only metric for success. As distillation brings advanced capabilities to small, cost-effective models, generative AI — with all its potential benefits — may reach underserved consumers much sooner than previously thought, and that brings risks we need to consider and address urgently. To extend the Prometheus metaphor, fire might provide warmth, but it can also burn.  

But What Is Distillation in the Context of AI Models? 

Before we jump into discussing risks, we need to unpack a term that has been extensively seen in the news in the past few months in the context of AI models — distillation. Like the English meaning of the term, which refers to the extraction of the essence, or the most important aspects of something, distillation in AI models is a process where a relatively small and resource-efficient model (often called the “student model”) learns to mimic a much larger and more complex and powerful LLM (often called the “teacher model”), but in a faster, more efficient manner.  

Distillation can produce resource-efficient AI with performance shockingly close to that of the teacher model. These models can be deployed at relatively low cost and in environments with limited computational power, which makes them well suited for use in low- and middle-income countries.  

What Does This Mean for Inclusive Finance? 

The potential of generative AI to have a major impact on the financial services industry has long been discussed, yet widespread adoption of these innovations has seemed far off for the inclusive finance sector. This is because of the relatively high cost of developing, fine-tuning, and running LLMs to serve consumers who have little purchasing power.  

Advancements in distillation suggest that the risks and benefits of generative AI will no longer remain confined to rich countries or to those with access to large cloud computing capacity for long. With generative AI expected to impact financial services more than any other sector aside from tech, inclusive finance is likely to be an early conduit through which low-income people experience this technology. The sector should be thinking critically about what we should be asking of these models and how best to apply them in this context. 

There are several ways in which generative AI may be especially useful for inclusive finance clients. For example, automating financial coaching with generative AI chatbots may help these services reach more people. Research suggests that some consumers may also prefer to discuss sensitive topics, like debt stress or gambling problems, with bots rather than with humans who they fear will judge them. However, financial coaching and the discussion of sensitive topics with generative AI raises serious concerns around both the reliability and privacy of this technology, and failures in these areas may impact consumer trust in the long run. Trust sentiments among vulnerable consumers using financial services are an area of forthcoming research undertaken by the Center for Financial Inclusion, which will include a deep dive on how consumer trust is impacted by technology design.  

Generative AI can also automate interactive and personalized customer service. These services, in theory, can be useful for consumers with low financial or technological literacy to assist them in understanding and accessing financial products that can meet their needs. Back-end uses such as analyzing unstructured data for credit or insurance underwriting and lowering the costs of software development can help financial service providers in low-income countries reach and design products for underserved consumers as well. 

In practice, we need to consider the unique vulnerabilities of underserved consumers and ensure they are not subject to greater harm. The idea of AI-based financial apps is not new; several exist today and target young people living paycheck to paycheck. But these apps can pose risks stemming from misaligned incentives to upsell costly services to vulnerable consumers. Similarly, commercial incentives may drive the deployment of cheap distillation models into low-regulation environments with little thought to consumers’ well-being. 

As the race to build the most powerful AI models intensifies, there are serious risks to consider with distillation models. These include their lower volume of training data, which can exacerbate existing biases, privacy and security risks from AI-powered cyberattacks, especially if training data is reverse engineered by student models, and the unauthorized distillation of proprietary models. 

Missing Data Challenges 

One issue is the lack of data available to train LLMs to serve low-income consumers — especially in local languages. Many low-income consumers have limited access to digital infrastructure and do not generate large volumes of data with which to train LLMs. This sparseness of data may lead inclusive finance clients to encounter more inaccurate and biased AI or have to deal with models that are simply not trained to meet their needs. 

Relatedly, bias and discrimination are especially difficult to measure in LLMs for low-income consumers. This is in part because research on identifying and correcting biases in LLMs have largely focused on English-language deployments, yet many low-income consumers reside in countries where English may not be the first language. It is also important to remember that language is dynamic and evolves to reflect cultures and changing lives and experiences.  

Developers typically test models on large, standardized datasets to measure and then address these issues in their models. However, the scarcity of data in local languages from low-income countries makes these datasets difficult to compile. For example, a study on AI safety in India found that the lack of benchmarking datasets for gender bias in Indian languages makes it challenging for developers or users to measure and address them. 

Moreover, individuals with low financial or tech literacy are likely to have additional vulnerabilities when presented with incorrect information. With deceptive practices already a critical issue in inclusive finance, inaccurate LLMs may further undermine consumer trust in the sector and exclude people from access to financial services. 

Distillation raises further questions about transparency, auditability, and fairness. LLMs already struggle with traditional notions of transparency and auditability, which may be exacerbated by the fact that distilled models are a step removed from the training data of their teacher models. The extent that these models may amplify any flaws or biases from their teachers is also unclear. Since biases often arise from a lack of data on certain subgroups, distilled models trained on much less data than their teachers may have sharper biases. While there is not yet a significant body of research on this topic, the effects of distillation in other contexts suggests that this is a concern. As these models are deployed in low-income countries, vulnerable consumers may be interacting with LLMs with less transparency, sharper biases, and less effective safeguards. 

Even without these added difficulties, large and complex English-language LLMs exhibit clear biases that affect marginalized populations. They have, for example, perpetuated extreme racial biases, offered unsubstantiated, race-based medical advice, and given gendered financial advice based on unfounded assumptions. 

Privacy and Data Security Risks 

Reverse distillation is a process used by “student” AI models to reduce some of the biases that might exist in “teacher” AI models. However, this can also be used with malicious intent to reverse engineer sensitive training data. Security analysts will need to stay a step ahead to address these security and privacy threats. As the demand for data access increases, we need rules and regulations that can prevent illegal access to data and to find a way to develop AI watermarking and model fingerprinting to identify copied models.  

While generative AI-powered cyberattacks and fraud are already rampant, distillation may make generative AI more accessible for fraudsters as well. Generative AI can be used to create synthetic identities and bypass ID verification, automate sophisticated phishing attacks, and create convincing deepfakes. The spread of these techniques will harm more consumers and risks undermining consumer confidence in transacting digitally and using digital financial services more broadly. 

What Do We Recommend? 

With distillation potentially allowing more people to leverage generative AI in more diverse contexts, incorporating principles and tools for responsible and equitable deployment of AI is as critical as ever. The inclusive finance sector, researchers, and policymakers should be thinking about best practices for generative AI. These include impact assessments, how to maintain data trails to make distilled models auditable, and how to deploy LLMs to enhance trust in the sector rather than undermine it. Other key issues that should be top of mind include consumer disclosures, liability, and mechanisms to provide redress for consumers who are harmed by generative AI. 

To mitigate the risks faced by vulnerable consumers, we recommend taking three steps before releasing the models for wider public access:  

  1. Assess and address initial risks: One of the limitations of an LLM is that it learns on the basis of its training data. One option to address this is to train the model using retrieval-augmented generation (RAG), i.e., adding access to updatable databases to improve the accuracy of results. This has to be done carefully to ensure we are not introducing new biases but is a potential area for exploration. Fine-tuning the LLM on a specific dataset designed to detect and correct biases after initial training can be another way to assess and address initial risks.  
  1. Develop audit models: This means developing methods to assess the underlying code and reward algorithm of an LLM, i.e., identifying which responses get a high score based on user responses and input data, and how the algorithm learns. Developing audit models is crucial but needs to be done carefully, as recent research suggests that LLMs alter their behavior when they realize they are being audited.  
  1. Understand the underlying language libraries to assess biases: Underlying language libraries are a critical piece of infrastructure that determines the accuracy of the LLM. It is not only crucial to understand the inherent biases that might be present in the library, but to continue to invest and enhance these libraries to reflect society and the consumer bases we seek to include and serve.  

This is an initial analysis of a rapidly evolving space. As we work on understanding the interplay of AI, inclusive finance, and building safeguards for better consumer outcomes, we welcome partnerships with organizations that are thinking of similar issues and topics. Meanwhile, read about our work on equitable AI in financial inclusion here

To get in touch with us and let us know how helpful you found this article, please drop us a line at center@accion.org.


Authors

Jayshree Venkatesan

Senior Director, Consumer Protection & Strategic Industry Engagement

As Senior Director, Consumer Protection & Strategic Industry Engagement, Jayshree leads the development and execution of the consumer protection research and influence strategy, contributing to CFI’s global portfolio. She also oversees the Responsible Finance Forum’s convening and influence model, building a community dedicated to addressing industry challenges in consumer protection and advancing the responsible finance agenda. 

With two decades of experience spanning structured finance, innovative business models, consumer research, and policy influence, Jayshree has worked to advance financial inclusion and economic development globally. Before joining CFI, she spent nearly a decade as an independent consultant with institutions like CGAP, the World Bank, JICA, and ITAD, focusing on customer-centric business models and challenges faced by low-income consumers in accessing and using formal financial services. From 2009-2013, she was part of the founding team at IFMR (now Dvara Trust) in India where she led India’s first mezzanine fund for microfinance, which evolved into an alternate investment fund. Jayshree is also a senior policy fellow at the Leir Institute within the Fletcher School of Law and Diplomacy and has served as adjunct faculty at the Fletcher School of Law and Diplomacy, teaching decision analysis for business. 

Jayshree is a recipient of the Chevening fellowship for leadership from the Foreign and Commonwealth Office, UK and completed the program at Kings College, London. She earned an MA in international relations from the Fletcher School of Law and Diplomacy, an MBA from Management Development Institute in Gurgaon, and an undergraduate degree in mathematics from Mumbai University. 

Gillous Harris

Research Specialist

As a research specialist, Gillous contributes to CFI’s research and policy work by generating insights, communicating insights for impact, managing programs, and influencing organizational culture.


Gillous joins CFI from FinRegLab, where he conducted research to advance inclusion in lending, payments, and KYC practices. At FinRegLab, Gillous worked on and managed several research projects, drafted and reviewed policy reports and communications, and convened stakeholders to maximize the impact of policy research.

Gillous is from Blacksburg, Virginia, and he received his BA in International Relations and History from William & Mary in Williamsburg, Virginia. He completed his MA in International Relations at the Johns Hopkins School of Advanced International Studies, where he studied in both Washington, D.C. and Bologna, Italy.

Gillous enjoys spending free time with his dog Villa (a black and tan coonhound), hiking, and playing pickup basketball. He loves to try new things, whether it’s food, traveling to new places, or learning something new.

Explore More

Report

Green Inclusive Finance in Action: Learnings from Guatemala

Sign up for updates

This field is for validation purposes and should be left unchanged.