top of page

Developing Trust in AI for Financial Services: Current Progress and Future Directions

Updated: Jul 23

The rapid integration of artificial intelligence (AI) into financial services has reached a critical point. Recently, Sam Altman highlighted significant limitations of AI systems, including issues such as "hallucinations"—instances where AI models produce plausible but incorrect information—and opaque reasoning, which obscure how decisions are made. These concerns underline the necessity for rigorous measures to ensure that AI systems, particularly in finance, are transparent, reliable, and trustworthy. Given that financial decisions have profound impacts on individuals and economies, establishing trust in AI is both critical and urgent. This article explores various strategies currently employed to foster trust in financial AI, examines key regulatory frameworks, and highlights areas requiring future innovation.

ree

Usage of AI in Financial Services


AI has a long history of use in financial services, extending beyond the recent surge in generative AI applications. Traditional uses include algorithmic trading, fraud detection, and credit risk modeling, employing techniques such as random forests, gradient boosting machines, and neural networks.


The advent of large language models (LLMs) and generative AI has expanded AI's application scope significantly. Financial institutions—including banks, insurance companies, and fintech startups—are adopting these advanced models for tasks such as document summarization, customer support chatbots, internal workflow automation, and even programming assistance. Recent surveys indicate that 75% of financial services firms are already using AI, with an additional 10% planning implementation within three years12. The adoption spans fraud detection (60% of European institutions), credit scoring systems (63% of firms), and algorithmic trading (over 50% of trading firms)3. These newer applications, despite their promise, introduce substantial risks, including reputational damage and potential financial losses, thereby underscoring the importance of robust trust frameworks.


The Trust Imperative in Financial AI


Historically, financial institutions built trust through human interactions and their established reputations. AI, however, challenges this traditional model by introducing systems whose decision-making processes are complex and not inherently transparent. Consumers demand transparency from AI-driven financial services, and industry practitioners specifically prioritize "explainability"—the ability to clearly understand how decisions are reached. Specifically, there is considerable concern when AI is used to make decisions that could have serious impact on finances - such as loan denials, trading actions, or fraud alerts.


Current Trust-Building Frameworks in Financial Services


1. Regulatory-Driven Governance


The 2008 financial crisis exposed severe shortcomings in risk modeling, prompting regulations such as SR 11-7 in the United States, which set comprehensive standards for model risk management (MRM). However, this regulation despite being comprehensive, was not explicitly geared towards AI usage and allowed certain ambiguity, resulting in certain AI algorithms falling outside MRM’s scope based on model definitions and standards followed in some banks.


Recent regulatory evolutions have increasingly included explicit guidelines for AI and machine learning (ML) applications:



These evolving regulations emphasize the necessity of transparency, fairness, and accountability in AI systems within financial services.


2. Enhanced Model Risk Management (MRM)


Modern MRM practices now demand continuous, real-time monitoring instead of periodic checks. Regulatory frameworks like the UK's PRA SS 1/23 mandate:


  • Continuous drift detection in model performance

  • Ongoing bias assessments (such as demographic parity and equalized odds)

  • Comprehensive, transparent audit trails


Advanced platforms automate governance activities, facilitating dynamic oversight. These platforms use machine learning to monitor model performance continuously, detect anomalies, and rapidly assess impacts of drift and bias. Automated alerting ensures immediate response to issues, enhancing reliability and accountability.


Additionally, advanced analytics tools enable detailed scenario analyses, aligning AI models with regulatory and ethical standards. Financial institutions are increasingly adopting specialized MRM tools that automate document generation, template population for model development and validation, and workflow configurations that integrate controls throughout the model lifecycle


3. Mechanistic Interpretability


Explainability of machine learning models has been a concern and the black box evaluation of models using traditional methods of LIME and SHAP has its own shortcomings. Mechanistic interpretability is a newer, more advanced approach that seeks to deeply understand the internal workings of complex models, especially neural networks.

ree

  • Sparse Autoencoders (SAEs): Neural networks that isolate specific internal features (e.g., neurons linked explicitly to credit risk), providing clarity on model decisions. Recent research shows that SAEs can decompose complex model activations into interpretable features, making it possible to systematically map model behavior to human-understandable concepts. As highlighted in Neuronopedia, this approach helps address the challenge of polysemantic neurons—where a single neuron encodes multiple, unrelated features—by enabling more granular, monosemantic representations.


  • Logit Lens: Tracing predictions layer-by-layer, enhancing transparency. This method allows researchers to examine how information propagates through each stage of the model, revealing the intermediate computations that lead to the final output.


  • Activation Patching: Identifying specific components responsible for decisions, allowing targeted adjustments. By editing internal activations and observing changes in model outputs, practitioners can causally attribute certain behaviors to specific features or circuits within the network, which is crucial for debugging and safety validation


4. Adversarial Protections


Adversarial testing systematically exposes and mitigates AI vulnerabilities by simulating realistic attacks. DEF CON evaluates models against threats such as data poisoning and model theft, complemented by NIST standards involving reverse stress tests and synthetic fraud scenarios.


DEF CON, officially known as the Defense Conference, is the world's largest annual cybersecurity convention where hackers, security professionals, and researchers gather to share knowledge, techniques, and discoveries in digital security. The DEF CON AI Village hosts specialized events focused on AI security, including the Generative AI Red Team (GRT) challenges where thousands of participants attempt to identify vulnerabilities in large language models from major AI companies. The GRT events have exposed significant security flaws, with the largest event involving 2,244 hackers evaluating 8 LLMs across 21 topics ranging from cybersecurity to misinformation56.] These events have contributed to improved AI safety measures across the industry.


The National Institute of Standards and Technology (NIST) is a U.S. federal agency that develops standards and guidelines to promote innovation and ensure public trust in technology. NIST's AI Risk Management Framework (AI RMF) provides voluntary guidance for organizations to identify, assess, and mitigate AI risks throughout the entire AI lifecycle. The framework includes four core functions—GOVERN, MAP, MEASURE, and MANAGE—and emphasizes trustworthy AI characteristics including validity, reliability, safety, security, accountability, explainability, and fairness78. The framework includes standards for testing, evaluation, verification, and validation (TEVV) of AI systems, along with comprehensive documentation requirements for system transparency.


5. Evolving Testing Metrics


The landscape of AI testing metrics is rapidly advancing, driven by the need for more sophisticated evaluation approaches that can address the complex challenges posed by modern AI systems. Frameworks like A. Sudjianto et al.'s HCAT and Liang P. et al.'s HELM propose advanced evaluation strategies emphasizing embedding-based metrics, robustness testing, and human calibration.


The HCAT framework introduces sophisticated embedding-based metrics that move beyond traditional n-gram approaches like BLEU and ROUGE. These metrics use contextual embeddings from models like BERT to measure semantic similarity, capturing paraphrases and contextual nuances that surface-level methods miss910. HELM's comprehensive approach evaluates 30+ language models across 42 scenarios, measuring seven key metrics including accuracy, calibration, robustness, fairness, bias, toxicity, and efficiency, ensuring that non-accuracy metrics receive proper attention11 12.


Persistent Challenges


Despite significant advances, several challenges remain:


  • Generative AI Risks: AI-generated "hallucinations" evade detection, representing a critical reliability issue.

  • Scalability Issues: Interpretability methods struggle with large-scale models exceeding 10 billion parameters, as computational requirements grow exponentially.

  • Bias Entrenchment: Persistent racial disparities in lending decisions despite bias mitigation efforts.

  • Regulatory Fragmentation: Lack of interoperability among global regulatory frameworks creates compliance complexity for multinational institutions.


The Trust Roadmap: 2025 and Beyond

ree

  1. Automated and ongoing AI Model Testing integrated into MRM


Adopting AI/ML has resulted in constant changes to the parameters of the model dynamically and this drives the need for automating comprehensive MRM activities through new-age platforms. This includes real-time cataloguing, monitoring and testing of models being used within a bank’s environment. Advanced platforms will maintain dynamic model inventories that can automatically discover and catalog AI models, while intelligent workflow orchestration will provide configurable rule engines tailored to different risk tiers and regulatory jurisdictions. Models will undergo ongoing testing that will be triggered automatically and human oversight will be needed to study outliers, unusual patterns and analyze reports. This level of automation will help MRM teams to enlarge models under purview to accommodate the ever-expanding AI use-cases within financial services. 


  1. Embedded Cryptographic Trust: Zero-Knowledge Proofs


Zero-knowledge proofs (ZKPs) are advanced cryptographic techniques that allow one party to prove to another that a certain statement about an AI model is true—such as fairness, regulatory compliance, or correct execution—without revealing any sensitive details about the model itself or its underlying data. This is particularly important in financial services, where models often contain proprietary algorithms and use confidential customer data. ZKPs enable financial institutions to demonstrate to regulators and third parties that their AI models meet specific legal or ethical standards (e.g., absence of bias, adherence to lending rules) without exposing the model’s inner workings or sensitive training data. This preserves intellectual property and privacy while still ensuring accountability.


  1. Cross-Industry Stress Testing


The Singapore AI Verify Foundation’s AI Assurance Pilot in 2025 brought together 17 organizations deploying GenAI applications across 10 industries—including banking, insurance, and technology—and paired them with 16 specialist AI testing firms from Singapore and eight other countries. These real-world pilots tested a diverse set of live applications, most with a human in the loop, and focused on surfacing and codifying emerging norms and best practices for technical evaluation. The pilot emphasized the importance of context-specific risk assessment, simulation testing for edge cases, and the value of independent, external evaluation to uncover systemic vulnerabilities.


In parallel, initiatives like the AI Safety Institute’s Turing Trials are advancing cross-industry stress testing by providing structured, multi-scenario evaluations that simulate adversarial threats and operational edge cases. Collectively, these efforts are shaping a more transparent, reliable, and collaborative approach to stress testing across the financial sector and beyond. 


  1. Future LLM Explainability Research


In their paper Beyond the Black Box: Interpretability of LLMs in Finance, Tatsat et al (2025) note that while current techniques like sparse autoencoders and feature attribution can reveal some internal mechanisms, future research must address the challenge of polysemantic neurons and emergent behaviors in models with trillions of parameters. This requires scalable methods to map internal model components to human-understandable concepts, especially as models become more complex and dynamic. They also stress the importance of developing causal tracing tools that can establish clear, auditable links between model inputs, internal reasoning steps, and outputs. This is crucial for meeting regulatory requirements and providing actionable explanations in regulated industries.


Future explainability research should focus on building domain-aware interpretability frameworks that can translate LLM reasoning into financial concepts, such as risk factors or compliance criteria, making explanations more relevant and actionable for practitioners.


Conclusion


Trust in financial AI is foundational. Institutions must integrate rigorous regulatory compliance, advanced interpretability, sophisticated adversarial protections, and robust evaluation frameworks. Future success depends on proactive industry-wide collaboration, alignment with global regulatory standards, and sustained investment in AI innovations.



Further Reading and References
  1. European Commission. (2025). EU AI Act: Financial Services Annex.

  2. Monetary Authority of Singapore. (2025). AI Model Risk Management Guidelines.

  3. UK PRA. (2023). SS 1/23: Supervisory Statement on Model Risk Management.

  4. NYDFS. (2024). AI Use in Financial Decision Making Regulations.

  5. CCPA. (2020). Automated Decision-Making Transparency Requirements.

  6. Canada Treasury Board Secretariat. (2021). Directive on Automated Decision-Making.

  7. Sudjianto, A. et al. (2024). Human-Calibrated Automated Testing (HCAT).

  8. Liang, P. et al. (2023). Holistic Evaluation of Language Models (HELM).

  9. Tatsat.H and Shater.A (2025). Beyond the Black Box: Interpretability of LLMs in Finance

  10. DEFCON AI Red Team. (2025). Financial AI Vulnerability Database.

  11. NIST (2024). Adversarial Testing Standards for AI Models.

  12. Neuronpedia. (2024). Sparse Autoencoder available at https://docs.neuronpedia.org/sparse-autoencoder

  13. Turing Trials. (2025). UK AI Safety Institute Cross-Industry Stress Testing available at https://www.turing.ac.uk/news/new-ai-security-initiative-set-boost-uks-resilience-against-hostile-threats

  14. Singapore AI Verify Foundation (2025) Main report on AI assurance pilot of technical testing of Generative AI applications 

  15. Quantum Cryptography Report. (2024). Quantum Key Distribution and Post-Quantum Cryptography Standards.


Comments


bottom of page