Mechanistic interpretability - Fluid-core iMIF
The Integrated Mechanistic Interpretability Framework (iMIF) is designed to make AI systems transparent, auditable, and trustworthy while fostering ethical and policy-aligned adoption. iMIF integrates three interdependent components into a single, cohesive framework:
1. Mechanistic Analysis of Neural Networks
iMIF employs advanced methods such as Information Bottleneck (IB), mutual information analysis, Neural Ordinary Differential Equations (Neural ODEs), and category-theoretic modeling to uncover internal neural computations. This allows for layer-wise transparency in LLMs and generative AI models, providing insight into how inputs are transformed into outputs. Mechanistic analysis enables identification of potential biases, unintended correlations, or failure points in high-stakes applications.
2. Regulatory Alignment and Auditing
Mechanistic insights are translated into quantitative, policy-aligned metrics, supporting compliance with Kenya’s Data Protection Act (2019), sectoral regulations, and international standards like the EU AI Act and US NIST AI Risk Management Framework. iMIF includes Python-based auditing scripts and dashboards that allow government agencies to trace, verify, and report AI decisions. This ensures ethical alignment, accountability, and trustworthiness in AI-driven governance and economic operations.
3. AI Literacy and Demystification Tools
iMIF provides interactive dashboards and visualization interfaces that make model behavior understandable to non-technical stakeholders, including policymakers and citizens. These tools foster AI literacy, allowing users to inspect decisions, compare outcomes, and engage critically with AI systems. By integrating demystification with mechanistic and regulatory components, iMIF ensures inclusive and accountable AI deployment.