WHALE – Web-Scale Hybrid Explainable Machine Learning

14.07.2025 | Heinz Nixdorf Institute, Data Science / Heinz Nixdorf Institute

The increasing integration of AI systems into everyday applications has made the explainability and transparency of machine learning models a central research challenge. Particularly on the Web – the largest information infrastructure in human history – decisions made by AI systems affect billions of users daily. Trust in these systems hinges on our ability to understand and explain their behavior.

WHALE (Web-Scale Hybrid Explainable Machine Learning) is a research project led by Prof. Dr. Axel-Cyrille Ngonga Ngomo and funded by the Ministry of Culture and Science of North Rhine-Westphalia (MKW NRW) as part of the Lamarr Fellow Network. Running from October 2023 to September 2027, the project addresses the development of scalable, explainable machine learning methods for Web-scale knowledge graphs.

WHALE focuses on class expression learning (CEL) as an ante-hoc, globally interpretable approach to concept induction over RDF knowledge bases. While CEL has shown promising results in constrained settings, existing algorithms do not scale well to the complexity and size of real-world, Web-scale data.

To overcome these limitations, WHALE focuses on time-efficient hybrid CEL approaches for expressive description logics (DLs) (e.g., SROIQ(D)). This is due to RDF knowledge bases being are easily amenable to CEL by virtue of their T-boxes being commonly described in a DL. By devising the equivalent of large-scale language models (LLMs) for knowledge graphs, WHALE aims to devise the first reusable and time-efficient Web-scale models for explainable CEL. The resulting artefacts will be designed to be reusable (e.g., in a fashion similar to LLMs like GPT-3 and T5) so as to (i) maximize the impact of WHALE on research and applications, (ii) abide by the premises of open and repeatable science, and (iii) foster further research on explainable AI on the web for the good of humanity.