We con­grat­u­late Kadiray Karakaya on passing his doc­tor­al ex­am­in­a­tion

 |  Heinz Nixdorf InstituteSecure Software Engineering / Heinz Nixdorf Institut

Kadiray Karakaya successfully completed his doctorate on the topic of "Scalable Data-Flow Analysis through Sparsification and Precise Call Graphs" with Prof Dr Eric Bodden.

Static data-flow analysis aims to ensure bug-free, secure, and quality software by accounting for all possible executions of a target programme. Due to scalability constraints, this often entails sound over-approximations that compromise analysis precision. On the other hand, sparsification, an optimization technique that restricts data-flow analyses to analysis-relevant program statements, improves scalability while at the same time maintaining precision.

This thesis presents SPARSEIDE, a novel framework that realises (data-flow) factspecific sparsification for any data-flow analysis that fits the IDE (Interprocedural Distributive Environments) framework. Although IDE analyses can only be sparsified with respect to static symbols and not dynamic values, SPARSEIDE yields significantly lower runtimes and memory consumptions than the original IDE framework. Scalability-improving approaches from the literature, including SPARSEIDE, use a fixed call-graph algorithm, without considering its impact on the downstream data- flow analysis. Through extensive empirical evaluation, this thesis shows how precise context-sensitive call graphs significantly reduce data-flow analysis runtimes.

Precise data-flow analyses reason about the heap through pointer analyses, which are also hard to scale. This thesis also presents SPARSEBOOMERANG as an application of fact-specific sparsification to demand-driven pointer analysis. SPARSEBOOMERANG realises two different sparsification strategies that exploit the characteristics of the pointer analysis domain: a type-aware sparsification and an aliasaware sparsification. Interprocedural data-flow analyses comprise a data-flow solver, a call graph, and a pointer analysis. This thesis shows how to scale precise data-flow analyses by considering all three components from the perspective of sparsification.

Fact-specific sparsification reduces the data-flow solver's workload. As an orthogonal component, the choice of call graph significantly influences data-flow analysis scalability. Pointer analysis, which is known to be a non-distributive problem, also benefits from factspecific sparsification when formulated within a distributive data-flow analysis framework.