Configuring Taint Analysis for GDPR

Master thesis


The General Data Protection Regulation (GDPR) [3] is enforced within the EU since May 2018. GDPR regulates the use of personal data within organizations. The policy has a broad scope. This thesis focuses on the use of personal data in software. To automatically check whether a software is GDPR compliant, one can use taint analysis, a static code analysis that tracks data flow within the software from source of personal data to sink, a location where personal data should not appear. Taint analysis has many applications in the detection of security vulnerabilities. For example, it is used in FlowDroid [5] to track data leaks in Android apps. Ferrara et al. [1][2], propose a taint analysis for GDPR compliance with manually selected sources, sinks and categories of the GDPR policy. The goal of this thesis is to automate the GDPR specification of taint analysis. A machine learning approach, such as SWAN [4] can be applied. 



  • Identify taint specifications for GDPR (sources, sinks, sanitizers, categories)
  • Propose an automated process for inferring taint specifications
  • Implement a taint analysis for GDPR compliance
  • Evaluate the approach


  • Programing experience (Java)
  • Prior knowledge in static analysis methods (e.g. data flow analysis)
  • Experience with Soot framework (optional)
  • Experience with machine learning techniques (optional)

Learning outcomes: 

  • Work on your own research project (design, management, analysis, implementation, documentation)
  • Learn research methods in static code analysis
  • Learn how to apply your research results into industrial context 



Goran Piskachev (


[1] Static Analysis for GDPR Compliance, Pietro Ferrara and Fausto Spoto

[2] Tailoring Taint Analysis to GDPR, Pietro Ferrara, Luca Olivieri, and Fausto Spoto


[4] Codebase-adaptive Detection of Security-Relevant Methods, Goran Piskachev, Lisa Nguyen Quang Do and Eric Bodden

[5] Flowdroid -