Designing code analyses for large-scale software systems 2 (DECA 2) SS2024
News
Course number and language
L.079.05800
The teaching language will be English. Questions in German will be permitted.
Registering and communicating
To attend the course, you have to register in the PAUL system as a participant.
To ask questions, please use the discussion forum in PANDA, so that others can benefit from the answers as well.
If needed, we will also send updates through PANDA circulars.
Abstract
Static code analysis is used to detect bugs and security breaches, and aids compiler optimization. It has been an area of research since the past several decades. This course explains several novel, advanced concepts out of cutting-edge research (such as weighted pushdown systems and demand-driven program analyses) and also introduces some interesting (and recently developed) tools. Most of these concepts are very recent and hence give an excellent overview of what static analysis researchers are currently working on. Example applications are drawn from the area of IT security.
Course structure
Each week, two hours will be dedicated to the lecture, and three hours will be dedicated to concrete exercise classes and programming labs.
In the exercise sessions, you will be able to apply the notions seen during the lecture into more concrete topics, preparing you to present your knowledge (with respect to the final exam).
The goal of the programming labs is to introduce you to recent program analysis tools, and deepen your knowledge and understanding of the notions seen in the lecture and exercise sessions. The lab assignments will mostly be done at home, using the scheduled lab hours to answer questions on the ongoing lab.
If you have questions to the organisation of the course, the topic, the exercises, or the labs, or if you get stuck when solving the exercises or labs, please use the forum in PANDA. We try to answer on a regular basis and as soon as possible.
Evaluation
Graded exercise sheets:
- During the semester, you will have to hand in six graded exercise sheets.
- Each sheet has to be handed in through PANDA before 8 am on its due date.
- Late submissions will not be accepted.
- Plagiarism will result on a 0 grade for the exercise sheets and will be reported to the department. It can result in severe consequences such as financial fine and expulsion from the university.
Graded labs:
- During the semester, you will have to hand in four labs.
- Each lab has to be handed in through PANDA before 8 am on its due date.
- Late submissions will not be accepted.
- Plagiarism will result on a 0 grade for the labs and will be reported to the department. It can result in severe consequences such as financial fine and expulsion from the university.
Labs are not required for course achievement. However, you will get the following bonus if you submit labs:
- If you scored 70% or more, you will receive a bonus of 0.3 on your final grade.
- If you scored 90% or more, you will receive a bonus of 0.7 on your final grade.
Final exam:
At the end of the course, you will have the opportunity to register for the oral exam based on your exercise sheets grade:
- If you scored below 50%, you cannot register for the exam.
- If you scored 50% or more, you can register for the exam.
Prerequisites
The course Designing code analyses for large-scale software systems (DECA) 1 is a required prerequisite. A mature understanding of the Java programming language and object-oriented programming will be helpful.
Syllabus
Topics covered include:
- Sparse IFDS
- SPLLIFT
- Pushdown Systems, WPDS Frameworks
- Demand-Driven Program Analysis
- Synchronised Pushdown Systems, Boomerang
- Introduction to CogniCrypt, FlowDroid
- Handling Reflection
- Hybrid Analysis
- Heapster
- SWAN/SWAN Assist
- Improved User Experience
Throughout the course and the exercise sessions, we will discuss applications to software security.
Learning outcomes
After having attended this course, students will have learned…
- how to make educated design decisions when designing automated code analysis for large-scale software systems,
- which algorithms have which properties when using them to implement static code-analyses,
- how to design real–world code analyses for practical problem cases from the area of IT security,
- which current tools are used for program analysis, what their limitations are and where they can be applied.
Recommended reading material
We will not be able to provide a script for this course. We will provide powerpoint slides where available, but will develop some concepts also on the blackboard. Students are highly encouraged to take their own copies during their lecture.
A lot of the material is also covered in the following books and papers, however, those publications present the material in a more complex manner than in the lectures, which is why they should mostly be used for deeper personal study.
- Dongjie He, Haofeng Li, Lei Wang, Haining Meng, Hengjie Zheng, Jie Liu, Shuangwei Hu, Lian Li, and Jingling Xue. 2019. Performance-boosting sparsification of the IFDS algorithm with applications to taint analysis. In Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE '19).
- Kadiray Karakaya, Eric Bodden. 2024. Symbol-Specific Sparsification of Interprocedural Distributive Environment Problems. In Proceedings of the 46th International Conference Software Engineering (ICSE '24).
- Eric Bodden, Társis Tolêdo, Márcio Ribeiro, Claus Brabrand, Paulo Borba, and Mira Mezini. 2013. SPLLIFT: statically analyzing software product lines in minutes instead of years. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '13). Association for Computing Machinery, New York, NY, USA, 355–364.
- Akash Lal, Thomas Reps, and Gogul Balakrishnan. 2005. Extended weighted pushdown systems. CAV 2005
- Context-, Flow-, and Field-sensitive Data-flow Analysis Using Synchronized Pushdown Systems (Johannes Späth, Karim Ali, Eric Bodden), In Proceedings of the ACM SIGPLAN Symposium on Principles of Programming Languages, pages 48:1–48:29, 3(POPL), 2019.
- Späth, Johannes & Nguyen Quang Do, Lisa & Ali, Karim & Bodden, Eric. (2016). Boomerang: Demand-Driven Flow- and Context-Sensitive Pointer Analysis for Java (Artifact). 10.4230/DARTS.2.1.12.
- Kadiray Karakaya, Eric Bodden. 2023. Two Sparsification Strategies for Accelerating Demand-Driven Pointer Analysis. In Proceedings of the 16th IEEE International Conference on Software Testing, Verification and Validation (ICST '23).
- Stefan Krüger, Sarah Nadi, Michael Reif, Karim Ali, Mira Mezini, Eric Bodden, Florian Göpfert, Felix Günther, Christian Weinert, Daniel Demmler, and Ram Kamath. 2017. CogniCrypt: supporting developers in using cryptography. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE 2017). IEEE Press, 931–936.
- FlowDroid: Precise Context, Flow, Field, Object-sensitive and Lifecycle-aware Taint Analysis for Android Apps (Steven Arzt, Siegfried Rasthofer, Christian Fritz, Eric Bodden, Alexandre Bartel, Jacques Klein, Yves Le Traon, Damien Octeau, Patrick McDaniel), In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 259–269, PLDI ’14, ACM, 2014
- Eric Bodden, Andreas Sewe, Jan Sinschek, Hela Oueslati, and Mira Mezini. 2011. Taming reflection: Aiding static analysis in the presence of reflection and custom class loaders. In Proceedings of the 33rd International Conference on Software Engineering (ICSE '11). Association for Computing Machinery, New York, NY, USA, 241–250.
- Goran Piskachev, Lisa Nguyen Quang Do, Oshando Johnson, and Eric Bodden. 2019. SWANAssist: semi-automated detection of code-specific, security-relevant methods. In Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE '19). IEEE Press, 1094–1097.
- Lisa Nguyen Quang Do, Karim Ali, Benjamin Livshits, Eric Bodden, Justin Smith, and Emerson Murphy-Hill. 2017. Just-in-time static analysis. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2017). Association for Computing Machinery, New York, NY, USA, 307–317.