# DisDaS: Distributed Data Streams

In our project group "DisDaS: Distributed Data Streams", we plan to lay the foundations for the design and analysis of distributed algorithms that continuously compute functions of streams of data which are observed by a multitude of data sources. The major challenge is to cope with the huge amount of data generated. Typically, the data streams are too big and arrive too fast to be completely stored, or sent to a server through a network, or processed in real time. Thus we have to find ways to extract useful information from the streams using only very limited resources, like memory, communication volume and computation time.

## The Scenario

The most commonly used model in distributed streaming algorithms is the continuous monitoring model due to Cormode & Muthukrishnan. In this model n nodes observe a (possibly infinite) data stream and are able to communicate to one central coordinator which computes a function based on the union of all data streams. The goal is to minimize the communication between the nodes and the coordinator. Various problems and results were investigated in this model during the past decades, e.g. the countdown or threshold monitoring problem or, in general, functional monitoring problems. In this project group, we want to generalize the model leading to new research questions.

While the continuous monitoring model focuses on the case with a single coordinator, this model connects n data sources to multiple coordinators by a potentially complex topology.

Also, new problems to investigate arise when the roles of nodes and the coordinator change in a natural way: the coordinator is no longer interested in data arriving at the nodes but instead, the nodes (clients) request data from the coordinator who is acting as a server.

## Results of the Seminar

Interval Selection in the streaming model | Pascal Bemmann | Seminar | Slides | |
---|---|---|---|---|

Multi-Dimensional Online Tracking | Steffen Knorr | Seminar | Slides | |

Frequency Moments - Approximations and Space Complexity | Felix Biermeier | Seminar | Slides | |

Estimating Simple Functions on the Union of Data Streams | Till Knollmann | Seminar | Slides | |

On Graph Problems in a Semi-Streaming Model | Arne Kemper | Seminar | Slides | |

The Count-Min Sketch and its Applications | Jannik Sundermeier | Seminar | Slides | |

Palindrome Recognition in The Streaming Model | Jan Bürmann | Seminar | Slides | |

Randomized Algorithms for Tracking Distributed Count, Frequencies, and Ranks | Nils Kothe | Seminar | Slides |