Theses
When you are interested in writing a Bachelor's or Master's thesis, please send me an e-mail (maximilian.schuele(at)uni-bamberg.de) with your transcript of records attached and your preferred topic wish. For qualification, I request to implement a B-Tree and send me the result as specified here: db.in.tum.de/teaching/theses/hiwitest/
Topics requiring C++
SQL++ Extending Database Systems by Building Blocks
The idea is to expose building blocks of database systems, for example hash tables, for data mining and machine learning algorithms. The thesis should build upon an existing open-source database system, for example Hyrise, where you should first write selected algorithms in SQL. Afterwards, you should improve the performance of the algorithms by adding certain suboperators to the database system. For example, you can perform gradient descent iteratively using recursive CTEs. Then you create an operator for iterations, for example trampolin, with less memory consumption. The building blocks should be accessed through an extension of SQL by user-defined functions (UDFs), calles SQL++.
SQL Compiler for LeanStore
LeanStore is an open-source system for OLTP and OLAP transaction but lacks an SQL interface. The goal of this thesis is to write a query compiler in C++.
Implementation of Higher-Order SQL Lambda Functions
Instead of extracting runnable code and data out of a database system, we propose higher-order SQL lambda functions for in-database execution.
SQL lambda expressions have been introduced to let the user customise otherwise hard-coded data mining operators such as the distance function for k-means clustering.
However, database systems parse lambda expressions during the semantic analysis, which does not allow for functions as arguments.
- Prototype in C++/Java
- Integration into PSQL/DuckDB
- Integration into LingoDB
Code-Generation for GPU Database Systems
Modern database system generate code instead of interpreting function call for an operator trees. In this thesis, you have to generate code to run on GPUs and investigate, how SIMT (single instruction mutliple threads) will accelerate query processing.
Versioning based on the ARIES protocol
Let's adapt the ARIES recovery protocol to make versions of tuples visible.