A Graph Query Processor for Queries of Class CRPQagg

Description

Analyzing social networks, querying the web of data, managing knowledge networks, studying the interaction of proteins, planning transportation grids, and routing network traffic are all problems that work with graph data. Since graphs with millions or even billions of nodes and edges are not uncommon, many applications require data management and processing techniques to handle these graph data instances.

Graph data applications present new database research challenges that arise from the fact that the graph application domain is more heterogeneous than, for example, the automated business domain for which most databases management systems are traditionally used. First, graph processing tasks may vary between application domains as, for example, determining the standing of a user in a social network is different from planning a route in a transportation grid and analyzing the interaction of proteins. Second, different graph topologies can have an impact on how a graph is best processed. Even in simplified examples, we can observe that, in a social network, the average node degree—the average number of incident edges over all nodes—is typically high and the diameter—the longest shortest path in the graph—is small, while in a transportation network the opposite is true.

As a consequence, an algorithm that computes the closest connection between two users might fail to efficiently compute a shortest route between two cities.

In this project, we will address these requirements in the tradition of database research by designing and developing a general query processor for graph databases. The research conducted in this project will be structured into two streams of work. The first stream of work will approach the problem “top-down” by starting from concrete graph query languages that belong to a well-defined class of query languages. The functionality of popular query languages in this class will be analyzed in order to formally define a common graph data model and a unified algebra of operators. The second stream of work will address the problem “bottom-up” by starting from existing algorithms for graph operations and index structures. The dependencies between the characteristics of a graph and the possible algorithms to accomplish a certain processing task will be studied systematically. Based on this empirical study, an analytical cost model will be derived that will tie the results of both streams of work together in order to build a query processor that translates logical graph processing tasks into optimized physical execution plans.

Institutions
  • WG Grossniklaus (Database and Information Systems)
Funding sources
Name Finanzierungstyp Kategorie Project no.
Sachbeihilfe/Normalverfahren third-party funds research funding program 422/15
Further information
Period: 09.01.2015 – 30.11.2019