SPP-Koordinationsprojekt: Scalable Visual Analytics
In research and development as well as numerous application areas fast growing data sets develop with ever higher complexity and dynamics. A central challenge is to filter the substantial information and to communicate it to humans in an appropriate way. Interactive visual data analysis techniques extend the perceptual and cognitive abilities of humans with automatic data analysis techniques. Only by a combination of data analysis (Data Mining) and visualization techniques an effective access to otherwise unmanageably complex data sets is possible. Visual analysis techniques make the unexpected more easily discoverable and help to gain new cognitions and insights.
Visual data analysis techniques are needed in nature-, environmental-, geo-, social-, engineering- and economic-science to gain understanding and to optimize and steer complicated processes. Approaches, which work either on a purely analytical or on a purely visual level, do not help due to the dynamics and complexity of the underlying processes or due to intelligent opponents. Examples are net traffic analysis (Internet, E-Mail), management of environmental and health data, as well as web dynamics and social nets. The multiplicity of research fields and applications, in which visual analysis techniques are urgently needed, demonstrates their scientific relevance and the potential economical utility. Exemplarily the purely automatic unsupervised classification of Internet and E-Mail traffic is mentioned here, which functions in each case only for a short time, since the senders adapt the message immediately to the filter algorithms, in order to by-pass them. The combined automatic and visual analysis is in this, as in numerous other cases, the only chance to capture the complex, changing characteristics of the data and to take suitable measures.
In the future visual analysis techniques must satisfy a multiplicity of new requirements, which result from the rapid technological development in the hardware, software and network infrastructure. Beside high-dimensional data continuous data streams develop, whose evaluation must take place immediately or in the context of given time frames, which poses high requirements on the data analysis and visualization. The context-dependent graphical representation of relevant information from a large and fast growing total volume of rising complexity makes new demands particularly on the scalability of the techniques. A goal of the priority program are scalable visual analysis systems, which can connect automatic data analysis methods with interactive visualization techniques and be integrated smoothly into custom-designed processes for the exploration and analysis of complex information spaces.
An important component of the visual analysis is the automatic data analysis. With fully automatic techniques, however, the analysis functions only work, if the problem is closely limited and can be clearly specified. In situations, in which high dynamics are present or an intelligent opponent tries to go around the data analysis methods, automatic procedures usually do not solve the problem. In such situations the analyst is demanded to adapt the data methods of analysis or to develop new procedures. In this context the visualization of the data plays an important role, since it gives a better understanding of the data, allows to gain new insights, and to interact effectively with the data analysis methods. The interaction permits the user to bring his expert knowledge into the data analysis process. However, visualization techniques alone do not solve the data analysis problem, since visual techniques can represent only data with a limited volume and moderate complexity.
A central challenge is the scalability, which refers not only to the quantity of the input data, but also to important characteristics of the data such as dimensionality, production rate, homogeneity, topicality, precision and completeness. Beyond that the visual analysis techniques themselves should be scalable, i.e. they should not only be interactive and easily usable, but they are also supposed to visually convey the quality and relevance of the data and to secure in this way the quality of the gained knowledge. Since visual data analysis is an interactive process, methods must be developed, which do not only permit a production of meaningful illustrative representations, but also allow a high degree of interaction. This refers both to the controlling of the visual representation and the visualization process and to the interaction with the data.
In the requested priority program new concepts and methods for visual analysis techniques are to be developed, which fulfill present and future requirements. A close co-operation between scientists from different fields of computer science is necessary: Visual analysis methods can only be developed in close coordination between the fields of visualization, data analysis and interaction. Beyond that the priority program is open for all fields, which can contribute innovatively to the overall topic, like for example statistic analysis, geo data analysis as well as perception psychology. Scientists from these and other fields are to be considered in the priority program, as far as their research contributes to the development of new visual analysis methods.
Scientific Goals of the Priority Program
The primary goal in the field of scalable visual data analysis is to represent real or abstract data graphically in such a way that structural connections and relevant characteristics of the data can be easily seized. Thus the exploration of unknown data should be supported. In the requested priority program the conceptional development, the software-technical realization and the custom-designed integration and evaluation of scalable visual analysis techniques stands in the foreground.
- AG Keim (Data Analysis and Visualization)
|Period:||22.08.2008 – 21.08.2011|