BISON, RTD Forschungsprojekt
- FB Informatik und Informationswissenschaft
(2012): Konzepterkennung in Informationsnetzwerken |
Konzepte spielen in unserem täglichen Leben eine wichtige Rolle, wenn auch nur unbewusst. Origin (projects) |
|
(2012): Bisociative Knowledge Discovery |
Modern knowledge discovery methods enable users to discover complex patterns of various types in large information repositories. However, the underlying assumption has always been that the data to which the methods are applied originates from one domain. Origin (projects) |
|
(2012): Towards Discovery of Subgraph Bisociations Bisociative Knowledge Discovery / Berthold, Michael R. (Hrsg.). - Berlin, Heidelberg : Springer Berlin Heidelberg, 2012. - (Lecture Notes in Computer Science ; 7250). - S. 263-284. - ISBN 978-3-642-31829-0 |
The discovery of surprising relations in large, heterogeneous information repositories is gaining increasing importance in real world data analysis. If these repositories come from diverse origins, forming different domains, domain bridging associations between otherwise weakly connected domains can provide insights into the data that are not accomplished by aggregative approaches. In this paper, we propose a first formalization for the detection of such potentially interesting, domaincrossing relations based purely on structural properties of a relational knowledge description. Origin (projects) |
|
(2012): Node Similarities from Spreading Activation Bisociative Knowledge Discovery / Berthold, Michael R. (Hrsg.). - Berlin, Heidelberg : Springer Berlin Heidelberg, 2012. - (Lecture Notes in Computer Science ; 7250). - S. 246-262. - ISBN 978-3-642-31829-0 |
In this paper we propose two methods to derive different kinds of node neighborhood based similarities in a network. The first similarity measure focuses on the overlap of direct and indirect neighbors. The second similarity compares nodes based on the structure of their possibly also very distant neighborhoods. Both similarities are derived from spreading activation patterns over time. Whereas in the first method the activation patterns are directly compared, in the second method the relative change of activation over time is compared. We applied both methods to a real world graph dataset and discuss some of the results in more detail. Origin (projects) |
|
(2012): From Information Networks to Bisociative Information Networks Bisociative Knowledge Discovery / Berthold, Michael R. (Hrsg.). - Berlin, Heidelberg : Springer Berlin Heidelberg, 2012. - (Lecture Notes in Computer Science ; 7250). - S. 33-50. - ISBN 978-3-642-31829-0 |
The integration of heterogeneous data from various domains without the need for prefiltering prepares the ground for bisociative knowledge discoveries where attempts are made to find unexpected relations across seemingly unrelated domains. Information networks, due to their flexible data structure, lend themselves perfectly to the integration of these heterogeneous data sources. This chapter provides an overview of different types of information networks and categorizes them by identifying several key properties of information units and relations which reflect the expressiveness and thus ability of an information network to model heterogeneous data from diverse domains. The chapter progresses by describing a new type of information network known as bisociative information networks. This kind of network combines the key properties of existing networks in order to provide the foundation for bisociative knowledge discoveries. Finally based on this data structure three different patterns are described that fulfill the requirements of a bisociation by connecting concepts from seemingly unrelated domains. Origin (projects) |
|
(2012): On the Integration of Graph Exploration and Data Analysis : the Creative Exploration Toolkit Bisociative Knowledge Discovery / Berthold, Michael R. (Hrsg.). - Berlin, Heidelberg : Springer Berlin Heidelberg, 2012. - (Lecture Notes in Computer Science ; 7250). - S. 301-312. - ISBN 978-3-642-31829-0 |
To enable discovery in large, heterogenious information networks a tool is needed that allows exploration in changing graph structures and integrates advanced graph mining methods in an interactive visualization framework. We present the Creative Exploration Toolkit (CET), which consists of a state-of-the-art user interface for graph visualization designed towards explorative tasks and support tools for integration and communication with external data sources and mining tools, especially the data-mining platform KNIME. All parts of the interface can be customized to fit the requirements of special tasks, including the use of node type dependent icons, highlighting of nodes and clusters. Through an evaluation we have shown the applicability of CET for structure-based analysis tasks. Origin (projects) |
|
(2012): (Missing) Concept Discovery in Heterogeneous Information Networks Bisociative Knowledge Discovery / Berthold, Michael R. (Hrsg.). - Berlin, Heidelberg : Springer Berlin Heidelberg, 2012. - (Lecture Notes in Computer Science ; 7250). - S. 230-245. - ISBN 978-3-642-31829-0 |
This article proposes a new approach to extract existing (or detect missing) concepts from a loosely integrated collection of information units by means of concept graph detection. Thereby a concept graph defines a concept by a quasi bipartite sub-graph of a bigger network with the members of the concept as the first vertex partition and their shared aspects as the second vertex partition. Once the concepts have been extracted they can be used to create higher level representations of the data. Concept graphs further allow the discovery of missing concepts, which could lead to new insights by connecting seemingly unrelated information units. Origin (projects) |
|
(2012): Towards Creative Information Exploration Based on Koestler's Concept of Bisociation Bisociative Knowledge Discovery / Berthold, Michael R. (Hrsg.). - Berlin, Heidelberg : Springer Berlin Heidelberg, 2012. - (Lecture Notes in Computer Science ; 7250). - S. 11-32. - ISBN 978-3-642-31829-0 |
Creative information exploration refers to a novel framework for exploring large volumes of heterogeneous information. In particular, creative information exploration seeks to discover new, surprising and valuable relationships in data that would not be revealed by conventional information retrieval, data mining and data analysis technologies. While our approach is inspired by work in the field of computational creativity, we are particularly interested in a model of creativity proposed by Arthur Koestler in the 1960s. Koestler’s model of creativity rests on the concept of bisociation. Bisociative thinking occurs when a problem, idea, event or situation is perceived simultaneously in two or more “matrices of thought” or domains. When two matrices of thought interact with each other, the result is either their fusion in a novel intellectual synthesis or their confrontation in a new aesthetic experience. This article discusses some of the foundational issues of computational creativity and bisociation in the context of creative information exploration. Origin (projects) |
|
(2012): Towards Bisociative Knowledge Discovery Bisociative Knowledge Discovery / Berthold, Michael R. (Hrsg.). - Berlin, Heidelberg : Springer Berlin Heidelberg, 2012. - (Lecture Notes in Computer Science ; 7250). - S. 1-10. - ISBN 978-3-642-31829-0 |
Knowledge discovery generally focuses on finding patterns within a reasonably well connected domain of interest. In this article we outline a framework for the discovery of new connections between domains (so called bisociations), supporting the creative discovery process in a more powerful way. We motivate this approach, show the difference to classical data analysis and conclude by describing a number of different types of domain-crossing connections. Origin (projects) |
|
(2011): Mining fault-tolerant item sets using subset size occurrence distributions Advances in Intelligent Data Analysis X / Gama, João; Bradley, Elizabeth; Hollmén, Jaakko (Hrsg.). - Berlin, Heidelberg : Springer Berlin Heidelberg, 2011. - (Lecture Notes in Computer Science ; 7014). - S. 43-54. - ISBN 978-3-642-24799-6 |
Mining fault-tolerant (or approximate or fuzzy) item sets means to allow for errors in the underlying transaction data in the sense that actually present items may not be recorded due to noise or measurement errors. In order to cope with such missing items, transactions that do not contain all items of a given set are still allowed to support it. However, either the number of missing items must be limited, or the transaction's contribution to the item set's support is reduced in proportion to the number of missing items, or both. In this paper we present an algorithm that efficiently computes the subset size occurrence distribution of item sets, evaluates this distribution to find fault-tolerant item sets, and exploits intermediate data to remove pseudo (or spurious) item sets. We demonstrate the usefulness of our algorithm by applying it to a concept detection task on the 2008/2009 Wikipedia Selection for schools. Origin (projects) |
|
(2011): New algorithms for finding approximate frequent item sets Soft Computing ; 16 (2011), 5. - S. 903-917. - ISSN 1432-7643. - eISSN 1433-7479 |
In standard frequent item set mining a transaction supports an item set only if all items in the set are present. However, in many cases this is too strict a requirement that can render it impossible to find certain relevant groups of items. By relaxing the support definition, allowing for some items of a given set to be missing from a transaction, this drawback can be amended. The resulting item sets have been called approximate, fault-tolerant or fuzzy item sets. In this paper we present two new algorithms to find such item sets: the first is an extension of item set mining based on cover similarities and computes and evaluates the subset size occurrence distribution with a scheme that is related to the Eclat algorithm. The second employs a clustering-like approach, in which the distances are derived from the item covers with distance measures for sets or binary vectors and which is initialized with a one-dimensional Sammon projection of the distance matrix. We demonstrate the benefits of our algorithms by applying them to a concept detection task on the 2008/2009 Wikipedia Selection for schools and to the neurobiological task of detecting neuron ensembles in (simulated) parallel spike trains. Origin (projects) |
|
(2010): Network ensemble clustering using latent roles Advances in Data Analysis and Classification ; 5 (2010), 2. - S. 81-94. - ISSN 1862-5347 |
We present a clustering method for collections of graphs based on the assumptions that graphs in the same cluster have a similar role structure and that the respective roles can be founded on implicit vertex types. Given a network ensemble (a collection of attributed graphs with some substantive commonality), we start by partitioning the set of all vertices based on attribute similarity. Projection of each graph onto the resulting vertex types yields feature vectors of equal dimensionality, irrespective of the original graph sizes. These feature vectors are then subjected to standard clustering methods. This approach is motivated by social network concepts, and we demonstrate its utility on an ensemble of personal networks of migrants, where we extract structurally similar groups and show their resemblance to predicted acculturation strategies. Origin (projects) |
Name | Project no. | Description | Period |
---|---|---|---|
Kooperation | 757/08 |
Period: | 01.06.2008 – 31.05.2011 |