Home Program Slides Resources Authors References

DOnEReCA

Data-driven ontology engineering with Relational Concept Analysis

Abstract

Data can successfully support ontology engineering tasks such as design or maintenance, assuming it has been properly analyzed to discover possible trends and/or groups. For instance, when an ontology is designed from a relational database (RDB), a first (rough) ontology can be enhanced by the result of a conceptual clustering to reveal missing classes, and even properties, in that ontology. Similarly, when populating an existing ontology with an independently created data, one might want to determine how well the data fit the ontology w.r.t. the mapping of resources to ontology classes. This warrants analysis of data descriptions to detect characteristic associations among ontology types, on one hand, and own descriptions in term of properties, on the other hand, which might reveal anomalous configurations.

Formal Concept Analysis (FCA) provides a knowledge discovery framework enabling both (1) conceptual clustering of data objects and (2) pattern/association discovery. It was thought as a mathematical approach to the design of concept hierarchies (called concept lattices) from a sets of observations (introduced as object x attribute tables, called formal contexts). FCA, as most data mining approaches focuses on a single data table, whereas Linked Data are inherently multi-table, a.k.a. multi-relational. Relational Concept Analysis (RCA) is a Multi-relational data mining (MRDM) method extending FCA.

Relational Concept Analysis

To bring the mathematical strength of FCA to the realm of multi-relational data, and hence RDF and Linked Data, RCA admits a set of contexts, i.e. multiple object sorts, as well binary relations between object sorts. To discover plausible concepts from such datasets, propositionalization mechanism called scaling is used to refine object descriptions as per input contexts : Description Logic-inspired relational scaling operators replace inter-object links with restriction-like attributes, called relational, that refer to concepts from the range context. Potential cycles in data are dealt with in an iterative fix-point computation that gradually expands the ordinary concept lattices with relational attributes. As RCA fix-point lattices reflect the refined contexts much in the same way as with FCA, clusters and patterns are drawn thereof by existing FCA methodologies. Cycles are, in turn, resolved by expanding concept descriptions in a minimal fashion.

RCA has been applied to practical problems from a wide range of fields such as software engineering, hydroecology, neurology, data interlinking, linguistics. In this tutorial, we will focus on the way RCA can support various ontology engineering tasks. First we bring to the audience an understanding of the mathematical foundations of the RCA method and the algorithms used in the iterative lattice construction. We present existing tools as well as examples of RCA applications from the literature. In the second part, the focus will be on the intricate links between RCA and ontologies. We present a small number of ontology engineering scenarios and show how RCA-based tools support them through proper analysis of the data.

Outline Of The Tutorial

Part 1 : Theory

Formal Concept Analysis

After providing some background on lattices, we show how FCA organizes objects a lattice of conceptually described clusters. We also present the way the lattice yields patterns and association rules and introduce interest metrics to score these

Relational Concept Analysis

We show how RCA encodes Linked Data into a family of contexts and object-to-object binary relations. We then explain the bootstrapping step of iterative RCA process, i.e. building of the initial lattices, and clarify the way evolving lattices onindividual contexts interact with each other. In particular, we illustrate the relational scaling mechanism, i.e. given a relation between two contexts and a scaling schema (roughly a logical quantifier) how concepts on the range context are turned intopredicate-like attributes of the domain context. Finally, we discuss various ways to extract patterns and associations from classdefinitions while avoiding potential cycles.

Outline Of The Tutorial

Part 2 : Applications

Ontology design and refinement

We show how given a dataset and an ontology, the latter can be refined with RCA output that can suggest new classes that would be specializations of existing ontological classes.

Static Analysis

We show how RCA detects discrepancies between data and the ontology classes these data is assigned to. By pointing out suchdiscrepancies, RCA acts as a recommendation mechanism to an ontologist to suggest, among other, missing properties and/orproperty restrictions in a class, potential assignments of data to a more specialized subclass, potential missing specializations ofan existing class, etc.

Ontology Restructuring

We present an RCA-based method to improve the quality of an ontology by reorganizing the specialization between classes aswell as between properties, while discovering some potentially missing abstract classes and properties. The method requires nodata as it operates on the ontological schema as meta-data instead

About The Authors

Dr. Petko Valtchev

Petko Valtchev is Associate Professor with the Computer Science department of University of Quebec at Montreal(UQAM). His Ph.D. was awarded in 1999 by J. Fourier University, Grenoble. He is member of the Editorial Board of the International conference on Formal Concept Analysis (FCA) and has served as a member of the program committees of top-tier conferences (AAAI, IJCAI, ISWC). He has been researching on knowledge discovery and data mining with/from ontologies and knowledge bases. In this context, he designed a number of methods and practical tools exploiting concept analysis.

Flowers

About The Authors

Mickael Wajnberg

Mickael Wajnberg is a student, currently enrolled in a PhD at University of Quebec at Montreal (Québec, Canada) and at Université de Lorraine (France) , he currently works on RCA and knowledge extraction. He did a Math and Physics Prepa before he got an Engineering Degree (M. Sc equivalent) at Telecom Nancy(France) and a M. Sc at University of Quebec at Chicoutimi (Québec, Canada) in Computer Science, he specialized in algorithms and theory for computer science.

References

  1. Ganter, B. & Wille, R. Formal concept analysis: mathematical foundations(Springer Science & Business Media, 2012)
  2. Rouane-Hacene, M., Huchard, M., Napoli, A. & Valtchev, P. Soundness and completeness of relational concept analysis. In International Conference on Formal Concept Analysis, 228–243 (Springer, 2013).
  3. Rouane-Hacene, M., Huchard, M., Napoli, A. & Valtchev, P. Relational concept analysis: mining concept lattices from multi-relational data. Annals Math. Artif. Intell.67, 81–108 (2013).
  4. Džeroski, S. Multi-relational data mining: an introduction.ACM SIGKDD Explor. Newsl.5, 1–16 (2003)
  5. Rouane, M. H., Huchard, M., Napoli, A. & Valtchev, P. A proposal for combining formal concept analysis and description logics for mining relational data. In International Conference on Formal Concept Analysis, 51–65 (Springer, 2007)
  6. Nica, C., Braud, A., Dolques, X., Huchard, M. & Le Ber, F. Exploring temporal data using relational concept analysis: Anapplication to hydroecology. In 13th International Conference on Concept Lattices and Their Applications (CLA 2016), vol.1624, 299–311 (2016).
  7. Wajnberg, M.et al. Semantic interoperability of large systems through a formal method: Relational concept analysis. IFAC-PapersOnLine51, 1397–1402 (2018).

Part of the Joint Ontology Workshops 2019