Ontologies: the new ally for your data with MyDataCatalogue

Article de blog gestion de la donnée et des processus

Advanced data analysis approaches increasingly use methods such as ontology, taxonomy, knowledge graphs and thesauri to organize and make sense of the vast quantities of data that organizations process. These methods provide a framework for creating a common vocabulary and structure that can be used to describe and link different data elements, and to discover patterns and insights in the data.

At Blueway, these concepts are at the heart of the Phoenix platform’s MyDataCatalogue module, which enables organizations to structure, classify and explore their data intelligently and automatically.

In this article, we will explore:

  • The key concepts of ontology, taxonomy and thesauri
  • The role of ontologies in data security and compliance,
  • The integration of ontologies in MyDataCatalogue, the cataloguing module of the Phoenix platform, for optimized management of repositories and business knowledge.

Ontology to facilitate information retrieval, understanding and analysis

An ontology is a formal definition of a set of terms used to describe and represent a domain. It contains terms and relationships between these terms, as well as property terms that describe the characteristics and attributes of the concepts. An example of an ontology is Gene Ontology (GO), which is used in biological research to describe genes and their functions. GO contains terms such as “cellular component”, “molecular function” and “biological process”, as well as relationships between these terms, such as “is_one” and “is_part_of”.

Taxonomy, or how to bring order to data chaos

Taxonomy is the science of classification, used to organise concepts in a hierarchical structure. A taxonomy can be domain-specific or general, and can be used to classify a variety of things, including organisms, documents or data elements. An example of a taxonomy is the Dewey Decimal Classification System, which is used to classify books in libraries. The Dewey Decimal Classification System contains broad categories such as ‘000 – Computer, information and general works’, which are then divided into sub-categories such as ‘020 – Library and information science.

Let’s take the classification of animals as an example

The relationship between ontology, taxonomy and thesauri can be understood through an example in the field of biology. As a reminder, a thesaurus is an organised list of related terms used to index and retrieve information in a coherent way in a specific domain.

Let’s assume that we are building a system that aims to classify and organise different species of animals. We can start by creating a taxonomy of animals, which involves grouping animals according to their physical characteristics and evolutionary relationships. For example, we can group mammals, birds, reptiles, fish and insects into distinct categories based on their unique characteristics. This taxonomy provides a basic structure for organising the different species of animals into a hierarchy.

Next, we can create a thesaurus, which can be seen as an extension of taxonomy. The thesaurus allows more detailed descriptions of each species, including their behavioural traits, habitats and geographical locations. For example, under the category of mammals, we can include various sub-categories such as carnivores, herbivores and omnivores. Each of these sub-categories can be further subdivided into more specific groups such as primates, rodents and carnivorous mammals. This allows us to describe and categorise each animal species more precisely.

Finally, we can use an ontology to formally define the concepts and relationships in the field of animal species classification. The ontology provides a standardised vocabulary and structure for describing the different concepts and relationships involved, enabling a more precise and accurate representation of domain knowledge. For example, we can define the term ‘mammal’ as a class with certain characteristics such as hair, milk production and live birth, and we can define the relationships between different classes such as ‘carnivorous mammals’ and ‘herbivorous mammals’. This allows us to reason more easily and more precisely about the classification of animal species. So, by using taxonomy, thesaurus and ontology, the MyDataCatalogue module on the Phoenix platform benefits from better organisation and classification of its data.

The key to effective and secure data governance

Ontologies can be a powerful tool for managing data and information, particularly in complex environments. By creating formal models of concepts and relationships, ontologies can help organisations identify and organise data more effectively, improve data analysis and decision-making, and ensure compliance with legal and ethical standards. But did you know that ontologies can also be a key tool for improving data security and confidentiality?

mdm white paper

Master Data Management : Data quality and traceability at the heart of your information system

The importance of ontologies for security

By creating ontologies, themes can be identified for sources and servers, which can highlight sensitive sources and provide an overview of sensitive servers. By understanding the relationships between these sources and servers, appropriate actions can be taken if necessary.

Understanding the interrelationships and connections between entities in a data environment can lead to intelligent exploration and analysis. By using an ontology to map these connections, data analysts can gain insights into complex systems and discover new patterns and relationships.

Ontologies can help establish connections between entities and concepts, which can lead to the generation of rules and policies for later verification. By creating rules based on ontological relationships, data management and analysis can become more accurate and efficient.

By using ontologies to map the data environment and establish taxonomies, organisations can verify their compliance with regulations such as the RGPD. This can help ensure that data management practices comply with legal and ethical standards, reducing the risk of regulatory sanctions and other legal issues. Ontologies are not just for data management and analysis – they can also play a crucial role in improving data security, anonymisation and confidentiality. By using ontologies to classify and protect sensitive data, identify and mitigate security risks, and ensure compliance with legal and ethical standards, organisations can build more secure and resilient data environments. In short, ontology provides a powerful tool for managing and analysing data accurately and securely.

MyDataCatalogue incorporates an innovative approach to data cataloguing using ontologies

Blueway is a key player in the field of data management and governance. Through its Phoenix platform and MyDataCatalogue module, Blueway enables organisations to map, classify and structure their data using ontologies and thesauri. This approach ensures the compliance of environments according to general thesauri such as RGPD and cybersecurity, classifies data from other business viewpoints via custom thesauri created and modified by customers. This ensures not only compliance, but also better interoperability and accessibility of information.

Together with its research and innovation team, we have developed a new approach to intelligent data cataloguing, incorporating automatic concept extraction and the generation of dynamic taxonomies. We know that ontologies can be a powerful asset, but that they also pose challenges when it comes to maintaining and managing the relationships between concepts. That’s why MyDataCatalogue automates the detection and organisation of data using a progressive hierarchical approach: identifying key concepts, structuring them into taxonomies, enriching them using thesauri, and then modelling them into usable ontologies.

The Phoenix platform is not limited to cataloguing data. It can also integrate, automate and expose this data via inter-application flows and orchestrated business processes. By combining data management, workflow automation, application bus and API management, Phoenix helps organisations to structure and streamline their information exchanges, ensuring an efficient and resilient digital transformation.

Schedule a call

Want to discuss your Data Catalog challenges with an expert?

Stephane Le Lionnais
Stéphane Le Lionnais
Entrepreneur passionné et polyvalent, Stéphane est le co-fondateur de Dawizz, la société à l’origine de MyDataCatalogue, module de Data Catalog intégré à la plateforme Phoenix que Blueway. Grâce à son expertise terrain et à son écoute attentive des besoins clients, il conjugue savoir-faire pratique et vision stratégique.
In the same category: Data Catalog & Data mapping