OMWG Roadmap

Introduction

In September 2004 the ontology management working group has been founded in order to develop an ontology management system to be used for several DERI projects and working groups. The group is coordinated by DERI and includes also Ontotext and other partners. In this document we will outline the way to go in order to reach this ambitious target.

In section 2 we will describe the necessary functionalities which will be broken down to the components level in Section 3. Decisions about the implementation of those will be described in section 4 before section 5 tries to schedule the single tasks and section 6 concludes this document.

Functionalities

In this section we describe the functionalities that will be provided by our Ontology Management System (OMS). We will consider versioning, merging/aligning, editing/browsing, storing and querying.

Versioning

The versioning function aims to handle different versions of the ontology. It includes versioning of the ontologies with instances update and backward compatibility support. This functionality is linked with the mediation to maintain consistent versions.

Multiple users on the tool require some versioning functionalities. Each time an ontology is modified, the old version is stored with a unique version number and the user can undo or redo her changes. These changes are formally defined; they can be either basic, for example a class modification, or complex, for example moving a sub tree of the ontology. Complex changes are only compositions of basic changes.

As the changes can modify the consistency of the ontology, the OMS versioning part includes consistency checking (see also Editing and Browsing). The multiple users must be informed when a change has occurred and their local ontologies are automatically synchronized. The modifications on the ontologies can create inconsistencies with the instance bases, this is the main reason for providing accessibility to the older versions of a modified ontology, and the user knows clearly on which version she is working on.

The OMS system provides security checking to restrict access to certain profiles and certain parts of the ontology.

Merging and Aligning

An important set of functionalities of the OMS is the mediation part. The term mediation includes all the functions providing help to merge and map ontologies and their instances. The mapping functionality is semi-automatic with the use of mapping-patterns or mapping-rules built on a mapping language. The patterns are stored in a pattern-library; the user can browse to find relevant patterns with the help of a search tool. That must also be done using a user-friendly interface with suggested mappings and drag and drops. The mappings are validated by consistency checking.

The OMS is able to mediate between different ontologies to answer the user queries; the system is able to execute queries on mapped ontologies. It mediates between them using query rewriting and unification of the resulting instances. It also uses the mappings to merge ontologies.

Editing and Browsing

The OMS allows the user to browse or edit ontologies via a user friendly graphical interface. Therefore this interface includes a visualization tool for representing even very large sets of ontologies and their instances. The user has the possibility to create ontologies and instance sets, to import/export them from/to different ontologies coded in different languages and to edit them. Creating and editing activities can insert inconsistencies in the ontologies; this is why the OMS editor includes an inference engine to ensure consistency checking.

Once the classes and instances are in the system, the user needs to manage them. The OMS provides tools to manage this library containing large scale different ontologies and sets of instances. These tools allow the user to search through the different ontologies as well as through their sets of instances. This search is carried out via a classical browsing, a query language or a graphical easy to use query interface and uses an internal indexation system. The search tool is improved by a suitable documentation on the ontologies concerning the authors, some keywords, and natural language searchable descriptions.

As the ontologies should be geographically dispersed and used by different users at the same time, the OMS is able to manage with concurrent access, and geographically dispersed versions. Concurrent access is detailed in the versioning paragraph of this document.

In order to align with the WSMO editor the ontology schema and the instances are stored in separated repositories. The software architecture is modular; it can interoperate with the existing ontology management tools, and is designed to interoperate with future tools because of the use of standard languages.

Storage and Representation

The ontologies and the sets of instances are stored in a repository which can efficiently deal with large ontologies. The ontologies are tightly connected to the other layers of the system and the provenance of their different parts is known due to the tracking process. The OMS repository support different ontology language and semantics, it is based on a semi-structured/graph-based data model.

The repository access is made with a query language supporting conjunctive queries; these queries can also include data types,    aggregation functions or range queries. Search is possible on literals with the use of keywords. The query results are structured and can be presented like search engines results pages. The query language provides closure.

The OMS repository interface adheres to a standard; it integrates multiple ontologies repositories in a single logical one. The system is able to deal with multiple users (see versioning) and adheres to the ACID paradigm; it includes transactions, logging and locking.

An important feature of the storage and representation model is that it allows for integration of (non-ontological) data sets such as databases. Specific wrappers make such data available in a form compliant with the OMS data model.

Further, it allows modularization of the data and knowledge, making it possible the ontologies to be partitioned into multiple data sets, each of which can be described with non-functional properties and managed separately. There is support for both explicitly defined data sets and views (which represent, in a nutshell, a restriction over larger data sets).

Querying

The different components of the Ontology Management system need a query language between the different user interfaces and the repository to realize queries on the ontologies. This language allows the user to manipulate the structure of the ontologies: classes, functions, relations and axioms. It also includes commands to manipulate the users, the connection to the repository and views of the ontologies. It is SQL-like with CREATE, DROP and ALTER commands.

Components

In this section we describe the components necessary to realize the functionalities introduced in the last section.

Versioning layer

The ontology versioning layer will have the following components:

Version space component

This component will handle the version space of ontologies and elements stored in ORDI. It will also provide an explicit API for the manipulation of versions - retrieval or creation.

ORDI facade

The versioning layer will facade ORDI ontology manipulation and storage API so that versioning is properly handled.

Authorization component

Every version in ORDI must have an author associated with it, therefore the authorization component will make sure that user identity is available to the other components.

Validation, Diff, Impact analysis and Change propagation components

These components will provide support for ontology/instance validation, computing the differences between versions, impact analysis and change propagation tasks. These components will be specified in further detail after the first prototype is ready in December 2004.

Merger

The alignment tool will be developed following the solution defined in the first version of this deliverable and adding new support from our researches recent advances.

Ontology merging and ontology aligning tasks both require the use of mappings: between the two source ontologies and the newly merged one for the former, and between the two aligned ontologies for the latter. Mapping specification is currently a semi-automatic task for which many algorithms exists. In the first version of this deliverable we present one based on PROMPT (see section 2.6) and suggest using it in our system. Like new algorithms are likely to emerge from the research community, the alignment tool should be able to include them and the user to use her preferred one. In this perspective we will develop a general alignment API on which different algorithms could be implemented.

The alignment tool will satisfy all the requirements raised in the first version of this working draft. We will next present its architecture.

The alignment tool contains two components: The mapping module helps the user to create mappings and construct merged ontologies. The runtime module uses the created mappings to perform the tasks required by the external components. We will next detail the composition of each module.

Mapping Module

• Mapping language As seen in section 4.1, the mappings are based on a general mapping language. • Patterns Patterns are templates that match the more usual mistakes between two ontologies. The use of predefines patterns considerably reduce the mapping designer task. In this solution we propose the use of a pattern language to define them, a pattern library allowing storing and retrieving them efficiently. • Mapping algorithms interface The architecture of the module allows the use of different mapping algorithms. These algorithms are stored and can be combined to create efficient mappings. The interface specifies the ontology language in input and the mapping language in output. • Graphical user interface This interface plays the main role in the mapping module. It allows the user to graphically create or modify mappings by linking similar entities. Mapping proposals as results of the mapping algorithms are also integrated in this part of the component.

Runtime module

This module is used by the reasoning part of the ontology management system. It can also be implemented as a web service but we won’t discuss this here. This module uses the mappings to perform the following tasks:

• Query rewriting Used to rewrite a query written for an ontology into one for another ontology. This process uses the mapping between the two ontologies or proposes to create one using the mapping module.

• Instance transformation Use to transform instances from one ontology to another. This process also uses the mapping between the two ontologies.

Explorer

The Ontology explorer will contain the following components:

Class browser/editor

The class browser/editor is UI component that will show the list or hierarchy of available ontologies and their contents, upon selection of an element in this hierarchy the properties and sub elements of the element will be shown and editable. The list will allow the removal or creation of ontologies and elements. The focus of this component will be to show the class/sub class hierarchies, the attribute definitions on classes, the relations between classes and the axioms/rules as the basic ontology modeling blocks.

Instance browser/editor

The instance browser/editor is a UI component similar to the Ontology browser/editor but dealing with independent collections of instances. The focus of this component will be to show the instances with their attribute values and class memberships, the relation instances and possibly axioms as the basic building blocks for data representation.

Versioning UI support

This UI component will help the user with managing versioning by providing access to the functionality of the versioning high-level component (see below). It will display the list of versions for an ontology or an element within an ontology, and likewise for the instances, possibly in connection with the ontology browser/editor and instance browser/editor components. This component will also allow the user to create a new version or retrieve previous versions.

Merging/alignment/mapping/factoring support

These UI components will provide assistance to the user with the tasks of merging, alignment, mapping and factoring of ontologies. These components will be specified in further detail after the first prototype is ready in December 2004.

DDL interpreter UI support

The DDL interpreter will be an independent component that will interpret DDL into ORDI invocations, thus providing batched processing of change descriptions. The editor tool will provide a basic support for the user to process a DDL file by the interpreter and to view the results.

ORDI

The ontology representation and data integration functionality will be realized with respect to the ORDI framework, [Kiryakov et al., 2004]. The included data and ontology models are described below.

The overall scheme is that ORDI will play as a middleware providing the OM tools and other applications with uniform access to various reasoners, repositories and other data sources. This strategy will be implemented through wrappers.

Data model

We ground our data representation on the RDF data model ([Klyne and Carroll, 2004]), since it is well-founded and detached from the semantics of the various knowledge representations, ontology, and semantic web languages used today. Another argument is that, there have been no major changes in its specification recently, which is an indication that it has reached certain degree of maturity. Finally, it ought to be taken into consideration that most of the formalisms that are used today for definition of the formal semantics within the set of different languages, can easily deal with the raw RDF data. To state it more explicitly, we see the data that will be used or manipulated through ORDI, as an RDF graph, defined as a set of RDF statements – triples.

Structured bodies of data represented in this model are called data graphs (in order to avoid, the usage of the term RDF graph and the inappropriate connotations to the RDFS semantics).

Ontology model

The ontology model in ORDI will be based on the one defined in WSMO (see Listing 1).

Listing 1. Ontology definition
entity ontology
      nonFunctionalProperties ofType nonFunctionalProperties
      importedOntologies ofTypeSet ontology
      usedMediators ofTypeSet ooMediator
      concepts ofTypeSet concept
      relations ofTypeSet relation
      functions ofTypeSet function
      instances ofTypeSet instance
      axioms ofTypeSet axiom

This model is defined on a conceptual, epistemological, level which means that it is formal enough to allow conceptualization, but still providing only minimal commitments to the semantics of the ontologies.

Details on the formal representation of such a conceptual model and its mapping to data graphs are given in [Kiryakov et al., 2004].

Ontology Query Language

In order to realize the querying functionality we will develop a querying language and a respective interpreter.

Implementation decisions

In this section we describe the implementation decisions that have been made so far.

Versioning

Currently, development of the versioning tool is in the architecture and design phase, therefore few implementation choices have been made and few implementation options (choices we will have to make) are apparent.

The agreement is to program versioning support as a layer above ORDI, with the UI as part of the editor/browser tool. It is decided that versioning will be done on the conceptual (semantic) level as opposed to other popular approaches like storing the ontologies in syntactic form and using existing syntactic versioning systems like CVS.

The version space component faces the choice of where and how the version space will actually be represented and what will be stored in ORDI. The options are currently unclear and under active discussion.

The ORDI facade component will resemble the underlying ORDI API as much as possible to ease the inclusion of the versioning component in the rest of the system during integration tasks.

Due to time limitations, it is likely the authorization component will be very simple and insecure, simply asking the user for their name and relying on the honesty. This will be fully sufficient for the purposes of the initial prototype and demo.

Merging & Alignment

So far there haven’t been made any decisions about the implementation of the merging and alignment component.

Editing & Browsing

Currently, development of the editing and browsing tool is in the architecture and design phase, therefore few implementation choices have been made and few implementation options (choices we will have to make) are apparent.

Both the ontology and instance editors/browsers will be based on the Eclipse platform which has been evaluated as the best basis for the tool. The UI widgets provided by Eclipse will be reused as much as possible.

We will attempt to provide a graphical representation of ontologies and instances in the editor/browser components, and currently the options trees, hyperbolic-plane trees and unconstrained graphs with automatic or user-driven layout. These options need to be evaluated with regard to the goal of being able to efficiently handle large-scale ontologies and instance collections, and support for the layout of automatically generated ontologies or instance collections, which has the tendency to become confusing with larger numbers of components.

Finally, it is not yet clear how much the Versioning UI support will be integrated with the two editor/browser components, the options apparently depend on the unresolved question of version space representation in the versioning layer.

Ontology representation and data integration

The core ontology representation API will be an extension of the WSMO Ontology API. In addition, OMS will also specify a set of functional interfaces (for storage and retrieval, for versioning, etc.).

Programming language

Based on the agreement found in the DIP project the ontology representation and data integration framework will be developed in Java.

Tool selection

The ontology representation and data integration API shall be based on an existing repository tool. Two widespread alternatives are the Jena and the Sesame system. The table below summarizes the features available in the tools respectively.

Jena Sesame
Database storage yes yes
OWL support yes no
Querying RDQL RQL SeRQL   yes no no   yes yes yes
Reasoning yes yes

As a general approach, ORDI will be coupled with multiple wrappers for repositories and ontology servers. Wrapper for KAON 2 is a high priority.

Ontology Query Language

So far there haven’t been made any decisions about the implementation of the query language component.

Efforts / Timetable

The necessary efforts and an appropriate schedule still have to be discussed. A first draft is summarized in Table1 and will be described in more detail in the following sections.

Versioning Merging & Alignment Editing & Browsing Represen-tation & Repository DDL
Requirements 27.09.2004 Jacek 31.12.2004 Francois 27.09.2004 Jacek 04.10.2004 Naso 31.12.2004 ???
Design   04.10.2004 Jacek 30.04.2005 ??? 04.10.2004 Jan 04.10.2004 Naso 30.04.2005 ???
Implementation 31.12.2004 Jacek 30.06.2005 ??? 31.12.2004 Jan 31.12.2004 Naso 30.06.2005 ???

Table 1: Deadlines and responsibilities

Versioning Tool

The versioning tool as described in 2.1 shall be realized according to the following sections.

Requirements

The requirements of the versioning tool will be analyzed by DERI Innsbruck. The responsible person is Jacek Kopecký. The document will be finished until October 4th 2004.The latest version can always be found at http://www.omwg.org/TR/2004/d6/d6.1 .

Design

The architecture of the versioning tool will be designed by DERI Innsbruck. The responsible person is Jacek Kopecký. The document will be finished until October 4th 2004 . The latest version can always be found at http://www.omwg.org/TR/2004/d6/d6.2 .

Implementation

The implementation of the versioning tool will be realized by DERI Innsbruck. The responsible person is Jacek Kopecký. The prototype will be finished until December 31st 2004 . The latest version can always be found at http://www.omwg.org/TR/2004/d6/d6.3 .

Merging and Alignment Tool

The merging and alignment tool as described in 2.1 shall be realized according to the following sections.

Requirements

The requirements of the merging and alignment tool will be analyzed by DERI Innsbruck. The responsible person is Francois Scharffe. The document will be finished until December 31st 2004.The latest version can always be found at http://www.omwg.org/TR/2004/d7/d7.1 .

Design

The architecture of the merging and alignment tool will be designed by DERI Innsbruck. The responsible person is not fixed, yet. The document will be finished until April 30th 2005 . The latest version can always be found at http://www.omwg.org/TR/2004/d7/d7.2 .

Implementation

The implementation of the merging and alignment tool will be realized by DERI Innsbruck. The responsible person is not fixed, yet. The prototype will be finished until June 30th 2005 . The latest version can always be found at http://www.omwg.org/TR/2004/d7/d7.3 .

Editing and Browsing Tool

The editing and browsing tool as described in 2.1 shall be realized according to the following sections.

Requirements

The requirements of the editing and browsing tool will be analyzed by DERI Innsbruck. The responsible person is Jacek Kopecký. The document will be finished until October 4th 2004.The latest version can always be found at http://www.omwg.org/TR/2004/d8/d8.1 .

Design

The architecture of the editing and browsing tool will be designed by DERI Innsbruck. The responsible person is Jan Henke. The document will be finished until October 4th 2004 . The latest version can always be found at http://www.omwg.org/TR/2004/d8/d8.2 .

Implementation

The implementation of the editing and browsing tool will be realized by DERI Innsbruck. The responsible person is Jan Henke. The prototype will be finished until December 31st 2004 . The latest version can always be found at http://www.omwg.org/TR/2004/d8/d8.3 .

Representation and Repository

The repository as described in 2.1 shall be realized according to the following sections.

Requirements

The requirements of the repository are managed by Ontotext. The responsible person is Atanas Kiryakov. Those are already available in [Kiryakov at all, 2004].

Design

The architecture of the representation API and path for wrapping of existing repositories and data sources will be delivered by Ontotext. The responsible person is Atanas Kiryakov. The most important design questions are already covered in [Kiryakov at all, 2004]. The next version will be presented after the implementation of the first phase.

Implementation

The implementation of the ontology representation API will be realized by Ontotext. It is currently being implemented as an extension of the WSMO Ontology API. The first version will be available by October 18th 2004 .

The implementation of the repository will be realized by Ontotext. The responsible person is Atanas Kiryakov. The prototype will be finished until December 31st 2004 . The latest version can always be found at http://www.omwg.org/TR/2004/d9/d9.3 .

DDL for Ontologies

The DDL as described in 2.1 shall be realized according to the following sections.

Requirements

The requirements of the DDL will have to be analyzed. The responsible person is not fixed, yet. The document will be finished until December 31st 2004.The latest version can always be found at http://www.omwg.org/TR/2004/d10/d10.1 .

Design

The architecture of the DDL will have to be designed. The responsible person is not fixed, yet. The document will be finished until April 30th 2005. The latest version can always be found at http://www.omwg.org/TR/2004/d10/d10.2 .

Implementation

The implementation of the DDL will have to be realized. The responsible person is not fixed, yet. The prototype will be finished until June 30th 2005. The latest version can always be found at http://www.omwg.org/TR/2004/d10/d10.3 .

Conclusions

In this document we outlined the way to go in order to reach the ambitious target of creating a general ontology management system to be used by several DERI projects and working groups.

Acknowledgement

The work is funded by the European Commission under the projects DIP, Knowledge Web, Ontoweb, SEKT, SWWS, Esperonto and h-TechSight; by Science Foundation Ireland under the DERI-Lion project; and by the Vienna city government under the CoOperate programme.