Ontology Representation and Data Integration (ORDI) Framework

Availability and Contacts

Version: 0.21, 26 June 2005.

Download: http://www.omwg.org/tools/ordi/v0.21/ordi.zip

Source control: To be made available from CVS of the DOME SourceForge project.

Contact person: Damyan Ognyanov, damyan@sirma.bg

Purpose and Functionality

Ontology Representation and Data Integration (ORDI) Framework is developed after the analysis and design guidelines of [ORDI-Design] - a conceptual framework, presented in deliverable D2.2 of the DIP project. The major objectives of ORDI are:

  • Ontology language neutrality;
  • Integration of databases and other structured data-sources;
  • Ontology and data modularization;
  • Support for heterogeneous reasoners and data-sources.

Instead of developing a new language-independent representation, the implementation of ORDI adapts WSML Core ([wsml0.2]) as a formal data- and knowledge representation model. This decision was taken due to the following reasons:

  • WSML Core is rather close the model defined in [ORDI-Design]. WSML Core is crafted after the same objectives - as a minimal but sufficient basic model, which can serve as a ground and/or be aligned to the models used under the most popular knowledge representation and conceptual modelling paradigms.
  • WSML provides a model also for web services (WS), which allows for smooth integration between ontology management (OM) and WS software infrastructure.

ORDI, as a package, contains the following modules:

  • ORDI API - all the APIs necessary to work with ORDI, at present it is a tiny extension of the WSMO API, see below.
  • ORDI Implementation - an implementation of the interfaces with the following major parts:
    • A default repository implementation, based on Sesame. It uses the WSMO Tripliser, see below;
    • An RDF/XML Parser implementation for WSMO-RDF, see below;
    • An RDF/XML Parser implementation for import of OWL (RDF/XML syntax). It allows import of OWL through parsing of the most popular RDF/XML syntax and transformation into WSMO-Triples format (see below).

Some sample usage code is also included in package, see the Usage section.

wsmo4j and ORDI

ORDI and wsmo4j were designed to complement each other in the following way:

  • wsmo4j includes a WSMO representation and management API coupled with a reference implementation. wsmo4j defines APIs for management of the WSMO elements as well as basic in-memory implementations. It also defines few functional interfaces: Parser (parsing and serialization), DataStore (storage and retrieval), Locator (mapping of logical to physical addresses). wsmo4j includes an WSML-HR parser.
  • ORDI defines interfaces for more advanced repository (storage, query, maintenance) functionality. It is meant to serve as ontology middleware which mediates between a wide range of ontology management tools and applications on the one hand and different sorts of ontology servers and reasoners on the other. ORDI also takes care of interoperability with other (non-WSMx) representation formats and syntaxes.
  • ORDI is an extension of wsmo4j, while the latter is self-sufficient. The major dependency between the two is the www.omwg.ontology package, which defines the ontology primitives of WSMO. Figure 1 depicts the major relationships between wsmo4j and ORDI and their positioning wrt WS- and OM-tools. (Those should be further elaborated.)

Figure wsmo4j and ORDI

Related Syntaxes

There are number of file formats related to ORDI. Those will be introduced here, the specific tasks related to them are discussed in a latter sub-section.

  • A WSML document in either WSML XML syntax (abbreviated as WSML-XML) or WSML Human Readable syntax (abbreviated as WSML-HR);
  • OWL-RDF: the standard RDF XML syntax. [RDF/XML]. RDF syntaxes different than XML (e.g. NTriples and N3) will also be supported (through the existing RDF parsers). An RDFS subset, which is a proper sub-language of OWL DLP, is considered as an import format.
  • WSMO-RDF: a WSMO/WSML document serialized according to the WSMO RDF Schema. The later is an ORDI-specific RDFS/OWL ontology (meta-schema) derived from the WSML mapping to OWL [wsml0.2]. In a way, it has the same role as the RDFS schema for OWL.

It is important to be mentioned that the immediate plans do not foresee export of WSML into OWL-RDF. The main WSML format compliant with the Semantic Web standards is WSMO-RDF.

Related Data-models and Representations

There are couple of datamodels (with corresponding Java interfaces and implementations) relevant to ORDI.

  • WSMO-In-Memory: a WSMO-API/wsmo4j compliant model (e.g. the reference implementation within wsmo4j). This an object-oriented representation, which is not specific for ORDI;
  • WSMO-Triples: a representation of WSMO elements as RDF triples according to the WSMO RDF Schema. This is an internal representation allowing ORDI to store WSMO entities (and other data) into an RDF triple repository for the sake of efficient query and management of huge amounts of data. The WSMO-RDF syntax is a serialization of this representation.

Follows a diagram which represents the transformations between the different formats and models. Next by the arrows one can see the modules which take care of the transformation.

Figure ORDI-related Formats and Representations

The Current Version

The current version 0.21 of ORDI is a pre-release with limited functionality. Its main purpose is to provide early access to the APIs, the examples and the overall architecture, this way facilitating the integration with other tools. The source code still requires further documentation to meet the minimal requirements for a distributed open-source development process.

The most interesting new feature in v. 0.21, as compared to 0.2, is the import of OWL (RDF/XML syntax). This way ORDI implements the highly desired possibility for re-use in WSMO environments of ontologies encoded in the W3C Semantic Web standards.

The major functionality of ORDI (as added value on top of wsmo4j) is:

  • the scalable repository implementation;
  • the WSMO-RDF parser. It currently supports only serialization (i.e. export) - the parsing (import) functionality is straightforward to implement, but still missing in this version. What is important is that this is the first (to the best of our knowledge) utility which allows for serialization of WSML into and XML-based format, which has been put as a requirement for a big number of industrial applications.
  • the OWL-RDF import. It allows for import and usage, within WSMO-compliant applications, of OWL (and RDFS) ontologies. The current "translation" strategy is based on the WSML mapping to OWL given in [wsml0.2]. However, it is still incomplete and requires further tuning, documentation and evaluation of the semantic correctness of the transformation. Early notes can be seen in owl2wsmo.txt.

Probably the best way to understand what is missing in the current version is to check the Future Plans section below.

Requirements

Nature: A Java library without user interface.

Interfaces (API, Web Services): a Java API.

Platform: JDK 1.4.2 and 1.5.

Supported standards:

  • ORDI's native data-model is the one of wsmo4j, which means [wsmo1.2] as a conceptual model and [wsml0.2] as knowledge representation language.
  • ORDI supports export in [rdf], more precisely the [RDF/XML] syntax, which is supported through Sesame.

Required Libraries (OMWGSDK ClusterWSMO-related):

  • wsmo4j is an API and a reference implementation for building Semantic Web Services applications compliant with the Web Service Modeling Ontology (WSMO). The version of wsmo4j used in the current version of ORDI is 0.4.0 - a release candidate compliant with [wsmo1.2] and [wsml0.2], from 26/06/2005 or newer.

Required Libraries (others):

  • Sesame : Sesame is an open source Java framework for storing, querying and reasoning with RDF and RDF Schema. One of its functionalities is that it can serve as a scalable high-performance semantic repository - this is its major role within ORDI. The version of Sesame used in the current version of ORDI is 1.1.3.

Licensing

ORDI License Agreement

Copyright (c) 2005, Ontotext Lab, Sirma.

This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.

Licensing of Third Party Libraries

Licensing of third party libraries and components required for ORDI:

  • wsmo4j - (c) Copyright Ontotext Lab, Sirma. It is an open-source library, available under the same LGPL conditions.
  • Sesame - (c) Copyright Aduna b.v. It is an open-source library, available under the same LGPL conditions.

Installation and Usage

Installation of ORDI

ORDI is distributed as a ZIP archive, which should be extracted in a separate folder. The archive file is originally named ordi.zip and has the following contents:

  • FactSheet.html - this document;
  • doc folder - contains Javadoc documentation;
  • lib folder - contains all the required libraries (jar files);
  • src folder - contains all source files;
  • conf folder - contains all necessary configuration files (at present only the one for Sesame, which is used for the implementation of the default DataStore);
  • test folder - contains sample data, at present few ontologies in WSML-HR and OWL format;
  • ordiapi.jar - the ORDI API provided as a Java library. It extends the the WSMO API which is part of wsmo4j.
  • ordiimpl.jar - the ORDI default implementation provided as a Java library. It includes WSMO-RDF parser, a Sesame-based default Repository (extension of the WSMO API DataStore interface).

To use ORDI as a library (e.g. in embedded mode) from a Java program, one needs the two ORDI jars (ordiapi.jar and ordiimpl.jar) plus the ones in the lib folder to be included in the CLASSPATH.

Usage Examples

Several simple scenarios are provided as an illustration of the functionality of ORDI. Those are available as Java sources in the src\ordiexamples folder.

  • StoreOntologyExample - parsing a WSML-HR ontology and storing it in the default repository. In addition it looks up a concept by IRI, which indirectly loads the concept definition from the repository, because it has registered itself as a locator. The concept is stored in the repository again, without a change, to demonstrate that the basic store/load operation of the repository is definition preserving, although the definition is getting transformed from WSMO-in-memory to WSMO-Triples and back.
  • ModifyConceptExample - loads a concept definition from the default repository, than modifies it and stores it back; finally it loads the concept again to demonstrate that the definition had changed.
  • ExportToWSMORDFExample - loads an ontology from the repository and exports (serializes) it into WSMO-RDF format.
  • ExportToWSMORDFExample - loads an OWL ontology (test/food.owl, one from the pair of well-known Wine and Food samples ontologies), converts it into WSMO-in-memory format, than dumps the WSML-HR serialization to the console and stores it in a file (test/food.wsml).

A pre-condition for the second and the third examples is that the http://www.example.org/ontologies/example ontology is already stored in the default ORDI repository (which is the effect of the first example: StoreOntologyExample).

Future Plans

The major driving forces for the future development of ORDI:

  • support for the evolution of the related standards and tools (wsmo4j, WSMO, WSML);
  • developments related to integration of ontology back-end infrastructure, i.e. reasoners and repositories;
  • improvements and fixes required by applications using ORDI.

Below follows a non-exhaustive list of tasks, which fit into the short-term development plans:

  • Import of files in WSMO-RDF syntax (now it only supports serialization).
  • Improvements to the import of files in OWL-RDF. One of the specific fixes is to better handle the XML namespaces from the OWL files.
  • Extensions in org.owmg.ordi.repository.Repository interface. In particular such related to reasoner integration.
  • Provide WSMO Core reasoning. Sesame's architecture allows for different implementations of the storage and inference layer (SAIL) - the one, which comes bundled with ORDI, is called OWLIM. OWLIM supports partial, forward chaining based, reasoning over a fragment of OWL Lite. Although the reasoning is handled in-memory, OWLIM offers a comprehensive persistency and backup strategy, implemented through N-Triples files. OWLIM will be used to support WSML Core semantics for huge data-sets.

Mid-term plans include provision of a client/server version and database integration.

Appendix A. References

[ORDI-Design] A. Kiryakov, D. Ognyanov, and V. Kirov: A Framework for Representing Ontologies Consisting of Several Thousand Concepts Definitions. DIP Project Deliverable D2.2, June 2004. http://dip.semanticweb.org/deliverables/D22ORDIv1.0.pdf

[RDF] G. Klyne, J. J. Carrol (eds): Resource Description Framework (RDF): Concepts and Abstract Syntax. W3C Recommendation 10 February 2004. http://www.w3.org/TR/rdf-concepts/

[RDF/XML] Dave Beckett (editor): RDF/XML Syntax Specification (Revised). W3C Recommendation 10 February 2004. http://www.w3.org/TR/rdf-syntax-grammar/

[WSML0.2] J. de Bruijn, H. Lausen , R. Krummenacher, A. Polleres, L. Predoiu, M. Kifer, D Fensel: The Web Service Modeling Language WSML. Deliverable d16.1v0.2, WSML, 2005. http://www.wsmo.org/TR/d16/d16.1/v0.2/

[WSMO1.2] D. Roman, H. Lausen, U. Keller (eds); J. de Bruijn, Ch. Bussler, J. Domingue, D. Fensel, M. Hepp, M. Kifer, B. Konig-Ries, J. Kopecky, R. Lara, E. Oren, A. Polleres, J. Scicluna, M. Stollberg: Web Service Modeling Ontology (WSMO). Deliverable d2v1.2, WSMO, 2005. http://www.wsmo.org/TR/d2/v1.2/