DIG

DIG 2.0 Proposal for Accessing Told Data

DIG Working Group Note May 2006

This version:
http://www.informatik.uni-ulm.de/ki/Liebig/told-access.html
Authors:
Thorsten Liebig, Ulm University
Anni-Yasmin Turhan, Dresden University of Technology
Olaf Noppens, Ulm University
Timo Weithöner, Ulm University

Abstract

This is a proposal for an interface for retrieving told KB information within DIG 2.0. The interface is an official extension of DIG and comes with two levels: a set of canned queries at level 1 and a freely definable query utilizing XPath expressions at level 2. Level 1 is intended to provide the basic set of queries sufficient for most told-aware applications. This level offers the essential functionality of the told interface. Level 2 is a superset of level 1 providing unrestricted access to any portions of previous told KB data for all those applications which need more flexibility than available within level 1.


Motivation

The specification of DIG identifies the DL reasoner as the server with respect to the stored knowledge. However, a mechanism to retrieve KB definitions from the server or to distinguish between explicitly given and inferred information is missing. As a result every DIG enabled application is forced to hold and maintain an own model of the KB in parallel. In case of providing a rollback/checkpoint mechanism (e.g. as an extension) within the next release of DIG this is no longer an adequate solution. When rolling back to a previous state of the KB a client application should at least be able to synchronize its own model by retrieving the explicitly given definitions from the DIG server.

In addition, accessing previously told definitions from a reasoner is of importance for applications where some kind of non-standard reasoning on client-side is being done. Actual examples are black-box debugging of KBs, explaining standard reasoning services or computing the least common subsumer as well as various services for concept approximations. Most of these services use standard reasoning services of the server very frequently in order to compute some non-standard client service which is significantly easier to implement when having access to previously told KB definitions on server side rather than working on representations extracted from typically very application specific client models. In other words, there are many reasoning intensive tasks on client side which would benefit from having direct access to given KB axioms of the reasoning server by-passing any client model.

Levels of Representation of Told Data

In the following we distinguish between three levels of representations for KB definitions in order to clearly define that particular level of representation this told data proposal refers to.

  1. The explicitly told KB axioms consist only of previously read data. Note that the implicitly or explicitly introduced primitive definition of each class (or property or individual) is not part of the told information. More precisely, a <defClass URI="A"> does not create a told relevant <subClass> <class URI="A"/> <class URI="http://dl.kr.org/dig/2.0#top"/> </subClass> axiom.
    The order of the different axioms messages or the (class, property, or individual) axioms within messages is irrelevant. As a result, when retrieving those definitions their order is arbitrary and may change after further inputs.
    There are (at least two) alternative ways of returning explicitly told axioms:
    i) Axiom Preservation
    Single axioms are conserved as given and are not permitted to be rewritten (neither combined nor split up). An axiom is returned exactly as been told before. Axioms are not allowed to be pooled even if the combination is equivalence-preserving.
    ii) Axiom Normal Form (Axiom Compression / Axiom Simplification)
    Here the syntax of the original axioms is not considered as important. Instead a certain DL normal form of the given axioms is maintained and returned.
  2. In optimized reasoning systems the told KB axioms usually are pre-processed in order to increase performance. A common pre-processing step is a technique which tries to encode certain general class inclusion axioms (GCIs) into a given class or property definition by using syntactical transformations. This is called absorption and results in mutated class or property definitions (with respect to the explicitly told ones).
  3. The inferred axioms finally are those which have been derived by the reasoning system. Typically they are made explicitly available on demand, i.e. when triggered by some query of the provided ask language of DIG.

This interface proposal for retrieving KB axioms refers to the explicitly told axiom preservation strategy. This corresponds to paragraph 1.i) from above. All other representations are without any relationship to told data retrieval.

Other Aspects

Syntax and Semantics for Accessing Told Data

A told data query is a DIG query within an <asks> root element. A told query therefore can be combined with other asks statements. Each told query will result in exactly one <toldResponse> element. Just as any other ask response this tag also has an attribute (named id) in order to uniquely relate queries to results. In case of no query results the toldResponse element will be empty. Otherwise it contains a set of previously told class, property, or individual axioms as specified below.

The told interface consists of two independent levels. Level one contains a basic set of specific queries which are assumed to occur frequently. Level two consists of a variable kind of queries for any possible kind of told data access. The characterization of their return set (query semantics) is defined in terms of XML Path Language (XPath) [XPath 1.0] expressions.

The role of XPath is twofold here. In the first place XPath is used to formally specify the query semantics by XPath expressions which address the corresponding query results within the collection of previous axioms. Since told data access is a pure syntactical retrieval task, this can be done with help of an union of one or more XPath location steps for each query. Within level two it is itself used as specification language for arbitrary, client definable retrieval queries.

The usage of XPath for the specification of the semantics for level one told queries does not imply to necessarily utilize a XPath query engine for told data access. One could use any appropriate mechanism for told data retrieval as long as they return the correct results. However, the freely definable told query of level two obviously requires a XPath aware DIG repository due to the direct usage of pure XPath expressions.

The returned axioms are structurally exactly as given in previous axioms messages due to axiom preservation (apart from structurally irrelevant variations like white spaces and indention). The only exception consists of an additional key attribute needed for potential later retraction. That is each class, property, or individual axiom owns an extra told-ref attribute containing a server given id. With help of this specific id axioms can be retracted or replaced in an un-ambiguous way using the retraction extension of DIG/2.0. There is no relationship between the query id and the told id: one and the same successively posed query typically has different ids but return axioms possessing a constant id. Note that it follows from the above that the basic set of told queries restrict their answers to axiom level granularity in contrast to the freely definable queries.

The following section introduces the syntax of all told data queries as well as their semantics. The characterization of the query result sets assumes a told data repository of axioms which comply to the DIG/2.0 XML Schema.

Told Access Level 1

Class Queries

The following two queries provide access to told class axioms and general class inclusion axioms.

<toldNamedClassAxioms URI="CN" id="id"/>                     (1)
<toldGeneralClassAxioms id="id"/>                            (2)

Told query (1) returns all known class axioms for the named class with an URI matching the string CN. In particular these are all those classAxioms of the told repository which meet the following conditions:

In terms of XPath, the informal characterization of query (1) from above can be phrased as the following expression:

/axioms/subClass/class[@URI="CN" and position()=1]/..| \
/axioms/*[self::equivalentClasses or self::disjoint]/class[@URI="CN"]/..

Told query (2) aims at returning all general class inclusion axioms (GCIs) of the KB. These are all those which are classAxioms without having a named class within the first level of arguments. Syntactically the set of all class axioms consists of the set of all named class axioms (of all named class URIs) and the set of general class axioms.
Technically the answer set of query (2) is characterized as follows:

/axioms/subClass/*[not(self::class) and position()=1]/..| \
/axioms/*[self::equivalentClasses or self::disjoint]/.[count(class)=0]

To sum up, the response to both queries consists of exactly one <toldResponse> element which is either an empty tag or contains one or possibly multiple implies, equal, and disjoint elements addressed by the XPath expressions above. Note that the elements within a told response may occur in arbitrary order.

Property Queries

We propose to distinguish between named property definitions and general property inclusion axioms (queries for general property inclusion axioms would suffice but make parsing more complex). The next two queries are needed for object as well as datatype property axioms of named property definitions.

<toldNamedObjectPropertyAxioms URI="RN" id="id"/>            (3)
<toldNamedDataPropertyAxioms URI="RN" id="id"/>              (4)

The named property axioms correspond to the propertyAxioms of the DIG/2.0 Schema specification where the first (lhs of axiom) property reference is a named property.
In terms of a XPath expression (3) is characterized by the following:

/axioms/subProperty/objectProperty[@URI="RN" and position()=1]/..| \
/axioms/*[self:equivalentProperties or \
          self:disjointProperties]/objectProperty[@URI="RN"]/..| \
/axioms/*[self::domain or self::range or self::transitive or \
          self::inverseFunctional or self::symmetric or \
          self::antiSymmetric or self::reflexive or \
          self::irreflexive or self::functional or \
          self::leftIdentity or self::rightIdentity]/ \
                objectProperty[@URI="RN" and position()=1]/..

The corresponding XPath expression for named data properties (4):

/axioms/subProperty/dataProperty[@URI="RN" and position()=1]/..| \
/axioms/*[self:equivalentProperties or \
          self:disjointProperties]/dataProperty[@URI="RN"]/..| \
/axioms/*[self::domain or self::range or self::functional]/ \
                dataProperty[@URI="RN" and position()=1]/..

A query to retrieve axioms with respect to general property inclusion axioms (complex property expressions like inverse or composition instead of property names) is the following:

<toldGeneralPropertyAxioms id="id"/>                         (5)

The response to query (5) is the told data addressed by the following:

/axioms/*[self::subProperty or self::domain or self::range or \
          self::functional or self::inverseFunctional or self::symmetric \
          or self::antiSymmetric or self::reflexive or self::irreflexive]/ \
          *[position()=1 and (self::inverse or self::propertyComposition \
                              or self::chain)]/..| \
/axioms/*[self::equivalentProperties or self::disjointProperties]/ \
        .[count(dataProperty)=0 and count(objectProperty)=0]

Individual Queries

The next query is needed to retrieve axioms about individuals.

<toldIndividualAxioms URI="IN" id="id"/>                     (6)
The result set contains all explicit class (or class expression) as well as property (or property expression) instantiations. The corresponding XPath characterization of the return set is the following:

/axioms/*[self::instanceOf or self::value or \
          self::notValue]/individual[@URI="IN" and position()=1]/..| \
/axioms/*[self::same or self::different]/*[@URI="IN"]/..

Additionally, as a counterpart to the individual queries of the inferred data interface, we propose to have the following queries, whose results otherwise need to be distilled out of the result sets from the queries from above:

<toldInstances URI="CN" id="id"/>                            (7)
<toldIndividualInstance URI="IN" id="id"/>                   (8)
<toldIndividualFiller I-URI="IN" R-URI="RN" id="id"/>        (9)
<toldIndividualsRelated I1-URI="IN1" I2-URI="IN2" id="id"/> (10)
<toldIndividualFillerOf URI="IN" id="id"/>                  (11)
<toldIndividualFillerOfProperty I-URI="IN" R-URI="RN" id="id"/>   (12)
<toldRelated URI="RN" id="id"/>                             (13)

Query (7) returns all individual axioms which state that an individual is told to be an instance of the class named CN:

/axioms/instanceOf/class[@URI="CN"]/..

The result of (8) is the sub-set of query (6) about class instantiations:

/axioms/instanceOf/individual[@URI="IN"]/..

Told query (9) is a selector which returns those relation instantiations, which have been told about an individual with respect to a particular relation (object as well as datatype):

/axioms/*[self::value or self::notValue]/individual[@URI="IN" and \
            position()=1 and (following-sibling::objectProperty[@URI="RN"] or \
                              following-sibling::dataProperty[@URI="RN"])]/..

All told relation instantiations between individual IN1 (origin) and IN2 (filler) will be returned as result of query (10):

/axioms/*[self::value or self::notValue]/individual[@URI="IN1" and \
            position()=1 and following-sibling::individual[@URI="IN2"]]/..

Query (11) selects all those relation instantiations where an individual is told as a filler of some property:

/axioms/*[self::value or self::notValue]/individual[2 and @URI="IN"]/..

Query (12) is a specialization of (11) due to a restriction on a particular property:

/axioms/*[self::value or self::notValue]/objectProperty[@URI="RN" and \
                               following-sibling::individual[@URI="IN"]]/..

Finally, query (13) returns all told relation instantiations of a given property:

/axioms/*[self::value or self::notValue]/*[self::objectProperty[@URI="RN"] \
                                           or self::dataProperty[@URI="RN"]]/..

Told Access Level 2

Freely Definable XPath Query

The set of queries from above allow to retrieve all previously told data. For example, one could first retrieve all named classes, properties, and individuals (with help of the standard queries) and then their given definition as well as all general class and property axioms.
However, within a setting of large KBs with a huge amount of axioms this approach might be to coarse and time consuming. As an example, in order to retrieve all transitive properties one needs to create a query for each property individually. Level two consists of only one told query, which allows the DIG user to freely define the result set in terms of a XPath expression:

<toldQuery id="id">
  XPath query expression
</toldQuery>                                                 (14)

The result set simply consists of the answer of the XPath engine and is not restricted to answers of axiom granularity.

The task of retrieving all transitive properties URIs (not the corresponding axioms) could then be efficiently specified with the following query:

<toldQuery id="tq1">
  /axioms/transitive/objectProperty/@URI
</toldQuery>

Open Issues

Prototypical tests discovered the following:

RACER
Problem 1: undefined concept/role error when trying to access told data without explicit call of (classify-tbox) before.
Problem 2: if a KB violates constraints like at-most restrictions for transitive roles (or roles with transitive sub-roles) Racer comes up with an error when trying to retrieve told information (told information then is no longer accessible until KB reset)

Example

Consider the following DIG/2.0 "axioms" (for the sake of identifying the result sets we have added server given told keys to all class, property and individual axioms):

<axioms xmlns="http://dl.kr.org/dig/lang/schema"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://dl.kr.org/dig/lang/schema http://homepages.cs.manchester.ac.uk/~seanb/dig/schema.xsd"
        uri="http://www.informatik.uni-ulm.de/ki/Liebig/told-example.dig">

  <defClass URI="Human"/>
  <defClass URI="Man"/>
  <defClass URI="Parent"/>
  <defClass URI="PersonWchild"/>
  <defClass URI="LivingBeing"/>
    
  <defDataProperty URI="age"/>
  <defObjectProperty URI="child"/>
  <defObjectProperty URI="hasDescendant"/>
  <defObjectProperty URI="hasGrandparent"/>

  <functional told-ref="pa01">
      <dataProperty URI="age"/>
  </functional>
    
  <transitive told-ref="pa02">
      <objectProperty URI="hasDescendant"/>
  </transitive>
    
  <domain told-ref="pa03">
      <objectProperty URI="child"/>
      <unionOf>
          <class URI="Human"/>
          <class URI="Person"/>
      </unionOf>
  </domain>
    
  <subProperty told-ref="pa04">
      <objectProperty URI="child"/>
      <objectProperty URI="hasDescendant"/>
  </subProperty>

  <equivalentProperties told-ref="pa05">
      <inverse>
          <objectProperty URI="hasGrandparent"/>
      </inverse>
      <propertyComposition>
          <objectProperty URI="child"/>
          <objectProperty URI="child"/>
      </propertyComposition>
  </equivalentProperties>
    
  <subClass told-ref="ca01">
      <class URI="Parent"/>
      <class URI="PersonWchild"/>
  </subClass>
    
  <subClass told-ref="ca02">
      <class URI="PersonWchild"/>
      <class URI="Parent"/>
  </subClass>
    
  <equivalentClasses told-ref="ca03">
      <class URI="Human"/>
      <class URI="Man"/>
  </equivalentClasses>
    
  <equivalentClasses told-ref="ca04">
      <class URI="Parent"/>
      <someValuesFrom>
          <objectProperty URI="child"/>
          <class URI="Person"/>
      </someValuesFrom>
  </equivalentClasses>
    
  <disjoint told-ref="ca05">
      <class URI="NonParent"/>
      <class URI="Parent"/>
      <complementOf>
          <class URI="LivingBeing"/>
      </complementOf>
  </disjoint>
    
  <equivalentClasses told-ref="ca06">
      <intersectionOf>
          <class URI="Person"/>
          <class URI="LivingBeing"/>
      </intersectionOf>
      <class URI="Human"/>
  </equivalentClasses>
    
  <equivalentClasses told-ref="ca07">
      <someValuesFrom>
          <objectProperty URI="child"/>
          <class URI="Person"/>
      </someValuesFrom>
      <intersectionOf>
          <class URI="Parent"/>
          <class URI="Person"/>
      </intersectionOf>
  </equivalentClasses>
    
  <defIndividual URI="John"/>
  <defIndividual URI="Sue"/>
  <defIndividual URI="Betty"/>
    
  <instanceOf told-ref="ia01">
      <individual URI="John"/>
      <class URI="Human"/>
  </instanceOf>
    
  <related told-ref="ia02">
      <individual URI="John"/>
      <objectProperty URI="child"/>
      <individual URI="Sue"/>
  </related>
    
  <related told-ref="ia03">
      <individual URI="Betty"/>
      <objectProperty URI="child"/>
      <individual URI="Sue"/>
  </related>
    
  <different told-ref="ia04">
      <individual URI="Sue"/>
      <individual URI="John"/>
      <individual URI="Betty"/>
  </different>
    
  <value told-ref="ia05">
      <individual URI="John"/>
      <dataProperty URI="age"/>
      <dataLiteral>40</dataLiteral>
  </value>
    
  <value told-ref="ia06">
      <individual URI="John"/>
      <dataProperty URI="age"/>
      <dataLiteral>35</dataLiteral>
  </value>
    
</axioms>

The following is a listing of the 13 proposed told queries together with their corresponding result set in terms of told-ref identifier:

Told query Axiom return set
<toldNamedClassAxioms URI="Parent" id="tq1"/> ca01, ca04, ca05
<toldGeneralClassAxioms id="tq2"/> ca07
<toldNamedObjectPropertyAxioms URI="child" id="tq3"/> pa03, pa04
<toldNamedDataPropertyAxioms URI="age" id="tq4"/> pa01
<toldGeneralPropertyAxioms id="tq5"/> pa05
<toldIndividualAxioms URI="John" id="tq6"/> ia01, ia02, ia04, ia05, ia06
<toldInstances URI="Human" id="tq7"/> ia01
<toldIndividualInstance URI="John" id="tq8"/> ia01
<toldIndividualFiller I-URI="John" R-URI="child" id="tq9"/> ia02, ia05, ia06
<toldIndividualsRelated I1-URI="John" I2-URI="Sue" id="tq10"/> ia02
<toldIndividualFillerOf URI="Sue" id="tq11"/> ia02, ia03
<toldIndividualFillerOfProperty I-URI="Sue" R-URI="child" id="tq12"/> ia02, ia03
<toldRelated URI="child" id="tq13"/> ia02, ia03

References

[DIG1.1]
Sean Bechhofer. The DIG Description Logic Interface: DIG/1.1. http://dl-web.man.ac.uk/dig/2003/02/interface.pdf
[Dickinson]
Ian Dickinson. Implementation Experience with the DIG 1.1. Specification. http://www.hpl.hp.com/techreports/2004/HPL-2004-85.html
[XPath 1.0]
XML Path Language (XPath) Version 1.0 http://www.w3.org/TR/xpath
(or see http://en.wikipedia.org/wiki/XPath)