Towards a Web Object Model
Frank Manola
Object Services and Consulting, Inc. (OBJS)
fmanola@objs.com
10 February 1998
Abstract
Today, the World Wide Web is a global information
repository of resources primarily consisting of syntactically-structured
HTML documents and MIME-typed files. These relatively unstructured data
models do not provide the foundation for command and control situation
modeling or enterprise computing, or for a new generation of tools to operate
on a more semantically-structured, knowledge-based web. Richer base data
model(s) are needed that converge the benefits of emerging Web structuring
mechanisms and distributed object service architectures.
A number of ongoing activities are attempting to merge aspects of object
models with those of the World Wide Web. This paper describes a number
of these activities, with particular emphasis on those which focus on providing
enhanced facilities for representing metadata for describing Web (and other)
resources. The intent of this paper is to:
- describe key examples of existing work from the Web, database, and
OMG communities that contribute both ideas and technology toward providing
the components of a Web object model
- identify some key underlying principles behind this work
- identify a framework which allows this work to be unified and extended
to support the requirements of advanced Web applications for object technology
Contents
- 1. Introduction
- 1.1 Background
- 1.2 Capabilities Provided by an Object Service Architecture
- 1.3 Increasing the Structuring Power of the Web
- 2. Relevant Work
- 2.1 Structured Data Representations and "Lightweight
Object Models"
- 2.1.1 Summary Object Interchange Format (SOIF)
- 2.1.2 Object Exchange Model (OEM)
- 2.1.3 Knowledge Interchange Format (KIF)
- 2.1.4 Extensible Markup Language (XML)
- 2.2 Higher-Level Models and Metadata
- 2.2.1 Dublin Core
- 2.2.2 Warwick Framework
- 2.2.3 PICS and PICS-NG
- 2.2.4 XML-Data
- 2.2.5 Meta Content Framework (MCF)
- 2.2.6 Resource Description Framework (RDF)
- 2.3 Adding Behavior to Web Pages
- 2.3.1 Document Object Model (DOM)
- 2.3.2 Embedded Objects
- 2.3.3 Web Interface Definition Language
- 2.4 Related OMG Technologies
- 2.4.1 OMG Property Service
- 2.4.2 Tagged Data Facility
- 3. Building a Web Object Model
- 3.1 Integration Approach
- 3.2 Discussion
- 3.3 Formal Principles
- 3.3.1 Logic Basis
- 3.3.2 Representation of Higher Level Semantics
- 3.3.3 Object Logics
- 4. Conclusions
- References
1. Introduction
1.1 Background
Many business and governmental organizations are planning or developing
enterprise-wide, open distributed computing architectures to support their
operational information processing requirements. Such architectures generally
employ distributed object middleware technology, such as the Object Management
Group's (OMG's) Common Object Request Broker Architecture (CORBA) [OMG95],
as a basic infrastructure.
The use of objects in such architectures reflects the fact that advanced
software development increasingly involves the use of object technology.
This includes the use of object-oriented programming languages, class libraries
and application development frameworks, application integration technology
such as Microsoft's OLE, as well as distributed object middleware such
as CORBA. It also involves the use of object analysis and design methodologies
and associated tools.
This use of object technology is driven by a number of factors, including:
- the desire to build software from reusable components
- the desire for software to more directly and more completely reflect
enterprise concepts, rather than information technology concepts
- the need to support enterprise processes that involve legacy information
systems
- the inclusion of object concepts and facilities in key software products
by major software vendors
The first two factors reflect requirements for business systems to be
rapidly and cheaply developed or adapted to reflect changes in the enterprise
environment, such as new services, altered internal processes, or altered
customer, supplier, or other partner relationships. Object technology provides
mechanisms, such as encapsulation and inheritance, that have the potential
to support more rapid and flexible software development, higher levels
of reuse, and the definition of software artifacts that more directly model
enterprise concepts.
The third factor reflects a situation faced by many large organizations,
in which a key issue is not just the development of new software, but the
coordination of existing software that supports key internal processes
and human activities. Mechanisms provided by object technology can help
encapsulate existing systems, and unify them into higher-level processes.
The fourth factor is particularly important. It reflects the fact that,
as commercial software vendors incorporate object concepts in key products,
it will become more and more difficult to avoid using object technology.
This is illustrated by the rapid pace at which object technology is being
included in software such as DBMSs (including relational DBMSs) and other
middleware, and client/server development environments. Due to this factor,
organizations may be influenced to begin adopting object technology before
they would ordinarily consider doing so.
At the same time, the Internet is becoming an increasingly important factor
in planning for enterprise distributed computing environments. For example,
companies are providing information via World Wide Web pages, as well as
customer access via the Internet to such enterprise computing services
as on-line ordering or order/service tracking facilities. Companies are
also using Internet technology to create private Intranets, providing
access to enterprise data (and, potentially, services) from throughout
the enterprise in a way that is convenient and avoids proprietary network
technology. Following this trend, software vendors are developing software
to allow Web browsers to act as user interfaces to enterprise computing
systems, e.g., to act as clients in workflow or general client/server systems.
Products have also been developed that link mainframes to Web pages (e.g.,
translating conventional terminal sessions into HTML pages).
Organizations perceive a number of advantages in using the Web in enterprise
computing. For example, Web browser software is widely available for most
client platforms, and is cheaper than most alternative client applications.
Web pages generally work reasonably well with a variety of browsers, and
maintenance is simpler since the browser and associated software can reduce
the amount of distributed software to be managed. In addition, the Web
provides a representation for information which
- supports interlinking of all kinds of content (text, voice, video,
etc.)
- is easy for end-users to access
- is easy to create content for using widely-available tools
However, as organizations have attempted to employ the Web in increasingly-sophisticated
applications, these applications have begun to overlap in complexity the
sorts of distributed applications for which architectures such as OMG's
CORBA, and its surrounding Object Management Architecture (OMA) [OMG97]
were originally intended. Since the Web was not originally designed to
support such applications, Web application development efforts increasingly
run into limitations of the basic Web infrastructure. As a result, numerous
efforts are being made to enhance Web capabilities, to enable them to support
these more complex applications. In order to understand the missing elements,
it is useful to look at the components of OMG's OMA.
1.2 Capabilities Provided by an Object
Service Architecture
There is increasing agreement that modeling a distributed system as
a distributed collection of interacting objects provides the appropriate
framework for use in integrating heterogeneous, autonomous, and distributed
(HAD) computing resources. Objects form a natural model for a distributed
system because, like objects, distributed components can only communicate
with each other using messages addressed to well-defined interfaces, and
components are assumed to have their own locally-defined procedures enabling
them to respond to messages sent them. Objects accommodate the heterogeneous
aspects of such systems because messages sent to distributed components
depend only on the component interfaces, not on the internals of the components.
Objects accommodate the autonomous aspects of such systems because components
may change independently and transparently, provided their interfaces are
maintained. These characteristics allow objects to be used both in the
development of new components, and for encapsulating access to legacy components.
In addition, because object-oriented implementations bundle data with related
operations in modular units, the use of objects provides the possibility
of fine-grained tuning in the computing architecture by moving or copying
objects to appropriate nodes of the network (this is becoming increasingly
feasible with the development of technology such as Sun's Java).
OMG's Object Management Architecture (OMA) is an example of a distributed
object architecture intended to support distributed enterprise computing
applications. The OMA includes the following components:
- A global object model to define how the heterogeneous resources
that make up the system can be modeled as objects. In the OMA, this global
object model is defined by the CORBA Interface Definition Language
(IDL).
- The Object Request Broker (ORB), an object messaging backplane
that enables distributed objects to transparently send and received requests
and responses.
- Object Services, which support basic functions for using and
implementing objects, and are likely to be used in any object-based program.
Examples include support for queries, transactions, and event notification.
- Common Facilities, which provide end-user oriented capabilities
useful across multiple application domains, such as compound document and
workflow facilities.
- Domain Objects, which are likely to be used only in specific
vertical application domains, such as telecommunications or manufacturing.
- Application Objects, which are built specifically for a particular
application.
These components provide multiple levels of capabilities in support
of developing complex distributed applications.
The ORB in the OMA is defined by the CORBA specifications. An ORB does
not require that the objects it supports be implemented in an object-oriented
programming language. The CORBA architecture defines interfaces for connecting
code and data to form object implementations, with interfaces defined
by IDL, that are managed by the ORB and its supporting object services.
It is this flexibility that enables ORBs to be used in connecting legacy
systems and data together as components in enterprise computing architectures.
A distributed enterprise object system must provide functionality beyond
that of simply delivering messages between objects. OMG's Object Services
have been defined to address some of these requirements. Object Services
provide the next level of structure above the basic object messaging support
provided by CORBA. The services define specific types of objects (or interfaces)
and relationships between them in order to support higher-level capabilities.
Object Services currently defined by OMG include, among others:
- Concurrency Control Service
- Life Cycle Services
- Event Notification Service
- Query Service
- Persistent Object Service
- Relationship Service
- Naming Service
- Transaction Service
Taken together, OMG Object Services provide services for ORB-accessible
objects similar to those that an Object DBMS (ODBMS) provides for objects
in an object database (queries, transactions, etc.). The Object Services,
together with the basic connectivity provided by the ORB, turn the collection
of network-accessible objects into a unified shared object space,
accessible by any ORB client application. Managing the collection of ORB-accessible
objects thus becomes a generalized form of "object database management",
with the ORB being part of the internal implementation of what is effectively
an ODBMS. Viewed in this way, the OMA provides a powerful object-oriented
infrastructure for the development of general-purpose applications, just
as an enterprise database and its associated DBMS provide such an infrastructure
for the development of general-purpose enterprise applications. Additional
levels of organization are also needed. These additional levels are where
OMG's Common Facilities, Application, and Domain Objects, as well as still
higher level concepts, come into play [MGHH+97].
If the Web is to be used as the basis of complex enterprise applications,
it must provide generic capabilities similar to those provided by the OMA,
although these may need to be adapted to the more open, flexible nature
of the Web. Providing these capabilities involves addressing not only the
provision of higher level services and facilities for the Web, but also
the suitability of the basic data structuring capabilities provided by
the Web (its "object model"). For example, in the case of services,
search engines (a form of query service) are becoming indispensable tools,
and agent technology can add additional intelligence to the searching process.
Similarly, extended facilities to support transactions over the Web are
being investigated. However, the ability to define and apply powerful generic
services in the Web, and the ability to generally use the Web to support
complex applications, depends crucially on the ability of the Web's underlying
data structure to support these complex applications and services.
1.3 Increasing the Structuring
Power of the Web
The basic data structure of the Web consists of hyperlinked HTML documents.
It is generally recognized that HTML is too simple a data structure to
support complex enterprise applications. For example, Jon Bosak's XML,
Java, and the Future of the Web [Bos97] identifies a number of key
limitations of HTML:
- Extensibility: HTML does not allow users to specify their own
tags or attributes in order to help identify the semantic significance
of data (e.g., to identify that a particular text string represents the
title of a document, or the customer placing an order).
- Structure: HTML does not support the specification of deep structures
needed to represent, e.g., database schemas or object-oriented hierarchies.
- Validation: HTML does not support the kind of language specification
that allows client applications to check data for structural validity on
loading the data, e.g., data that represents fixed structured forms or
database records
These limitations severely affect the ability to develop advanced applications
using HTML, including:
- applications that require the Web client to function as the front-end
to enterprise applications or mediate between multiple heterogeneous databases,
- applications that require more flexibility in distributing processing
load between Web servers and clients, and
- applications that require the Web client to present different views
of the same data to different users, or in which intelligent Web agents
need to tailor information discovery to the needs of individual users.
Proprietary HTML extensions have been developed to address some of these
problems, but none deals with all of them, and together they create barriers
to interoperability. The same is true of the proprietary data formats used
by particular applications. Their use requires specialized helper applications,
plug-ins, or Java applets, creating interoperability problems, and difficulty
in reusing that data in different applications for new purposes. While
use of some specialized formats is necessary in particular applications
(e.g., multimedia), in many cases these formats are just used to address
the deficiencies of HTML for generalized document and data processing.
A more fundamental direction of efforts to address HTML limitations
has been attempts to integrate aspects of object technology with the basic
infrastructure of the Web. There are a number of reasons for the interest
in integrating Web and object technologies:
- The Web, even in its current form, can be viewed as a simple form of
distributed object system, with a particularly simple object model. In
this model, HTML pages are considered as objects (actually, object state),
having identity provided by URLs, and methods defined by, or that are invoked
via, HTTP servers. The methods supported by HTTP servers are extensible,
and HTTP supports negotiation to find out what they are (even though GET,
PUT, and POST are the only methods generally used). The basic resemblance
of the Web to a simple object system has created a natural interest in
seeing how far the resemblance can be further developed. The World Wide
Web Consortium (W3C) HTTP-NG
activity <http://www.w3.org/Protocols/HTTP-NG/> is attempting to
do this at the protocol level by developing a new architecture for the
HTTP protocol based on a simple, extensible, distributed object-oriented
model.
- Object technology is seen as a particularly-convenient way of adding
functionality (e.g., behavior) to the Web, both by adding the behavior
provided by objects to the static content of HTML, and by allowing Web
clients and servers, through distributed object technology, to access other
computing resources. For example:
- Web pages can be used as convenient carriers or containers
for objects in various models, e.g., Java or ActiveX objects. In this approach,
objects are added to the conventional static content of Web pages. The
pages provide a vehicle for transmitting the objects between server and
client. Once on the client, the objects can then execute. In some cases,
the client objects then interact with server objects, possibly using a
different protocol, e.g., OMG's IIOP or Java's RMI. While this was originally
supported by proprietary extensions, HTML specifications now include support
for the <APPLET> tag, and the recently-adopted HTML 4.0 specification
includes a more general <OBJECT> tag (see Section 2.3).
- Web pages can be treated as objects with methods that execute on HTTP
clients. Dynamic HTML developments by Microsoft
<http://www.microsoft.com/sitebuilder/workshop/author/dhtml/> and
Netscape are examples of this approach. Current work by the W3C on a Document
Object Model <http://www.w3.org/TR/WD-DOM/> is attempting to
extend these ideas to include even more powerful facilities (see Section
2.3). What is being proposed is an object model that allows the HTML document,
together with its contents (its collection of elements and attributes),
to be treated as a collection of programmable objects. Client-side code
(scripts or code contained in the document, or plug-ins or other code which
accesses the document through the client) will be allowed to access these
objects, and manipulate them dynamically (e.g., causing immediate changes
in the document displayed to the user).
Such efforts all contribute toward giving the Web a richer structural
base, capable of directly supporting a wider variety of activities, in
more flexible and extensible ways. However, up until recently these efforts
have still been based on HTML, with its basic structuring limitations,
and have generally been pursued as separate, non-integrated activities.
There is much other ongoing work within both the Web and database communities
on data structure developments to address Web-related enhancements. Work
on similar issues is ongoing within the Object Management Group as well
(see Section 2.4). This work has contributed valuable ideas, and the various
proposals illustrate similar basic concepts, generally, movement toward
some form of simple object model. However, these similarities are often
obscured by detailed representational differences, and the work is fragmented
and lacks a unifying framework. As a result, individual proposals often
lack key capabilities that are in some cases contained in other proposals.
Moreover, in many cases these proposals are not well-integrated with key
areas of emerging industry consensus on Web data structuring technologies.
If the Internet is to develop to support advanced application requirements,
there is a need for both richer individual data structuring mechanisms,
and a unifying overall framework which supports heterogeneous representations
and extensibility, and provides metalevel concepts for describing and integrating
them.
The intent of this paper is to describe how a number of (in some respects)
separate "threads" of Web-related development can be combined
to form the basis of a Web object model to address these requirements.
This combination is based on the observation that the fundamental components
of any object model are:
- data structures that can represent object state
- ways to associate behavior (object methods) with the object
state
- ways for the object methods to access and operate on that state
As a result, what is needed to progress toward a Web object model is:
- a richer base representation than HTML, in order to better represent
"object state" (in particular, better support for semantic identification
of fields, rather than simply supporting presentation aspects of data)
- an API to this state, so that programs can readily access it (without
complex parsing)
- an enhanced ability to define relationships between this state and
specified pieces of code that can serve as object methods
At the same time, the openness of the Web compared to conventional object
models needs to be preserved, due to the distinct requirements of the Web
environment for openness and scalability.
In the following sections, this paper will:
- describe key examples of existing work from the Web, database, and
OMG communities that contribute both ideas and technology toward providing
the components of a Web object model identified above
- identify some key underlying principles behind this work
- identify a framework which allows this work to be unified and extended
to support the requirements of advanced Web applications for object technology
2. Relevant Work
As noted in the Introduction, there has been much ongoing work on enhancements
to address Web limitations in supporting richer data structures, and integrating
object technology. For example, the Internet and Web communities have developed
both additional representations, and a number of "object models"
or data structuring principles, to represent richer data structures. The
database community has also developed proposals for "lightweight object
models," partly driven by attempts to represent the structure of Web
resources. All this work has contributed valuable ideas and, taken as a
whole, exhibits important common underlying principles. What is required
is that this work be integrated, and the best ideas merged.
The Introduction specifically noted that what is needed to progress
toward a Web object model is:
- a richer base representation than HTML, in order to better represent
"object state" (in particular, better support for semantic identification
of fields, rather than simply supporting presentation aspects of data)
- an API to this state, so that programs can readily access it (without
complex parsing)
- an enhanced ability to define relationships between this state and
specified pieces of code that can serve as object methods
This section describes a number of the key technologies that attempt
to address parts of these problems. Several of these technologies will
be used as the basis of an approach, described in Section 3, which integrates
them to support a Web object model.
Caveats
The following subsections describing the various technologies are in
some cases rather long, and include a great deal of text and specific examples
taken from the cited references. The purpose in doing this is to provide
enough detail in one place to illustrate key concepts and the roles they
might play in supporting a Web object model, and to give the reader a feel
for how generalizations of the concepts might be developed. Hence, this
report makes no claims of originality for most of this material (and readers
should refer to the cited sources for further details). The subsections
also include some additional commentary highlighting key points, and establishing
"forward references" to later material.
Several of the sections describe ongoing activities of the World Wide
Web Consortium (W3C), particularly:
- the Extensible Markup Language (XML)
- the Resource Description Framework (RDF)
- the Document Object Model (DOM)
The reader should be aware that in many cases these specifications are
works in progress. As a result, some of the details described in this report,
as well as the source references, may no longer be completely accurate
(or accessible due to changed URLs) by the time the report is read. The
latest information on these activities can be obtained through the main
W3C Web page <http://www.w3.org/>
or W3C's technical report page <http://www.w3.org/TR/>.
2.1 Structured Data Representations
and "Lightweight Object Models"
The Introduction briefly described HTML's limitations in supporting
the data structure requirements of more complex Web applications. HTML
was adequate as long as what applications were generally doing was simply
displaying pages to users. However, more complex applications require programs
to be able to recognize and process parts of Web pages that have specific
semantic meaning within the application. In some cases, applications require
data that has a well-defined, fixed format (such as an invoice or other
form). Even if applications don't require such fully regular structures,
they often need the ability to identify specific pieces of a page's contents.
For example, a document may not have a fixed number of authors, but it
is still important to be able to identify the strings of text that correspond
to authors' names. In some cases, these "pieces" would correspond
to specific fields in records, such as "author". In other cases,
they would correspond to specific relationships (e.g., a "citation"
link to a related paper).
These are the same structuring requirements that apply to object state
in object models; i.e., an object's state must be structured in such a
way that the object methods can find the parts of the state that they need
in order to execute properly. As compared with HTML, whose tags are primarily
concerned with how the tagged information is to be presented, satisfying
this structuring requirement involves some form of semantic markup,
i.e., the ability to tag items with names that can be used to identify
items based (at least to some extent) on their semantics.
This section describes a number of developments directed at dealing
with the problems of providing richer data structuring capabilities for
Web data.
2.1.1 Summary Object Exchange Format (SOIF)
Harvest's Summary Object Interchange Format (SOIF) is a syntax for representing
and transmitting descriptions of (metadata about) Internet resources such
as files, sites, Web pages, etc., as well as other kinds of structured
objects (see Internet
draft: CIP Index Object Format for SOIF Objects <http://www.globecom.net/(eng)/ietf/draft/draft-ietf-find-cip-soif-01.shtml>).
SOIF is based on a combination of the Internet Anonymous FTP Archives (IAFA)
IETF Working Group templates and BibTeX. Each resource description
is represented in SOIF as a list of attribute-value pairs (e.g., Company
= 'Netscape'). SOIF handles both textual and binary data as values, and,
with some minor extensions, multivalued attributes. SOIF also allows bulk
transfer of many resource descriptions in a single, efficient stream. A
SOIF stream contains one or more SOIF objects, each of which contains the
structured content of a resource description. An example SOIF object might
be:
@DOCUMENT { http://www.netscape.com:80/
Title{20}: Welcome to Netscape!
Last-Modified{29}: Thu, 16 May 1996 11:45:39 GMT }
Resource Description Messages
(RDM) <http://www.w3.org/TR/NOTE-rdm>, 24 July 1996, by Darren
Hardy (Netscape), is a technical specification of Resource Description
Messages (RDM). RDM is used in Netscape's Catalog Server. RDM is a mechanism
to discover and retrieve metadata about network-accessible resources, known
as Resource Descriptions (RDs). A Resource Description consists of a list
of attribute-value pairs (e.g., Author = Darren Hardy, Title = RDM)
and is associated with a resource via a URL. Agents can generate RDs automatically
(e.g., a WWW robot), or people can write RDs manually (e.g., a librarian
or author). Once a repository of Resource Descriptions is assembled, the
server can export it via RDM as a programmatic way for WWW agents to discover
and retrieve the RDs.
RDM uses Harvest's SOIF format to encode the RDs. The data model that
SOIF provides is a flat name space for the attributes, and treats all values
as blobs. The RDM schema definition language extends the SOIF data model
by providing:
- Data type and format information for the values (e.g., varchar and
application/rfc822-address, or blob and text/html).
- Hints to the RDM client as to which attributes should be surfaced to
the user-level, and attributes which are included in the default view.
- Hints to an indexer as to which attributes should be indexed, and attributes
which should be used to suppress duplicates.
- A mapping between attribute names and (table name, column name) tuples,
which helps an RDM client to place this data into the relational data model
to support RDBMS backends.
- Other semantic information, such as indexable columns and foreign keys,
which helps in mapping the SOIF objects into the relational data model.
SOIF illustrates a theme that will be repeated in other Web-related
structured data representations discussed here: the representation of data
as semantically tagged data items (attribute/value pairs),
where the tags or attribute names convey something of the meaning of the
associated data value. A key advantage of an approach based on individual
attribute/value pairs is that, unlike a database-like "typed record"
approach, it is arbitrarily extensible in a federated environment like
the Web (without a centralized collection of types or schema). Anyone can
record any attributes they feel are necessary, without going through the
"overhead" of defining a new type (and, in particular, possibly
having to define it as a subtype of an existing type), and distributing
that type definition throughout a distributed network.
However, while SOIF supports attribute/value pairs, its structuring
capabilities are not sufficiently rich to support the full structuring
requirements of the Web. For example, it lacks support for nested structures,
and cannot support the functionality of HTML, let alone extensions to it.
It is also not well integrated with more advanced developments in Web data
representation, such as XML, RDF, and DOM, described later.
2.1.2 Object Exchange Model (OEM)
Stanford's Object Exchange Model (OEM) [PGW95, AQMW+96] is a "lightweight
object model" developed to act as a general model capable of representing
both database and Web data structures. A similar model, developed at the
University of Pennsylvania, is described in [BDHS96, BDFS97]. OEM was introduced
in TSIMMIS (The Stanford-IBM Manager of Multiple Information Sources) as
a self-describing way of representing metadata. OEM was later modified
for use in the Lore (Lightweight Object Repository) system. OEM exists
in two main variants. In the original (TSIMMIS) version, OEM defines a
set of labeled nodes. Each node has an object identifier (oid),
a label, a type, and a value (the type defines the type of the value).
The types include primitive types such as integer, and set. If the
type is set, the value consists of a set of oids of other nodes.
This allows aggregate structures to be defined. These structures are shown
in the figure below.
original (TSIMMIS) OEM:
+-----+-------+------+-------+
| oid | label | type | value | type includes "set"
+-----+-------+------+-------+
+-----+-------+------+-----------------+
| oid | label | set | {oid, oid, ...} |
+-----+-------+------+-----------------+
In the newer (Lore) version of OEM, the structures have been modified
so that edges are labeled rather than nodes. In this scheme, a complex
object consists of a set of (label,oid) pairs. These effectively represent
relationships between the containing object and the target object. That
is, a given (label,targetoid) pair contained in object sourceobject represents
the relationship
label(sourceobject, targetobject)
This revised structure thus more closely resembles a first order logic
(FOL) structuring of data. These structures are shown in the figure below.
new (Lorel) OEM:
atomic object
+-----+------+-------+
| oid | type | value |
+-----+------+-------+
complex object
+-----+---------+-------------------------------------------+
| oid | complex | value = {(label, oid), (label, oid), ...} |
+-----+---------+-------------------------------------------+
Since individual objects do not have labels in this scheme, additional
labels are introduced so that top-level objects can also have names.
As an example, a simple structure for information on books in a library
might have the following structure in the TSIMMIS OEM:
+----+---------+------+---------------+
| &1 | library | set | {&2, &5, ...} |
+----+---------+------+---------------+
+----+------+------+----------+
| &2 | book | set | {&3, &4} |
+----+------+------+----------+
+----+--------+--------+-----+
| &3 | author | string | Aho |
+----+--------+--------+-----+
+----+-------+---------+-----------+
| &4 | title | string | Compilers |
+----+-------+---------+-----------+
Linearly, this might be represented as:
<&1, library, set, {&2,&5,...} >
<&2, book, set, {&3,&4} >
<&3, author, string, Aho >
<&4, title, string, Compilers >
In the Lorel OEM, the same structure would be:
+----+------+-----------------------------+
library: | &1 | set | {(book,&2), (book,&5), ...} |
+----+------+-----------------------------+
+----+------+---------------------------+
| &2 | set | {(author,&3), (title,&4)} |
+----+------+---------------------------+
+----+--------+-----+
| &3 | string | Aho |
+----+--------+-----+
+----+---------+-----------+
| &4 | string | Compilers |
+----+---------+-----------+
OEM can represent complex graph structures, similar to those that exist
in the Web. It is a "lightweight" object model in the sense that:
- it does not require the definition of classes or types; arbitrary structures
with arbitrary attribute names can be included in OEM structures; this
enables it to more directly represent the irregular structures found within
and among Web resources
- it does not support encapsulation; applications can directly access
the OEM structures
- it does not support object behavior (there are no object methods defined
for OEM nodes)
OEM and related models effectively define global models for a federated
database system, where the federated components include unstructured or
semistructured data sources such as the Web (unlike the more conventional
structured database sources usually considered in federated database systems).
These models provide a valuable core of ideas for applying database concepts
to Web data. As the examples illustrate, OEM is based on the use of attribute/value
pairs. This is important in allowing the individual components of Web resources
to be recognized and accessed in a meaningful way by applications. In addition,
OEM extends the basic attribute/value pair model by providing each pair
with its own identifier. This is important in allowing complex nested and
graph structures to be defined. It is also potentially important in allowing
additional descriptive information (metadata) to be directly associated
with the pairs (e.g., to describe an attribute's meaning more fully). However,
this latter idea has not directly followed up in the OEM-related papers
reviewed.
While these models are intended to represent data in (or extracted from)
Web and other resources, and hence constitute a form of metadata, the capabilities
of these models for representing metadata that might already exist about
a resource, and for representing their own metadata, are somewhat undeveloped.
They do not explicitly consider capturing type and schema information where
it exists, or linking that type information to the structures it describes.
For example, when OEM is used to capture a database structure, a schema
actually exists for this data, unlike Web resources. It should be possible
to capture both the data and the schema in OEM, and link them together.
This is not really followed up in existing OEM work (although it could
be). Related work has been done on a concept called DataGuides [GW97,
NUWC97]. A DataGuide resembles a schema, but is derived dynamically as
a summary of the structures that have been encountered, and only approximately
describes the structures that may actually be encountered. This is appropriate
for unstructured and semistructured data, but does not fully represent
the semantics of an actual schema.
These models as currently implemented are also not well integrated with
emerging Web technologies, such as the XML, DOM, and RDF work described
below, that are likely to change the basic nature of the Web's representation.
The approach taken in work such as OEM has so far assumed that the Web
will continue to be largely unstructured or semistructured, based on HTML,
and that data from the Web will need to be extracted into separate OEM
structures (or interpreted as if it had been) in order perform database-like
manipulations on it. On the other hand, the new Web technologies provide
a higher level, more semantic representational structure, which can start
with the assumption that information authors themselves have support to
provide more semantic structural information. Our work on a Web object
model is based on the idea that, with this additional representation support,
it makes sense to investigate building more database-like capabilities
within the Web infrastructure itself, rather than assuming that almost
all of these database capabilities need to be added externally. Since Web
structures are unlikely to become as regular as conventional databases,
some of the principles developed by work such as OEM will continue to be
important (and, in fact, as a model, OEM has many similarities with
work such as RDF described later in this report). However, it seems likely
that these principles will need to be applied in the context of representations
such as XML and DOM, used directly as the basis of an enhanced Web infrastructure.
2.1.3 Knowledge Interchange Format (KIF)
The Knowledge Interchange
Format <http://logic.stanford.edu/kif/kif.html> provides a common
means of exchanging knowledge between programs with differing internal
knowledge representation techniques. It is human-readable, with declarative
semantics. It can express first-order logic (FOL) sentences, with some
second-order capabilities. Translators exist to convert various knowledge
representation languages to and from KIF. A simple example of KIF in representing
information about an ontology (from [BBBC+97]) is:
ontology(o_857)
ontology_name(o_857,'healthcare')
ontology_frame(o_857,f_123)
frame(f_123)
frame_name(f_123,'encounter_drg')
slot(s_345)
frame_slot(f_123,s_345)
slot_name(s_345,'patient_age')
constraint(c_674)
slot_constraint(s_345,c_674)
constraint_expression(c_674,[[gt,'patient_age',43]
[lt,'patient_age',75]]]
The example illustrates that the KIF representation of data is based
on the use of attribute/value pairs; in fact, this is a direct representation
of the way this information might be expressed in first-order logic. This
also illustrates the fact that a FOL representation necessarily introduces
a number of "intermediate" object identifiers (like o_857
and f_123), in order to assert the identity of distinct concepts,
and to represent relationships among the various parts of the description.
This is similar to the way that OEM introduces identifiers for the individual
parts of a resource description. The KIF example particularly illustrates
the use of such identifiers in defining namespaces like frames or ontologies,
which qualify contained information.
Like OEM, KIF is capable of representing arbitrary graph structures.
Moreover, KIF illustrates the importance of identifying parts of a data
structure representation with logical assertions in conveying semantics
between applications. Section 3 will describe how this principle serves
the basis of a formal Web object model definition. However, while KIF is
widely used for knowledge interchange, it, like OEM, is not well integrated
with emerging Web infrastructure technologies.
2.1.4 Extensible Markup Language (XML)
The Extensible Markup Language
(XML) <http://www.w3.org/XML/>, is an ongoing effort within the World
Wide Web Consortium (W3C). XML is a data format for structured document
interchange on the Web. More specifically, XML defines a simple subset
of SGML (the Standard Generalized Markup Language [ISO86]; see also, e.g.,
[DeR97]), and is intended to make it easy to use SGML on the Web. XML is
extensible because unlike HTML, which defines a fixed set of tags, XML
allows the definition of customized markup languages with application-specific
tags, e.g., <AUTHOR> or <QTY-ON-HAND>, for
exchanging information in particular application domains such as chemistry,
electronics, or general business. Hence, XML is really a metalanguage
(a language for describing languages).
Because authors and providers can design their own document types using
XML, browsers can benefit from improved facilities, and applications can
use tailored markup to process data. As a result, XML provides direct support
for using application-specific tagged data items (attribute/value pairs)
in Web resources, as opposed to the current need to use ad hoc encodings
of data items in terms of HTML tags. [KR97] provides a useful overview
of the potential benefits of using XML in Web-related applications.
Although XML could eventually completely replace HTML, XML and HTML
are expected to coexist for some time. In some cases, applications may
wish to define entirely separate XML documents for their own processing,
and convert the XML to HTML for display purposes. Alternatively, applications
may wish to continue using HTML pages as their primary document format,
embedding XML within the HTML for application-specific purposes. For example,
[Hop97] describes the use of blocks of XML markup enclosed by <XML>
and </XML> tags within an HTML document for this purpose.
XML has considerable industry support, e.g., from Netscape, Microsoft,
and Sun. For example, Microsoft has built an XML parser into Internet Explorer
4.0 (which uses XML for several applications), has made available XML parsers
in Java and C++, together with links to other XML tools (see http://www.microsoft.com/xml/),
and has indicated that it will use XML in future versions of Microsoft
Office products. Microsoft has also contributed to a number of proposals
to W3C on the use of XML as a base for various purposes (some of which
will be discussed in later sections). Netscape has said it will support
XML via the Meta Content Framework (described in Section 2.2) in a future
version of its Communicator product. Work is also underway on tying XML
to Java in a number of ways. Other commercial vendors are also developing
XML-related software tools. In addition, a number of XML tools are available
for free non-commercial use. A list of some of these tools is available
at the W3C XML Web page identified
above.
A number of industry groups have defined SGML Document Type Definitions
(DTDs) for their documents (e.g., the U.S. Defense Department, which requires
much of its documentation to be submitted according to defined SGML DTDs);
in many cases these could either be used with XML directly, or converted
in a straightforward fashion. Work is already underway to define XML-based
data exchange formats in both the chemical and healthcare communities.
Work has also been done on other applications of XML, e.g., an Ontology
Markup Language (OML) <http://wave.eecs.wsu.edu/WAVE/Ontologies/OML/OML-DTD.html>
for representing ontologies in XML.
The W3C XML specification has several parts:
- XML (language): specifications
for XML documents and Document Type Definitions (DTDs) <http://www.w3.org/TR/REC-xml>;
these specifications have the status of a W3C Recommendation, and hence
are stable
- XLL (XML-Link): draft
specifications of constructs that can be inserted in XML documents to describe
links between objects and addressing into the internal structures of XML
documents <http://www.w3.org/TR/WD-xml-link>
- XSL (XML-Style): a submission
defining presentation styles for XML documents <http://www.w3.org/TR/NOTE-XSL>
A DTD is usually a file (or several files together) which contains a
formal definition of a particular type of document. This acts like a database
schema, and defines what names can be used for elements, where they may
occur (e.g., <ITEM> might only be meaningful inside <LIST>),
and how they all fit together. The DTD lets processors parse a document
and identify where each elements belongs, so that stylesheets, browsers,
search engines, and other applications can be used. The linking of resources
with the DTDs that describe them is similar to the association of a database
record with its schema type, and to the association of an object with its
type or class definition.
An XML document may be either valid or well-formed. A
valid XML document is well-formed, and has a DTD. The document begins
with a declaration of its DTD. This may include a pointer to an external
document (a local file or the URL of a DTD that can be retrieved over the
network) that contains a subset of the required markup declarations (called
the external subset), and may also include an internal subset
of markup declarations contained directly within the document. The external
and internal subsets, taken together, constitute the complete DTD of the
document. The DTD effectively defines a grammar which defines a class of
documents. Normally, the bulk of the markup declarations appear in the
external subset, which is referred to by all documents of the same class.
If both external and internal subsets are used, the XML processor must
read the internal subset first, then the external subset. This allows the
entity and attribute declarations in the internal subset to take precedence
over those in the external subset (thus allowing local variants in documents
of the same class). XML DTDs can also be composed, so that new document
types can be created from existing ones.
A well-formed XML document can be used without a DTD, but must
follow a number of simple rules to ensure that it can be parsed correctly.
These rules require, among other things, that:
- all tags must be balanced (elements must have both start and end tags
present)
- all attribute values must be in quotes
- elements must nest inside each other properly (no overlapping markup)
The general characteristics of XML can be illustrated using an example
of a document that maintains a list of people's electronic business cards
(this example is modified from one in [KR97], and is not necessarily consistent
with the details of the latest XML specification). Each business card contains
the person's first name, last name, company, email address, and Web page
address. There is more than one way to represent attribute-value style
data in XML. One approach is to specify the attributes as the "attributes"
of XML tags. In this case, the document contains only tags annotated with
attribute-value pairs, and there is no content in the document other than
the tags themselves (which can be parsed and processed by applications).
Using this approach, an example document would be:
<!DOCTYPE bCard "http://www.objs.com/schemas/bCard">
<bCard>
<?xml default bCard
firstname = ""
lastname = ""
company = ""
email = ""
webpage = ""
?>
<bCard
firstname = "Frank"
lastname = "Manola"
company = "Object Services and Consulting"
email = "fmanola@objs.com"
webpage = "http://www.objs.com/manola.htm"
/>
<bCard
firstname = "Craig"
lastname = "Thompson"
company = "Object Services and Consulting"
email = "thompson@objs.com"
webpage = "http://www.objs.com/thompson.htm"
/>
</bCard>
The default specification ensures that every tag has the same number
of attribute-value pairs.
An alternative representation uses different tags, rather than XML attributes,
to identify the meaning of the content. Using this approach, the same content
would be represented as:
<bCard>
<FIRSTNAME>Frank</FIRSTNAME>
<LASTNAME>Manola</LASTNAME>
<COMPANY>Object Services and Consulting</COMPANY>
<EMAIL>fmanola@objs.com</EMAIL>
<WEBPAGE>http://www.objs.com/manola.htm</WEBPAGE>
</bCard>
<bCard>
<FIRSTNAME> Craig </FIRSTNAME>
<LASTNAME> Thompson </LASTNAME>
<COMPANY>Object Services and Consulting</COMPANY>
<EMAIL> thompson@objs.com </EMAIL>
<WEBPAGE>http://www.objs.com/thompson.htm</WEBPAGE>
</bCard>
The paper XML representation
of a relational database <http://www.w3.org/XML/RDB.html> uses
a relational database as a simple example of how to represent more complex
structured information in XML. A relational database consists of a set
of tables, where each table is a set of records. A record
in turn is a set of fields and each field is a pair field-name/field-value.
All records in a particular table have the same number of fields with the
same field-names. This description suggests that a database could be represented
as a hierarchy of depth four: the database consists of a set of tables,
which in turn consist of rows, which in turn consist of fields.
The following example, taken from the cited paper, describes a possible
XML representation of a single database with two tables:
<!doctype mydata "http://www.w3.org/mydata">
<mydata>
<authors>
<author>
<name>Robert Roberts</name>
<address>10 Tenth St, Decapolis</address>
<editor>Ella Ellis</editor>
<ms type="blob">ftp://docs/rr-10</ms>
<born>1960/05/26</born>
</author>
<author>
<name>Tom Thomas</name>
<address>2 Second Av, Duo-Duo</address>
<editor>Ella Ellis</editor>
<ms type="blob">ftp://docs/tt-2</ms>
</author>
<author>
<name>Mark Marks</name>
<address>1 Premier, Maintown</address>
<editor>Ella Ellis</editor>
<ms type="blob">ftp://docs/mm-1</ms>
</author>
</authors>
<editors>
<editor>
<name>Ella Ellis</name>
<telephone>7356</telephone>
</editor>
</editors>
</mydata>
The representation is human-readable, but fairly verbose (since XML
in general is verbose). However, it compresses well with standard compression
tools. It is also easy to print the database (or a part of it) with standard
XML browsers and a simple style sheet.
The database is modeled with an XML document node and its associated
element node:
<!doctype name "url">
<name>
table1
table 2
...
table n
</name>
The name is arbitrary. The url is optional, but can be
used to point to information about the database. The order of the tables
is also arbitrary, since a relational database defines no ordering on them.
Each table of the database is represented by an XML element node with the
records as its children:
<name>
record1
record2
...
recordm
</name>
The name is the name of the table. The order of the records is
arbitrary, since the relational data model defines no ordering on them.
A row is also represented by an element node, with its fields as children:
<name>
field1
field2
...
fieldm
</name>
The name is the name of the row type (this was not required in
the original relational model, but the current specification allows definition
of row types); the name is required in XML anyway. The order of the fields
is arbitrary. A field is represented as an element node with a data node
as its only child:
<name type="t">
d
</name
If d is omitted, it means the value of the fields is the empty
string. The value of t indicates the type of the value (such as
string, number, boolean, date). If the type attribute is omitted, the type
can be assumed to be `string.'
This example illustrates that XML tags can (and will) represent concepts
at multiple levels of abstraction. The example defines a specific four-level
hierarchy, but does not explicitly define the relational model and indicate
the hierarchical relationships among the various relational constructs.
In order to do this in a generic way for all relational databases, there
would need to be explicit tags such as <SCHEMA>, <TABLE>,
<ROW>, etc., and a specification of how they should be nested.
This is metalevel information as far as the XML representation is concerned,
and could be specified in the DTD. The definition of models, such as the
relational model, for organizing data for specific purposes, is independent
of XML, and needs to be done separately. The definition of such models
(in some cases using XML as their representation) is discussed in the next
section.
An XML document consists of text, and is basically a linearization of
a tree structure. At every node in the tree there are several character
strings. The tree structure and character strings together form the information
content of an XML document. Some of the character strings serve to define
the tree structure; others are there to define content. In addition to
the basic tree structure, there are mechanisms to define connections between
arbitrary nodes in the tree. For example, in the following document there
is a root node with three children, with one of the children containing
a link to one of the other children:
<p>
<q id="x7">The first child of type q</q>
<q id="x8">The second child of type q</q>
<q href="#x7">The third child of type q</q>
</p>
In this case, the third child contains an href attribute which
points to the first child, using its id attribute as an identifier.
The XML linking model is described in the XLL
draft <http://www.w3.org/TR/WD-xml-link>. The full hypertext linking
capabilities of XML are much more powerful than those of HTML, and are
based on more powerful hypertext technology such as described in HyTime
[ISO92] <http://www.hytime.org/> and the Text
Encoding Initiative (TEI) <http://www.uic.edu/orgs/tei/>. The
current specification supports both conventional URLs, and TEI extended
pointers. The latter provide support for bidirectional and multi-way links,
as well as links to a span of text (i.e., a subset of the document) within
the same or other documents.
XSL <http://www.w3.org/TR/NOTE-XSL>
is a submission defining stylesheet capabilities for XML documents. XML
stylesheets enable formatting information to be associated with elements
in a source document to produce formatted output. XML stylesheet capabilities
are based on a subset of those defined in the ISO standard Document Style
Semantics and Specification Language (DSSSL) [ISO96] used in formatting
SGML documents. The formatted output is created by formatting a tree of
flow objects. A flow object has a class, which represents a kind
of formatting task, together with a set of named characteristics, which
further specify the formatting. The association of elements in the source
document tree to flow objects is defined using construction rules.
A construction rule contains a pattern to identify specific elements
in the source tree, and an action to specify a resulting subtree
of flow objects. The stylesheet processor recursively processes source
elements to produce a complete flow object tree which defines how the document
is to be presented.
The XML working group is also currently developing a Namespace
facility <http://www.w3.org/TR/1998/NOTE-xml-names> that will
allow Generic Identifiers (tag names) to have a prefix which will make
them unique and will prevent name clashes when developing documents that
mix elements from different schemas. This facility allows a document's
prolog to contain a set of Processing Instructions (an SGML concept) of
the form:
<?xml:namespace name="some-uri" as="some-abbreviation"?>
for example
<?xml:namespace name="http://www.w3.org/schemas/rdf-schema" as="RDF"?>
<?xml:namespace name="http://www.purl.org/DublinCore/schema" as="DC"?>
Elements in the document may then use generic identifiers of the form
<RDF:assertions> or <DC:Title>. Those element names would expand
to URIs such as http://www.w3.org/schemas/rdf-schema#assertions. This work
is still under development, and the details of the final specification
may differ from those described here.
XML provides basic tagged value support, as well as support for nesting,
and enhanced link capabilities. Because the Web community is increasingly
targeting XML as its "next generation Web representation", the
Web object model described in Section 3 uses XML as its basic representation
of object state. However, additional concepts must also be defined to apply
XML to extended data and metadata structuring requirements, and particularly
the requirements for a Web object model that go beyond a richer state representation.
Some of these requirements are illustrated both by the relational database
example above, and by the RDF and related efforts described in the next
section. These efforts generally involve defining data model concepts for
representing specific kinds of data (as the relational model does for database
data), and then using the tagged value structures supported by XML as their
representation. These models support various ways of using identifier concepts
(URLs plus other identifier concepts) to provide support for graph structured
data. An additional general requirement, not generally addressed by Web-related
activities, is the definition of structured database capabilities (e.g.,
an algebra or calculus to serve as the basis for database-like query and
view facilities for XML data).
2.2 Higher-Level Models and Metadata
Richer representation techniques for Web information, such as XML, are
an important component in making the Web an improved basis for enhanced
applications of all kinds. However, additional structure must also be defined.
For example, XML provides support for the representation of data in terms
of attribute/value pairs, with user-defined tags. However, this alone will
not provide for easy interchange of information, and interoperability among
components since, using XML, different users could define their own ways
of using attribute-value pairs to represent the same (or the same type
of) information. Thus, there is also a need to define additional characteristics
of what to represent using representations such as XML.
A data model defines one level of "what to represent". For
example, the relational data model defines structuring concepts such as
rows and tables, and provides one basic organizational framework for representing
data. The example from the previous section of how to represent relational
data in XML illustrated how using the relational model imposed additional
structure on the XML representation. Defining a data model for data represented
in XML both suggests specific structuring concepts for using XML to organize
data, and may also involve the specification of certain standard tags or
attributes (like <TABLE>) to reflect those concepts. Use
of particular data models (represented using techniques such as XML) regularizes
the structures that may be encountered, and potentially simplifies the
task of applications that process those structures.
An additional level of "what to represent" is provided by
standardizing the use of domain-specific attribute/value pairs and document
structures (e.g., standards for specific kinds of reports or forms). SGML
and XML DTDs constitute one way to specify such standards, and there are
already numerous SGML DTDs in use for this purpose (these could, in most
cases, be easily adapted for use with XML).
An important source of efforts to develop such higher-level model specifications
for use on the Web has been work on developing representation techniques
for Web metadata, i.e., data available via the Web that describes
or helps interpret either other Web resources or non-Web resources. This
metadata is used both to facilitate Web searches for relevant data, and
to either provide direct access to it (if it is Web-accessible) or at least
indicate its existence and possibly describe how to obtain it. The reason
why the development of metadata representations has driven the development
of higher-level models is that the metadata is intended to support indexing,
searching, and other automated processes that require more structure than
may be present in the data itself. Metadata requirements have also driven
the development of structured representations themselves. For example,
the SOIF format described in Section 2.1.1 was developed to represent Web
metadata.
Efforts to develop enhanced metadata capabilities have involved several
types of activity (a given effort may bundle more than one of them):
- The definition of an abstract metadata model, i.e., the definition
of the basic constructs and operations of the model, and their semantics,
or the definition of the principles behind such models. Specific models
may add additional mechanisms, such as predefined attributes and types,
and inheritance. Among the work described later in this section, the Dublin
Core and Warwick Framework are examples of work on the basic principles
of metadata models. The Resource Description Framework (RDF) and Meta Content
Framework (MCF) are examples of specific metadata models.
- The definition of one or more representations of these models in terms
of specific syntactic formats such as HTML, or XML (equivalently in some
cases, the definitions of how various popular representations, such as
HTML pages, are to be viewed in these models). Examples of such definitions
are described in subsequent sections, e.g., the representation of RDF in
XML.
- The definition of requirements, and specific sets of attributes and
their associated value types, for defining specific types of metadata for
specific application areas. The Dublin Core is an example of work on metadata
intended to be descriptive of resources of all types. Other examples of
metadata definitions for specific types of resources (or data which could
be used as such metadata) include:
- Federal metadata
standards for geospatial data <http://www.fgdc.gov/Metadata/Metadata.html>
- work on metadata to support searching for software resources [IK96]
- the PICS specifications for describing ratings information (see Section
2.2.3)
Web data/metadata models defined "on top of" representations
such as XML are relevant to the development of a Web object model in helping
to further define an adequate basis for representing object state. In addition,
these models are also relevant in helping to identify ways to establish
relationships between the object state and the specified pieces of code
that serve as object methods. This is based on the idea that an "object"
is basically a piece of state with some attached (or associated) programs
(methods). For example, a Smalltalk object consists of a set of state variables
(data), together with a pointer (link) to a class object which contains
its methods. The link between an object and its class is essentially a
metadata link, since the class methods are used to help interpret the data.
In the Web environment, the idea is that objects can be constructed by
enhancing Web resources with additional metadata that allows the resources
to be considered as objects in some object model. This concept will be
developed further in Section 3, but is mentioned here to further explain
the role that metadata structuring principles will play in the development
of a Web object model.
2.2.1 Dublin Core
The Dublin Core
<http://purl.oclc.org/metadata/dublin_core/> is a set of specific
metadata attributes originally developed at the March 1995 Metadata Workshop
in Dublin, Ohio. The set has subsequently been modified on the basis of
later Dublin Core Metadata Workshops. The goal of the Dublin Core is to
define a minimal set of descriptive elements that facilitate the description
and the automated indexing of document-like networked objects. The Core
metadata set is intended to be suitable for use by resource discovery tools
on the Internet, such as the "WebCrawlers" employed by popular
World Wide Web search engines (e.g., Lycos and Alta Vista). In addition,
the core is meant to be sufficiently simple to be understood and used by
the wide range of authors and casual publishers who contribute information
to the Internet. The Dublin Core reflects input from a wide range of communities
interested in metadata, including both the Internet and Digital Library
communities. The elements of the Dublin Core (as of November 1997) are
given below. The Dublin
Core Reference Description <http://purl.org/metadata/dublin_core_elements>
contains the current definition.
- TITLE: The name given to the resource by the CREATOR or PUBLISHER.
- CREATOR: The person(s) or organization(s) primarily responsible
for the intellectual content of the resource.
- SUBJECT: Keywords or phrases that describe the subject or content
of the resource. The intent is to use controlled vocabularies and keywords,
so the element might include scheme-qualified classification data (for
example, Library of Congress Classification Numbers) or scheme-qualified
controlled vocabularies (such as MEdical Subject Headings).
- DESCRIPTION: A textual description of the content of the resource,
such as document abstracts or content descriptions of visual resources.
This could be extended to include computational content description (e.g.,
spectral analysis of a visual resource). In this case this field might
contain a link to the description rather than the description itself.
- PUBLISHER: The entity responsible for making the resource available
in its present form.
- CONTRIBUTORS: Person(s) or organization(s) in addition to those
specified in the CREATOR element who have made significant intellectual
contributions to the resource.
- DATE: The date the resource was made available in its present
form.
- TYPE: The category of the resource, such as home page, novel,
poem, working paper, etc. It is expected that RESOURCE TYPE will be chosen
from an enumerated list of types that is under development. See http://sunsite.berkeley.edu/Metadata/types.html
for current thinking on the application of this element.
- FORMAT: The data representation of the resource, such as text/html,
ASCII, Postscript file, executable application, or JPEG image (as well
as non-electronic media). FORMAT will be assigned from an enumerated list
that is under development.
- IDENTIFIER: String or number used to uniquely identify the resource.
Examples for networked resources include URLs and URNs (when implemented).
Other globally-unique identifiers,such as International Standard Book Numbers
(ISBN) or other formal names would also be candidates for this element.
- SOURCE: The work, either print or electronic, from which this
resource is derived, if applicable.
- LANGUAGE: Language(s) of the intellectual content of the resource.
Where practical, the content of this field should coincide with RFC 1766.
See: http://ds.internic.net/rfc/rfc1766.txt.
- RELATION: Relationship to other resources, for example, images
in a document, chapters in a book, or items in a collection. A formal specification
of RELATION is currently under development.
- COVERAGE: The spatial locations and temporal durations characteristic
of the resource. Formal specification of COVERAGE is currently under development.
- RIGHTS: A link (e.g., a URL or other suitable URI as appropriate)
to terms and conditions, copyright statements, or similar information.
A formal specification is currently under development.
In addition to enumerating these data elements, the Dublin Workshop
report specified a number of underlying principles that apply to the entire
core metadata set.
- The core metadata set should be extensible to permit site specific
or domain specific data elements .
- All elements in the Core metadata set should be optional.
- All elements should be repeatable allowing, for example, multiple author
elements.
- The semantics of each element should be be modifiable by either:
- the use of qualifiers, borrowed from other existing metadata schemes,
which allow the use of more detailed or specific semantics from those schemes.
For example, a Subject element might be specified as Subject
(scheme=LCSH), indicating that the subject terms are taken from the
Library of Congress Subject Headings.
- ad-hoc specializations and extensions developed specifically for use
with the Core so as to refine the normal meanings of the core data elements.
These principles illustrate a number of requirements in a general metadata
model, including:
- the need for structural flexibility, e.g., for repeating or
missing elements, and for adding local extensions
- the need to be able to refer to additional levels of metadata.
This is illustrated here by the example Subject (scheme=LCSH), which
identifies the source definition of the Subject attribute name,
and constitutes metadata about the metadata (the subject information) being
recorded for a particular document. In this case, the reference to the
additional level of metadata is by name (LCSH). Later in the paper,
there will be examples where these references are via explicit metalevel
pointers (e.g., URLs) that link a metadata element directly to its
definition. These capabilities allow instances of metadata to refer to
specific ontologies, where these are defined.
- the need to be able to provide metadata about metadata at multiple
levels of granularity. For example, Subject (scheme=LCSH) illustrates
the need to associate additional metadata with individual attribute
names. This permits, for example, the use of attribute names whose
definitions come from different sources.
These same principles are illustrated in a number of the specific metadata
models described later in this section, such as MCF and RDF.
2.2.2 Warwick Framework
The Warwick
Framework <http://cs-tr.cs.cornell.edu:80/Dienst/UI/2.0/Describe/ncstrl.cornell/TR96-1593>
defines a container architecture that builds on the Dublin Core results.
It is a mechanism for aggregating distinct packages of metadata,
allowing multiple, separately-managed metadata sets to be defined, managed,
and associated with the resources they describe. The report also describes
proposals for representing Warwick Framework structures using HTML, MIME,
SGML, and a distributed object architecture. (See also the overview papers
at http://www.dlib.org/dlib/july96/07weibel.html
and http://www.dlib.org/dlib/july96/lagoze/07lagoze.html.)
The Warwick Framework has two fundamental components: packages,
which are typed metadata sets, and containers, which are the units
for aggregating packages.
A container may be either transient or persistent. In its transient
form, it exists as a transport object between and among repositories, clients,
and agents. In its persistent form, it exists as a first-class object in
the information infrastructure. That is, it is stored on one or more servers
and is accessible from these servers using a globally accessible identifier
(URI). A container may also be wrapped within another object (i.e., one
that is a wrapper for both data and metadata). In this case the "wrapper"
object will have a URI rather than, or in addition to, the metadata container
itself.
Independent of the implementation, the only operation defined for a
container is one that returns a sequence of packages in the container.
There is no provision in this operation for ordering the members of this
sequence and thus no way for a client to assume that one package is more
significant or "better" than another.
Each package is a typed object; its type may be determined after access
by a client or agent. Packages are of three types:
- metadata set - These are packages that contain actual metadata.
Examples are packages that are MARC records, Dublin Core records, and encoded
terms and conditions (MARC stands for MAchine Readable Catalog, a standard
for the representation and communication of bibliographic and related information
in machine-readable form, used extensively in the Library community--see,
e.g., http://lcweb.loc.gov/marc/).
A potential problem is the ability of clients and agents to recognize and
process the semantics of the many metadata sets. In addition, clients and
agents will need to adapt to new metadata types as they are introduced,
at least to the extent of ignoring them gracefully, or perhaps copying
them for downstream applications that may know how to process them. Initial
implementations of the Warwick Framework will probably include a set of
well known metadata sets, in the same manner that most Web browsers have
native handlers for a set of well-known MIME types. Extending the Framework
implementations to handle an extensible metadata sets is expected to rely
on a type registry scheme.
- indirect - This is a package that is an indirect reference to
another object in the information infrastructure. While the indirection
could be done using URLs, the existence of a reliable URN implementation
is necessary to avoid the problems of dangling references that currently
exist in the Web. It is important to note that the target of the indirect
package is a first-class object, and thus may have its own metadata and,
significantly, its own terms and conditions for access. Further, the target
of the indirect package may also be indirectly referenced by other containers
(i.e., sharing of metadata objects). Finally, the target of the
indirection may be in a different repository or server than the container
that references it.
- container - This is a package that is itself a container. There
is no defined limit for this recursion.
+--------------------+
| container |
| |
| +---------------+ |
| | package | |
| | (Dublin Core) | |
| +---------------+ |
| +---------------+ |
| | package | |
| | (MARC Record) | |
| +---------------+ | +------------------------+
| +---------------+ | URI | package |
| | package |-+------>| (terms and conditions) |
| | (indirect) | | +------------------------+
| +---------------+ |
+--------------------+
Figure 1- Metadata container with three packages (one indirect)
Figure 1 illustrates a simple example of a Warwick Framework container.
The container in this example contains three logical packages of metadata.
The first two, a Dublin Core record and a MARC record, are contained within
the container as a pair of packages. The third metadata set, which defines
the terms and conditions for access to the content object, is referenced
indirectly via a URI in the container (the syntax for terms and conditions
metadata and administrative metadata is not yet defined).
The mechanisms for associating a Warwick Framework container with a
content object (i.e., a document) depend on the implementation of
the Framework. The proposed implementations discussed in the cited reference
illustrate some of the options. For example, a simple Warwick Framework
container may be embedded in a document, as illustrated in the HTML implementation
proposal; or an HTML document can include a link to a container stored
as a separate file. On the other hand, as illustrated in the distributed
object proposal, a container may be a logical component of a so-called
digital object, which is a data structure for representing networked objects.
The reverse linkage, which ties a container to a piece of intellectual
content, is also relevant, since anyone can create descriptive data for
a networked resource, without permission or knowledge of the owner or manager
of that resource. This metadata is fundamentally different from the metadata
that the owner of a resource chooses to link to or embed with the resource.
As a result, an informal distinction is made between two categories of
metadata containers, which both have the same implementation:
- An internally-referenced metadata container is the metadata
that the author or maintainer of a content object has selected as describing
the object. This metadata is associated with the content by either embedding
it as part of the structure that holds the content or referencing it via
a URI. An internally-referenced metadata container referenced via a URI
is, by nature, a first-class networked object, and may have its own metadata
container associated with it. In addition, an internally-referenced metadata
container may back-reference the content that it describes via a linkage
metadata element within the container.
- An externally-referenced metadata container is metadata that
may be created and maintained by an authority separate from the creator
or maintainer of the content object. In fact, the creator of the object
may not even be aware of this metadata. There may an unlimited number of
such externally-referenced metadata containers. For example, libraries,
indexing services, ratings services, and the like may compose sets of metadata
for content objects that exist on the net. As stated earlier, these externally-referenced
metadata containers are themselves first-class network objects, accessible
through a URI and having some associated metadata. The linkage to the content
that one of these externally-referenced containers purports to describe
will be via a linkage metadata element within the container. There is no
requirement, nor is it expected, that the content object will reference
these externally-referenced containers in any way.
One of the motivations for the development of the Warwick Framework
was a recognition that, even if attention is restricted to metadata for
descriptive cataloging (the subject of the Dublin Core), many different
formats for such metadata have been defined (including specialized forms
for particular kinds of data, such as geospatial data), and techniques
must be defined for organizing the metadata about an object that may appear
in these multiple forms.
Another motivation was the recognition that there are many other kinds
of metadata besides that used for descriptive cataloging that may need
to be recorded and organized. These kinds of metadata include, among others:
- terms and conditions - metadata that describes the rules for use of
an object, such as an access list of who can view the object, a set of
prices and fees for use of the object, or a definition of permitted uses
of an object (viewing, printing, copying, etc.).
- administrative data - metadata that relates to the management of an
object in a particular server or repository, such as date of last modification,
or the administrator's identity.
- content ratings - a description of attributes of an object within a
multidimensional scaled rating scheme as assigned by some rating authority,
e.g., using the PICS mechanism.
- provenance - data defining source or origin of some content object,
for example describing some physical artifact from which the content was
scanned, a summary of algorithmic transformations that have been applied
to the object (filtering, decimation, etc.).
- linkage or relationship data - data describing relationships to other
objects, such as the relationship of a set of journal articles to the containing
journal, the relationship of a translation to the work in its original
language, or the relationships among the components of a multimedia work
(including synchronization information between images and a soundtrack,
for example). Relationships should be defined using some unique persistent
identifier such as an ISBN, ISSN, or URN.
- structural data - data defining the logical components of complex or
compound objects and how to access those components, such as a table of
contents, or the definition of the different source files, subroutines,
data definitions in a software suite.
The Warwick Framework illustrates a number of very basic structural
requirements and options that must be supported in representing metadata,
and linking it with the data it describes. Like the principles reflected
in the Dublin Core, the Warwick Framework principles are illustrated in
a number of the specific metadata models described later in this section,
such as the RDF. For example, RDF assertions (see below) correspond
closely to Warwick Framework packages, and the various means provided for
associating RDF assertions with the resources they describe support options
identified in the Warwick Framework.
2.2.3 PICS and PICS-NG
PICS (Platform for Internet Content
Selection) <http://www.w3.org/PICS/> is an infrastructure for
associating labels (metadata) with Internet content. It was originally
designed to help parents and teachers control what children access on the
Internet, but it also facilitates other uses for labels, including code
signing, privacy, and intellectual property rights management. PICS currently
defines the following recommendations:
- Rating Services and
Rating Systems (and Their Machine Readable Descriptions) <http://www.w3.org/TR/REC-PICS-services>
(earlier version in World Wide Web Journal, 1(4), Fall 1996, 23-43) defines
a language for describing rating services and systems. Software programs
will read service descriptions written in this language, in order to interpret
content labels and assist end-users in configuring selection software.
- PICS Label Distribution
-- Label Syntax and Communication Protocols <http://www.w3.org/TR/REC-PICS-labels>
(earlier version in World Wide Web Journal, 1(4), Fall 1996, 45-69) specifies
the syntax and semantics of content labels and HTTP-related protocol(s)
for distributing labels as part of PICS.
- PICSRules 1.1 <http://www.w3.org/TR/REC-PICSRules>
defines a language for writing profiles, which are filtering rules that
allow or block access to URLs based on PICS labels that describe those
URLs.
In PICS, a rating service is an individual or organization that
provides content labels for resources on the Internet. The labels it provides
are based on a rating system. Each rating service must describe
itself using a PICS-defined MIME type application/pics-service.
Selection software that relies on ratings from a PICS rating service can
first load the application/pics-service description. This description
allows the software to tailor its user interface to reflect the details
of a particular rating service.
Each rating service picks a URL as its unique identifier, and includes
this unique identifier in all content labels the service produces. It is
intended that the URL, in addition to simply being a unique identifier,
also refer to an HTML document which describes both the rating service,
but also the rating system used by the service (possibly via a link to
a separate document).
A rating system specifies the dimensions used for labeling, the
scale of allowable values for each dimension, and a description of the
criteria used in assigning values. For example, the MPAA rates movies in
the U.S. based on a single dimension with allowable values G, PG, PG-13,
R, and NC-17. The current PICS specification allows only floating point
values.
Each rating system is identified by a URL. This allows multiple services
to use the same rating system, and refer to it by its identifier. The URL
identifying a rating system can be accessed to obtain a human-readable
description of the rating system.
A content label, or rating, contains information about
a document. The format of a content label is defined in the Label Format
document referenced above, and has three parts:
- the URL identifying the rating service that produced the label
- a set of PICS-defined (and extensible) attribute-value pairs which
provide information about the rating, such as the date the rating was assigned.
- a set of rating-system-defined attribute-value pairs that actually
rate the document along various dimensions or categories (chosen by the
rating system)
A new MIME type application/pics-labels is also defined for transmitting
one or more content labels.
When an end-user attempts to access a particular URL, a software filter
built into the Web client (browser) fetches the document. The client also
accesses the document's content label(s) based on rating systems that the
client has been told to pay attention to. The client then compares the
content label to the rating-system-specified values that the client has
been told to base access decisions on, and either allows or denies access
to the document.
Content labels may be:
- embedded in the document (using a PICS-specified mechanism based on
the HTML META tag)
- located separately from the document on the same server, and retrieved
along with the document via a protocol that uses RFC-822 headers
- retrieved separately from the document from a "label bureau"
The following application/pics-service document (taken from the PICS
specification) describes a simple rating service.
((PICS-version 1.1)
(rating-system "http://www.gcf.org/ratings")
(rating-service "http://www.gcf.org/v1.0/")
(icon "icons/gcf.gif")
(name "The Good Clean Fun Rating System")
(description "Everything you ever wanted to know about soap,
cleaners, and related products")
(category
(transmit-as "suds")
(name "Soapsuds Index")
(min 0.0)
(max 1.0))
(category
(transmit-as "density")
(name "suds density")
(label (name "none") (value 0) (icon "icons/none.gif"))
(label (name "lots") (value 1) (icon "icons/lots.gif")))
(category
(transmit-as "subject")
(name "document subject")
(multivalue true)
(unordered true)
(label (name "soap") (value 0))
(label (name "water") (value 1))
(label (name "soapdish") (value 2))
(label-only))
(category)
(transmit-as "color")
(name "picture color")
(integer)
(category
(transmit-as "hue")
(label (name "blue") (value 0))
(label (name "red") (value 1))
(label (name "green") (value 2)))
(category
(transmit-as "intensity")
(min 0)
(max 255))))
There are four top-level categories in this rating system. Each category
has a short transmission name to be used in labels (e.g., "suds");
some also have longer names that are more easily understood (e.g., "Soapsuds
Index"). The "Soapsuds Index" category rates soapsuds on
a scale between 0.0 and 1.0 inclusive. The "suds density" category
can have ratings from negative to positive infinity, but there are two
values that have names and icons associated with them. The name "none"
is the same as 0, and "lots" is the same as 1. The "document
subject" category only allows the values 0, 1, and 2, but a single
document can have any combination of these values. The "picture color"
category has two sub-categories.
A label list is used to transmit a set of PICS labels. The following
is a label list for two documents rated using the above rating system.
(PICS-1.1 "http://www.gcf.org/v2.5"
by "John Doe"
labels on "1994.11.05T08:15-0500"
until "1995.12.31T23:59-0000"
for "http://www.w3.org/PICS/Overview.html"
ratings (suds 0.5 density 0 color/hue 1)
for "http://www.w3.org/PICS/Underview.html"
by "Jane Doe"
ratings (subject 2 density 1 color/hue 1))
PICS-NG (Next Generation) was a W3C effort based on the observation
that the PICS infrastructure could be generalized to support arbitrary
Web metadata, with PICS categories serving as metadata attributes, having
meanings defined by the rating system. The W3C paper Catalogs:
Resource Description and Discovery <http://www.w3.org/pub/WWW/Search/catalogs.html>
also observes that the structure of a PICS label is similar to:
- a row in a relational database table (a rating system is analogous
to the schema)
- the set of header names and values in an email message (the rating
system in this case is RFC822)
- an SOIF record
- a BibTeX bibliography entry
- an HTML form data set
The PICS-NG effort defined a Metadata Object Model, and its encodings
in XML and as S-expressions, in the note PICS-NG
Metadata Model and Label Syntax <http://www.w3.org/TR/NOTE-pics-ng-metadata>.
This model includes a number of extensions to the original PICS representation
scheme, in order to support more general forms of metadata. These extensions
include such things as:
- additional primitive types such as strings, URLs, lists, and symbols
(sequences of characters acting as unique identifiers)
- the ability for attributes to describe objects, things referred to
by objects, and other attributes
- an inheritance mechanism to allow collections of metadata to be shared
and reused
- the ability to define schemata that define the meanings and other metadata
about attributes (the draft notes that the mechanism for doing this may
take on features of metaobject protocols)
- a metamodel for the model itself, including a small set of attributes
that are available in any label
Other papers related to this effort include:
The PICS-NG effort has been merged with other work to become W3C's Resource
Description Framework activity (see Section 2.2.6).
PICS illustrates a number of important ideas in data modeling and metadata
representation. One such idea is the definition of specific required
data items (e.g., category, label) having predefined
meanings in the model. Such specifications are important in supporting
interoperability among applications that use PICS ratings. PICS also illustrates
the use of metalevel pointers. The URLs that identify rating services
and rating systems in PICS point to information that describes PICS metadata
(i.e., to metametadata). These illustrate the idea that a given piece of
data on the Web, no matter what its intended purpose (e.g., whether it
is intended to represent data or metadata), can itself point to (or be
related in some other way to) data that can be used to help interpret it.
Finally, PICS illustrates the use of a metalevel (or reflective)
architecture. PICS requires that ordinary requests for data on the
Web be interrupted or intercepted, so that rating information about the
requested resource can be retrieved, and a decision made about whether
to return the requested data or not. This same basic idea can be used to
enhance individual requests with other types of additional processing,
often transparently to users. For example, such processing could be used
to bracket a collection of individual requests to form a database-like
transaction, by adding interactions with a transaction processor to these
requests. Examples of such processing are described in [CM93, Man93, SW96].
These same ideas are the basis for current OBJS work on an Intermediary
Architecture <http://www.objs.com/workshops/ws9801/papers/paper103.html>
for the Web.
As illustrated by the existence of a PICS-NG effort, PICS itself requires
extensions to deal with more general metadata requirements. Some of these
are described further in the discussion of the Resource Description Framework
(Section 2.2.6). In addition, in order to provide a complete Web object
model, PICS and similar ideas must be augmented with an API providing applications
with easy access to the state, and with mechanisms to link code to the
state represented using models such as PICS. These aspects will be discussed
in subsequent sections.
2.2.4 XML-Data
XML-Data <http://www.w3.org/TR/1998/NOTE-XML-data/>
is a submission to W3C by Microsoft, ArborText, DataChannel, and INSO.
XML-Data defines an XML vocabulary for schemas, that is, for defining
and documenting object classes. It can be used either for classes which
are strictly syntactic (for example, XML), or which indicate concepts and
relations among concepts (as used in relational databases, knowledge representation
graphs, and RDF). The former are called "syntactic schemas;"
the latter "conceptual schemas."
For example, an XML document might contain a "book" element
which lexically contains an "author" element and a "title"
element. An XML-Data schema can describe such syntax. However, in another
context, it may simply be necessary to represent more abstractly that books
have titles and authors, irrespective of any syntax. XML-Data schemas can
also describe such conceptual relationships. Further, the information about
books, titles and authors might be stored in a relational database, in
which case XML-Data schemas can describe the database row types and key
relationships.
One immediate implication of the ideas in XML-Data is that, using XML-Data,
XML document types can be described using XML itself, rather than DTD syntax.
Another is that XML-Data schemas provide a common vocabulary for ideas
which overlap between syntactic, database and conceptual schemas. All features
can be used together as appropriate.
Schemas in XML-Data are composed principally of declarations for:
- Concepts
- Classes of objects
- Class hierarchies
- Properties
- Constraints
- Relationships
- Indicated by primary key to foreign key matching
- Indicated by URI
- XML DTD Grammars and Compatibility
- grammatical rules governing the valid nesting of the elements and attributes
- attributes of elements
- internal and external entities
- notations
- Data types giving parsing rules and implementation formats.
- Mapping rules allowing abbreviated grammars to map to a conceptual
data model.
The following simple example taken from the XML-Data submission shows
some data about books and authors, and the XML-Data schema which describes
it (note the use of the XML Namespace facility, described in Section 2.1.4,
for qualifying names).
Some data:
<?xml:namespace name="http://company.com/schemas/books/" as="bk"/>
<?xml:namespace name="http://www.ecom.org/schemas/dc/" as="ecom" ?>
<bk:booksAndAuthors>
<Person>
<name>Henry Ford</name>
<birthday>1863</birthday>
</Person>
<Person>
<name>Harvey S. Firestone</name>
</Person>
<Person>
<name>Samuel Crowther</name>
</Person>
<Book>
<author>Henry Ford</author>
<author>Samuel Crowther</author>
<title>My Life and Work</title>
</Book>
<Book>
<author>Harvey S. Firestone</author>
<author>Samuel Crowther</author>
<title>Men and Rubber</title>
<ecom:price>23.95</ecom:price>
</Book>
</bk:booksAndAuthors>
The schema for http://company.com/schemas/books:
<?xml:namespace name="urn:uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882/" as="s"/?>
<?xml:namespace href="http://www.ecom.org/schemas/ecom/" as="ecom" ?>
<s:schema>
<elementType id="name">
<string/>
</elementType>
<elementType id="birthday">
<string/>
<dataType dt="date.ISO8601"/>
</elementType>
<elementType id="Person">
<element type="#name" id="p1"/>
<element type="#birthday" occurs="OPTIONAL">
<min>1700-01-01</min><max>2100-01-01</max>
</element>
<key id="k1"><keyPart href="#p1" /></key>
</elementType>
<elementType id="author">
<string/>
<domain type="#Book"/>
<foreignKey range="#Person" key="#k1"/>
</elementType>
<elementType id="writtenWork">
<element type="#author" occurs="ONEORMORE"/>
</elementType>
<elementType id="Book" >
<genus type="#writtenWork"/>
<superType href=" http://www.ecom.org/schemas/ecom/commercialItem"/>
<superType href=" http://www.ecom.org/schemas/ecom/inventoryItem"/>
<group groupOrder="SEQ" occurs="OPTIONAL">
<element type="#preface"/>
<element type="#introduction"/>
</group>
<element href="http://www.ecom.org/schemas/ecom/price"/>
<element href="ecom:quantityOnHand"/>
</elementType>
<elementTypeEquivalent id="livre" type="#Book"/>
<elementTypeEquivalent id="auteur" type="#author"/>
</s:schema>
While this example does not illustrate all of the capabilities of XML-Data,
it does illustrate the capabilities of declaring such things as:
- the names and data types of data elements and groups
- required or optional data elements
- constraints on values (e.g., minimum and maximum)
- data elements which act as keys
- referential integrity constraints between keys in one group and foreign
keys in another
- class hierarchies (supertype relationships)
- mixing declarations from multiple schemas
The submission
should be referenced for further details and additional examples.
XML-Data is another example of a higher-level model built using XML
as its representation. It is not yet clear how the overlap in metadata
capabilities between such representations as DTDs, RDF, and XML-Data will
work out. The XML-Data approach may prove to be better than DTDs in supporting
some types of processing, such as database-like operations, since it makes
no distinctions between data and metadata representations. Like the other
data models described in this section, XML-Data is not sufficient to form
a complete Web object model. In particular, it requires integration with
an API facility and a mechanism to access associated code.
2.2.5 Meta Content Framework (MCF)
Netscape's Meta Content
Framework (MCF) <http://www.w3.org/TR/NOTE-MCF-XML/> [GB97] is
a proposal for a metadata model based on the increasing need for machine-readable
descriptions of distributed information. MCF is based on the following
principles:
- There is no useful distinction between data and metadata; these are
simply roles that data may play with respect to some application or requirement,
and hence there should be no special syntax reserved just for "metadata"
- For interoperability and efficiency, descriptive information should
share a common data model and vocabulary (e.g., attribute set) as much
as possible
The latter point is particularly important. If all applications save
their data in XML format, this would be somewhat more open than the use
of proprietary formats, since any application could access the resulting
documents. However, in order for applications to meaningfully process
those documents, it would be necessary for the applications to recognize
the various labels and structures used in those documents, and their associated
semantics. Agreements on data models and vocabularies allow this sort of
mutual recognition of labels and structure among applications, thus supporting
interoperability.
MCF is essentially a structure description language. The basic information
structure used is the Directed Labeled Graph (DLG). An MCF database is
a set of DLGs, consisting of:
- a set of labels (property types)
- a set of nodes
- a set of arcs, where each arc is a triple consisting of two nodes (the
origin and destination) and a label. Arcs are also referred to as properties.
Nodes represent things like web pages, images, subject categories, and
sites. The labels are nodes that correspond to properties such as size
or lastRevisionDate used to describe web pages, subject categories,
etc., and also to define relations such as hyperlinks, authorship, or parenthood,
between these things.
Each label/property type, such as pageSize, is a node (but not
all nodes are property types). Since labels are nodes, they can participate
in relationships that, e.g., define its semantics. For example, a pageSize
node could have properties that specify its domain (e.g., Document),
its range (sizeInBytes), that a Document has only one pageSize,
and that provide human-readable documentation of the intended semantics.
An MCF node can be either a primitive data type or a "Unit".
The primitive data types are the same as the Java primitive types. In addition,
a DATE type should be supported by the low-level MCF machinery. The concept
of a "Unit" corresponds loosely to the Java concept of "Object".
MCF defines a small set of units with predefined semantics in order
to "bootstrap" the type system. These include, among others:
- typeOf: the PropertyType used to specify that a given object is of
a certain type. Every unit has a typeOf property
- Category: corresponds to an object Class. The destination of a typeOf
property has a typeOf property which ends at Category
- Unit: the most general Category
- PropertyType: the typeOf all properties
- superType: a PropertyType used to indicate that one Category is the
superType of another (i.e., MCF supports type/subtype hierarchies as found
in many object models).
MCF recognizes that, for purposes of interoperability, it would be good
to standardize the vocabulary for commonly-used terms. [GB97] proposes
some items for this vocabulary (largely derived from existing standards
such as the Dublin Core) for describing Web content. [GB97] also defines
an XML-based syntax for representing MCF. This essentially defines a type
system for XML.
Like PICS, MCF illustrates a number of important ideas in data modeling
and metadata representation. For example, MCF illustrates both the use
of specific required data items having predefined meanings in the model,
and metalevel pointers. Unlike PICS, MCF represents a data model that can
be used for more general purposes than content labeling. For example, it
includes a type hierarchy, a richer set of base types, and other aspects
of a full data model. In addition to required data items representing aspects
of the model structure, the MCF reference identifies a list of suggestions
for standard application-specific item names borrowed from the Dublin Core
and elsewhere. MCF "units" are similar to the individual elements
of the OEM model. Many MCF concepts have been incorporated into W3C's RDF
(described in the next section). However, as noted in connection with other
models in this section, these concepts must be combined with an API and
a mechanism for integrating behavior to provide full object model support.
2.2.6 Resource Description Framework (RDF)
The World Wide Web Consortium's Resource
Description Framework (RDF) effort <http://www.w3.org/Metadata/RDF/>
is currently developing a mechanism designed for exchanging machine-understandable
metadata describing Web resources. This type of metadata can be used, e.g.:
- in resource discovery, to provide better search engine capabilities
- for cataloging, in describing the content and content relationships
available at a particular Web site, page, or digital library
- by intelligent software agents, to facilitate knowledge sharing and
exchange
- in content rating
- in describing collections of resources that represent a single logical
"document"
- in describing intellectual property rights
- combined with digital signatures, in electronic commerce, collaboration,
and similar applications
The work combines extensions of the PICS technology to support more
general metadata requirements with work on metadata models such as Netscape's
Meta Content Framework (MCF) and Microsoft's Web
Collections [Hop97]. The current
RDF draft specification <http://www.w3.org/TR/WD-rdf-syntax/>
defines both a data model for representing RDF metadata, and an XML-based
syntax for expressing and transporting the metadata.
The basis of RDF is a model for representing named properties
and their values. These properties serve both to represent attributes of
resources (and in this sense correspond to attribute/value pairs) and to
represent relationships between resources. The RDF data model is a
syntax-independent way of representing RDF statements.
The core RDF data model is defined in terms of:
- a set of Nodes (N)
- a set of PropertyTypes (P), a subset of N
- a set of 3-tuples T, whose elements are informally known as properties.
The first item of each tuple is an element of P, the second item is an
element of N and the third item is either an element of N or an atomic
value (e.g. a Unicode string).
(thus resembling MCF).
In this data model both the resources being described and the values
describing them are nodes in a directed labeled graph (values may
themselves be resources). The arcs connecting pairs of nodes correspond
to the names of the property types. This is represented pictorially as:
[resource R] ---propertyType P---> [value V]
and can be read "V is the value of the property P for resource
R", or left-to-right; "R has property P with value V". For
example the statement "John Smith is the Author of the Web page "http://www.bar.com/some.doc"
would be represented as:
[http://www.bar.com/some.doc] ---author---> "John Smith"
where the notation [URI] denotes the instance of the resource identified
by URI and "..." denotes a simple Unicode string.
According to the above definition, the property "author",
i.e. the arc labeled "author" plus its source and target nodes
is the triple (3-tuple):
{author, [http://www.bar.com/some.doc], "John Smith"}
where "author" denotes a node used for labeling this arc.
The triple composed of a resource, a property type, and a value is an RDF statement.
A collection of these triples with the same second item is called an
assertions. Assertions are particularly useful when describing a
number of properties of the same resource. Assertions are diagramed as
follows:
[resource R]-+---property P1----> [value Vp1]
|
+---property P2----> [value Vp2]
An RDF assertions can be a resource itself and can therefore
be described by properties; that is, an assertions can itself be
used as the source node of an arc. The name assertions is suggestive of
the fact that the properties specified in it are effectively (logical)
assertions about the resource being described. This establishes a relationship
between RDF and a logic-based interpretation of the data structure which
will be further developed in Section 3.
Assertions may be associated with the resource they describe in one
of four ways:
- the assertions may be contained within the resource (embedded)
- the assertions may be external to the resource but supplied by the
transfer mechanism in the same retrieval transaction as that which returns
the resource (along-with)
- the assertions may be retrieved independently from the resource, including
from a different source (service bureau)
- the assertions may contain the resource (wrapped)
All resources will not support all association methods (e.g., many resource
types will not support embedding).
The set of properties in a given assertions, as well as any characteristics
or restrictions of the property values themselves, are defined by one or
more schemas. Schemas are identified by a URL. An assertions
may contain properties from more than one schema. RDF uses the XML namespace
mechanism to associate the schema with the properties in the assertions.
The schema URL may be treated merely as an identifier, or it may refer
to a machine-readable description of the schema. By definition, an application
that understands a particular schema used by an assertions understands
the semantics of each of the contained properties. An application that
has no knowledge of a particular schema will minimally be able to parse
the assertions into the property and property value components, and will
be able to transport the assertions intact (e.g., to a cache or to another
application).
A human- or machine-readable description of an RDF schema may be accessed
through content negotiation by dereferencing the schema URL. If the schema
is machine-readable, it may be possible for an application to dynamically
learn some of the semantics of the properties named in the schema.
An RDF statement can itself be the target node of an arc (i.e. the value
of some other property) or the source node of an arc (i.e. it can have
properties). In these cases, the original property (i.e., the statement)
must be reified; that is, converted into nodes and arcs. RDF defines
a reification mechanism for doing this. Reified properties are drawn
as a single node with several arcs emanating from it representing the resource,
property name, and value:
[property P1]-+---PropName---> ["name"]
|
+---PropObj----> [resource R]
|
+---PropValue--> [value Vp1]
This allows RDF to be used to make statements about other statements;
for example, the statement "Joe believes that the document 'The Origin
of Species' was authored by Charles Darwin" would be diagramed as:
[Joe]--believes-->[stmnt1]+--InstanceOf-> RDF:Property
|
+--PropName->"author"
|
+--PropObj->[http://loc.gov/Books/Species]
|
+--PropValue->"Charles Darwin"
To help in reifying properties, RDF defines the InstanceOf relation
(property) to provide primitive typing, as shown in the example.
To reify a property, all that is done is to add to the data model an
additional node (with a generated label) and the three triples with first
items (or arcs with labels) using the predefined names RDF:PropName,
RDF:PropObj, and RDF:PropValue respectively, second item
the generated node label, and third item the corresponding property type,
resource node, and value node respectively. In the above example, the three
added triples would be:
{PropName, stmnt1, "author"}
{PropObj, stmnt1, [http://loc.gov/Books/Species]}
{PropValue, stmnt1, "Charles Darwin"}
(The use of the "RDF:" prefix in names illustrates
the use of the XML namespace mechanism to qualify names to indicate the
schema in which they are defined.)
Frequently it is necessary to create a collection of nodes; e.g. to
state that a property has multiple values. RDF defines three kinds
of collections: ordered lists of nodes, called sequences, unordered
lists of nodes, called bags, and lists that represent alternatives
for the (single) value of a property, called alternatives. To create
collections of nodes, a new node is created that is an RDF:InstanceOf
one of the three node types RDF:Seq, RDF:Bag, or RDF:Alternatives.
The remaining arcs from that new node point to each of the members of the
collection and are uniquely labeled using the elements from Ord.
For the RDF:Alternatives, there must be at least one member whose
arc label is RDF:1, and that is the default value for the Alternatives
node.
The RDF data model provides an abstract, conceptual framework for defining
and using metadata. A concrete syntax is also needed for the purpose of
authoring and exchanging this metadata. The syntax does not add to the
model, and APIs could be provided to manipulate RDF metadata without reference
to a concrete syntax. RDF uses XML encoding as its syntax. However, RDF
does not require an XML DTD for the contents of assertion blocks (and RDF
schemas are not required to be XML DTDs). In this respect, RDF requires
at most that its XML representations be well-formed.
RDF defines several XML elements for its XML encoding. The RDF:serialization
element is a simple wrapper that marks the boundaries in an XML document,
where the content is explicitly intended to be mappable into an RDF data
model instance. RDF:assertions and RDF:resource contain
the remaining elements that instantiate properties in the model instance.
Each XML element E contained by an RDF:assertions or an RDF:resource
results in the creation of a property (a triple that is an element of the
formal set T defined earlier).
With these basic principles defined, directed graph models of arbitrary
complexity can be constructed and exchanged. A simple example would be
"John Smith is the Author of the document whose URL is http://www.bar.com/some.doc"
(all these examples are taken from the RDF paper cited above, but updated
to use more recent XML namespace syntax). This assertion can be modeled
with the directed graph:
[http://www.bar.com/some.doc] ---bib:author---> "John Smith"
(This report uses a notation where Nodes are represented by items in
square brackets, arcs are represented as arrows, and strings are represented
by quoted items.) This small graph can be exchanged in the serialization
syntax as:
<?xml:namespace name="http://docs.r.us.com/bibliography-info" as="bib"?>
<?xml:namespace name="http://www.w3.org/schemas/rdf-schema" as="RDF"?>
<RDF:serialization>
<RDF:assertions href="http://www.bar.com/some.doc">
<bib:author>John Smith</bib:author>
</RDF:assertions>
</RDF:serialization>
This example illustrates how the resource, property name, and value
are translated into XML.
A more elaborate model could be created in order to say additional things
about John Smith, such as his contact information, as in the model:
[http://www.bar.com/some.doc]
|
bib:author
|
V
[John Smith]-+---bib:name----> "John Smith"
|
+---bib:email----> "john@smith.com"
|
+---bib:phone----> "+1 (555) 123-4567"
which could be exchanged using the XML serialization representation:
<?xml:namespace name="http://docs.r.us.com/bibliography-info" as="bib"?>
<?xml:namespace name="http://www.w3.org/schemas/rdf-schema" as="RDF"?>
<RDF:serialization>
<RDF:assertions href="http://www.bar.com/some.doc">
<bib:author>
<RDF:resource>
<bib:name>John Smith</bib:name>
<bib:email>john@smith.com</bib:email>
<bib:phone>+1 (555) 123-4567</bib:phone>
</RDF:resource>
</bib:author>
</RDF:assertions>
</RDF:serialization>
The serialization above is equivalent to this second serialization:
<?xml:namespace name="http://docs.r.us.com/bibliography-info" as="bib"?>
<?xml:namespace name="http://www.w3.org/schemas/rdf-schema" as="RDF"?>
<RDF:serialization>
<RDF:assertions href="http://www.bar.com/some.doc">
<bib:author href="#John_Smith"/>
</RDF:assertions>
</RDF:serialization>
<RDF:resource id="John_Smith">
<bib:name>John Smith</bib:name>
<bib:email>john@smith.com</bib:email>
<bib:phone>+1 (555) 123-4567</bib:phone>
</RDF:resource>
In these representations, the RDF:resource element creates
an in-line resource. Typically such a resource will be a surrogate, or
proxy, for some other real resource that does not have a recognizable URI.
The id= attribute in the second representation provides a name
for the resource element so that the resource may be referred to elsewhere.
As an example of making a statement about a statement, consider the
case of computing a digital signature on an RDF assertion. (It is assumed
that the signature is computed over a concrete XML representation of the
assertion rather than over an internal representation. The figure below
shows a box containing a small graph. This is a convention to indicate
that the XML content whose ID is foo is a concrete representation of the
graph it contains.) What is to be specified in the model is expressed by
the pair of graphs below - that there is an XML encoding of some assertion,
and that there is some other XML content that is a digital signature over
that encoding.
+---------------------------------------------------------------+
| ID=foo |
| |
| [http://www.bar.com/some.doc] ---DC:creator---> "John Smith" |
| |
+---------------------------------------------------------------+
[foo]------DSIG:Signature------>"AKGJOERGHJWEJ348GH4GHEIGH4ROI4"
The details could be expressed in the model below:
"AKGJOERGHJWEJ348GH4GHEIGH4ROI4"<--RDF:PropValue----+
|
[DSIG:Signature]<----RDF:PropName-----+
|
+--RDF:InstanceOf-->[RDF:Property]<--RDF:InstanceOf--+
| |
| |
[foo]<----------------RDF:PropObj-----------------[prop-001]
|
|
+---------------------------------------------+
| |
+-----------------------------+ |
| | |
RDF:PropObj RDF:PropName RDF:PropValue
| | |
V V V
[http://www.bar.com/some.doc] ---DC:creator---> "John Smith"
These models could also be expressed as:
<?xml:namespace name="http://purl.org/DublinCore/RDFschema" as="DC"?>
<?xml:namespace name="http://www.w3.org/schemas/rdf-schema" as="RDF"?>
<?xml:namespace name="http://www.w3.org/schemas/DSig-schema" as="DSIG"?>
<RDF:serialization>
<RDF:assertions href="http://www.bar.com/some.doc" id="foo">
<DC:Creator>John Smith</DC:Creator>
</RDF:assertions>
<RDF:assertions href="#foo">
<DSIG:Signature>AKGJOERGHJWEJ348GH4HGEIGH4ROI4</DSIG:Signature>
</RDF:assertions>
</RDF:serialization>
(Note that node labels such as "RDF:Property" are shorthand
for a full URI such as "http://www.w3.org/schemas/rdf-schema#Property").
The RDF data model intrinsically only supports binary relations. However,
higher arity relations can also be represented, using just binary relations.
As an example, consider the subject of one of John Smith's recent articles
- library science. The Dewey Decimal Code for library science could be
used to categorize that article. While the numeric code is the true Dewey
value, few people can understand those codes. Therefore, the description
of the Dewey categories has been translated into several different languages.
In fact, Dewey Decimal codes are far from the only subject categorization
scheme. So, it might be desirable to define a "Subject" node
that not only specified the subject of a paper, but also indicated the
language and categorization scheme it came from. That might look like:
[http://www.webnuts.net/Jan97.html]
|
DC:subject
|
V
[subject_001]-+---DC:scheme----> "Dewey Decimal Code"
|
+---DC:lang----> "English"
|
+---RDF:PropValue----> "020 - Library Science"
which could be exchanged as:
<?xml:namespace name="http://purl.org/DublinCore/RDFschema" as="DC"?>
<?xml:namespace name="http://www.w3.org/schemas/rdf-schema" as="RDF"?>
<RDF:serialization>
<RDF:assertions href="http://www.webnuts.net/Jan97.html">
<DC:subject>
<RDF:resource id="subject_001">
<DC:scheme>Dewey Decimal Code</DC:scheme>
<DC:lang>English</DC:lang>
<RDF:PropValue>020 - Library Science</RDF:PropValue>
</RDF:resource>
</DC:subject>
</RDF:assertions>
</RDF:serialization>
A common use of this higher-arity capability is when dealing with units
of measure. A person's weight is not just a number like 94, it also requires
specification of the units on that number. In this case either pounds or
kilograms might be used. A relationship with an additional arc might be
used to record the fact that John Smith is a rather strapping gentleman:
+--NIST:units--> "pounds"
|
[John Smith]--NIST:weight-->[weight_001]-+
|
+--RDF:PropValue--> "200"
which can be exchanged as:
<?xml:namespace name="http://www.nist.gov/RDFschema" as="NIST"?>
<?xml:namespace name="http://www.w3.org/schemas/rdf-schema" as="RDF"?>
<RDF:serialization>
<RDF:assertions href="John_Smith">
<NIST:weight>
<RDF:resource id="weight_001">
<NIST:units href="#pounds"/>
<RDF:PropValue>200</RDF:PropValue>
</RDF:resource>
</NIST:weight>
</RDF:assertions>
</RDF:serialization>
assuming the node "pounds" was defined elsewhere.
The RDF effort is attempting to define a very general abstract metadata
architecture and associated support facilities. RDF, like MCF, illustrates
how a higher level model can be used together with XML to support specific
types of application requirements, and illustrates a number of the same
metadata modeling ideas as MCF. The RDF examples above specifically illustrate
a requirement for metalevel pointers to explicitly link tags to attribute
definitions (by an explicit pointer, not by looking up the name in a dictionary).
The more powerful facilities of XML for defining hyperlinks will improve
the ability to define very general relationships between data and metadata
that describes (and can help interpret) it. For example, the advanced XML
linking facilities defined in XLL would allow assertions to refer to parts
of referenced documents. It seems likely that RDF will also investigate
mechanisms to automatically provide access to RDF metadata at runtime (implementing
the various association modes such as along-with), similar to the
mechanisms provided by PICS for content labels. In implementing a Web object
model, these techniques will be required to gain access to the object methods
(which may be either embedded in the Web page, or located as separate resources).
Because of its generality in representing metadata, and the likelihood
that it will be the basis of future Web developments in representing metadata,
the Web object model described in Section 3 uses RDF (and its XML representation)
as part of its structural base (although RDF is currently incomplete, and
will be developed further). Additional aspects of MCF may be used as well,
depending on more detailed analysis to be performed later. Section 3 will
describe further decisions about the nature of the object model, based
on RDF as a starting point.
However, RDF and MCF themselves are not sufficient to support all requirements
of a Web object model. For example, the object model requires an API to
its state representation, and thus RDF and MCF must be integrated with
parallel work on a Document Object Model (see below), which is not currently
the case. Also, mechanisms for linking code to RDF and MCF structures must
be further developed. Finally, structured database capabilities do not
exist for these structures, and must be worked out.
2.3 Adding Behavior to Web Pages
Previous sections have noted that what is needed to progress toward
a Web object model is:
- a richer base representation than HTML, in order to better represent
"object state" (in particular, better support for semantic identification
of fields, rather than simply supporting presentation aspects of data)
- an API to this state, so that programs can readily access it (without
complex parsing)
- an enhanced ability to define relationships between this state and
specified pieces of code that can serve as object methods
Section 2.1 described work toward providing the Web with a richer base
representation (e.g., XML). The metadata and model work described in Section
2.2 described approaches for adding additional structure to this representation.
In addition, as noted in the introduction to Section 2.2, these techniques
for representing metadata and linking it to Web resources provide a conceptual
framework for linking behavior to Web resources, by treating the
code implementing that behavior as a form of metadata. Code resources are
already being stored on the Web, e.g., in program libraries supporting
reuse, and it is already possible to create links between Web documents
and such resources. However, in using code resources to create objects,
it is necessary to reflect the special semantics associated with these
links. These semantics somewhat resemble those of metadata such as content
labels, in the sense that rather than the user explicitly following the
links to retrieve the associated "metadata", some of the "metadata"
is automatically retrieved during access to the original resource, in order
to support some special processing. In the case of content labels, the
special processing involves checking the content labels against user-specified
requirements in order to determine whether to allow access to the original
resource. In the case of object methods, the special processing involves
invoking the retrieved code in order to perform some operation . This particular
approach to representing and invoking object methods will be discussed
further in Section 3.
This section describes several mechanisms developed within the Web community
for defining relationships between state and code, and for providing an
API to state (the second and third bullets above). Specifically, techniques
developed for embedding objects and scripts in Web documents represents
one way of associating behavior with the state represented by a Web document.
The W3C's Document Object Model (DOM) effort represents another way of
addressing this issue, as well as the issue of providing an API to this
state. These two issues are closely related.
A program must gain access to data in order to process it, and so an
object method must have access to the object's state. It is always possible
to pass data as a value to a program. However, the program must understand
the structure of this data in order to access it efficiently. Conventional
object models provide what is in effect a special API for object methods
to use when accessing state for this purpose. This is also necessarily
in a Web object model. However, the need for such an API becomes especially
important when the state has a rich, complex structure, such as an XML
document. Without an API to this state (and its implementation), each program
would have to implement a considerable amount of code simply to parse the
structure, in order to locate the parts of the document required for specific
purposes. An API providing access to the various parts of a document, together
with an implementation of this API as part of the general representation
of this state's "data type", provides this code as a pre-existing
component, allowing the program to concentrate on application-related processing.
The DOM provides such an API. At the same time, it provides part of a general
mechanism (albeit a very unconstrained one) for linking code and state,
since it provides a straightforward mechanism for code (currently, programs
such as plug-ins or external applications) to access the state it needs.
Finally, the Web Interface Definition Language (described in Section
2.3.3) is commercial technology that represents another mechanism for providing
an API to state (as well as to Web-based services).
2.3.1 Document Object Model (DOM)
W3C's Document Object Model (DOM)
<http://www.w3.org/DOM/> effort provides a mechanism for scripts
or programs to access and manipulate parsed HTML and XML content (including
all markup and any Document Type Definitions) as a collection of objects.
Specifically, DOM defines an object-oriented API of an HTML or XML document
that a Web client can present to programs (applications or scripts) that
need to process the document. The client (at least conceptually) operates
off this collection of objects in displaying the document. Thus, by operating
on the collection of objects representing a Web page, scripts or programs
can change styles and attributes of page elements, or even replace existing
elements with new ones, resulting in an immediate change to the data displayed
to the user. As a result, DOM makes it easy to implement dynamic content
on the client, rather than forcing all such content to be implemented on
the server, and provides a basic way to integrate a document's data with
code. For example, a client might implement a JavaScript DOM interface,
so that scripts in this language could be used within the page itself to
manipulate the page. The client could also provide a DOM interface to external
applications such as plug-ins allowing them to access the document via
the client. Similarly, an editor might implement a Java DOM interface to
allow programs written in Java to interact with the editor to manipulate
the page.
DOM is a generalization of Dynamic HTML facilities defined by Microsoft
and Netscape. Functionality equivalent to the Dynamic HTML support provided
by Netscape Navigator 3.0 and Microsoft Internet Explorer 3.0 is referred
to as "DOM level 0". DOM level 1 extends these capabilities to,
for example, allow creation "from scratch" of entire Web documents
in memory by creating the appropriate objects. The DOM
Working Draft specification <http://www.w3.org/TR/WD-DOM> includes
level 1 Core specifications which apply to both HTML and XML documents,
and level 1 specializations for HTML and XML documents. The DOM object
class definitions in these specifications have their interfaces defined
using OMG IDL. Java interface specifications are also defined (see the
specifications for details).
DOM represents a document as a hierarchy of objects, called nodes,
which are derived (by parsing) from a source representation of the document
(HTML or XML). The DOM object classes represent generic components
of a document, and hence define a document object metamodel. The DOM Level
1 working draft defines a set of object classes (and their inheritance
relationships) for representing documents. The major classes are:
Node
|
+--Document
| |
| +--HTMLDocument
|
+--Element
| |
| +--HTMLElement
| |
| +--specific HTML elements
|
+--Attribute
|
+--Text
|
+--PI [Processing Instruction, an XML concept from SGML]
|
+--Comment
The Node object is the base type for all objects in the DOM.
It may have an arbitrary number (including zero) of sequentially-ordered
child nodes. It usually has a parent Node, the exception being that the
root Node in a document tree has no parent.
Element objects represent the elements in HTML and XML documents.
Elements contain, as child nodes, all of the content between the start
tag and the corresponding end tag of an element. Aside from Text
nodes, the vast majority of node types that applications will encounter
when traversing a document structure will be Element nodes. Element
objects also have a list of Attribute objects which represent
the set of attributes explicitly defined as part of the element, and those
defined in the DTD that have default values.
Text objects are used to represent any non-markup values, whether
the values are intended to represent an integer, date, or some other type
of value. For XML documents, all whitespace between markup results in Text
objects being created.
The Document object is the root node of a document object tree,
and represents the entire HTML or XML document. The HTMLDocument
subtype represents a specialization of the generic Document type
for the specific requirements of HTML documents.
Additional object classes are defined in the working draft for representing
XML Document Type Definitions, and auxiliary data structures (e.g., lists
of nodes).
Normally, a DOM-compliant implementation will make the main Document
instance available to the application through some implementation-defined
mechanism. For example, a typical implementation would give the application
a reference to a DocumentContext object. This object describes
the source of the document, as well as related information such as the
date and time the document was last changed. From the DocumentContext,
the application may access the Document object, which is the root
of the document object hierarchy. From the Document object, the
application can use the methods provided for accessing individual nodes,
selection of specific node types (such as all images), and so on. For XML
documents, the DTD is available through the documentType method
(which returns null for HTML documents and XML documents without
DTDs). Document also defines a getElementsByTagName method.
This produces an enumerator that iterates over all Element nodes
within the document whose tagName matches the input name provided.
(The DOM working draft indicates that a future version of the DOM will
provide a more generalized querying mechanism for nodes).
As an example generally illustrating how an XML document might be presented
to an application in the DOM, consider the example described in Section
2.1.4 of a simple relational database represented in XML. The DOM for XML
would present the XML document to an application as a collection (actually,
a tree) of objects. Most of these objects would be of type Node,
and specifically of its subtypes Element (representing the individual
elements) and Text (representing the content). More precisely:
<!doctype mydata "http://www.w3.org/mydata">
<mydata>
...
</mydata>
(the outer markup) would be presented as an object of type Document
(a subtype of Node). The children of this node would be objects
representing the Table elements (and, indirectly, their contained rows
and fields). Type Node provides a method getChildren()
to access the children. The table delimited by
<authors>
...
</authors>
would be presented as an object of type Element (another subtype
of Node) representing the Authors table. Type Element
provides a method getTagName() to provide access to the actual
tag name (authors in this case). The children of this node would
be objects representing Row elements of type Author (and, indirectly, the
contained fields). Similarly,
<editors>
...
</editors>
would be presented as another object of type Element representing
the Editors table.
Each element delimited by
<author>
...
</author>
would be presented as an object of type Element representing
a particular Author row. The children of this node would be objects representing
the fields contained in the row. Elements delimited by
<editor>
...
</editor>
would similarly be presented as objects of type Element representing
Editor rows.
Fields would similarly be presented as Element objects. For
example, each element delimited by
<name>
...
</name>
would be presented as an object of type Element representing
that particular field. Each of these elements would have a child node of
type Text (Text is not a subtype of Element)
representing the text value of the field (e.g., "Robert Roberts").
The data() method of the Text object type returns the
actual string representation. In this case, this would end the nesting.
The representation of a Web page in terms of objects makes it easy to
associate code with the various subcomponents of the page. The DOM requirements
also identify the need for an event model, to provide a way to schedule
the execution of the code associated with particular parts of a Web page
at appropriate times. This event model (not yet specified) would extend
the current event capabilities provided by most Web clients. The requirements
specify that:
- all elements will be capable of generating events
- there will be interaction events, update events, and change events
- the event model will allow responses to user interactions
- events will bubble through the structural hierarchy of the document
- events are synchronous
- events will be defined in a platform independent and language neutral
way
- there will be an interface for binding to events
As noted at the beginning of Section 2.3, the development of the DOM
recognizes the fact that, in enhancing the data structuring capabilities
of the Web, more is needed than just more complex representations. There
also must be built-in (and widely-available) capabilities for processing
these representations. The DOM interface (and its implementation by clients
and other tools) provides a general means for applications to access and
traverse these representations without having themselves to perform complex
parsing. The more complex the representation can become, the more important
this capability becomes (and, hence, it is particularly important if XML
is the representation). DOM's support for dynamic documents (documents
mutable on the client) also causes these documents to more closely resemble
the state of general objects. The integration of DOM and XML will provide
a powerful basis for enriched Web applications.
The DOM remains under development, and further work is required to integrate
it both with other Web technology developments, and with capabilities required
to provide full Web object model support. For example, SGML's DSSSL (described
briefly in the XML section) defines a very general object model for SGML
documents, called groves, which resembles the DOM to some extent.
Groves are intended to provide a runtime object model for use while processing
SGML documents. However, it is not clear to what extent DOM and grove capabilities
will be integrated. Groves are extremely general (e.g., using groves it
is possible to define each character in a document as a separate element),
and it is not clear that the same level of generality is required for DOM.
Moreover, groves define an object model for static documents. DOM,
on the other hand, is designed to deal with dynamic documents, which
can be modified by processing applications (via the DOM interface) at runtime.
However, the XML stylesheet proposals are based to some extent on DSSSL
(and hence presumably on the use of some aspects of groves). Another interesting
aspect of this integration is that DSSSL defines a query language called
SDQL for accessing parts of SGML documents for use in stylesheet processing.
The provision of a query language (or aspects of one) for XML would provide
an important base for the development of full-fledged database-like processing
capabilities for Web documents represented in XML. This issue is being
explored further in a companion OBJS technical report in progress.
The DOM defines its API at a generic level, i.e., at the level of components
of a document metamodel. Additional work would be required to define "application
level" object interfaces. For example, in the relational database
example defined above, DOM provides objects of types node, element,
and so on, rather than objects of type author or editor
(or even objects of type table or row). Using DOM, an
application could effectively create such types from the information given,
but it would have to "know what to look for", and would have
to traverse the various element objects to find that information.
It would be desirable to have a capability for creating DOM-like, but application-oriented,
APIs. This could involve using additional metadata (e.g., the DTD, or an
XML-Data-like schema) to generate a default API automatically (which the
document's author could then customize). It might then be possible to attach
specific methods to this API to define application-specific object behavior.
An integration of DOM and the embedded OBJECT elements described below
would be one way to support this. This would effectively permit the creation
of objects in the classic object-oriented programming sense.
The DOM work also needs to be integrated with the work on higher-level
models described in Section 2.2. One effect of this would be to provide
a way to add object behavior to documents without the need for references
to the associated programs to be embedded in the page, as with OBJECT elements.
These models might also provide additional support for generating application-specific
object APIs.
2.3.2 Embedded Objects
Web clients generally contain mechanisms for rendering common data types
such as text, GIF images, colors, fonts, and some graphic elements. To
render data types that do not have built-in support, clients generally
run external applications (plug-ins or helpers). In addition, Web clients
currently support mechanisms for including specialized types of "objects"
in the rendering process that are not physically located in the document,
e.g.:
- the <IMG> tag is used to specify a reference to an image located
in a separate file that is to be included as part of the rendering of the
page
- the <APPLET> tag is used to specify a reference to a Java (or
other) applet that is to be executed as part of the rendering of the page
The recently-adopted HTML
4.0 Specification <http://www.w3.org/TR/REC-html40/> defines
an OBJECT element (and an associated <OBJECT> tag) which subsumes
these specialized tags (the <OBJECT> tag is already supported in
some Web clients). In general, its purpose is to define an inserted
rendering mechanism, in order to allow authors to control whether included
objects are handled by Web clients internally or externally.
In the most general case, an inserted rendering mechanism specifies
three types of information (although in specific cases not all this information
may need to be explicitly specified):
- the rendering mechanism's implementation
- the data to be rendered
- additional values required by the rendering mechanisms at run-time
(Not surprisingly, this is a variant of the information needed for an
object invocation in an object-oriented programming language).
In HTML 4.0, the OBJECT element specifies the location of a rendering
mechanism and the location of data required by the rendering mechanism.
This information is specified by the attributes of the OBJECT element.
The PARAM element specifies a set of run-time values.
A client interprets an OBJECT element by first trying to render the
mechanism specified by the element's attribute. If this cannot be done
for some reason (e.g., the client is configured not to, or the client platform
cannot support that mechanism), the client must try to render the element's
contents. This provides a way to specify alternate object renderings, since
the contents of an OBJECT element can be another OBJECT element specifying
an alternative mechanism. The contents of the most deeply embedded element
should be text. Data to be rendered can be supplied either inline, or from
an external resource. An HTML document can be included in another document
by using an OBJECT element with the data attribute specifying the file
to be included.
The following simple Java applet:
<APPLET code="AudioItem" width="15" height="15">
<PARAM name="snd" value="Hello.au|Welcome.au>
Java applet that plays a welcoming sound.
</APPLET>
may be rewritten as follows using OBJECT:
<OBJECT codetype="application/octet-stream"
code="AudioItem"
width="15" height="15">
<PARAM name="snd" value="Hello.au|Welcome.au">
Java applet that plays a welcoming sound.
</OBJECT>
The OBJECT element includes, among others, the following attributes:
- codebase: the path used to resolve relative URLs specified
by classid, specified as a URL
- classid: the location of a rendering mechanism, specified
as a URL
- codetype: the Internet Media Type of data expected by the
rendering mechanism specified by classid
- data: the location of the data to be rendered, specified as
a URL
- type: the Internet Media Type for the data specified by data
The HTML OBJECT element illustrates an example of a capability for Web
clients to automatically invoke behavior associated with a document when
the behavior is encountered. The approach to a Web object model described
in Section 3 must both generalize this capability, and integrate it with
the XML, RDF, and DOM technologies described earlier. In particular, the
OBJECT element only deals with references to external code that have been
embedded in the document (i.e., the relationship between the code and the
document is represented physically in the document). A generalization of
this capability (and an integration of it with PICS/RDF metadata access
concepts) would allow relationships between code and documents to be specified
separately from the code and documents that are interrelated (just as PICS
content ratings may be specified separately from the content they rate),
and accessed automatically during the processing of the document. This
would permit a more flexible integration of data and code to form Web objects.
The OBJECT element is also the basis of current capabilities that link
Web pages into CORBA distributed object architectures. This is done by
using Java applets (referenced from OBJECT elements on Web pages) which
define CORBA objects, and can interact with other CORBA objects (not necessarily
written in Java) via CORBA's Internet Inter-ORB Protocol (IIOP), using
an ORB contained in the Web client (Netscape Communicator supports such
an ORB). This is an important capability in merging Web and object technologies,
particularly the object service capabilities provided by CORBA architectures.
Combining this capability with the facilities of our Web object model would
provide a deeper integration of Web and object technology, and an improved
ability to apply object services to Web resources. This is discussed further
in Section 3.
2.3.3 Web Interface Definition Language
The Web Interface Definition
Language (WIDL) <http://www.w3.org/TR/NOTE-widl> is commercial
technology from webMethods, Inc.
(information on WIDL is made available at W3C's Web site as a service by
W3C, but WIDL is not W3C technology; WIDL is also described in [KR97]).
WIDL is an application of XML which allows interactions with Web servers
to be defined as functional interfaces. These interfaces can be accessed
by remote systems using standard Web protocols, and provides the structure
necessary for generating client code in languages such as Java, C/C++,
COBOL, and Visual Basic.
A central feature of WIDL is that programmatic interfaces can be defined
and managed for Web resources such as:
- Static documents (HTML, XML, and plain text files)
- Dynamically generated documents (HTML, XML, and plain text files)
- HTML forms
- URL directory structures
These resources need not under the direct control of programs that require
such access. WIDL definitions can be co-located with client programs, centrally
managed in a client/server architecture, or referenced directly from HTML/XML
documents.
WIDL definitions provide a mapping between such Web resources and applications
written in conventional programming languages such as C/C++, COBOL, Visual
Basic, Java, JavaScript, etc., enabling automatic and structured Web access
by compatible client programs, including mainstream business applications,
desktop applications, applets, Web agents, and server-side Web programs
(CGI, etc.). Using WIDL, programs can request Web data and services by
making local calls to functions which encapsulate standard Web access protocols
and utilize WIDL definitions to provide naming services, change management,
error handling, condition processing and intelligent data binding. A browser
is not required to drive Web applications. WIDL requires only that target
systems be Web-enabled (there are numerous commercial products which allow
existing systems to be Web-enabled).
A service defined by WIDL is equivalent to a function call in standard
programming languages. At the highest level, WIDL files describe the locations
(URLs) of services, input parameters to be submitted (via Get or Post methods)
to each service, conditions for successful processing, and output parameters
to be returned by each service. In much the same way that DCE or CORBA
IDL is used to generate code fragments, or 'stubs', to be included in application
development projects, WIDL provides the structure necessary for generating
client code in languages such as C/C++, Java, COBOL, and Visual Basic.
Many of the features of WIDL require a capability to reliably identify
and extract specific data elements from Web documents. Various mechanisms
for accessing elements of HTML and/or XML documents have been defined,
such as the JavaScript Page Object Model, the Document Object Model, and
XML-Link. The following capabilities are desirable for accessing elements
of Web documents:
- HTML Parsing
- XML Parsing
- Text Pattern Matching
Object referencing mechanisms would ideally support both parsing and
pattern matching. Pattern matching extracts data based on regular expressions,
and is well suited to raw text files and poorly constructed HTML documents.
Parsing, on the other hand, recovers document structure and exposes relationships
between document objects, enabling elements of a document to be accessed
with an object model. WIDL does not define or determine a mechanism for
accessing document data, but rather allows an object model referencing
mechanism to be specified on a per-interface basis.
The following example (from the cited reference) illustrates the use
of WIDL to define a package tracking service for generic Shipping. By allowing
a WIDL definition to reference a 'Template' WIDL definition, a general
class of shipping services can be defined. 'FoobarShipping' is one implementation
of the 'Shipping' interface.
<WIDL NAME="genericShipping" TEMPLATE="Shipping"
BASEURL="http://www.shipping.com" VERSION="2.0">
<SERVICE NAME="TrackPackage" METHOD="Get"
URL="/cgi-bin/track_package"
INPUT="TrackInput" OUTPUT="TrackOutput" />
<BINDING NAME="TrackInput" TYPE="INPUT">
<VARIABLE NAME="TrackingNum" TYPE="String" FORMNAME="trk_num" />
<VARIABLE NAME="DestCountry" TYPE="String" FORMNAME="dest_cntry" />
<VARIABLE NAME="ShipDate" TYPE="String" FORMNAME="ship_date" />
</BINDING>
<BINDING NAME="TrackOutput" TYPE="OUTPUT">
<CONDITION TYPE="Failure" REFERENCE="doc.title[0].text"
MATCH="Warning Form" REASONREF="doc.p[0].text" />
<CONDITION TYPE="Success" REFERENCE="doc.title[0].text"
MATCH="Foobar Airbill:*" REASONREF="doc.p[1].value" />
<VARIABLE NAME="disposition" TYPE="String" REFERENCE="doc.h[3].value" />
<VARIABLE NAME="deliveredOn" TYPE="String" REFERENCE="doc.h[5].value" />
<VARIABLE NAME="deliveredTo" TYPE="String" REFERENCE="doc.h[7].value" />
</BINDING>
</WIDL>
In this example, the values defined in the 'TrackInput' binding get
passed via HTTP Get as name-value pairs to a service residing at 'http://www.shipping.com/cgi-bin/track_package'.
Object References are used in the 'TrackOutput' binding to a) check for
successful completion of the service, and b) extract data elements from
the document returned by the HTTP request.
'Input' and 'Output' bindings specify the input and output variables
of a particular service. Input bindings define the name-value pairs to
be passed via Get or Post methods to a Web-based application. Output bindings
use object references to identify and extract data elements from documents
returned by HTTP requests.
Conditions define 'success' and 'failure' states for output bindings,
and determine whether a binding attempt should be retried in the case of
a 'server busy' error: Conditions can apply to a binding as a whole, or
to a specific object reference. Conditions can define error messages to
be returned as the value of the service; error messages can be a literal,
or can be extracted from the returned document.
WIDL is another example of technology that provides an API (an object
interface) to state. In addition, it supports the definition of similar
interfaces to Web-based services. Facilities for defining such interfaces
are helpful tools in integrating Web-based state and behavior.
2.4 Related OMG Technologies
Section 1 briefly described OMG's activities in developing an infrastructure
for distributed object computing. Section 1 also noted the resemblance
of the Web to a simple distributed object system. Given that commonality,
practically any of OMG's work could be considered "relevant"
to the creation of a Web Object Model. Information on the wide range of
OMG's activities is available at the OMG
Web site <http://www.omg.org/>. This activity includes both platform-related
work on infrastructure components, and work related to specific vertical
industry application domains. While much of this OMG activity is proceeding
independently of Internet-related activities, one OMG activity which is
directly addressing the integration of Internet and distributed object
technology is OMG's Internet
Special Interest Group <http://www.objs.com/isig/home.htm>.
While a complete description of OMG activities is outside the scope
of this report, several OMG technologies address structured data representation
capabilities similar to others descrbed in Section 2, and hence are of
direct interest here. Specifically, the OMG has been considering a Tagged
Data Facility, and a Mediated Exchange Facility based on it, as part of
its Common Facilities Architecture. The Tagged Data Facility involves the
use of tagged data items to support semantics-based information exchange
between applications, and also supports nesting and the ability to locate
objects via tags through layers of nesting. The Mediated Exchange Facility
is built on the Tagged Data Facility by adding mediator components and
related services. Several submissions to OMG's Business Object Facility
RFP describe such capabilities. In addition, the already-approved OMG Property
Service provides similar capabilities. These OMG technologies are of interest
in showing that there is a recognized need for tagged "data"
representations to pass semantically-rich data structures between clients
and servers within OMG's distributed object architecture, just as the representations
described in Section 2.1 illustrated the need to do the same thing in the
Web. However, there is not yet any coordination between these two communities
in developing these facilities.
2.4.1 OMG Property Service
The OMG Property Service defines PropertySet objects that act as containers
for sets of properties (name/value pairs). Each property has a different
name. All property values are defined (and represented) as type any.
PropertySet objects provide operations for finding the value of a property
given its name, adding and deleting properties, modifying the value of
an existing property, and determining whether the object has a property
with a given name. PropertySet objects are intended to be a dynamic equivalent
of CORBA attributes. When an application finds it necessary to add an attribute
to an object, and cannot do so by using the IDL interface of the object
(either using an existing attribute, or modifying the interface to add
a new one), it can create a PropertySet object with the necessary attribute(s)
and associate it with the object. A given object may have zero or more
PropertySet objects associated with it. The Property Service does not define
how this association is established. It could be done, for example:
- by having attributes of type PropertySet in the object interface
- by having the object interface inherit from PropertySet
- by using the Relationship Service to define associations between the
object and PropertySet objects
PropertySet objects do not have "schemas" as such; that is,
there is no declaration that restricts a PropertySet to only contain properties
with specific names. Nor is there a declaration that specifies that a property
with a given name must only have values of a specific type. As a result,
in the general case a property with any name/value combination can be contained
in a given PropertySet (and there is no guarantee that a given name won't
be used inconsistently by multiple applications in different PropertySets
the application might define). However, such constraints can be (at least
partially) defined operationally through the PropertySetFactory object
used to create PropertySet objects (by implementing the appropriate PropertySetFactories
to enforce the required constraints).
The OMG Property Service essentially provides a simple, dynamic, object-oriented
interface to relatively unstructured property/value pairs. Object models
(including OMG's) are generally static, in that they require an object
class to have a fixed number of attributes and methods. The OMG Property
Service addresses this restriction, and thus adds value to the object model.
It does not specify an actual representation (this would presumably be
specified using object externalization capabilities currently being developed
by OMG), it is not as rich as XML, nor does it provide the higher-level
modeling capabilities such as those described in Section 2.2. However,
in some respects it resembles a very simple DOM, in that it does provide
an object interface to an (unspecified) representation.
2.4.2 Tagged Data Facility
The OMG has been considering release of an RFP (Request for Proposal)
for a Tagged Data Facility (TDF). The TDF is intended to provide a facility
for defining semantically-tagged objects that can be passed as parameters
between ordinary CORBA objects. In particular, the TDF is intended to:
- support tagged (named) data values of all types
- support nesting (of data objects within other data objects), without
requiring a preplanned sequence or order
- allow the value of a given tag to evolve from being a single value
to being nested tagged data
- support methods for locating, creating, updating, deleting, etc., contained
data objects (in particular, by name)
- support a capability to apply name spacing and synonyms to the tag
within a data object
- support automatic type-conversions of data values on retrieval
A tagged data object is intended to be an object; unlike a PropertySet
object, its interface is not intended to be part of another object. Moreover,
TDF objects are not intended to be "network-visible" objects.
They are intended to be passed by value when used as information exchange
between CORBA objects.
The TDF requirements seem to fit the basic structural capabilities of
OEM and MCF to some extent (the draft TDF RFP explicitly references OEM),
in the sense that they seem to call for the ability to construct complex
graph structures of relatively simple labeled nodes. However, MCF in particular
goes much further than TDF in defining the basis of a rather complete object
model (which is unnecessary in TDF since TDF objects are already CORBA
objects). TDF also specifies some metadata-related requirements, such as
dealing with namespace issues and synonyms. However, like the Property
Service, TDF is not well-integrated with related Web developments. Of course,
as an RFP, the TDF leaves a great deal of detail, both of technology and
usage scenarios, to be supplied by specific technology proposals submitted
in response. As a result, it may be possible that some technology integrating
OMG and Web technology, e.g., combining XML and DOM, could be adopted in
response to the TDF RFP, once it is issued.
3. Building a Web Object Model
Section 2 has described a number of the key technologies that address
issues in creating a Web object model. In this section, we describe a general
approach to integrating these technologies to support a Web object model.
Specifically, the key component technologies we propose to integrate are:
- XML. XML provides a richer representation for object state than HTML,
including:
- application-specific tagged data elements and nested structures
- more powerful linking facilities
In supporting an object model, XML pages (like HTML pages) can also
be used as containers for embedded objects and object methods (e.g., Java
applets)
- the Document Object Model (DOM). The DOM provides an API for XML documents
used as object state, and provides a mechanism for integrating object state
and associated code
- the OBJECT element from HTML 4.0 for representing and implementing
embedded object methods
- concepts from the PICS, RDF, and MCF Web data/metadata models to provide
standardized attributes, data structures, and infrastructure for representing
and implementing basic aspects of the object model, including,
- relationships between documents containing state and documents containing
metadata (including type information and code implementing object methods)
- the framework for accessing and invoking object methods contained in
separate Web resources when documents requiring those methods are accessed
- type definitions and relationships between types (e.g., inheritance),
depending on the details of the object model chosen
In addition to using these emerging Web technologies, we also take advantage
of other existing aspects of the Web, e.g.:
- Web clients already have the capability to invoke some forms of code
associated with Web pages (e.g., Java applets and plug-ins)
- Web clients will soon provide support for PICS. This establishes the
principle of intercepting requests for Web pages in order to perform intermediate
processing on the request (in the case of PICS, checking content ratings
prior to displaying the page).
- Code libraries currently exist on the Web, with metadata describing
them. Thus, it is not a major extension (at least in principle) to allow
this code to be associated with "data pages" in order to form
objects.
3.1 Integration Approach
The idea behind integrating these technologies to form a Web object
model is that an "object" in a conventional object model is basically
a piece of state with some attached (or associated) programs (methods).
In many object model implementations, this idea is exactly reflected in
the physical structure of the objects. For example, a Smalltalk object
consists of a set of state variables (data), together with a pointer (link)
to a class object which contains the object's methods. The structure is
roughly:
Object (state) Class object
+---------------+ +-------------+
| class pointer |------------->| Class data |
+---------------+ +-------------+
| variable 1 | | method 1 |
| variable 2 | | method 2 |
| ... | | ... |
| variable n | | method m |
+---------------+ +-------------+
C++ implementations use similar structures. The state is a collection
of programming language variables, which (usually) are not visible to anything
but the methods (this is referred to as encapsulation). A typical
object model has a tight coupling between the methods and state. All the
structures (class objects, internal representation of methods and state,
etc.) are determined by the programming language implementation, and are
created together as necessary. The class (in particular, the methods it
defines) defines the way the state should (and will) be interpreted within
the system, and hence is a form of metadata for the state. As a
result, the link between an object and its class is essentially a metadata
link.
Extending this idea to the Web environment, the idea is that Web pages
can be considered as state, and objects can be constructed by enhancing
those pages with additional metadata that allows the pages to be considered
as objects in some object model. In particular, we want to enhance Web
pages with metadata consisting of programs that act as object methods with
respect to the "state" represented by the Web page. The resulting
structure would, at a minimum, conceptually be something like:
+----------+
+---------->| method 1 |
+-------+ | +----------+
| Web |--+ ...
| page |--+
+-------+ | +----------+
+---------->| method n |
+----------+
The NCITS Object Model
Features Matrix [Man97] identifies many different object models, with
widely differing characteristics. Different object models could also be
defined for the Web. The details of the structures to be supported in a
Web object model depend on the details of the object model we choose to
define. For example, many object models are class-based, such as
the Smalltalk and C++ models mentioned above. Choosing a class-based model
for the Web would require defining separate class objects to define the
various classes. Other object models are prototype-based, and do
not require a class object (each object essentially defines itself). Either
of these forms (plus others) could be supported by the basic mechanism
we propose.
In a Web object model, some of the tight coupling that exists in programming
language object models would probably be relaxed, and the connection between
the state and code would be somewhat "looser". This would allow
more flexibility in defining associations between programs and Web pages
in the model. For example, unless special constraints prohibited such access,
a user would probably be able to directly access the state (and manipulate
it as well) using standard Web document viewing and creation tools, without
necessarily using any associated methods (just as users today can often
usefully access pages containing Java applets even when Java is inactive
or unsupported on their browsers). In these cases, encapsulation would
be relaxed and access to any methods related to the state would be optional.
Constructing these object model structures requires a number of "pieces"
of technology, as we have already observed several times. These pieces
are:
- a representation for object state; this role is played by XML pages
- an API to this state, so that programs can readily access it; this
role is played by the DOM
- pieces of code to serve as object methods; this role can be played
by
- OBJECT elements embedded in the state
- other pieces of code defined as or within Web resources separate from
the state, identified by URLs, that are designed to access the state via
its DOM-based interface, and that are associated with the state via the
relationship mechanism in the next bullet
- a way to define the relationships between the state and the methods
(the linkages in the above diagrams)
- a way to access and invoke the code as necessary when the state (Web
document) is accessed
Code resources are already being stored on the Web, e.g., in program
libraries supporting reuse, and it is already possible to create relationships
(links) between Web documents and such resources. However, in using code
resources to create objects, it is necessary to not only define the links
between the code and its associated state, but also to reflect the special
semantics associated with these links. These semantics somewhat resemble
those of metadata such as PICS content labels, in the sense that instead
of the user explicitly following the links to retrieve the associated "metadata",
some of the "metadata" is automatically retrieved during access
to the original resource, in order to support some special processing.
This processing involves a form of what is variously called a metalevel,
reflective, or intermediary architecture, in the sense that
the processing requires that ordinary requests for data on the Web be interrupted
or intercepted, so that the necessary special processing can be performed.
In the case of content labels, the special processing involves checking
the content labels against user-specified requirements in order to determine
whether to allow access to the original resource. In the case of object
methods, the special processing involves accessing the code, and invoking
that code, in order to perform some operation.
In the approach we propose, relationships between the state and the
methods will be defined in either of two ways:
- OBJECT elements referring to the programs can be embedded in the Web
pages. This requires that the pages be created with (or modified to contain)
the necessary OBJECT elements. However, this capability is available now,
and hence requires no enhancements.
- The programs can be identified in metadata defined as either embedded
or separate RDF resources. For example, an RDF resource associated with
a given Web page might contain OBJECT elements that identify the programs
that act as the page's methods. Alternatively, the RDF resource might refer
to programs defined as Web resources using some mechanism other than the
OBJECT element, and also include a reference (possibly as an OBJECT element)
to a "loader" mechanism capable of accessing those programs and
providing them to the client on request. RDF resources contain explicit
references to the Web pages for which they define metadata. However, they
do not require that the Web pages they describe themselves be aware of
the existence of this metadata, and hence do not require that the pages
be created with (or modified to contain) references to the metadata. Thus,
using an RDF-based approach would allow Web pages to be associated with
object methods without the pages themselves having to contain references
to the methods.
In order to define relationships between Web pages and methods without
these relationships being explicitly contained in the Web pages, it is
necessary to have a way to determine the existence of these relationships
at runtime, so that the client can download those methods, and invoke them
to provide object behavior. PICS provides a mechanism for doing this. PICS
defines metadata (content labels) that need not be embedded in the page
described by that metadata. In PICS, the client specifies the sources and
types of content labels it wants to use to evaluate the Web pages it accesses.
Whenever an attempt is made to access a page, content labels from those
sources are implicitly accessed (either from the site supplying the page,
or from a separate rating service), and evaluated to determine whether
access to the page should be allowed. It seems likely that RDF will define
a similar (but possibly more general) mechanism for transparently accessing
metadata about a given page when the page is accessed, and providing that
metadata to the Web client. This mechanism would provide the basis of our
metadata access mechanism as well. (If such a mechanism is not defined
in RDF, we would define one as an extension. This would probably be relatively
straightforward, given the existence of the PICS mechanism already mentioned).
In our case, however, the metadata will contain methods that can operate
on the data in the page, and perform various functions based on that data.
The two mechanisms identified above (embedded OBJECT elements and RDF
resources associated with the page) potentially provide a way to access
the methods when the state is accessed. In addition, a mechanism is required
to invoke the code as it is needed. The OBJECT element already provides
such a mechanism which can be used in some cases (for example, this is
used to invoke Java applets embedded in pages). A more general mechanism
would necessary for methods defined in RDF resources. There may be a way
to do this provided within a general RDF-supported metadata access mechanism
(this is currently not clear, since RDF is still under development). Alternatively,
it may be necessary to define this as an extension. Again, this would probably
be relatively straightforward.
Many details of this technology integration must still be worked out
(partially because some of the key technologies we have identified are
still under development). Nevertheless, we feel that the capabilities inherent
in these technologies provide the necessary support for the object model
integration we propose.
3.2 Discussion
A number of projects have investigated developing object capabilities
for the Web, e.g., the Harvest Object System [CHHM+94], W3Objects
[ILCS95], and ANSAWeb
[REMB+95]. A thorough review of such projects has been undertaken, and
the descriptions of W3Objects and ANSAWeb below are taken from a forthcoming
technical report "Web + Object Integration", by Gil Hansen (OBJS),
resulting from that review.
The Harvest Object System (HOS) [CHHM+94] modified the Mosaic
browser to include a Harvest Object Broker, allowing users to interact
with remote objects via a special Harvest Object Protocol (HOP). HOS defines
objects from existing files and programs by recording metadata roughly
of the form:
user-defined type name
URL --> file data
URL --> method (program)
URL --> method
URL --> method
...
URL --> method
using SOIF to hold that metadata. The HOP is used for retrieving IDL
information, moving object code and data, and invoking objects. A command
such as GETOBJS hop://URL/some.obj (where URL/some.obj
designates a file) returns the object data for some.obj along
with its metadata, including a set of methods.
ANSAWeb <http://www.ansa.co.uk/ANSA/ISF/overview.html>
provides a strategy for interoperability between the Web and CORBA using
HTTP-IIOP gateways -- the I2H gateway converts IIOP requests to HTTP, and
H2I converts HTTP requests to IIOP. The H2I gateway allows WWW clients
to access CORBA services; the I2H gateway allows CORBA clients to access
Web resources. The pair of gateways together behave like an HTTP proxy
to the client and server. A CORBA IDL mapping of HTTP represents HTTP operations
as methods and headers as parameters. An IDL compiler generates client
stubs and server skeletons for the gateways. H2I is both a gateway to IIOP
and a full HTTP proxy so a client can access resources from a server that
does not have an I2H gateway. A locator service decides when to use IIOP
or HTTP. If the locator can find an interface reference to a I2H server-side
gateway, IIOP is used; otherwise, the H2I gateway passes the request via
HTTP.
The W3Objects <http://arjuna.ncl.ac.uk/w3objects/>
project at the University of NewCastle upon Tyne provides facilities for
transforming standard Web resources (HTML documents, GIF images, PostScript
files, audio files, and the like) from file-based resources into objects
called W3Objects, i.e., encapsulated resources possessing internal state
and well-defined behaviors. The motivating notion is that the current Web
can be viewed as an object-based system with a single class of object --
all objects are accessed via an HTTP daemon. W3Objects are responsible
for managing their own security, persistence, and concurrency control.
These common capabilities are made available to derived application classes
from system base classes. A W3Objects server supports multiple protocols
by which client objects can access server objects. When using HTTP, the
URL binds to the server object and the permitted object operations are
defined by the HTTP protocol. Or, the RPC protocol can be used to pass
operation invocations to a client-stub generated from a description of
the server object interface. W3Objects uses C++ as the interface definition
language, although CORBA IDL and ILU ISL can be used. W3Objects can also
be accessed though a gateway, implemented as a plug-in module for an extensible
Web server, such as Apache <http://www.apache.org/>.
URLs beginning with /w3o/ are passed by the server to the gateway;
the remainder of the URL identifies the requested service and its parameters.
Using a Name Server, the appropriate HTTP method is invoked on the requested
service.
These projects have identified a number of important ideas in supporting
objects on the Web (in particular, objects constructed in the HOS resemble
in many respects those that would be constructed using the approach described
in Section 3.1). However, they based their attempts to develop object capabilities
for the Web on the existing Web infrastructure. As a result, they
had to use a number of non-standard Web extensions (e.g., special protocols
referenced in URLs to trigger the loading of object methods), which limit
their widespread usability. Dependence on the existing Web infrastructure
also limits the ability of the resulting objects to support more complex
Web applications. Our work, on the other hand, is based on what will likely
be the next-generation Web infrastructure. This infrastructure is
still evolving, and hence some extensions to it may yet be necessary. However,
based on our analysis, these new Web technologies seem likely to provide
a much better basis for providing powerful Web object facilities, that
are at the same time based on standard (hence, widely accessible) Web protocols
and components.
An approach similar to that provided by ANSAWeb is becoming increasingly
popular, and is potentially very powerful. This involves placing Java applets
on Web pages (using the APPLET or OBJECT elements in HTML). Once on the
Web client, these objects then communicate with other objects on remote
servers using various protocols. A particularly important variant of this
approach is to use it to combine Java and CORBA. In this variant, Java
applets downloaded to the client communicate with other CORBA objects over
the Internet via CORBA's IIOP (Internet Inter-ORB Protocol), which is supported
by all CORBA Object Request Brokers. This approach is, for example, supported
by Netscape Communicator, which includes Visigenic's Java ORB. Using this
approach, the advantages of CORBA's object services are potentially available
to Internet objects. This also allows non-Java objects to be integrated
into the Internet, since CORBA objects can be written in many languages.
Java has also been the basis of proposals to improve Web capabilities by
representing more and more Web content directly as Java objects, using
the existing Web largely as a transport mechanism for these objects.
Such approaches provide important new mechanisms for supporting more
powerful Web capabilities, and integrating enterprise distributed object
systems (which are likely to be CORBA-based) with the Internet. However,
these approaches suffer from a number of disadvantages when used by themselves,
e.g.:
- much of the Web remains outside the scope of CORBA-based services unless
additional facilities are provided to wrap Web resources as objects
- the current easy construction of Web content is replaced by the need
to use Java objects,
- there is no smooth integration of Web page content with Java
- Java programs must still grapple with the problem of processing syntactically
tagged HTML content
What we are proposing is a general way to merge objects and the
Web. Our approach subsumes these Java-based approaches, since all these
mechanisms for integrating Java (and CORBA) objects with Web pages are
still available. However, our approach goes beyond these approaches in
providing richer Web content that is more amenable to application processing
(XML pages accessible via DOM), together with a more general way to link
non-embedded methods with that Web content.
There are a number of potential ways to use the "objects"
constructed using the mechanism we are proposing. One approach would be
to use the methods associated with a document in the same way that Java
applets are used now. The difference would be that the code would not need
to be embedded in the document. (In fact, depending on the exact details
of the DOM, if the methods were separately-located OBJECT elements, they
could presumably be embedded dynamically in the document at the client
using the DOM interface, and act just the way embedded OBJECTs would act).
A more conventional "object-like" use would be to allow the associated
methods to be invoked via an enhanced DOM interface by programs acting
through the client. That is, the DOM effectively implements a generic interface
of a type something like XML-document (for XML documents). Application-specific
subtypes of this generic type could be created which included the application-specific
methods associated with the document as parts of the interfaces defined
for those subtypes. Programs acting through the client could then invoke
these methods through the new interfaces just as they invoke the methods
of other objects.
The mechanism defined here provides a form of "component-oriented"
development, in that it allows the arbitrary composition of objects from
data and code resources found on the Internet. Using this approach, a client
could have multiple "object views" of the same base data (e.g.,
access the same data resources using different classes), by simply changing
the collection of methods it uses when accessing the data (this would be
like using different annotation sets or PICS-like labels in accessing a
document).
The approach may appear somewhat "heavyweight", in the sense
that it involves additional mechanism, and may involve delays in accessing
the code that implements object methods. However:
- if object methods are implemented as embedded OBJECT elements, there
is no change from the current way that Java applets are downloaded with
pages
- if object methods are implemented as separate resources but are co-located
with the documents they operate on (at the same server), they can potentially
be downloaded with the document (within the same response by the server)
using PICS/RDF-defined mechanisms
- using the additional methods is (at least potentially) a choice that
the user can make
- it is important to provide the basic capability for doing this,
at which point the efficiency issues can be addressed. For example, these
could be addressed by caching frequently used methods at the client, or
by other mechanisms.
In this connection, it is useful to compare the architecture that results
from using this approach to that of an Object DBMS (ODBMS). In most current
ODBMS client/server architectures, methods typically reside in class libraries
located on the client, rather than being stored as complete objects on
the server. Only object state resides on the server. When objects are needed
by the client, the state is accessed from the server, moved to the client,
and complete objects are created locally using the client-based class libraries.
In our approach, both the methods and the state (at least conceptually)
reside remotely; the client only contains references to the objects. The
Web delivers the state to the client just the way an ODBMS server does,
and delivers the methods as well.
So far, our work has focused on identifying new Web technologies to
serve as a base, analyzing their capabilities, and developing the basic
principles for integrating them. Further work needs to be done to work
out the additional details required to build a prototype implementation.
For example, we have already noted that there are many object models that
could be supported using the principles we have identified. It will be
necessary to choose a particular object model (or possibly more than one)
to use for our Web object model. This, in turn, will affect the structure
of the metadata that must be supported. For example, if a class-based model
is chosen, additional metadata will need to be defined to support the class
objects (these could be recorded as Web objects too, using RDF, possibly
together with techniques from MCF or XML-Data). Further work will be necessary
to determine an appropriate type of object model for use on the Web.
Additional work is also required to define the mechanism that invokes
the object methods once they are returned to the client. This will depend
on the details of how the RDF standard evolves. As noted at the end of
Section 3.1, the general RDF-supported metadata access mechanism may provide
a way to insert this method invocation mechanism. Alternatively, it may
be necessary to define this as an extension to the RDF mechanism.
Finally, as noted already, the DOM currently defines its API at a generic
level, i.e., at the level of components of a document metamodel. Additional
work is required to define "application level" object interfaces
which include interfaces to the methods associated with the objects. For
example, in the relational database example described in Section 2.3.1,
DOM provides objects of types node, element, and so on,
rather than objects of type author or editor (or even
objects of type table or row). Using DOM, an application
could effectively create such interfaces from the information given, but
it would have to "know what to look for", and would have to traverse
the various element objects to find that information. It would
be desirable to have a capability for creating DOM-like, but application-oriented,
APIs. This could involve using additional metadata (e.g., the DTD, or an
XML-Data-like schema) to generate a default API automatically (it might
then be possible for the document's author to customize this API or, alternatively,
define the API explicitly). It might then be possible to attach specific
methods to this API to define application-specific object behavior. An
integration of DOM and embedded OBJECT elements would be one way to support
this. This would effectively permit the creation of objects in the classic
object-oriented programming sense.
3.3 Formal Principles
The approach to creating a Web object model described in the previous
sections provides the basis for creating genuine objects, having both state
and behavior, on the Web. This would greatly increase the structuring power
of the Web, enabling it to support increasingly complex applications. However,
as noted in Section 1, it is also important to have higher level object
services available for these objects, such as those provided for CORBA
objects in OMG's Object Management Architecture. In providing this additional
support, it is important to have a formal foundation for the object model
and its operations. For example, such a formal foundation is essential
as a basis for defining query processing and view facilities (just as the
formal foundation of the relational database model is essential for defining
query processing and view facilities for relational databases). A formal
foundation is also helpful as a basis for defining extensions to the model,
and generally understanding its capabilities.
In this section, we describe some basic ideas behind work on a formal
definition for our Web object model. The ideas are derived from work on
the foundations of Web metadata concepts, work on object-oriented logics,
and our own prior work on object model formalization. Many of these same
ideas are currently being reflected in W3C's ongoing RDF activity.
3.3.1 Logic Basis
Section 2 described a number of different representation techniques
and models for Web-related data. While these models have individual variations,
in most cases these models are basically the same model: graphs, with labeled
edges (although some models are based on a tree structure, they generally
provide graph capabilities through the use of pointers of one form or another,
usually URLs). This is essentially a model of the Web itself: Web resources,
identified by URLs, which point to each other by including the URLs of
related resources as hyperlinks. Papers describing these models often acknowledge
their similarity to each other.
Common features of these representational models are:
- the basic "objects" consist of either individual fields (attribute/value
pairs, tagged values), or simple aggregates of these fields (e.g., sets
of fields, nested fields)
- support for some form of identity (such as URLs, or specifically generated
object identifiers or identifier fields); fields can have as values either
individual identifiers, or sets of identifiers; this allows tree and graph
structures to be defined
- no encapsulation; applications accessing the objects obtain direct
access to the field values
- no behavior in the form of object methods
- loose typing (although generally a set of primitive base types is defined
for field values, e.g., integer, character, etc.)--there is often no schema;
in some models an arbitrarily-defined field can appear in any object, and
can appear an arbitrary number of times (or not at all); sometimes the
value of a field may be an actual value in one instance, and a reference
to a complex object in another (e.g., an address field may be a string
in one object, and a reference to a collection of street, city, and state
fields in another). (Languages for querying these models must thus support
pattern-matching and various forms of implicit path traversal to deal with
these types of irregularities.)
- no inheritance
There are a number of reasons for adopting this form of model to deal
with Web data:
- An approach based on attribute/value pairs, unlike a database-like
"typed record" approach, is arbitrarily extensible in a federated
environment (without a centralized collection of types or schema). Anyone
can record any attributes they feel are necessary, without going through
the "overhead" of defining a new type (and, in particular, possibly
having to define it as a subtype of an existing type), and distributing
that type definition throughout a distributed network. Also, anyone can
appropriately use those attributes, provided that they understand the intended
semantics of the attributes.
- The basic structure of the Web is built up from individual tagged items
(currently, in HTML), i.e., it identifies individual things down to what
is essentially the attribute level. As a result, an "object model"
sufficient to describe Web information must contain individual identifiable
constructs down to this level (with higher-level groupings, like pages,
being built up as aggregates of these primitive units).
- It is still possible to combine the representation flexibility of these
models with the efficiency and other benefits of typed models by using
the "less typeful" model as a base, and adding type-like structuring
as additional constraints. This is consistent with the idea of "retrofitting"
type and other information to resources that are "discovered"
dynamically. In this approach, attribute names are explicitly defined in
the base model; then, "smarter components" (e.g., knowledge-based
"mediators" in OEM) add more complex structures or semantics,
using the attribute names to identify the relevant material.
- Describing a resource by a set of attribute/value pairs is equivalent
to describing the resource by a set of assertions (essentially binary relationships)
in predicate logic. The resource is assigned an identity (say, a URL),
and the attribute name serves as the name of the predicate/relationship,
as in author(url1,"Oscar Wilde") or title(url1,
"The Importance of Being Earnest"). This basis in predicate
logic, like the logical basis of the relational data model, provides a
solid foundation for these models, guaranteeing the generality of the model,
and that familiar formal mechanisms can be employed in connection with
it.
The relationship identified in the last bullet between these representational
models and logic-based formalisms is very important, and is explicitly
called out in a number of papers introducing or analyzing these models.
The relationship is, as noted above, important in establishing a formal
framework in which to understand these models, as well as in suggesting
possible extensions. The relationship is also important in establishing
a way to add more "intelligence", through the use of knowledge-based
components such as "mediators" or "intelligent agents".
Such components, for example, will need to have a formal way of interpreting
the data they will be dealing with. The ability to understand Web representations
in terms of logic provides a basis for applying KIF-based technologies,
for example. In addition, the fact that these models can be understood
in a common way (have a common semantics expressible in terms of logic)
is important in providing a basis for defining translations/conversions
(in terms of logic-based rules) between apparently different representations.
This is similar to the use of logic-based formalisms to define translations
in federated database systems (see, e.g., [FR97]).
As an example of work within the W3C addressing the relationship of
logic and metadata, Describing
and Linking Web Resources is an early W3C note which discusses general
ideas and issues for describing and linking Web resources. It references
work such as PICS, SOIF, and MCF, and notes that, though these different
formats exhibit a range of syntactic variations, semantically they attempt
to convey similar information. The architectural model that is common to
them is the basic structure of the web: a directed graph with labeled arcs.
The nodes (or points, or vertices) of the graph are URLs--anchor or resource
addresses. The arcs are links. The labels are link relationships. Associated
with each node is a set of attributes, or slots, or fields. Each attribute
has a name and a value. Values are defined in a media-type specific manner.
The note also identifies the relationship of these attribute/value-based
schemes to basic concepts in propositional logic. This allows the identification
of the basic principles of the model independently of particular representations.
R(S, T) can be used to denote a link from S to T with relationship R. The
same notation can be used for attributes, writing N(S, V) for an attribute
named N on an anchor at S with value V. For example, both the SOIF description
@FILE {"http://www.shoes.com"
Author{4}: Fred
Supersedes{30}: http://www.provider.com/shoes }
and the HTML
<about href="http://www.shoes.com">
<meta name=author content="Fred">
<link rel=Supersedes href="http://www.provider.com/shoes">
</about>
can be interpreted as:
Author(http://www.shoes.com, "Fred")
Supersedes(http://www.shoes.com, http://www.provider.com/shoes)
Link semantics can be modeled by observing that anything can be considered
a point in the web--including people, organizations, dates, and subject
categories--by giving it a URL. A link or attribute in the web can be interpreted
as an assertion, given an understanding of the semantics of the link relationship
or attribute name. For example, given the definitions:
- Author(S, V) means "The Author of S is V"
- Supersedes(S, T) means "S supersedes T"
the HTML or SOIF data above can be interpreted as the assertions:
- The Author of http://www.shoes.com is Fred.
- http://www.shoes.com supersedes http://www.provider.com/shoes.
A straightforward application of this approach permits the description
of a set of assertions about an individual concept, identified by an identifier.
Tim Berners-Lee's paper Metadata
Architecture [Ber97] carries these ideas further, and this approach
is being reflected in the W3C's RDF specifications.
In addition to the description of simple, flat, sets of attribute/value
pairs describing individual entities, it is necessary for these structural
models to be able to handle more complex structures, such as trees (e.g.,
repeating groups) and networks (directed graphs). In defining these more
complex structures, the ability to assign identifiers to both resources,
and individual (or groups of) attribute/value pairs is important. This
allows a given (sub)structure to be assigned an identity, and then referenced
from multiple places within a data structure. In actual representations,
such substructures are indicated not by assigning them separate identifiers,
but by some distinct representation technique (e.g., by nesting them within
a larger tag). Such substructures need to be understood as being "flattened",
with separate identifiers defined, in interpreting them within a logic-based
framework (just as, in the relational data model, data must at least be
represented in unnested "first normal form"). Techniques for
factoring nested parts of a hierarchical structure into a "flat"
logical form, and the need for both AND and OR logical operators, are illustrated
and discussed in On
Information Factoring in Dublin Metadata Records <http://www.uic.edu/~cmsmcq/tech/metadata.factoring.html>.
Various specific representation techniques for metadata, such RDF, MCF,
SOIF, OEM, etc., can be understood in the context of these observations
as simply involving different encodings of the basic logic-based structures.
Each encoding selects specific attributes, identifiers, etc. to cluster
together in specific data representations, and selects others to represent
as separate entities. Also, they select some relationships to represent
explicitly by using identifiers as pointers, and some to represent implicitly
by grouping related constructs in the same data structure. This interpretation
of attribute/value pairs (and associated structures) as logical assertions
is a key element in the development of a formal basis for our Web object
model, and is explicitly reflected in RDF as well.
3.3.2 Representation of Higher Level Semantics
What is metadata to one application is often data to another, and vice-versa.
Hence, it is often important to be able to define metadata which describes
other metadata descriptions, or parts of them. For example, it is important
to be able to define the semantics of the individual attributes used in
metadata descriptions, and to define the characteristics of the values
that may be assigned to them (e.g., their types, their units, what they
signify). Discussions of structural or "lightweight" models often
refer to tagged values as "self-describing", as allowing arbitrary
attribute names to be introduced, and as not requiring the use of centralized
attribute or type registration. However, this is only true to a certain
extent. These representations are really "self-describing" in
a truly useful way only if there is a common understanding of the meaning
of the attribute names (and their associated values) by accessing applications.
To support general interoperability, the definitions of attribute names
and types must either be actually distributed, or distributed access must
be provided to them.
A number of abstract models for Web metadata describe the ability to
link metadata individually to tagged items (attributes). For example, the
Dublin Core describes the ability to access the definition of an individual
attribute. This, for example, allows the attributes used in a particular
description to be linked to an ontology that defines the attributes, and
the set of concepts used in the context that the attributes are intended
to describe. (A resource pointing to its ontology is similar to an object
pointing to its methods, in a sense: it provides an interpretation (the
methods are a "procedural specification" of the meaning/behavior
appropriate to the data, while an ontology is human-readable). Work by
groups such as the Stanford knowledge group is intended to merge these
ideas and make the ontology readable/usable by knowledge-based software,
the idea being that one could have a logic-based or other semantic specification
which is declarative, and machine-interpretable.) The relationship between
attribute/value pairs and formal logic described above also provides a
basis for representing these additional kinds of links.
Describing and Linking
Web Resources discusses how higher level information (such as beliefs),
and information about the attributes or relationships themselves, can also
be encoded using predicate logic. The basic approach is to assign each
relationship (or attribute) its own URL (object identity), thus reifying
the relationship (or attribute). Once a relationship has a URL (or other
unique identifier), it can have its own metadata, by recording additional
assertions about that identifier. If the relationship is identified with
a URL, dereferencing the URL should access a definition of the link relationship,
in either human-readable or machine-readable form. In addition, information
about the association between a given attribute or assertion and a given
resource can also be recorded. For example, in addition to recording an
assertion like cost(o1, $26.95), information as to who made that
assertion, and when, can also be recorded, e.g.:
who( (o1,cost), "fred")
when( (o1,cost), "04/07/97")
In this case, (o1,cost) acts as a new unique identifier which
is the identity of the use within (or for) o1 of the attribute "cost"
(this is a form of identifier construction mechanism supported by
object logics, such as F-logic, described below).
Metadata Architecture
[Ber97] observes that the URL space is an appropriate space for the definition
of attribute names in the Web because it effectively provides for a federated
name space, within which users can freely define attribute names without
necessarily "registering" them with a central authority. However,
the URLs that identify relationships or attributes need not necessarily
be used locally (within a given resource). Instead, local names from a
namespace defined by the resource can be used as abbreviations. However,
it should always be possible to translate from a local name to the global
URL that represents the actual definition of the relationship or attribute.
Relationships such as the following could be defined to represent these
concepts:
- global(S, T)--The anchor S, which represents a link relationship
locally to a resource, is defined globally at T.
- implies(S, T)--S implies T; that is, from any link/assertion
S(X, Y), deduce T(X, Y) [this could be used as the basis for defining a
subtype relationship for the base level relationships.]
These ideas are being reflected in the RDF, XML, and other W3C specifications.
Such reification of attributes and relationships (and also of types and
methods) is also a key element in the development of a formal basis for
our Web object model.
3.3.3 Object Logics
Along with the development of object technology, a number of attempts
have been made to extend logical formalisms to represent the specific characteristics
of objects. A particular goal in the development of object logics
has been to provide the same type of solid theoretical foundation for object-oriented
database systems that the relational model provides for relational database
systems. The foundation of the relational model (specifically relational
calculus) is a restricted subset of conventional predicate logic. The reasoning
was thus that, in order to have the same sort of theoretical foundations
for object-oriented database systems, it would be necessary to have a logic
analogous to predicate calculus, but one that would incorporate object
concepts such as objects, classes, methods, inheritance, etc. A number
of object logics have been introduced, one of the more thoroughly-developed
of which is F-logic
(Frame Logic) [KL89, KLW95].
A full exposition of F-logic is outside the scope of this paper (and
in any case can be obtained from the cited references). However, F-logic
includes a number of capabilities that are relevant to this discussion.
For example, F-logic supports operations on both flat data structures (along
the lines of the conventional relational model) and nested data structures
(path traversal). F-logic also supports id-terms representing object
identities. These are logical terms which which use object constructor
functions that can be interpreted as constructing object identities
that are functionally dependent on their arguments. These terms are used
to represent derived objects (e.g., objects to be constructed on the left-hand
sides of rules), with the arguments of the function indicating the base
objects from which the new objects were derived (effectively, the derived
identity can be considered as the labeled tuple of the base identities).
The ability to construct derived objects is crucial in describing the semantics
of queries which produce new objects from existing ones (as a relational
join operation does) and of views.
Finally, F-logic introduces higher-order capabilities, in order to effectively
describe inheritance, and operations on metadata (e.g., database schemas),
while retaining first-order semantics. This is done, as suggested in the
previous section, by reifying concepts such as predicates, functions, and
atomic formulas, allowing them to be manipulated as first-class objects.
This reification allows the use of higher-order syntax, while retaining
first order semantics. Under first-order semantics, predicates and functions
have associated objects, called intensions, which can be manipulated
directly. Depending on the context in which they appear, these intensions
may assume different roles, acting as relations, functions, or propositions.
For example, in F-logic, id-terms are handled as individuals when they
occur as object identities, viewed as functions when they appear as object
labels (attributes), and as sets when representing classes of objects.
When functions or predicates are treated as objects, they are manipulated
as terms through their intensions; when being applied to arguments, they
are evaluated as functions or relations through their extensions.
The use of F-logic concepts in helping define query language concepts
for object-oriented databases is described in [KKS92], including query
language support for:
- object method invocation
- derived objects (for constructing views)
- querying both the database and its metadata
In addition, the higher-order capabilities of F-logic are those needed
to formally define the use of mixtures of data and metadata within the
Web. For example, in dealing with an RDF description of a Web resource,
in some cases we may want to treat one of the RDF properties as simply
a property of the described resource. In other cases, we may want to treat
the property as an object in its own right (by following its URL), with
properties of its own (e.g., its definition, or the ontology it is a part
of). RDF explicitly allows this, using the sort of reification we have
already described. Using F-logic (or possibly a variant), we hope to provide
a formal basis for describing such operations, and for the development
of both our Web object model, and query languages and other services based
on it.
4. Conclusions
In this paper, we have:
- described key examples of existing work from the Web, database, and
OMG communities that contribute both ideas and technology toward providing
the components of a Web object model
- identified some key underlying principles behind this work
- identified a framework which allows this work to be unified and extended
to support the requirements of advanced Web applications for object technology
At the moment, we have only identified an approach toward integrating
these technologies. Many details of this technology integration must still
be worked out (partially because some of the key technologies we have identified
are still under development). Nevertheless, we feel that the capabilities
inherent in these technologies provide the necessary support for the object
model integration we propose.
We feel that a particularly important aspect of this work is the attempt
to rely to the greatest possible extent on standards (commonly-accepted
or likely-to-be-accepted Web technology) in developing our integration
approach, and on working within standards-developing organizations such
as W3C and OMG in further refining it and developing additional capabilities.
This both takes maximum advantage of existing work, and improves the chances
that the technology that is developed will become widely available (albeit
possibly in some modified form) in commercial software products.
Further work on this project will include:
- tracking the development of the key technologies (XML, RDF, and DOM)
- defining one or more specific object models as the basis of further
development (an obvious candidate would be one that maps easily to OMG
IDL or Java)
- developing a detailed integration plan
- working with the relevant standards groups to define any necessary
enhancements to support object model requirements
- prototype development, using tools that are already available for some
of these technologies (e.g., XML and DOM)
- defining query support facilities for objects created using the object
model(s) we define
References
[AQMW+96] S. Abiteboul, D. Quass, J. McHugh, J. Widom, and J. L. Wiener,
"The Lorel Query Language for Semistructured Data", http://www-db.stanford.edu/pub/papers/lorel96.ps.
See also the other papers available at the Stanford
DB group Publications page <http://www-db.stanford.edu/pub/>.
[BBBC+97] R. Bayardo, Jr., W. Bohrer, R. Brice, A. Cichocki, J. Fowler,
A. Helal, V. Kashyap, T. Ksiezyk, G. Martin, M. Nodine, M. Rashid, M. Rusinkiewicz,
R. Shea, C. Unnikrishnan, A. Unruh, and D. Woelk, "InfoSleuth: Agent-Based
Semantic Integration of Information in Open and Dynamic Environments",
Proc. 1997 ACM SIGMOD Conf., SIGMOD Record, 26(2), June 1997.
[BDHS96] P. Buneman, S. Davidson, G. Hillebrand, and D. Suciu, "A
Query Language and Optimization Technique for Unstructured Data",
Proc. SIGMOD'96, 505-516.
[BDFS97] P. Buneman, S. Davidson, M. Fernandez, and D. Suciu, "Adding
Structure to Unstructured Data", Proc. ICDT, 1997.
[Ber97] T. Berners-Lee, Metadata
Architecture, <http://www.w3.org/DesignIssues/Metadata>.
[Bor95] A. Borgida, "Description Logics in Data Management",
IEEE Trans. on Knowledge and Data Engineering, 7(5), October 1995,
671-682.
[Bos97] J. Bosak, XML,
Java, and the Future of the Web, <http://sunsite.unc.edu/pub/sun-info/standards/xml/why/xmlapps.htm>,
1997.
[CHHM+94] B. Chhabra, D. Hardy, A. Hundhausen, D. Merkel, J. Noble,
M. Schwartz, "Integrating Complex Data Access Methods into the Mosaic/WWW
Environment", Proc. Second Intl. World Wide Web Conf., Oct.
1994, 909-919.
[CM93] S. Chiba and T. Masuda, "Designing an Extensible Distributed
Language with a Meta-Level Architecture", Proc. ECOOP '93,
LNCS 707, Springer-Verlag, July 1993, 482-501.
[DeR97] S. DeRose, The SGML FAQ Book, Kluwer, 1997.
[FR97] G. Fahl and T. Risch, "Query Processing over Object Views
of Relational Data", VLDB Journal 6(1997) 4, 261-281.
[GB97] R. Guha and T. Bray, Meta
Content Framework Using XML, <http://www.w3.org/TR/NOTE-MCF-XML/>,
June 6, 1997.
[GW97] R. Goldman and J. Widom, "DataGuides: Enabling Query Formulation
and Optimization in Semistructured Databases", Technical Report, Stanford
University, 1997, http://www-db.stanford.edu/pub/papers/dataguide.ps.
[Hop97] A. Hopmann, et. al., Web
Collections using XML, 1997 <http://www.w3.org/TR/NOTE-XMLsubmit.html
>.
[IK96} T. Isakowitz and R. J. Kauffman, "Supporting Search for
Reusable Software Objects", IEEE Trans. Software Engrg. 22(6),
June 1996, 407-423.
[ILCS95] D. Ingham, M. Little, S. Caughey, S. Shrivastava, "W3Objects:
Bringing Object-Oriented Technology to the Web", Proc. Fourth Intl.
World Wide Web Conf., World Wide Web Journal, December, 1995,
89-105.
[ISO86] International Standard ISO 8879:1986(E), Information Processsing
- Text and Office Systems - Standard Generalized Markup Language (SGML),
International Organization for Standardization, 1986.
[ISO92] International Standard ISO/IEC 10744:1992, Information Technology
- Hypermedia/Time-based Structuring Language (HyTime), International
Organization for Standardization, 1992.
[ISO96] International Standard ISO/IEC 10179:1996(E), Information Technology
- Processing languages - Document Style Semantics and Specification
Language (DSSSL), International Organization for Standardization, 1996.
[KKS92] M. Kifer, W. Kim, and Y. Sagiv, "Querying Object-Oriented
Databases", Proc. ACM SIGMOD Conf., 1992, 393-402.
[KL89] M. Kifer and G. Lausen, "F-Logic": A Higher-Order Language
for Reasoning about Object, Inheritance, and Scheme", Proc. 1989
ACM-SIGMOD Intl. Conf. on Management of Data, 1989. See also other
papers on F-logic and related
formalisms <http://www.cs.sunysb.edu/~kifer/dood/>.
[KLW95] M. Kifer, G. Lausen, and J. Wu, "Logical Foundations of
Object-Oriented and Frame-Based Languages", Journal of the ACM,
July 1995, 741-843.
[KR97] R. Khare and A. Rifkin, "XML: A Door to Automated Web Applications",
IEEE Internet Computing, 1(4), July-August 1997, 78-87.
[Man93] F. Manola, "MetaObject Protocol Concepts for a 'RISC' Object
Model", TR-0244-12-93-165, GTE Laboratories Incorporated, 1993 <ftp.gte.com,
directory pub/dom>.
[Man97] F. Manola (ed.), "NICTS Technical Committee H7 Object Model
Features Matrix", X3H7-93-007v12b, May 25, 1997, http://www.objs.com/x3h7/h7home.htm.
[MGHH+97] F. Manola, D. Georgakopoulos, S. Heiler, B. Hurwitz, G. Mitchell,
F. Nayeri, "Supporting Cooperation in Enterprise-Scale Distributed
Object Systems", in M. Papzoglou and G. Schlageter, eds., Cooperative
Information Systems, Academic Press, 1997.
[NUWC97] S. Nestorov, J. Ullman, J. Wiener, and S. Chawathe, "Representative
Objects: Concise Representations of Semistructured Hierarchical Data",
in Proc. Thirteenth Intl. Conf. on Data Engineering, Birmingham,
U.K., April 1997.
[OMG95] Object Management Group, The Common Object Request Broker:
Architecture and Specification, Revision 2, July, 1995.
[OMG97] Object Management Group, A Discussion of the Object Management
Architecture, June, 1997, http://www.omg.org/library/omaindx.htm.
[PGW95] Y. Papakonstantinou, H. Garcia-Molina, and J. Widom, "Object
Exchange Across Heterogeneous Information Sources", IEEE Intl. Conf.
on Data Engineering, 251-260, Taipei, March 1995. See also the other papers
available at the TSIMMIS
Publications page <http://www-db.stanford.edu/tsimmis/publications.html>.
[REMB+95] O. Rees, N. Edwards, M. Madsen, M. Beasley, A. McClenaghan,
"A Web of Distributed Objects", Proc. Fourth Intl. World Wide
Web Conf., World Wide Web Journal, December, 1995, 75-87.
[SG95] N. Singh and M. Gisi, "Coordinating Distributed Objects
with Declarative Interfaces", http://logic.stanford.edu/sharing/papers/oopsla.ps.
[SW96] R. Stroud and Z. Wu, "Using Metaobject Protocols to Satisfy
Non-Functional Requirements", in C. Zimmermann (ed.), Advances
in Object-Oriented Metalevel Architectures and Reflection, CRC Press,
Boca Raton, 1996, 31-52.
This research is sponsored by the Defense Advanced Research
Projects Agency and managed by the U.S. Army Research Laboratory under
contract DAAL01-95-C-0112. The views and conclusions contained in this
document are those of the authors and should not be interpreted as necessarily
representing the official policies, either expressed or implied of the
Defense Advanced Research Projects Agency, U.S. Army Research Laboratory,
or the United States Government.
© Copyright 1997, 1998 Object Services and Consulting,
Inc. Permission is granted to copy this document provided this copyright
statement is retained in all copies. Disclaimer: OBJS does not warrant
the accuracy or completeness of the information in this survey.
This page was written by Frank Manola. Send questions
and comments about it to fmanola@objs.com.
Last updated: 2/10/98 fam
|