UMBEL Logo
Upper Mapping and Binding Exchange Layer
A lightweight, subject concept reference structure for the Web
UMBEL Index
UMBEL wiki

Why OpenCyc?

The combination of the representativeness of UMBEL's subject concepts (the scope of the ontology) and their relationships (the structure of the backbone) is fundamental. These factors in turn express the functional capabilities of the system. The use of OpenCyc as the source basis for UMBEL is fundamental to these capabilities.

First Things First: The Importance of Context

A reference structure of almost any nature has value. A reference structure provides context, which in turn provides fixed points in the information space for relating distributed datasets to one another. Further, a reference structure of concepts has the further benefit of providing a logical reference structure for instances as well.

While Wikipedia is perhaps the most comprehensive collection of well-known instances, no single source can or will be complete in scope. Thus, many public and private sources of entities will emerge as reference hubs.

How do each of these rich instance sources relate to one another? What is the subject concept or topical basis by which they overlap or complement? What is the framework and graph structure of knowledge to give this information context?

These are the benefits brought by a structure of reference concepts, independent from the specifics of the reference structure itself.

Over time, it is likely that a few Web-based reference structures will emerge and compete and get supplemented by still further structures. This evolution is expected and natural and desirable in that it provides choice and options.

Alternative Approaches

Since the Web's inception, there have been various alternatives tried or in ascendance for organizing and bringing structure to Web content. Some of these may be too static and inflexible, others perhaps too arbitrary or parochial. All approaches to date have had little collective success.

Here is a summary of some of these alternate approaches:

Since inception, the stated intent of the UMBEL project was to base its subject structure on extant systems. To minimize development time, the structure needed to be drawn from one of the categories above. Possible development of a de novo structure was rejected because of development time and the low probability of gaining acceptance in the face of so many competing alternatives.

Rationale for OpenCyc

The granddaddy of knowledge bases suitable to all human content and knowledge is Cyc. Because of its more than 20-year history, Cyc brings with it considerable strengths and some weaknesses.

Amongst all alternatives, Cyc rapidly emerged as the leading candidate. While its strengths warranted close attention, its weaknesses also suggested a considerable effort to overcome them. This combination compelled the need for a significant investigation and due diligence.

First, here are OpenCyc's strengths:

As first encountered, one impression of OpenCyc was that of a very solid structure, but somewhat obscured and deserving of a fresh cleaning.

The Decision and Implementation

Nearly five full months of due diligence were devoted to the question of the suitability of OpenCyc as the conceptual and relationship grounding for UMBEL.

On balance, OpenCyc’s benefits significantly outweighed its then weaknesses. This balance also stood considerably superior to all potential alternatives. An important factor through this deliberation was the commitment of Cycorp and The Cyc Foundation to the aims of UMBEL, and the willingness of those organizations to lend time and effort.

The decision was thus made in October 2007 to base UMBEL on OpenCyc and to undertake the (eventual) two person-years of effort to clean and vet the OpenCyc knowledge base for UMBEL’s purposes.

As discussed in the accompanying piece on UMBEL's role, the project has also made two pivotal decisions with respect to OpenCyc and its use:
  1. All UMBEL subject concepts are based on existing concepts in OpenCyc. This means UMBEL inherits the proven structure and relationships extant in OpenCyc
  2. No new subject concepts will be added to UMBEL that are not included in OpenCyc. This means that UMBEL's structure will not diverge from the structural relations already in OpenCyc. This decision preserves the use of UMBEL as a sort of contextual middleware between unstructured Web content and the inferential and tools infrastructure within OpenCyc (and beyond into ResearchCyc and Cyc for commercial purposes) and back again to the Web. We term this "round-tripping" and the capability is available for any of the 20,000 subject concepts vetted from OpenCyc within UMBEL.

Fortunately, in the intervening months, Cycorp has been responsive and made changes to the OpenCyc concept structure and its conversion to OWL in support of needs and observations brought forth by the UMBEL project. These provide comfort and a solid, adaptable structure for UMBEL moving forward.


UMBEL Logo Copyright © 2007-08. UMBEL.org. Some rights reserved. Creative Commons License
Site problems? Please report it here.