UMBEL Logo
Upper Mapping and Binding Exchange Layer
A lightweight, subject concept reference structure for the Web
UMBEL Index
UMBEL wiki

Key Terminology

UMBEL's main classes categorize subject concepts; notable instances are specifically termed named entities.

UMBEL defines subject concepts as a distinct subset of the more broadly understood concept such as used in the SKOS RDFS controlled vocabulary, conceptual graphs, formal concept analysis or the very general concepts common to many upper ontologies. UMBEL contrasts subject concepts with abstract concepts and with named entities.

Subject Concepts

Subject concepts are a special kind of concept: namely, ones that are concrete, subject-related and non-abstract. Note in other systems or ontologies similar constructs may alternatively be called topics, subjects, concepts or perhaps interests. UMBEL has adopted the term subject concept to distinguish from these uses, which have different nuances of meaning and use, as well as to highlight the subject or topic nature of UMBEL's concrete concepts.

Each subject concept is a class. While subject concepts have a preferred label (using SKOS terminology), they are representative or a proxy for that concept, and not to be confused with the thing itself. Every UMBEL subject concept can be expressed and referred to by a different preferred label in alternate languages. Indeed, in a given language, different preferred labels may be swapped out without affecting the identity or use of the subject concept itself. The name for a subject concept is therefore merely a handle.

Subject concepts are the core constituents to the UMBEL framework. All subject concepts are based on existing concepts in OpenCyc, the open source version of the Cyc knowledge base (see related article). About 20,000 of them have been distilled and are part of the UMBEL backbone.

Semsets

Semsets are semantically close terms or phrases synonomous or nearly so with the meanings of a subject concept or a named entity. Semsets are akin to WordNet synsets or Cyc aliases, but can also include acronyms or more contemporary jargon or slang as may be drawn from Web tagging or folksonomies. The term semset has been chosen to distinguish this consolidated meaning.

Semsets may apply to either subject concepts or named entities. In the latter case, their use is closer to the sense of an alias (such as nicknames, or 'great satan' or 'uncle sam' for the United States).

Abstract Concepts

Abstract concepts represent abstract or ephemeral notions such as truth, beauty, evil or justice, or are thought constructs useful to organizing or categorizing things but are not readily seen in the experiential world. They are included in the UMBEL specification because they help maintain the integrity of the UMBEL subject concept graph.

Like subject concepts, abstract concepts are based strictly on those already in OpenCyc. Abstract concepts may be viewed in the UMBEL graph, and may be used for ontology mapping, but are not generally displayed when doing standard content mapping or concept look-ups via Web services. For various domain extraction or relatedness determinations, abstract concepts may be excluded from UMBELs internal processing.

Named Entities

Named entities are the real things or instances in the world that are themselves natural and notable class members of subject concepts. The initial named entities are drawn from Wikipedia as processed via YAGO, and other online fact-based repositories. Named entities are the instances of the subject concepts in the standard definition of the term.

Named entities and the sources for them are also a major avenue for growth and expansion of UMBEL moving forward. Named entities are more contemporary and changing, while the reference subject concept backbone is more fixed and stable.

Each named entity is mapped to a governing subject concept for ontology purposes. There are no relations between named entities except as mediated through a subject concept(s). As noted, named entities may also have semset aliases.

Subject Concepts v. Abstract Concepts

The following table helps draw the distinction between subject concepts and abstract concepts. Inspect the UMBEL documentation to see the 740 or so abstract concepts presently within UMBEL. Looking at those can help draw the distinction.

Subject Concepts Abstract Concepts
  • Nouns or noun phrases
  • These are concrete kinds of things or ideas in the real world
  • Broad, collective, reference concepts, often hierarchically related
  • Similar to topics or subjects, these other terms are used in somewhat different ways in alternative schemas
  • Collections or classes of like kinds of items
  • Quite stable in scope, breadth and structure
  • Grounded in the OpenCyc knowledge base, which is the source of its relationships and graph structure
  • Named entities are members of subject concepts
  • These are either: 1) abstract (truth, beauty, evil) concepts, or 2) artificial thought constructs for organizing things but not encountered as standalone concepts in their own right ( e.g., PartiallyTangibleThing)
  • Collections or classes of like kinds of items
  • Class members may be either other abstract concepts or subject concepts
  • Class members are never named entities
  • Tend to reside higher in the subsumption structure
  • Generally hidden from the UMBEL subject concept reference backbone structure
  • May be used for ontology mapping purposes
  • Grounded in the OpenCyc knowledge base, which is the source of its relationships and graph structure

Subject Concepts v. Named Entities

The following table helps draw the distinction between subject concepts and named entities. More explanation is also provided in the documentation.

The distinction between these two categories is not always clear. For example, most geographical places clearly belong to the named entity category. But, on somewhat arbitrary grounds, all nations, countries, states and provinces were assigned as subject concepts so that they would act as classes with other entities mapped to them. It should also be noted that entites or concepts in the gray zone may be treated both as a named entity and a subject concept.

Subject Concepts Named Entities
  • Broad, collective, reference concepts. In a hierarchical category structure, subject concepts represent the root or branch nodes
  • Nouns or noun phrases
  • Called subject concepts (or sometimes as a shorthand, concepts). Similar to topics or subjects, these other terms are used in somewhat different ways in specific in alternative schemas and are therefore not used interchangeably here
  • These are not abstract (truth, beauty, evil) concepts, but concrete about kinds of things or ideas in the real world; abstract concepts are often properly part of what are known as upper ontologies but they are not applicable for UMBELs purposes
  • Collections or classes of like kinds of items
  • Quite stable in scope, breadth and structure
  • Grounded in the OpenCyc knowledge base, which is the source of its relationships and graph structure
  • Basis for the UMBEL subject concept reference backbone structure
  • Named entities are members of subject concepts
  • Atomic, specific objects, often famous or well-known, that belong to reference types such as persons, places, organizations, events, products, time intervals, etc. In a hierarchical category structure, named entities represent the leaves
  • Nouns or noun phrases
  • Called named entities not entities alone, to prevent confusion with other general senses of the term entity and in keeping with named entity recognition (NER).
  • Very concrete, atomic entities
  • The number and scope is fluid and growing, and potentially of huge size as specific objects are named
  • Often expressed as a proper noun (with some capitalization), but not necessarily so. Common animal, plant, object, substance names also can be named entities
  • Major sources are Wikipedia (YAGO), and similar such as Wikinvest, Wikicompanies, etc.
  • Named entities are maintained and treated separately from the UMBEL subject concept ontology
  • Every named entity belongs to at least one subject concept.

Though there are shades of gray between subject concepts and named entities, we have generally found this distinction to be a powerful means for gaining clarity in UMBEL's design. It provides a clean path for keeping an ontology lightweight while in essence providing infinite extensibility for all manner of named entities and the datasources that contain them. Moreover, the ability to classify named entities into types orthogonal to subject concepts also provides useful guidance for presentation templates that may be automatically invoked in data meshups.

UMBEL Logo Copyright © 2007-08. UMBEL.org. Some rights reserved. Creative Commons License
Site problems? Please report it here.