|
UMBEL's main classes categorize subject concepts; notable instances are specifically termed named entities.
UMBEL defines subject concepts as a distinct subset of the more broadly understood concept such as used in the SKOS RDFS controlled vocabulary, conceptual graphs, formal concept analysis or the very general concepts common to many upper ontologies. UMBEL contrasts subject concepts with abstract concepts and with named entities.
Subject concepts are a special kind of concept: namely, ones that are concrete, subject-related and non-abstract. Note in other systems or ontologies similar constructs may alternatively be called topics, subjects, concepts or perhaps interests. UMBEL has adopted the term subject concept to distinguish from these uses, which have different nuances of meaning and use, as well as to highlight the subject or topic nature of UMBEL's concrete concepts.
Each subject concept is a class. While subject concepts have a preferred label (using SKOS terminology), they are representative or a proxy for that concept, and not to be confused with the thing itself. Every UMBEL subject concept can be expressed and referred to by a different preferred label in alternate languages. Indeed, in a given language, different preferred labels may be swapped out without affecting the identity or use of the subject concept itself. The name for a subject concept is therefore merely a handle.
Subject concepts are the core constituents to the UMBEL framework. All subject concepts are based on existing concepts in OpenCyc, the open source version of the Cyc knowledge base (see related article). About 20,000 of them have been distilled and are part of the UMBEL backbone.
Semsets are semantically close terms or phrases synonomous or nearly so with the meanings of a subject concept or a named entity. Semsets are akin to WordNet synsets or Cyc aliases, but can also include acronyms or more contemporary jargon or slang as may be drawn from Web tagging or folksonomies. The term semset has been chosen to distinguish this consolidated meaning.
Semsets may apply to either subject concepts or named entities. In the latter case, their use is closer to the sense of an alias (such as nicknames, or 'great satan' or 'uncle sam' for the United States).
Abstract concepts represent abstract or ephemeral notions such as truth, beauty, evil or justice, or are thought constructs useful to organizing or categorizing things but are not readily seen in the experiential world. They are included in the UMBEL specification because they help maintain the integrity of the UMBEL subject concept graph.
Like subject concepts, abstract concepts are based strictly on those already in OpenCyc. Abstract concepts may be viewed in the UMBEL graph, and may be used for ontology mapping, but are not generally displayed when doing standard content mapping or concept look-ups via Web services. For various domain extraction or relatedness determinations, abstract concepts may be excluded from UMBELs internal processing.
Named entities are the real things or instances in the world that are themselves natural and notable class members of subject concepts. The initial named entities are drawn from Wikipedia as processed via YAGO, and other online fact-based repositories. Named entities are the instances of the subject concepts in the standard definition of the term.
Named entities and the sources for them are also a major avenue for growth and expansion of UMBEL moving forward. Named entities are more contemporary and changing, while the reference subject concept backbone is more fixed and stable.
Each named entity is mapped to a governing subject concept for ontology purposes. There are no relations between named entities except as mediated through a subject concept(s). As noted, named entities may also have semset aliases.
The following table helps draw the distinction between subject concepts and abstract concepts. Inspect the UMBEL documentation to see the 740 or so abstract concepts presently within UMBEL. Looking at those can help draw the distinction.
| Subject Concepts |
Abstract Concepts
|
|
|
The following table helps draw the distinction between subject concepts and named entities. More explanation is also provided in the documentation.
The distinction between these two categories is not always clear. For example, most geographical places clearly belong to the named entity category. But, on somewhat arbitrary grounds, all nations, countries, states and provinces were assigned as subject concepts so that they would act as classes with other entities mapped to them. It should also be noted that entites or concepts in the gray zone may be treated both as a named entity and a subject concept.
| Subject Concepts | Named Entities |
|
|
Though there are shades of gray between subject concepts and named entities, we have generally found this distinction to be a powerful means for gaining clarity in UMBEL's design. It provides a clean path for keeping an ontology lightweight while in essence providing infinite extensibility for all manner of named entities and the datasources that contain them. Moreover, the ability to classify named entities into types orthogonal to subject concepts also provides useful guidance for presentation templates that may be automatically invoked in data meshups.