Wednesday, July 22, 2009

Semantic precision and documents

Modelling semantics provides all of the following: glossary, controlled vocabulary, data dictionary (objects, properties, visualisations), data model (relationships, cardinality, layouts), taxonomy (inheritance hierarchy, and domains), ontology (where URLs relate data).

When determining semantic precision one would assess if the representing is
  • correct: consistency of syntax and semantics
  • translatable: can it transformed ensuring semantic equivalence
  • analysable: can it be searched, queried and reasoned on. Can logic be applied, can it be "measured" directly or does the representation require intepretation.
  • integrateable: can they be related to other knowledge in other forms
  • complete: adequacy, expressivity, scope
  • extendable: is it difficult to define new concepts
  • concise: do they record things as efficiently as possible and their is no unintended duplication.
In these definitions we don't mean what a person can do through interpretation or inference i.e. perhaps someone looking at a picture can analyse it, integrate with other concepts they have etc., but a system or application can't.

When we are examining the utility of documents (e.g. Word, Powerpoint, Visio diagrams) - we can see that they are not a very good way of ensuring semantic precision. They don't ensure correctness, they are not easily translatable, intergratable, and they can not be analysed or reasoned on. They can be complete, though seldom are, and the form doesn't require completeness (e.g. there is no validation of cardinality). Likewise they are seldom concise. They are extendable.

Diagrams provide special problems. In some mature domains semiotics and iconography have matured so that images can convey semantics with some precision. Sadly when one looks at many Visio diagrams one find that they don't follow any precise semantics e.g. they don't have a controlled vocabulary and spatial concepts e.g. alignment, containment, proximity and connecting lines, don't have explicit meaning and are used inconsistently. This lack of precision means they can not easily be analysed, integrated, and often importantly translated.

People often seek to reuse the information in documents. This means they need to be able to be translated and integrated. It is often suggested that with some remediation (i.e. by a person) these documents can be made "correct", "complete" and "concise" so that this is viable. In practice it is almost always more efficient for a person to examine them and model them directly in some form that ensures semantic precision - rather than attempt to correct them, then translate them and then deal with any translation errors (or remaining incorrectness in the source). The very act of fixing them is more effort than the modelling.

The metamodel provides the modelling semantics in a central repository. A metamodel can consist of a set of metamodels each covering a different domain of interest. Processes are required to manage change (approval processes associated with the creation or changes) and ensure duplicate or redundant data elements are not introduced. This is more difficult than it appears as often multiple mechanisms are available for recording information e.g. a property of a person could be their last name, or their last name could be recorded as a relationship to a family.

I encounter problems with these issues every day when looking at business or technology strategy, enterprise architecture, business analysis etc. The authors of the artefacts and a small clique that the authors have usually communicated with directly to explain things - don't see the problem - but everyone else who encounters the documents and tries to work with them finds problems resulting from semantic imprecision i.e. incorrect, incomplete, untranslatable, unanalysable, unintegratable and are not concise.

I would argue that is a major problem with current approaches to the adoption of business plans an strategies in particular i.e. when the foundation artefacts are semantically imprecise.

No comments:

Post a Comment