Sunday, August 30, 2009

What price SITP data quality

More lessons from CRM - CRM systems like SITP systems are largely oriented at making decisions (rather that supporting transactions) i.e. more shameless plagiarism (or learning from more mature disciplines as a I prefer to think of it as)

See the source reference here: http://www.computerworld.com/s/article/9135688/What_Price_CRM_Data_Quality_?source=CTWNLE_nlt_entsoft_2009-07-23

In an SITP the data does not have to be perfect to be useful. Many things can be just an approximation or can be missing altogether. So how do you decide where to make your data quality investment?

The first step is separate the data elements (objects and relationships and their properties) into three categories the ones that:
(1) must be there and must be correct to prevent corruption in external systems or misrepresentation of the business.
(2) should be correct for the SITO to work at all
(3) people have asked for to make decisions or record the state of the enterprise

A data quality analysis on each data element in the three categories can be scored based on questions such as:
- Ownership - Does it have an undisputed owner, or is it updated by specific roles as part of a formal business process, or can nearly anyone update it in an ad-hoc fashion.
- Validatity and auditability - Does it have validation on entry and does an audit trail track changes.
- Completeness and correctness - how complete is the source data, how much is missing, clearly incorrect, or duplicate.
Developing an understanding of this metadata is a 1st step that needs to be taken when looking at data preparation and loading.

Data element in an SITP system is in there because it was required to answer a specific business questions and/or because the SITP system is the source of record for this data.

Analysis will discover what data elements that are often missing or wrong. Depending on what concepts are being dealt with the data that is often found missing depends changes e.g.
- Business goals and strategies - will often lack: weighting and explicit relationships
- Applications - will often lack: a range of cost information, related standards and business processes
- Standards - will often lack: lifecycles, basis of current preferences, current and planned usage

Historically where there is no SITP solution it is difficult to collect some of the data in the first place, there are many ways for the meaning of the data to be misinterpreted or misrepresented, and there is often no easy way of usefully applying it (i.e. its recorded as academic exercise and not tested through the fire of use).

In many cases, you can't afford to spend too much on data quality before the system is implemented. What you need to understand is the metadata. If the business question needing this data is materially affected by the quality of this data you will need to carefully assess the cost of remediating the data with the impact and improvement in its quality with have on the decision that are being made base on it.

You can set business rules that allow you to determine when it's worth chasing a data element's quality and when it isn't.

It gets exponentially more expensive to improve data quality. If it costs $X to get solid data quality on 2/3rds of your records, it will probably cost $2X to get the data quality right on half of the remaining records, and $4X to get the half of what bis then left.

For most purposes it is hard to justify a lot of remediation on historical data, and one is better to invest the effort time in improving the way data is captured on an ongoing basis. In many cases it is sufficient to start capturing data on a forward on a JIT basis, or as data is used. Over a short space of time i.e. a year a two most of the data needed for SITP will accrete as natural by-product of improved processes (of course adapting the processes to capitalise on the SITP is a key step to enabling this).

Wednesday, August 26, 2009

EA Frameworks

Everytime one looks at implementating solutions to support strategic IT planning, transformation, optimisations and governance one is asked one supports one or more of the common EA frameworks.

The use word "frameworks" gives a good insight to EA as a practice. Major EA frameworks (FEAF, DODAF, TOGAF, Zachman) mean different things by the terms (taxonomy, method, reference models, architectural styles) and have different roles.

In my experience people who ask about this seldom have a good grasp of the real questions - so I usually ask them to describe what they are trying to achieve. This usually provokes frustrated responses because they don't want they want to achieve they just want to follow the framework du jour.

It would perhaps be more acceptable to blythly follow these frameworks, if over more than a decade, they had proved widely successful.

[WIP]

Tuesday, August 11, 2009

EA-Emperor's new clothes - from Gartner

According to this emergent architect item Gartner now claims - EAs must adopt a new style of enterprise architecture - 'emergent architecture', also known as middle-out EA and light EA, and set out definitions of the new approach. I see this as Gartner starting to slowly realise that they have long confused Enterprise and Solutions Architecture (doing neither constituency any good).

I would argue that what is now being advocated has always been best practice i.e.
  • Architect the lines, not the boxes i.e. the connections between different parts of the business rather than the actual parts of the business themselves. [this is why the focus SW developed oriented modelling and complex BP modelling/simulation has seldom made sense]
  • Models all relationships as interactions via some set of interfaces [obviously - which is why I have long argued for a decade that the venerated Zachman model needs another column i.e. on presentation/interfaces (user, system)]

And regarding the 7 "new" principles:
  1. Non-deterministic - "they instead must decentralise decision-making". The term non-deterministic is a misnomer. What you really want is objective deterministic decision making carried out in a decentralised, objective and transparent way.
  2. Autonomous actors - "EAs must now recognise the broader business ecosystem and devolve control to constituents". EAs have NEVER controlled all aspects of architecture. One is tempted to suggest EAs should also consider how they work with the IT ecosystem.
  3. - Rule-bound actors - "EAs must now define a minimal set of rules and enable choice". EAs have NEVER provided detailed design specifications for all aspects of the EA (this is confusing Enterprise and Solution Architect)
  4. Goal-oriented actors - "Each constituent acting in their own best interests". It is just silly to suggest anything else has ever really happened the real world.
  5. Local Influences: "People are influenced by local interactions and limited information. Feedback within their sphere of communication alters the behaviour of individuals. No one has data about all of an emergent system. EA must increasingly coordinate". [Putting aside the absurd use of the word "Actor" in the original"]. Yawn - nothing new here. However one wonders if the use of the word "system" belies a recidivist tendancy towards again confusing Enterprise and Solutions Architecture]
  6. Dynamic or Adaptive Systems: "System changes over time. EA must design emergent systems sense and respond to changes in their environment.". Again nothing new.
  7. Resource-Constrained Environment: "emergence; rather, the scarcity of resources drives emergence". Necessity if the mother of invention [is over two thousands year old]
Nevertheless it is helpful to recognise that 5 of Gartners "new" points if restated are worth reiterating
  1. Decentralised decision making needs to be facilitated (and therefore access to information) - so we need to make information available to people in many places (in a way that suits them)
  2. Engaging with all constituencies is key - including the broader business ecosystem and devolve control to constituents (and of course the IT ecosystem)
  3. Publishing rules (standards, patterns etc.) is critical - so we can let solutions architects, and specialised technical architects do their job.
  4. Making decisions objectively and transparently is critical - as constituencies will always act in their own best interest - so we need to understand, and be able to examine/question, why different conclusions are reached.
  5. Facilitating KM and collaboration if fundametal - rather than trying to control all the information, and create all the artefacts (or pretend that the knowledge can "live" in documentary artefacts).

Sunday, August 9, 2009

Reference models - what examples exist in mature domains

To gain an insight into reference models we can examine how more industry deal with reference models. I am looking at this through a very narrow lenst focusing on patterns for the built environment.

In Alexander's seminal work on Patterns he describes a set of patterns. Some of these patterns relate to each other. The patterns based on the physical size of what they apply to i.e. first come those applying to large areas (e.g. towns), then comes patterns relating to buildings, then to space in buildings (e.g. rooms), then patterns relating to elements within buildings (e.g. column details).

We could regard these as a reference model (RM) for design.

These patterns as a whole sit beside an extant set of reference models some of which are so well known that they hardly need articulation e.g. lists of:
  • types of space - rooms - a dwelling could have;
  • functions a dwelling could facilitate;
  • measures: structure, construction, lighting, acoustics etc.;
As well as beside reference models that are encapsulated within various regulations and laws, or by technologies
  • building best practice (codes)
  • products
  • safety and efficiency
  • etc.
Lastly we could consider the set of psychological and physiological requirements people have for a buildings:
  • warmth, quiet
  • peaceful, welcoming
The original work, while seminal, has several limitations resulting from a lack of semantic precision and the form of the work (a document). These are that:
  • it does not relate the patterns RM to other RMs requirements, space-types, functions, products etc. which as a set are the constraints reflect requirements and technologies.
  • the mode of presentation does allow one to automate the analyse of the use of patterns for referential of inferential integrity.
These limitations can be now be easily addressed with modern methods of managing reference models.

Perhaps in architectural domain they can be effectively dealt with be people (though looking at building one sometimes wondered). But I don't think this is true in some other increasingly complex domains where technology and expectations change too rapidly to allow best practice to encapsulated in documents, taught in a professional school, and reflected in regulations.