Tuesday, September 18, 2012

Unstructured vs Structured

There are many terms I dislike from the misuse of the term innovation to the whole computer utility vs cloud debate. Another example of this is the whole unstructured vs structured data argument.

The terms unstructured vs structured implies there is a difference in the data sets i.e. unstructured data is by its very nature unstructured and unlike structured data. The term implies this is a permanent state of affairs, a set of data which has no structure and therefore cannot be modeled.

However, our entire history of scientific endeavour can be broken down into discovery of data we didn't understand, creation of models to explain the data and finally data we now understand. In other words we constantly move from unstructured to structured via the creation of a model.

Hence I prefer the term un-modeled data vs modeled data. Inherently this implies there is not a difference in the data sets just simply in our ability to model. It implies that there will be a movement from one to another.

What is your view? Am I the only person who dislikes this framing of unstructured vs structured?