What is Content?
Several internet articles and blogs address the meaning of content from an internet perspective. From this perspective, content is the (meaningful) stuff on a page, the presentation of information to the seeker.
But content within an operations-centric perspective is entirely different. And the databases and operational tools must be content data reflecting the desired information being sought in the pursuit of knowledge. Thus, paraphrasing Scottie Claiborne (http://www.successful-sites.com/articles/content-claiborne-content1.php), “content is the stuff in your operations system; good content is useful information”.
Therefore, content is the meaningful data and the presentation of this data as information.
Content can, and should be, redundant. Not redundant from a back-up perspective; redundant from an information theory perspective – data that is inter-related and inter-correlated. (Data that is directly calculated need not be stored, however, the method of calculation may change and therefore the original calculation may prove useful.) Data that is inter-correlated may be thought of in terms of weather: wind speed, temperature, pressure, humidity, etc. are individual, measurable values but the inter-relate and perfectly valid inferences may be made in the absence of one or more of these datums. When the historical (temporal) and adjacent (geospatially) datums are brought into the content, then, according to information theory, more and more redundancy exists within the dataset.
Having identified the basis of content, the operations system designer should perform content analysis. Content analysis is both qualitative and quantitative. But careful attention to systems design and systems management will permit increased quantification of the results. What is content analysis in its most base form: the designer asking the questions “What is the purpose of the data? What outcomes are expected from the data? How will the data be imparted to produce the desired behavior?”
So how do we quantify the importance of specific data / content? How do we choose which data / content to retain? This question is so difficult to answer, the normal response is to save everything, forever. And since data not retained is data lost, and lost forever, this approach seems reasonable in a world of diminishing data storage costs. But, then, the cost and complexity of information retrieval becomes more difficult.
The concept and complexity of data retrieval is left for another day…