Many years ago, brilliant innovators at Statistics Canada,
Letting the files out was step one, but the challenge was finding exactly where data on specific topics resided. Searching was just like looking for a needle in a haystack until a standard - called Data Documentation Initiative or DDI - permitted indexing at the variable level and used XML structures combined with powerful search functionality to find buried data treasures and display them in context - see http://search1.odesi.ca/ as an example. The investment in the indexing was required before the full value of the access could be realized.
DLI and DDI came to mind recently during a discussion about the challenges associated with finding buried text information - information contained for example in printed or imaged reports but not detectable by a search engine or catalog search function because the catalog records or metadata for the reports aren't sufficiently detailed. The question was raised, "could we go back and include in the catalog record the table of contents of a report and abstracts of chapters?" In other words, could we "liberate" the details held in the body of each report? The answer: Certainly we could - if we could afford it. In-depth indexing of information objects to provide intellectually value-added precision and granularity in searching is very expensive because it requires time consuming work by professionals familiar with the subject matters at hand.
Digitization of print materials provides a major boost to information discoverability, but we are all familiar with the avalanche syndrome - what's the use of retrieving a huge list of documents containing a search term any old place in the text? The one paragraph we need right now to solve a particular problem is in there … but it is beyond our practical reach. It's ironic that without indexing, printed and electronic objects present similar retrieval challenges when it comes to unearthing the nugget buried on page 78.
The case for investing in indexing to assure future retrievability is obvious in concept. Funding the activity is another story. The takeaway for today's information object creators is simple: Whenever we consign an object to a repository no matter how informal, let's be mindful not just of attaching metadata where possible but also of incorporating in prominent places such vocabulary as might assist future searchers when they are attempting to dig it up.