When viewing detailed catalog records, I generally notice that catalogers choose to use controlled vocabulary for their subject terms. This is commonplace for bibliographic records, because these types of records are part of large catalogs that are built around the concepts of controlled linking, collocation, and browsing. But subject terms, whether single word or compound terms, rarely provide enough information for users to determine the aboutness of the resource.
Oksana L. Zavalina’s article “Complementarity in Subject Metadata in Large-Scale Digital Libraries: A Comparative Analysis” shows how important the Description metadata element can be when it is used by catalogers. Zavalina indicated that free-text descriptive metadata in the Description element field of an item’s record will invariably complement the topical subject data values which are supplied by controlled vocabularies. Further, not only are there complementary relationships between the Description and Subject element sets, but also between Description and Object Type, as well as Temporal elements. The only element set that does not enjoy a significant augmentation from free-text description is the Spatial refinement. It makes sense that geographic data is complemented the least because of the specific nature of geographic locations, and the fact that there are geographic designators in the form of the ISO 3166 country codes.
There is some concern that high levels of complementarity could be viewed as a sign of insufficient granularity in metadata design. The contention is that if one record represents a single resource, as in the 1:1 Principle, then one metadata statement should represent one discrete property of an object. In other words, there should be no redundant data values in a catalog record. The number one rule in relational database design is to avoid redundant data. I’m sure the same principle applies to information retrieval system design. But I guess I do not see the harm in having redundant data values within a single item-record (beyond the practical concern of digital storage space, which is hardly an issue with today’s server capacities). Granted, you do not want to violate the 1:1 principle and create multiple item-level records to stand in as the representational surrogate for the same resource. But having some redundant data values in the same item-level record, while unnecessary, does not strike me as harmful.
Complementarity does not imply redundancy. The Description element is meant to give users a fuller picture of the intellectual character of a resource. The example (Figure 4) in the Zavalina article provided several new concepts that were not apparent among the other elements, such as “children’s lore,” “foodways,” “Native American culture,” etc. Without reading the Description element, it would not have been clear that the resource covered these topics, which could be of tremendous value to children’s historians, dietitians, or Native American scholars.
Perhaps descriptive statements should be shortened to include only new information supplied from the cataloger’s observations. In writing descriptive statements, the data which gets reiterated could be eliminated in the final, uploaded version of the statement. This would be one way of getting rid of redundant data. But truthfully, much of this seems like hair-splitting to me. The important point is having a DC element that provides intellectual contextualization of a resource.
As important as free-text description is, sometimes controlled vocabulary provides a proper term that escapes the free-thinking ideas of the cataloger during the process of free-text entry. This just demonstrates the vagaries of subject analysis, which reminded me of Miller saying that the practice of subject analysis is “inherently mired in the subjectivity and ambiguity of human thought processes; personal, cultural, social, linguistic, and subject knowledge limitations and potential biases, whether conscious or unconscious, and the ambiguity of human language itself” (Miller, 99).
Clearly, descriptive metadata is helpful for contextualizing records and giving users an intelligible framework for understanding the resource on an intellectual level. A short description of the resource is often a requirement for catalogers now because the description is what gives users a sense of relevance based on their queries or interests. Whether or not full-text indexing is available through the content management system seems irrelevant to me because, fundamentally, the user should be able to understand intellectually what the item-level record signifies.