The Organization of Information

Taylor and Joudrey (2012) concluded their book, The Organization of Information, by stating that there is much work to be done in both information organization and the development of retrieval systems. With the diffusion of information in today’s world, the effort to analyze, arrange, classify, and make readily available millions of resources is a task that requires sophisticated programming of bibliographic networks, as well as endless hours of critical and analytical work from trained catalogers or indexers. Taylor and Joudrey showed that, despite advances in technology, the human mind is still needed to interpret a myriad of information resources by providing subject analysis, controlled vocabulary, and classification schemes to the descriptive practice of knowledge management.

We have now witnessed almost two centuries of bibliographic control, with many of the foundational principles of cataloging and description still in use today. For example, collocation – the intellectual and proximal ordering of bibliographic materials – was an invention of nineteenth-century founders such as Anthony Panizzi, Charles Ammi Cutter, and Melvil Dewey. These individuals saw the importance of creating subject headings and classification rules, which libraries shortly adopted thereafter in the form of dictionary catalogues, indexes, thesauri, and subject lists. The goal of these systems was to classify the entirety of all knowledge. This all started with the Dewey Decimal Classification system, which had ten main discipline classes with 10,000 subdivisions in which books could be classified. This system was expanded by Cutter in the use of his Expanded Classification system, which included letters to represent subject classes. Cutter’s system ultimately found its way into the Library of Congress Classification system, rather to the chagrin of Dewey.

The development of computerized systems to aid in the structuring and retrieval of knowledge occurred in the late 1960s. Machine-readable Cataloging (MARC) was introduced in 1968. MARC formatting allowed computers to read and encode bibliographic records by utilizing a numeric coding system that corresponded to the areas of description in a written catalog record. These codes contained “variable fields” for areas of variable length (such as a book title or author name); “control fields” for numeric data points (call numbers, ISBNS, LCCN, etc.); and “fixed fields” for bibliographic data of a predetermined length and format, such as a three-letter language abbreviation.

Bibliographic networks were built to accommodate the MARC format. The first major network to emerge was the Ohio College Library Center, which morphed into the OCLC (Online Computer Library Center), still in use today. OCLC allows catalogers the ability to import bibliographic records from a shared network of libraries and information resource centers. Where importing occurs, this is referred to as copy cataloging. A cataloger will add an already-cataloged record to their system, engaging in authority work by ensuring their record was copied from a reliable source like the Library of Congress authority files. Almost all public and academic libraries use OCLC, and this system has streamlined the work of cataloging in technical service departments. But it is important to note that this technology is almost fifty years old now. There are nascent trends in the world of information science that go beyond the reach of time-honored bibliographic networks.

The classical arrangement of knowledge mentioned above was based on a narrow set of information resources; primarily books. But not all resources that users need to be able to search and retrieve are biblio-centric. For example, an information seeker may need to find an artifact. Knowledge artifacts are as varied as the name implies. They can include sound recordings, historical objects, websites, performance art pieces, even concepts. This last example of “concepts” perhaps best illustrates the point. Indeed, a knowledge artifact can be purely conceptual or abstract in nature. Yet, as an artifact, it still needs to be described and collocated for information retrieval. This is done though a “technical reading” of the artifact; a process of critical analysis whereby the cataloger or indexer attempts to define the aboutness of a work.

The process of defining aboutness, referred to as subject analysis by Taylor and Joudrey, is at the heart of information organization. Subject analysis is arguably the most important part of cataloging work, and it is certainly the trickiest. In order to determine the aboutness of a work, the cataloger must be able to accurately represent a knowledge artifact. But the artifact in question might possibly not contain any lexical content. In other words, it may be a nontextual information resource, and thus completely intangible intellectually without the creator’s original insight. Yet, as a cultural information resource, the knowledge artifact still has meaning, which requires it to be abstracted and indexed. How is this to be done? Well, there is still debate among LIS professionals regarding the best practices for subject analysis. The common practice is to isolate subject keywords in an aboutness statement. However, aboutness statements impose the cataloger’s perceptions onto a work, classifying the artifact in a hierarchical manner which may not be culturally precise. Herein lies the danger of subject analysis.

This creates a dilemma for classification of knowledge artifacts. For instance, in order to make an information resource readily retrievable, controlled vocabulary is required. Controlled vocabulary are specific terms which are used for describing all “like” resources. But, as we have seen, describing knowledge artifacts can be difficult. Indeed, sometimes during subject analysis, the cataloger can only describe the of-ness of an artifact (Taylor & Jourdrey, 309). As a general rule, controlled vocabulary makes it easier to find resources in an information system. But if an original cataloger incorrectly represents a knowledge artifact, any surrogate record for that artifact will invariably be misrepresented. Surrogate records can number into the hundreds of thousands. So if the goal of bibliographic networks is to create standardized subject headings in an interoperable system, then hundreds of thousands of inaccurate records could be created. Conversely, if controlled vocabulary is not used in the representation of a knowledge artifact, then that artifact will be made all but impossible to retrieve in an information system. This is the dilemma of subject analysis.

Another argument against classification schemes of the past is that they contain restrictive rules which hinder knowledge discovery. Knowledge discovery is the ability to make connections between wide-ranging subjects that otherwise would not be related in a traditional classification system. For example, we have entered an era where almost all data can be linked together in novel and entertaining ways. This is the basis for the Semantic Web. Internet users can link and categorize anything they want by creating tags or folksonomies that showcase niche interests and new subject matter. By analyzing the content of the semantic web, information scientists are working to harness these folksonomies to improve search engine functionality and retrieval tools. It is an exciting time, but it is also a daunting time. Intellectual mastery of the semantic web is necessary to preserve entrenched disciplines that contain thousands of years of knowledge.

In the future, newer forms of information systems will be tried and tested. These will include natural language processors and artificial intelligence systems. But bibliographic data will still be inputted by humans through the process or cataloging and resource description. This task may become easier for catalogers and indexers as information systems may improve on their ability to offer suggestions or provide prepopulated subject headings. But just the same, the work will continue. Taylor and Joudrey illustrated that knowledge management is not perfect. There are flaws and implicit biases in subject analysis. But where data integrity for abstract and philosophical content is concerned, human intervention is still required. Indeed, knowledge is still the province of human beings, not machines.

Advertisements

The notorious case of self-censorship in the Fiske Report

Self-censorship in libraries occurred in the 1950s due to the fear of being “blacklisted,” an outcome produced by the McCarthy era and the House Un-American Activities Committee in their maniacal efforts to root out soviet conspiratorial activity. On the heels of this shameful period in American history, libraries transitioned from a period of careful, “patriotic” book selection to the more enlightened practice of collecting materials on diffuse and even controversial subject matters.

The “Fiske Report,” conducted by Marjorie Fiske between 1956 and 1958, is a 1500-page study which focused on book selection and censorship practices in California libraries. The damning conclusion of that report was that librarians censored themselves, often shamelessly and habitually. Curiously, however, the Intellectual Freedom Committee of the California Library Association (CLA IFC) had already secured victory over McCarthyism, opposing blacklisting and loyalty oaths. Yet, self-censorship was still a reality in library selection processes. For the CLA IFC, the question became: Why is censorship continuing unabated after the pressure of McCarthyism has subsided? Indeed, after these late victories, the CLA IFC attempted to unravel the mystery. There was a search on for a new and unmasked “enemy,” as it were, of intellectual freedom who threatened the newly minted freedom-model of California libraries (if not libraries across the nation). This was the basis of the Fiske Report.

In outlining the goals of the report, intellectual freedom was established as a “sacred” principle that librarians were exhorted to uphold going forward in defiance of what happened during the McCarthy period. Why, then, did this professional call to action not permeate the institutional practice of librarianship?

Until recently, the research data from the Fiske Report went unquestioned. But Latham (2014) points out that there were many problems in Fiske’s original research strategy, and in Fiske’s assumptions about the leverage librarians had to affect real change. Latham has reinterpreted this data using a feminist approach. This makes perfect sense considering the nature of the research data. For instance, the entire report was predicated on female service-oriented librarians in the 1950s when females were considered “timid” and “mousy.” Indeed, “women’s work” was still prevalent; a concept that goes back to the Cult of Domesticity and beyond.

The original report consisted of interview material with California public and school librarians. The gender ratio of the respondents was very unbalanced, with 87% of those interviewed being female librarians. Curiously, interview respondents occupying higher, more “elite” positions, like school administrators, were predominately male: 47 out of 48 individuals, in fact. (Latham 58) Therefore, what we have here in the Fiske Report is not just a random gender ratio imbalance, but – given the social context of the day – a deeply gendered and sexist politics. This structuralism of 1950s librarianship went overlooked, and this is what Latham addresses, informed by the evidence of earlier studies from Serebnick (1979) and Stivers (2002).

Gender norms of the 1950s suggested that men had more authority than women in matters of social importance (religion, morality, politics, etc.). This is reflected in the statement made by Max Lerner, speaker at the UC Berkeley School of Librarianship symposium cited in the article. Lerner said, “Having only petticoats among teachers and, perhaps, among librarians, too is not entirely healthy…” (66) This unfair and glaringly sexist statement reflects the consciousness of the day, which the symposium was rife with. Indeed, the pre-eminent sociologist Talcott Parsons concluded from the Fiske Report that self-censorship was still an issue in libraries because female librarians could not handle the intellectual rigors of reestablishing the authority of the library’s intellectual freedom.

If librarian autonomy and the role of the board or school administrators was compared in the report, why did the reality of the situation escape Fiske, a woman herself? Admittedly, she was a bit of a high-brow. But her attitudes toward women should have been gentler than that of Lerner or Parsons. Moreover, bias should have been tempered by her research support, as Katherine G. Thayer joined the research team to provide a perspective on librarianship. Thayer was the head of the library school at UC Berkeley, and she surely would have had a more intimate understanding of the field at the time. But the sad reality is that the librarians who participated in the study derived little support from the male administrative hierarchy when it came to figuring out the best practices for reversing restrictive collection policies.

Finally, as was already mentioned, Fiske’s research was deeply flawed. Latham writes, “None of [Fiske’s] interviews were taped, and notes were handwritten. When interviewees objected to handwritten notes, the interviewer used memory to reconstruct the data after leaving the interview (64). This is a big red flag. One does not simply – and certainly does not ethically – “fill in the blanks” when doing ethnographic research. When in doubt, clarification from the respondents should have been sought and attained with careful attention to the re-recording of participant perspectives. Therefore, the report suffers from a short-sightedness in both the integrity of data and in a deeper understanding of the cultural milieu of a male-dominated society.

The Organization of Knowledge: Then and Now

There are two traditional classification schemes for the organization of information. These we know fairly well. They are the Dewey Decimal System and Library of Congress Classification System (LCCS). They are still used where the physical organization of library materials are concerned. But these systems and the logic they are based on have become problematized in our Internet age.

It is the growth of the Internet, and the ever-increasing diversity of electronic resources that have spurred the need for change in organizing knowledge resources. Traditional catalogs, like the LCCS, relied on Library of Congress Subject Headings (LCSH) that were confined to subject disciplines. LCSH were relatively static, and they greatly restricted the number of access points that subject searches would yield. Thus, LCSH does not facilitate greater resource or knowledge discovery, which our twenty-first century explosion of information demands.

The work of Information Science professionals has turned to seeing a need to base the creation of bibliographic records on an entity relationship model. By grouping all like-resources together in terms of a derivative concept or title, the process of resource discovery can be greatly enhanced.

New organizational methods and standards put in place by the Functional Requirements for Bibliographic Records (FRBR) and Resource Description and Access (RDA) are meant to streamline the process of resource discovery and make the bibliographic universe more easily navigable. The new languages designed by these frameworks embody a larger network of resources, not just traditional analog materials like books. By cataloging the “work, expression, manifestation, or item,” bibliographic subjects take on a whole new meaning and gain interrelationships with other format-specific materials.

MobyDick

FRBR and RDA is quite ingenious. For example, databases that have these four types of entities cataloged can yield search results that cross institutional boundaries. Indeed, if a user is looking for something related to, let us say, “Moby Dick,” they will encounter an entire family of works containing that title or subject when further narrowing results. This could lead the user to not only the original print work from Herman Melville, but to other related items in the library like audiobooks, motion pictures, etc. Or perhaps other resources will be found in digital libraries, consortia, museums, or archives, such as first editions, manuscripts, artworks, plays, and other ephemera relating to Moby Dick, the historical enterprise of “whaling,” illegal whaling activity in contemporary society, or the biology of these ocean majesties.

Electronic retrieval systems have made it possible to retrieve bibliographic records from anywhere around the globe. As today’s information resources are shared, and indeed, born on the World Wide Web, the work of today’s catalogers and subject analysts is of a global scale. This is why it is so important to create and maintain systems like Resource Description Framework (RDF) and XML applications that can aggregate and logically order web resources. This is also why linked data and metadata are so important. These technologies illustrate how far the library profession has come in cataloging and making available bibliographic resources.

John Y. Cole and the Library of Congress

In The Library of Congress and the Democratic Spirit, John Y. Cole – the current Library of Congress Historian – explores the history of the Library of Congress (LOC) and what he believes to be the democratic roots of America’s largest national library. Cole’s chapter was written on the coattails of the LOC’s bicentennial in 2000. His attitude toward the Library is considerably exalted, and as then-co-chair of the Bicentennial Steering Committee, it seems that Cole sought to extend the “romantic” narrative of the Library’s history, started by the historian James Truslow Adams.

While Cole puts emphasis on the expansion of the Library and its efforts to include more cultural artifacts (or Americana), he seems to implicitly believe that this expansion and inclusion is based on the Library’s historical principles laid down from the personal desires of Thomas Jefferson. This is a big declaration of faith in Jefferson’s intentions. Cole seems to believe that there were no prejudices in the founding father’s formation of the LOC. Perhaps it is because of Jefferson’s presumably “selfless” gesture of donating his entire library at Monticello after the fire in 1814. But Jefferson was going broke, and he sold his library to the LOC. This is an example of what can go wrong when we assume things about historical “great men.” Indeed, Jefferson may be one of those men who thought better than he actually lived, for he was not the man who turned the LOC into an institution of national significance.

smi-war-of-1812-c6b0550c-885x497_q90_box-01301366898_crop_detail
It was Ainsworth Rand Spofford, the sixth Librarian of Congress, who emphasized the addition of non-legislative library materials, and in particular a growing collection of Americana. Cole writes that Spofford “had the vision, skill, and perseverance to capitalize on the libraries claim to a national role” (Cole, 172). Indeed, the most important provision that Spofford introduced to the LOC was the copyright law of 1870, which “ensured the continuing development of the Library’s Americana collections, for it stipulated that two copies of every book, pamphlet, map, print, photograph, and piece of music registered for copyright be deposited in the Library” (Cole, 173). We can understand how much of a game-changer this decision was, because it greatly added to the wealth and democratic inclusion of American cultural history. This, however, was not a decision having anything to do with Thomas Jefferson. So the democratic threads going back farther than Spofford are, I think, questionable.

The succeeding Librarian of Congress, John Russell Young, further integrated multiculturalism into the LOC during the melting pot years. Young reached out to U.S. diplomatic and consular representatives and had them send cultural works belonging to the populations that were immigrating to the United States (Cole, 175). This was another democratizing move, again having nothing to do with Thomas Jefferson…

Cole ends his article with a summary of the last Librarian of Congress, James H. Billington, and his major achievements. Billington oversaw the Library between 1987 and 2015. During this time, the LOC established a presence on the World Wide Web and much of the Americana collections were made available online during his tenure. This greatly leveled the democratic privileges to accessing the national Library. Cole says that this ushered in a new era of service and accessibility (Cole, 179).

So while I feel that Cole aptly highlighted the truly democratic changes that occurred during the centennial of the LOC, it remains dubious as to how these events were rooted in the “cradle of Jeffersonian democracy.”

LIS Evolves

rubin_fullsize_rgbIn Chapter 5 of Foundations of Library and Information Science, Richard Rubin looks at the Library profession and the history of the profession’s educational moorings. In the last quarter of the nineteenth century, library students trained under Melvil Dewey or his students from the Columbia School of Library Economy; the first library school which opened on January 1, 1887. Course requirements were brief and narrowly focused on clerical work, taking as little as six months to complete for a certificate. The Carnegie Corporation was concerned that library-related professional competencies dwarfed that of other professions. Attempting to address this issue, the Williamson Report was conducted by C. C. Williamson, which redefined the nature and scope of library teaching. A pedagogy emerged with greater emphasis on applied theory rather than work routines or “task-specific rules” (Rubin, 249).

Rubin goes on to explain the contemporary trends in the library profession; trends which have been so disruptive that they have changed the ways we even talk about the profession. I liked the metaphor from Van House and Sutton (2000), suggesting that the library profession has moved from “the Ptolemaic information universe with the library at its center, to a dynamic, Copernican universe with information at its center…” (Rubin, 256). Indeed, the influence of the rapid growth of information in defining this field has been stark. Due to the natural evolution of Information technologies, the traditional library profession has been forced to reassess its pedagogy.

It was interesting to learn that just fifteen years ago, critics declared library education to be in a state of crisis because there was no academic consensus on what Library schools should focus on. For instance, should iSchools focus on Library Service or Information Science? Or, as Rubin frames the issue: the library-service paradigm or the information science paradigm? Rubin concludes that these two perspectives are not incompatible. In fact, understanding socially-critical cultural values and the forms/functions of information technology are equally important for today’s library professional.

Rubin offers up three models that distinguish the LIS professional from other professions. He discusses the Trait Model, the Control Model, and the Values Model. The Trait Model identifies LIS professionals as being service-oriented, theoretically-informed, competent in library clerical duties, a member of associations, and cognizant of the ALA core values. The Control Model, borrowed from Winter (1988), emphasizes a more strict and hierarchical mindset. The Control Model, as I understand it, has more to do with information-based library professionals who catalog and organize knowledge. In other words, technical service professionals who classify, index, and determine library collections. This is regarded as a kind of “power” that is imposed on outside users from within the library as a centralized institution. Finally, the Value Model identifies LIS professionals who focus on fundamental library values, intellectual freedom, democratic access, etc.

Rubin’s section on the future of LIS professionals is appreciated for its optimistic outlook. For years, library professionals have been predicting a doomsday scenario for their careers. But I feel that there is a lot more certainty nowadays when it comes to staying employed in this field. Indeed, the outlook has changed significantly in the past 10, or even 5 years. Today, there is a big professional push toward e-resources and digitization. In terms of archives or special collections, these activities are admittedly proscriptive. Given enough time, collections will have been digitized, and the amount of work and employment available in this professional aspect will fade. But there is still an ongoing need for classifying and making available e-resources, as well as born-digital content. There is also an ongoing need for traditional librarians and clerical staff. True, digital architecture and digital assets may be ephemeral jobs on a long-term occupational time-scale, but there will still be a need for reference and user services in various fields. As well, I see librarians taking on – as Rubin mentioned via Lankes (2011) – a more “participatory” role in society. Think tanks and the blogosphere – or biblioblogosphere – is the ready province of 21st-century librarians. Research is also an inexhaustible area. With time, there is an ever-expanding amount of informational resources that need to be preserved, collocated, and brought to mind instead of being left to deteriorate or obsolesce in some forgotten space. Moreover, future generations will need education in literacy, reading, and lifelong learning. Indeed, as long as there is a society with human problems to be solved, there exists a need for an LIS professional.