Information Retrieval in Digital Environments

Searching for information in digital environments can be a difficult task. There is an overwhelming amount of information available today. So much so, in fact, that information overload is a pervasive problem in society. This overload may be more related to cultural attachment to Internet technologies and multimedia information, but even work-related information tasks are characterized by overload to the degree that feelings of anxiety and uncertainty are endemic to the information search process. Choosing where to even begin a search online relies on careful evaluation of information retrieval (IR) systems. Evaluation is required in order to determine what an IR system’s functionalities are, and whether or not the system can provide relevant results to the user. Given the fact that there are literally hundreds of IR systems available, the feeling of being overwhelmed can exist at the beginning stages of the search process, and can persist throughout the experience of using an IR system.

In general, there are four different types of IR systems. These are online databases, web search engines, online public access catalogs (OPACs) and digital libraries. Each of these systems is designed to facilitate a user’s information requirements. The ultimate goal of each system is to satisfy a user’s request for information without the presence of an intermediary or help from a system consultant. The IR system is designed to be a standalone interface that can be used by individuals who have unique and specific information needs. This gives the user a fair amount of power to control their informational environment and find information that is not influenced or biased by the selection procedures of another person, namely a librarian.

Looking at each IR system in turn, it becomes evident that system design is a complex issue. Because users are interacting with a system instead of another human being, there is no way to readily assess users’ level of expertise or aptitude for information retrieval. Therefore, unlike a reference librarian, an IR system can not gauge a user’s skill set when they approach the system interface. This means that IR system designers need to consider a plethora of user competencies, search styles and search strategies. Digital literacy is an important consideration, because users need to understand how to use digital tools for information access and retrieval. Search bars, fields, limiters, sorting mechanisms; all these tools may seem simple to a digitally-fluent person, but the use of these tools is essential in the online information environment. Moreover, there is an underlying logic to these features as well. For instance, users need to be able to understand Boolean logic, truncation, wildcards, and phrased searching in order to narrow their results and get precision. Being able to specify or assess format, document type, publication, and scholarliness are also necessary skills for users in the information environment.

Interaction design has occupied a large chunk of the information retrieval literature. The features mentioned above are usually indicative of online database design. But the other information retrieval environments – web engine, OPAC and digital library – have begun to integrate more sophisticated digital tools like these as well. The result is that the lines are beginning to blur between the four types of IR systems. This could be problematic, as standardized interface design could encourage searching habits that are not appropriate for all systems and information needs. For example,  if all systems were based on the Z39.50 protocol, user searches would be limited to the Bib-1 Attribute Set. This would be adequate for known-item searching, like a bibliographic search, but this syntax might not be so good for multimedia information searching. In other words, based on the underlying database structure of an IR system, certain queries will work better than others. Identifying these strategies for information searching is complex, and novice users will not understand the nuances of system mapping and indexing.

The complexity of information retrieval is one reason why finding information online can be so difficult. Finding relevant documents is a skill, which is seldom taught to students with any real exactitude. There are also limitations to each system as well, which makes finding information difficult. For example, performing systematic searches for research is an exercise more suited to subject databases than web-based search engines. Finding information for research purposes is easier in an online database, because online databases will yield a smaller proportion of relevant documents than web search engines. Indeed, there are plenty of problems with web search engines. Bates et al. (2017) demonstrated that the basic principles of Boolean logic might not apply in web-search engines, as the order of concept groups are altered (p. 10). In other words, relevant documents will not be clustered with other potentially relevant documents as most web search engines are based on ranked results and search engine optimization, especially in the case of Google or Google Scholar. Because of these more restrictive algorithms, many advanced search functions are simply not available in web search engines, like filtering and automatic term mapping.

There are limitations with the other IR systems as well. Kumar and Singh (2014) identified a number of problems with OPACs. As bibliographic databases, OPACs generally do not provide users with adequate help for query formulation; they do not convert keywords to terms used in the catalog; and users are usually left trying to determine subject headings and call numbers based on their inquiry terms. Moreover, as OPACS serve as resource guides, they are more likely to provide geo-mapped content than intellectual content, providing information on where to find collections, but not more in-depth information such as table of contents, abstracts, or book reviews (p. 42). However, these features are increasingly becoming more common in WorldCat.

Digital libraries are often built for browsing collections. For digital libraries, browsing is a more important function, because digital libraries are often created by individual libraries, or consortium, and are not vast collections based on a content management system. In other words, the digital library is an IR system that is suitable for very specific information requests. But it is important to have an understanding of the institution and it’s collections before using this type of IR system. Digital libraries generally do not have robust search functions, if they have search functionality at all.

This short essay hopefully demonstrates the complexity of searching for information in online environments. Each environment is unique and provides challenges to end-users. People do not always know how to conduct searches in IR systems. In fact, most searches for information are performed in culturally normalized ways. People will opt for the principle of least effort when seeking information, which often leads to natural language searches and perfunctory searches on the Web. Database documentation and help literature is often too dry or boring for the average user to consider. But different IR environments have different search functions and features. Matching a user’s query to the right environment, choosing the right search strategy, and using the appropriate IR tools can admittedly be very difficult. As with all good results, some determination is required in order to navigate the difficult terrain.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s