Design, query, and evaluate information retrieval systems

Introduction

In a world overflowing with information, information retrieval systems (IRS) play an important role in providing access to LIS collections. Whether through individual websites, search engines, intranets, email, social media platforms, OPACs, or other databases, IRSs support the storage, organization, and retrieval of content (Chowdhury, 2017). For LIS professionals, understanding how to design, query, and evaluate these systems is a core duty, as these tasks help meet the information needs of their patrons. 

Design

The design of an IRS begins with understanding both the collection and users it will serve (Tucker, 2024; Weedman, 2018). A well designed system considers several foundational questions: 

  • What problems is the system intended to solve? 
  • Who are the users, and what are their information needs, behaviors, and search capabilities? 
  • What types of documents or records will the system manage, and what characteristics must be considered to ensure they are findable and usable?

IRS design requires consideration of technical infrastructure with the specific context of the collection and its users in mind. The process can involve iterative stages such as conducting preliminary research, design planning, implementation via creating or importing records, developing prototypes, and conducting user testing. Often, designers will need to revisit earlier stages of development after user testing in order to implement their feedback. These stages help designers fully consider user needs and collection characteristics during development. 

A number of design principles reinforce an IRS. Controlled vocabularies, for example, can enable consistent and precise searching. These vocabularies include subject headings, descriptors, and indexing terms, ensuring that similar concepts are grouped together, even when different terms are used during searches. The way descriptors are applied also matters. According to Tucker (2024), in pre-coordinated systems, such as the Art & Architecture Thesaurus (AAT), complex terms are established in advance and structured hierarchically. In contrast, post-coordinated systems rely on searchers to combine individual terms (i.e. using Boolean operators like AND), allowing more flexibility but requiring greater search literacy (Tucker, 2024).

Design also involves decisions about metadata, indexing, and data structures. Metadata supports discoverability by supplying consistent descriptors and context for documents, while indexing enables efficient searching by organizing the underlying data for quick retrieval (Chowdhury, 2017). Ultimately, thoughtful IRS design is about creating a system that balances technical functionality with user needs, all while keeping in mind the context of the collection.

Query

While not all LIS professionals will be designing databases; they will need to effectively use them. By understanding a system’s underlying design, individuals can significantly improve the construction of their searches. Knowing how information is structured, indexed, and described can help inform how to approach complex queries.

Developing strong search strategies is key to navigating an IRS. At the most basic level, Boolean operators (AND, OR, NOT) help refine or expand results. Using AND narrows a search by requiring both terms to be present, while OR broadens it by allowing either. Finally, NOT excludes unwanted results. Beyond Boolean logic, more advanced query techniques can include the use of wildcards and truncation, enabling flexibility when searching for word variants (Tunon, n.d.). Another option that can be used to specify a search, proximity operators allow searchers to retrieve documents where specified terms appear within a certain distance of each other; this can be useful when context and phrasing matter (Tunon, n.d.). Similarly, field restrictions let users target specific metadata fields, such as author, title, or subject, which can dramatically increase search efficiency in structured databases or library catalogs.

All in all, querying is a dynamic and iterative process. It requires critical thinking, adaptability, and a clear understanding of the tools at hand. For LIS professionals, mastering these techniques not only supports their own research but allows them to guide others in their reference journeys by developing search literacy.

Evaluate

Evaluation allows the LIS professional to assess whether the system is able to fulfill the users’ needs by providing accurate and appropriate information based on the users queries. This is known as relevance (Tucker, 2024; Weedman, 2018). According to Weedman, this comes down to recall and precision where “recall [is] how close the system gets to retrieving all of the relevant documents… [and] precision [looks at] how close the system get to retrieving only the relevant documents” (p. 182). A system with high recall may return a broader set of relevant results, but this can come at the cost of including more irrelevant ones; thus there is a balance to be struck between the two. Relevance itself can be subjective and context-dependent (Tucker, 2024). What is relevant for one user might not be for another, depending on their information needs, background, or search intent. Additionally, other areas of the system can be evaluated, such as completeness of the metadata records or richness of controlled vocabularies (Weedman, 2018).

Evidence

INFO 202: Information Retrieval System Design – Database Design

The first piece of evidence to show my mastery of designing information retrieval systems is the database design project for INFO 202 (compE_candleDatabaseDesign.pdf & databaseScreencast.mp4).

I worked with three other students to design, establish rules for and implement a database of candles in WebDatabasePro. This project required us to consider user needs when selecting the attributes that described our candles. Some of the needs we considered included preference for aesthetics and smell, allergies, budgets, and how the candle would be used. These attributes became fields within our database, and each required their own set of indexing rules. The indexing rules allow individuals inputting new candles into the database a way to standardize their inputs. Eventually, after setting up the database, we also needed to input each candle and make sure that our varied search options worked. After creation our database went through phases with feedback from another student team and fine-tuning.

In this group project I took on the tech role which meant I was able to set up the database in WebDatabasePro and help my team members who got confused about inputting candles into the database. In terms of the indexing rules, I was in charge of writing the fields: Candle is Unscented (p. 6), Wax Type (p. 8), Burn Time (p. 8-9), and Made In (p. 11). For the introduction, I wrote the last paragraph which covered user needs addressed by our database (pp. 2-3).

This project helped me understand how basic databases can work from a technical stand point and stepped our team through the creation process from idea to preparation and eventually implementation. Creating a good database is a team effort, while someone may be able to create a database alone, they’d probably miss key elements as they only have one perspective to work with. In this vein having individuals outside of your project test the system is helpful in finding ways to improve your design. Lastly, detailed, clear rules are an absolute must for indexing, as users need reliable results.

INFO 210: Reference and Information Services – Search Activity 15

The next piece of evidence that illustrates my ability to query and evaluate IRS is a Search Activity from INFO 210 (compE&J_searchActivity14&15.docx). Throughout the Reference and Information Services class, we were asked to perform multiple search activities to expand our knowledge about and abilities to query varied information retrieval systems. Each search activity focused on different types of reference questions that covered multiple subject areas and patron demographics. As such, I explored a number of different open access and subscription databases; this rotation between database types helped me understand the varied purposes and rules of these databases.

For this competency I chose the search activity that looked at Biog.Info, Maps, Gov Info, & Web 2.0. This assignment I explored ways of assessing IRSs and an exploration of the Getty Thesaurus of Geographic Names (a gazetteer that’s sometimes used as part of a metadata controlled vocabulary). I was required to directly engage with IRS systems as a user to find information for different potential reference questions; on question two I compared two sites for user use and tested multiple systems that allowed me to evaluate when one was more appropriate to use then another.

INFO 202: Information Retrieval System Design – Website Redesign

My last piece of evidence that shows mastery of competency E is a website design report for INFO 202 (compE_siteRedesign.pdf). This is a product from the same group as the candle database project described above. For this project our team worked on understanding, evaluating, and redesigning the layout of the iSchool website; the redesign required analyzing the current website design, generating a current sitemap, recommending changes via a new sitemap, and providing justification of said changes.

We each took on a section of the site to analyze; my sections included Programs, Student Resource and About. Additionally I created all of the images of the current sitemap and changes to the sitemap (p. 5, 6, 8, & 9). During the writing stage, I consolidated the team’s ideas regarding the redesign by writing the main section of Part 4: Proposed Redesigned Site and Discussion (p. 8-10)(subsections were written by a different team member) and all of Part 5: Recommendations (12-14).

Conclusion

Understanding how information retrieval systems are designed, queried, and evaluated is essential for all LIS professionals. Whether I pursue a career as a digital and data librarian, where I might guide students through complex database searches or as a processing archivist, where I could be involved in selecting or assessing systems for specific collections, this knowledge can support instructional, reference, or technical responsibilities.

Staying up to date with the system I work with will also be critical. Regularly engaging with the systems directly, and perhaps following updates from system developers through tools like RSS feeds, will help ensure my search skills remain sharp and relevant in an evolving information landscape.

References

Chowdhury, G. G. (2017). Introduction to Modern Information Retrieval (3rd ed.). Facet Editions.

Tucker, V. M. (2024). Design Concepts in Information Retrieval: Creating User-centered Systems,Search Engines, and Sites. Expert peer-reviewed OER book https://ischoolblogs.sjsu.edu/202

Tunon, J. (n.d.). Crash Course on Search Strategies — Advanced — LIBR 210 [Video]. San Jose State University iSchool Panopto.

Weedman, J. (2018). Information retrieval: Designing, querying, and evaluating information systems. In K. Haycock & M.J. Romaniuk (Eds.), The Portable MLIS. (2nd ed., pp. 171-185). Libraries Unlimited.