How You Can Use AI to Accelerate Library Research

Finding research literature has come a long way. Once upon a time, finding papers for a literature review, dissertation, grant proposal, patent, or any other research purpose meant digging through stacks of magazines, hunting for the issue with the index, and then hoping the right issues were in the library. Depending on the college, you might end up walking back and forth between libraries, which, in some odd twist of space-time, are always farther apart than the physical size of the campus. At least it was a good way to get some exercise. The biggest challenges were finding the relevant papers, having enough nickels to photocopy them all, and not getting blisters on your feet.

Today, things are very different. Online databases let your fingers do the walking. Rather than combing the stacks at the library, researchers can search online collections that collectively contain virtually everything published in their field. As touched on in the article The Truth is Out There, solving one problem opened up a host of different problems.

Where Do I Search?

Libraries have access to multiple article databases, some containing abstracts only, others containing full-text articles. For someone starting out, finding relevant articles can be a daunting task. Figuring out which databases to look in is not as obvious as it seems: papers are not always categorized in ways that make sense to mortal minds. Even once you have an idea of where to look, enter too narrow a set of search parameters and you won’t find enough material for your needs. Enter too wide a set of search parameters and you can easily end up with more results than a person could read through in several lifetimes. Playing guess the keywords is a great way to spend many an evening communing with your computer, at least if you have a high frustration tolerance.

Sure, there are tricks to narrowing down the search space - for example, searching for the names of researchers referenced in a textbook or starting with the citation index of an existing paper. Many library search engines will let you walk the citation chain up and down, so once you have a starting point, you can find papers that are fed into it and papers that are subsequently fed. Even so, you still have a lot of abstracts to skim, and you still are likely to miss papers by less well-known researchers.

An under-appreciated benefit of the old system was that when you did find the right issue of the magazine you were looking for, it would often contain other related papers that you might never have thought to look for. Although some library systems will suggest related papers, the connection between those papers and what you are looking for is frequently just a bit obscure.

Professors might be able to pass some of this work off to their graduate students, which is one way of motivating them to finish their degrees.

What Can I Use?

In the old days, the struggle was finding enough information. Today, the problem isn’t lack of information - it’s figuring out what is actually relevant.

On the one hand, there’s something to be said for skimming through dozens or hundreds of papers because you never know what you’ll find. Exactly what is said won’t be repeated here. The real problem is that human decision-making is a finite resource. We can make only so many good decisions in a day before things start to blur together. Is this what I’m looking? Maybe this one is useful? What was my topic again?

As you find relevant and useful papers, finding more like those can again be tricky. Finding papers by the same authors isn’t that tough. But finding papers by other researchers that are conceptually similar is that tough. We’re back to trying out keywords and wading through a lot of hits.

It’s around this point that many researchers start questioning their life decisions.

Okay, in all seriousness, things are a lot better today than back when finding papers meant digging through the stacks at the libraries. However, the vast explosion of knowledge and ability to instantly access almost anything written in the past several decades has resulted in a massive information tsunami. Our ability to create information far outstrips our ability to find the most relevant information.

AI or The Oracle of Delphi?

King Croesus of Lydia famously asked the Oracle of Delphi (the ancient Greek version of an LLM) what would happen if he went to war against the Persian Empire. The Oracle replied, “If you cross the river Halys, you will destroy a great empire."

Croesus neglected to ask which empire and went down in defeat.

Without the proper AI infrastructure software in place, it is all too easy for AI enhanced academic search to become the Oracle of Delphi.

Training an AI on the papers for each field of study would be a nightmarish task - there are many fields of study, with new ones appearing all the time. Similarly, within each area new papers are being written all the time. There are theses, dissertations, and journal articles. Journals have differing degrees of quality and selectivity. Sometimes papers need to be retracted, and there’s really no good way to make an AI forget. Results would become a hopeless mishmash of valid and invalid information which could only be remedied through never-ending retraining.

Even if the Oracle of Delphi scenario didn’t come to pass, training an AI on a library’s research papers opens up a Pandora’s Box of copyright and plagiarism issues, risks of incorrect citations, and more.

However, there is an alternative to bringing the data to the AI: bringing the AI to the data.

Bringing the AI to the Data

The distinction between training an AI on your data and allowing an AI access to your data may seem minor, but in that distinction is a world of difference. In the former case, the data becomes part of the AI’s knowledge base. In the latter case, the AI can be used to analyze data and determine relevancy, but it doesn’t remember what it has seen. This second approach requires much less data movement, avoids the problem of keeping an AI up-to-date, and prevents your AI from turning into the Delphic Oracle. It also lets you leverage the power of AI to identify relevant data in any location, so you can easily add new databases to the search space.

Bringing the AI to the data requires AI infrastructure software that provides several key capabilities.

Universal Connectivity

The AI infrastructure software must connect with numerous applications and data sources regardless of format or application type, including structured data from databases, content from email, office documents, and collaboration apps such as Teams, Webex, Confluence and Slack. Users can find information in library data sources and information they may have in local storage or buried in their email or other apps.

AI Flexibility

AI infrastructure software is not the One True AI that can solve everything. Rather, it is a framework that lets you connect an arbitrary number of LLMs and AI models. You can take advantage of the power of any AI without designing your system around that AI. You can swap models in and out as the situation requires, such as using specialized models to handle specific, domain relevant papers.

Powerful, Unified Search

AI Infrastructure software should keep track of the data sources that a given user is permitted to access, so that you don’t have to keep track manually. It should enable searching of all permitted data sources and then use AI to help manage the results, pruning and ranking by relevancy so that you are able to quickly find the information that you are looking for. It should enable dynamic, interactive refinement of the search results, essentially enabling you to have a conversation with your data.

Prompt Enhancement

Because an AI is not a Generalized Omniscient Device, an AI can return incorrect, inaccurate, or hallucinatory results. AI infrastructure software should incorporate prompt enhancement and use of Retrieval Augmented Generation in order to minimize or eliminate these problems. Combining prompt enhancement, RAG, and other techniques for managing search, increases the validity and trustability of results and frees the user from the laborious process of attempting to take these steps manually, reducing the number of irrelevant papers you need to wade through.

Ease of Use

One of the biggest limitations on adopting AI is the lack of trained experts. AI infrastructure software should be simple to install. It should be low-code and configuration based and shouldn’t require an army of computer science PhDs to make it work. A major benefit of AI infrastructure software is to make it easier for everyone to benefit from AI, not just a few experts.

AI Infrastructure Software Makes AI Work For You

AI infrastructure software may sound like wishful thinking, but it exists today and is available right now. SWIRL is an innovative approach to AI infrastructure that provides a powerful, efficient, and flexible framework for incorporating AI into your business. SWIRL evolves and scales with technological advancements so you can avoid being locked to a specific AI model or vendor.

Capabilities

SWIRL provides key capabilities that clearly maximize the return on your AI investments:

Powerful Search: SWIRL combines unified search, prompt enhancement, and Retrieval Augmented Generation to provide more accurate and contextually relevant results than generative AI models can provide on their own. SWIRL enables you to have a conversation with your data - it’s like having your own personal research assistant or expert librarian available to guide you.

Connects to Everything: SWIRL connects with over a hundred (and counting) apps and databases, including email, Slack, WebEx, Confluence, Teams, and other collaboration apps, so you can access data no matter the location or format.

AI Agnostic: Avoid vendor lock-in and ensure flexibility by seamlessly swapping AI models to suit your evolving needs.

Specialized Models: Leverage state-of-the-art Large Language Models (LLMs) and easily integrate new, specialized models for specific use cases.

Low-Code and Configuration-Based: Accelerate AI-powered app development with minimal code, freeing you to focus on results rather than complex infrastructure.

Granular Data Access and Firewall Protection: Maintain tight control over sensitive data and ensure top-level security. Deploy AI models within your firewall and keep data inside your trust boundaries.

Streamlined AI operations: SWIRL is the middle layer between applications and AI models, helping to manage complexity, optimize resource usage, and ensure the infrastructure adapts to the rapidly changing demands of AI workloads.

Scalability: SWIRL enables AI systems to handle complex and variable demands, such as large-scale model training and deployment, without compromising performance.

Benefits

SWIRL provides significant benefits:

Fast Access to Relevant Research: SWIRL helps you find the papers you need faster and with less effort than ever before. RAG enhanced, context-aware AI responses improve the quality of results, decrease the time spent wading through irrelevant results, and help you find better quality information.

Better Quality Research: Enhance research quality by finding better, more relevant papers.

Increased Efficiency: Decrease the time spent hunting for and identifying the most useful and relevant papers and other data.

Improved Search - Break down silos and enable users to search databases they may not know about. Easily find data stored locally, in email, and in collaboration apps.

Easier Deployments - Simplify enterprise-wide deployments and easily scale AI initiatives.

With SWIRL’s AI infrastructure software, you spend less time hunting for papers and more time actually conducting research.

Let Your AI Do the Walking

While the search experience has certainly improved from the days when finding the right papers meant walking back and forth across campus, the experience is still far from pleasant. Even when our fingers do the walking, finding relevant research papers is hard and existing tools are limited. Our ability to create new information has continually outstripped our ability to find relevant information… until now.

SWIRL’s AI infrastructure software changes the game. SWIRL’s unique and innovative combination of source agnostic metasearch, AI flexibility, and Retrieval Augmented Generation enables you to have a conversation with your data. Rather than spend countless hours manually wading through hundreds of hits, SWIRL lets you interactively refine your search results, yielding a highly enriched set sorted by relevance. With SWIRL, the AI does the walking and our ability to find relevant information is, at long last, visibly overtaking our ability to create new information.

To find out more about becoming a leader in bringing the power of AI infrastructure software to your organization, contact SWIRL today.