Back to all tools
Research

Open Scholar

AI search for scientific papers.

Visit Website

About Open Scholar

OpenScholar is a specialized retrieval-augmented language model (LM) designed to answer complex scientific queries by identifying relevant passages from over 45 million open-access academic papers. Developed by the Allen Institute for AI (Ai2) and the University of Washington, the system utilizes a retrieval-augmented self-feedback mechanism to synthesize findings and generate answers that are grounded in verifiable sources. Its pipeline consists of a bi-encoder retriever trained on millions of passage embeddings, a cross-encoder reranker to prioritize the most contextually relevant information, and an iterative inference process that refines responses. OpenScholar has demonstrated superior performance on factuality and citation accuracy, significantly outperforming larger proprietary models like GPT-4o on the ScholarQABench—a large-scale multi-domain benchmark for literature search. Unlike many other AI assistants, OpenScholar is fully open-source; its team has released the code for the language model, the retrieval pipeline, a specialized 8-billion-parameter model, and the entire datastore of scientific papers. The project aims to accelerate scientific discovery by helping researchers synthesize knowledge faster and with greater confidence, particularly by ensuring reliable attribution and reducing "hallucinations." While currently reliant on open-access research from Semantic Scholar, it represents a major step toward making the vast body of scientific knowledge more navigable and accessible to the global research community.

Tags

#Scientific Research #Open Source #Literature Synthesis #Academic Search #AI2

Added November 21, 2024

allenai.org