Paperclip: the command-line interface for scientific literature

Last month, we introduced Sy, the literature search agent that navigates biomedical preprints as a filesystem. Scientists have loved using it, and the natural question has been: can my own agent use this filesystem too?

Read Documentation →

Today we're releasing Paperclip, the agent-native counterpart to Sy. Whereas humans use a chat-based UI, agents work best within the rich text environment of a command-line interface. Paperclip gives your agent direct CLI access to 8M+ papers—standard search and retrieval functions, plus several powerful tools that, when used together, let agents actually explore, deep-dive, and synthesize.

Claude Code
Claude Code v2.1.94
Sonnet 4.6 · API Usage Billing
/Users/my_folder
❯
? for shortcuts

What's in Paperclip?

We've thought a lot about what the “principal components” of good literature search are. These components shouldn't just be individually useful for specific queries, but should be synergistic and composable. We're excited to introduce several powerful commands—search,grep, map, ask-image,sql, and from.

search

As a starting point, we've implemented hybrid search, combining BM25 and embedding-based retrieval. The agent can also select a specific ranking mechanism more suited for its queries. For token efficiency, rather than return the entire abstract of each search result, we return a 1–2 sentence TL;DR summary.

$ paperclip search "KRAS G12C resistance mechanisms" -n 5

Found 5 papers [s_74db6679]

1. M1C is a druggable target for NSCLC KRAS G12C mutant tumors resistant to KRAS inhibitors

bio_5c6e4b117ab6 · bioRxiv · 2025-12-02

“M1C protein expression drives resistance to sotorasib by promoting EMT, and targeting M1C reverses it.”

2. Modeling response to AZD4625 in KRAS G12C NSCLC patient-derived xenografts

PMC12765001 · British Journal of Cancer · 2025-10-11

“mTOR signaling was identified as a potential mechanism of primary resistance to the drug.”

3. Genetic mechanisms of resistance to targeted KRAS inhibition

bio_d7a242096fc0 · bioRxiv · 2025-08-04

“CRISPR screens identified resistance mutations, with CIC mutations being a notable example.”

4. Combining EGFR and KRAS G12C Inhibitors for Advanced Colorectal Cancer

PMC11340593 · J Cancer Immunol · 2024-08-07

“EGFR + KRAS G12C inhibitor combinations show improved efficacy vs monotherapy.”

5. Inhibition of ULK1/2 and KRAS G12C controls tumor growth in lung cancer

bio_f7c3187b7275 · bioRxiv · 2024-02-06

“Combining KRAS G12C and ULK1/2 inhibition synergistically reduces tumor growth.”

[6.3s, saved to s_74db6679]

grep

While grep is a preferred search tool for coding agents, it's not available in literature search APIs—it's hard to do across millions of papers. We've spent a lot of time optimizing our indices to make this happen in milliseconds. We're very excited for this feature, and we think your agents will agree.

Corpus-wide regex — same results, different speeds

intracellular tachyzoites

native

26s

paperclip

720 ms

36× faster

binding affinity.*nanomolar

native

70s

paperclip

240 ms

294× faster

CRISPR (case-insensitive)

native

55s

paperclip

230 ms

236× faster

We've optimized grep over 8M+ papers to run 36–294× faster than literal grep on an 8-core NVMe SSD with the full corpus warm in page cache. Cold-cache reads would be 2–7 minutes.

map

A common motif in literature search involves asking the same question across many papers. Rather than have the agent do this sequentially, we providemap, which performs this in parallel, yielding a structured result for the agent to read.

map --from <prev> --query "What resistance mechanism was reported?"

Paper 1 — PMC7795113

Y96D, H95D secondary KRAS mutations. RTK bypass via MET amplification.

Paper 2 — PMC9399772

Q99L switch-II pocket mutation. Compound-specific: sotorasib only.

Paper 3 — PMC11364849

CIC loss-of-function via NFκB. PI3K/mTOR reactivation.

Paper N — PMC8843735

KRAS amplification (18%), RAS isoform switching, KEAP1 co-mutation.

Returned to agent — structured summary table

Paper	Mechanism	Type
PMC7795113	Y96D, H95D secondary mutations; MET amplification	on-target, RTK bypass
PMC9399772	Q99L switch-II pocket mutation	on-target
PMC11364849	CIC loss-of-function via NFκB reactivation	tumor suppressor
PMC8843735	KRAS amplification (18%), RAS switching, KEAP1	amplification, co-mutation

One tool call, N papers read in parallel. Each paper is read in full by an LLM that extracts a structured answer. The agent gets back a summary table it can reason over—without N separate read–extract–respond loops.

Wall-clock time — 20 papers, same query

Sequential

~120 s

map

~15 s

~8× faster

Sequential: 20 agent turns × ~6s each (LLM round-trip + paper read). Map: 20 sub-agents in parallel, bounded by the slowest paper (~15s).

ask-image

Papers are inherently multimodal, and we don't want you to have to download every image just to figure out what's in each paper. We provide ask-image, which allows your agent to ask arbitrary questions over figures via a VLM, all in the cloud—no heavy lifting agent-side. Everything stays in the CLI.

$ cd /papers/PMC12765001 && ask_image 41416_2025_3216_Fig1_HTML.jpg "Describe each panel and the key findings"

KRAS G12C PDX characterization: oncoprint, dose-response curves, IC50 violin plots, tumor growth

PMC12765001 — Rosen et al., British Journal of Cancer (2025)

VLM

Panel A — Oncoprint: Genetic alterations across KRAS^G12C PDX models. KRAS mutated in 100% of samples; co-mutations in TP53, KEAP1, CDKN2A. Color indicates alteration type (missense, amplification, frameshift). Top bars annotate XDO establishment success and AZD4625 sensitivity.

Panel B — Dose-response: Sotorasib sensitivity in KRAS^G12C(red, n=6) vs KRAS^WT (black, n=12) organoids. G12C lines show significantly lower IC₅₀ values.

Panel D — IC₅₀ violin plots: Distribution of IC₅₀ for AZD4625 vs Sotorasib. Wider spread for Sotorasib; XDO344 shows high IC₅₀ indicating resistance.

Panel E — Tumor growth: PHLC207 in vivo. Vehicle arms show exponential growth. Both Sotorasib and AZD4625 (100 mg/kg) achieve sustained tumor regression.

6.6s · Vision LM

Real output from ask_image on a multi-panel figure from a KRAS G12C resistance paper. The model parses each panel—oncoprint, dose-response curves, violin plots, tumor growth—and extracts specific data points and findings.

sql

Another major motif in literature search is aggressive filtering. This can be hard to do over a filesystem, but it's something SQL was designed for. Common APIs often wrap these functions into pre-set queries. We figured it would be best for agents to just have direct access to the underlying metadata table.

SELECT title, doi, source
FROM documents
WHERE authors ILIKE '%Doudna%'
ORDER BY pub_date DESC
LIMIT 5

title	source
Programmable RNA editing…	pmc
CRISPR-Cas9 gene drives…	biorxiv
Base editing in human cells…	pmc

Read-only SQL against the full metadata table. 15-second timeout, 200-row limit. Filter by author, journal, date, source, keywords—anything semantic search isn't precise enough for.

from

We've read thousands of agent rollouts for literature search APIs. Probably the biggest limitation is that most search APIs are stateless—each function exists independent of the agent's context. This stunts agentic exploration and tunneling.

To address this, we store every intermediate search result in our cloud-hosted database. You can reference them easily using --from, which performs the next action only on the subset of previously retrieved papers.

The result set is stored server-side. Every --from operates on the same 30 papers.

We've been using Paperclip internally for weeks now and agents take to it immediately. We think the best way to experience it is for yourself. However, here are a couple of fun examples below:

Example 1: KRAS G12C Inhibitors

KRAS G12C inhibitors (sotorasib, adagrasib) were approved for non-small cell lung cancer, but most patients develop resistance. Map out the specific mutations and pathway alterations driving resistance, how frequently each appears, and whether there's a dominant mechanism.

The canonical mechanisms—secondary KRAS mutations (Y96D, H95D, Q99L), RTK bypass via MET/EGFR, PI3K/mTOR reactivation—show up in every review. These tend to be well-covered in most agent outputs. The harder question is whether it can surface findings that are published but not prominent. One such finding: CIC (Capicua), a tumor suppressor whose loss-of-function drives resistance through NFκB reactivation. It's described in a 2025 CRISPR screen paper, present in 24 of 50 corpus papers, but absent from most review-level summaries.

We gave the same agent the same question with two different tool configurations: a standard 3-tool search API (paper_search, read_paper,full_text_search) and Paperclip. Both used the same model, backend, and corpus. Here's the shape of each exploration:

Standard API

Stateless — each call is independent

Paperclip

Stateful — --from keeps the result set

The key difference is statefulness. The MCP agent's two searches return independent result sets with no connection between them. There's no way to say "search for CICwithin papers I already found about KRAS resistance." The Paperclip agent holds a single results_id throughout the session. Every grep operates on the same 27 papers. When it notices CIC while reading, it tunnels back into that same set—grep "CIC" --from <previous search>—and confirms the entity appears in 15 of those 27 papers. That's the loop that produces discovery: broad search, scan, read, notice, narrow within the same context, confirm, read the source.

Example 2: AlphaFold Failure Modes

Map out where AlphaFold systematically fails. Which specific proteins or structural features cause failures? What are the root causes—training data, architecture, or something else? I want concrete examples with protein names, not just broad categories.

The canonical failures are well-known: intrinsically disordered regions, misleading pLDDT confidence scores, MSA depth dependence. These are well-covered in the literature. The deeper findings are harder: XCL1 (lymphotactin), a fold-switching protein that adopts two completely different stable structures and AlphaFold confidently predicts only one; β-solenoid hallucination, where AF2 produces confident but unrealistic repeat structures; and adversarial invariance, where AF3 predictions remain unchanged despite destabilizing mutations. All are published findings buried in full-text papers, absent from review abstracts.

Same setup: the same agent with a standard API vs Paperclip.

Standard API

Stateless — each search is independent

All 3 deep findings missed.

Paperclip

Stateful — --from keeps the result set

Stateful grep → XCL1, β-solenoid, adversarial invariance.

Same pattern as KRAS, but with three deep findings instead of one. The API agent ran 5 independent searches and read 2 papers, extracting canonical findings (KaiB, RfaH, disordered regions). It searched for "fold switching p53 KaiB RfaH" and "BCCIP RfaH KaiB p53 GA95"—protein names from training data, not from anything it read. The Paperclip agent anchored 50 papers to a single handle and ran 7 greps against it. Grepping "fold switching" narrowed to 8 papers; reading one revealed XCL1. Grepping "intrinsically disordered" surfaced a paper on β-solenoid hallucination. Grepping "training data|MSA" led to the adversarial invariance finding. Both agents used comparable resources (14 vs 8 calls), but the exploration graphs tell different stories—and led to different discoveries.

Use Paperclip today

# Install

curl -fsSL https://paperclip.gxl.ai/install.sh | bash

# Authenticate

paperclip login

# Verify

paperclip config

Or add it as an MCP server directly—no local install needed:

# Claude Code

claude mcp add --transport http paperclip https://paperclip.gxl.ai/mcp

Full documentation →