gxl
Blog
Paperclip

Paperclip: the command-line interface for scientific literature

Last month, we introduced Sy, the literature search agent that navigates biomedical preprints as a filesystem. Scientists have loved using it, and the natural question has been: can my own agent use this filesystem too?

Today we're releasing Paperclip, the agent-native counterpart to Sy. Whereas humans use a chat-based UI, agents work best within the rich text environment of a command-line interface. Paperclip gives your agent direct CLI access to 8M+ papers—standard search and retrieval functions, plus several powerful tools that, when used together, let agents actually explore, deep-dive, and synthesize.

Claude Code
Claude Code v2.1.94
Sonnet 4.6 · API Usage Billing
/Users/my_folder
? for shortcuts

What's in Paperclip?

We've thought a lot about what the “principal components” of good literature search are. These components shouldn't just be individually useful for specific queries, but should be synergistic and composable. We're excited to introduce several powerful commands—search,grep, map, ask-image,sql, and from.

As a starting point, we've implemented hybrid search, combining BM25 and embedding-based retrieval. The agent can also select a specific ranking mechanism more suited for its queries. For token efficiency, rather than return the entire abstract of each search result, we return a 1–2 sentence TL;DR summary.

$ paperclip search "KRAS G12C resistance mechanisms" -n 5

Found 5 papers  [s_74db6679]

1. M1C is a druggable target for NSCLC KRAS G12C mutant tumors resistant to KRAS inhibitors
     bio_5c6e4b117ab6 · bioRxiv · 2025-12-02
     “M1C protein expression drives resistance to sotorasib by promoting EMT, and targeting M1C reverses it.”

2. Modeling response to AZD4625 in KRAS G12C NSCLC patient-derived xenografts
     PMC12765001 · British Journal of Cancer · 2025-10-11
     “mTOR signaling was identified as a potential mechanism of primary resistance to the drug.”

3. Genetic mechanisms of resistance to targeted KRAS inhibition
     bio_d7a242096fc0 · bioRxiv · 2025-08-04
     “CRISPR screens identified resistance mutations, with CIC mutations being a notable example.”

4. Combining EGFR and KRAS G12C Inhibitors for Advanced Colorectal Cancer
     PMC11340593 · J Cancer Immunol · 2024-08-07
     “EGFR + KRAS G12C inhibitor combinations show improved efficacy vs monotherapy.”

5. Inhibition of ULK1/2 and KRAS G12C controls tumor growth in lung cancer
     bio_f7c3187b7275 · bioRxiv · 2024-02-06
     “Combining KRAS G12C and ULK1/2 inhibition synergistically reduces tumor growth.”

[6.3s, saved to s_74db6679]

grep

While grep is a preferred search tool for coding agents, it's not available in literature search APIs—it's hard to do across millions of papers. We've spent a lot of time optimizing our indices to make this happen in milliseconds. We're very excited for this feature, and we think your agents will agree.

Corpus-wide regex — same results, different speeds
intracellular tachyzoites
native
26s
paperclip
720 ms
36× faster
binding affinity.*nanomolar
native
70s
paperclip
240 ms
294× faster
CRISPR (case-insensitive)
native
55s
paperclip
230 ms
236× faster
We've optimized grep over 8M+ papers to run 36–294× faster than literal grep on an 8-core NVMe SSD with the full corpus warm in page cache. Cold-cache reads would be 2–7 minutes.

map

A common motif in literature search involves asking the same question across many papers. Rather than have the agent do this sequentially, we providemap, which performs this in parallel, yielding a structured result for the agent to read.

map --from <prev> --query "What resistance mechanism was reported?"
Paper 1 — PMC7795113
Y96D, H95D secondary KRAS mutations. RTK bypass via MET amplification.
Paper 2 — PMC9399772
Q99L switch-II pocket mutation. Compound-specific: sotorasib only.
Paper 3 — PMC11364849
CIC loss-of-function via NFκB. PI3K/mTOR reactivation.
Paper N — PMC8843735
KRAS amplification (18%), RAS isoform switching, KEAP1 co-mutation.
Returned to agent — structured summary table
PaperMechanismType
PMC7795113Y96D, H95D secondary mutations; MET amplificationon-target, RTK bypass
PMC9399772Q99L switch-II pocket mutationon-target
PMC11364849CIC loss-of-function via NFκB reactivationtumor suppressor
PMC8843735KRAS amplification (18%), RAS switching, KEAP1amplification, co-mutation
One tool call, N papers read in parallel. Each paper is read in full by an LLM that extracts a structured answer. The agent gets back a summary table it can reason over—without N separate read–extract–respond loops.
Wall-clock time — 20 papers, same query
Sequential
~120 s
map
~15 s
~8× faster
Sequential: 20 agent turns × ~6s each (LLM round-trip + paper read). Map: 20 sub-agents in parallel, bounded by the slowest paper (~15s).

ask-image

Papers are inherently multimodal, and we don't want you to have to download every image just to figure out what's in each paper. We provide ask-image, which allows your agent to ask arbitrary questions over figures via a VLM, all in the cloud—no heavy lifting agent-side. Everything stays in the CLI.

$ cd /papers/PMC12765001 && ask_image 41416_2025_3216_Fig1_HTML.jpg "Describe each panel and the key findings"
KRAS G12C PDX characterization: oncoprint, dose-response curves, IC50 violin plots, tumor growth
PMC12765001 — Rosen et al., British Journal of Cancer (2025)
VLM
Panel A — Oncoprint: Genetic alterations across KRASG12C PDX models. KRAS mutated in 100% of samples; co-mutations in TP53, KEAP1, CDKN2A. Color indicates alteration type (missense, amplification, frameshift). Top bars annotate XDO establishment success and AZD4625 sensitivity.

Panel B — Dose-response: Sotorasib sensitivity in KRASG12C(red, n=6) vs KRASWT (black, n=12) organoids. G12C lines show significantly lower IC50 values.

Panel D — IC50 violin plots: Distribution of IC50 for AZD4625 vs Sotorasib. Wider spread for Sotorasib; XDO344 shows high IC50 indicating resistance.

Panel E — Tumor growth: PHLC207 in vivo. Vehicle arms show exponential growth. Both Sotorasib and AZD4625 (100 mg/kg) achieve sustained tumor regression.
6.6s · Vision LM
Real output from ask_image on a multi-panel figure from a KRAS G12C resistance paper. The model parses each panel—oncoprint, dose-response curves, violin plots, tumor growth—and extracts specific data points and findings.

sql

Another major motif in literature search is aggressive filtering. This can be hard to do over a filesystem, but it's something SQL was designed for. Common APIs often wrap these functions into pre-set queries. We figured it would be best for agents to just have direct access to the underlying metadata table.

SELECT title, doi, source
FROM documents
WHERE authors ILIKE '%Doudna%'
ORDER BY pub_date DESC
LIMIT 5
titlesource
Programmable RNA editing…pmc
CRISPR-Cas9 gene drives…biorxiv
Base editing in human cells…pmc
Read-only SQL against the full metadata table. 15-second timeout, 200-row limit. Filter by author, journal, date, source, keywords—anything semantic search isn't precise enough for.

from

We've read thousands of agent rollouts for literature search APIs. Probably the biggest limitation is that most search APIs are stateless—each function exists independent of the agent's context. This stunts agentic exploration and tunneling.

To address this, we store every intermediate search result in our cloud-hosted database. You can reference them easily using --from, which performs the next action only on the subset of previously retrieved papers.

search "KRAS G12C" -n 30 30 papers — stored server-sidegrep "resistance" --from prev17 match grep "CIC|NFκB" --from prev4 match map --from prev "mechanism?"30 answers
The result set is stored server-side. Every --from operates on the same 30 papers.

We've been using Paperclip internally for weeks now and agents take to it immediately. We think the best way to experience it is for yourself. However, here are a couple of fun examples below:

Example 1: KRAS G12C Inhibitors

KRAS G12C inhibitors (sotorasib, adagrasib) were approved for non-small cell lung cancer, but most patients develop resistance. Map out the specific mutations and pathway alterations driving resistance, how frequently each appears, and whether there's a dominant mechanism.

The canonical mechanisms—secondary KRAS mutations (Y96D, H95D, Q99L), RTK bypass via MET/EGFR, PI3K/mTOR reactivation—show up in every review. These tend to be well-covered in most agent outputs. The harder question is whether it can surface findings that are published but not prominent. One such finding: CIC (Capicua), a tumor suppressor whose loss-of-function drives resistance through NFκB reactivation. It's described in a 2025 CRISPR screen paper, present in 24 of 50 corpus papers, but absent from most review-level summaries.

We gave the same agent the same question with two different tool configurations: a standard 3-tool search API (paper_search, read_paper,full_text_search) and Paperclip. Both used the same model, backend, and corpus. Here's the shape of each exploration:

Standard API
Stateless — each call is independent
27 papers paper Key paperReads 5 papers from top of resultsExtracted (canonical):Y96D, H95D, Q99L, MET amp, RTK bypass, PI3K/mTORpaper_search("Y96D H95D Q99L")new set — old results gone24 different papers CIC not hereY96D, H95D, Q99L, KEAP1, RTK bypassNo CIC. Key paper never found.
Paperclip
Stateful — --from keeps the result set
27 papers handle persistsgrep "resistance" --from prev17 match "CIC mentioned — new to me"grep "CIC" --from prev15 match grep "CIC mutation" --from prev1 match bio_d7a242096fc0CIC mutations → resistance via NFκBKey paper found.

The key difference is statefulness. The MCP agent's two searches return independent result sets with no connection between them. There's no way to say "search for CICwithin papers I already found about KRAS resistance." The Paperclip agent holds a single results_id throughout the session. Every grep operates on the same 27 papers. When it notices CIC while reading, it tunnels back into that same set—grep "CIC" --from <previous search>—and confirms the entity appears in 15 of those 27 papers. That's the loop that produces discovery: broad search, scan, read, notice, narrow within the same context, confirm, read the source.

Example 2: AlphaFold Failure Modes

Map out where AlphaFold systematically fails. Which specific proteins or structural features cause failures? What are the root causes—training data, architecture, or something else? I want concrete examples with protein names, not just broad categories.

The canonical failures are well-known: intrinsically disordered regions, misleading pLDDT confidence scores, MSA depth dependence. These are well-covered in the literature. The deeper findings are harder: XCL1 (lymphotactin), a fold-switching protein that adopts two completely different stable structures and AlphaFold confidently predicts only one; β-solenoid hallucination, where AF2 produces confident but unrealistic repeat structures; and adversarial invariance, where AF3 predictions remain unchanged despite destabilizing mutations. All are published findings buried in full-text papers, absent from review abstracts.

Same setup: the same agent with a standard API vs Paperclip.

Standard API
Stateless — each search is independent
50 papers XCL1 β-solenoid adversarial read (2)6 searches — no shared contextpaper_search("AF limitations failures accuracy")paper_search("disordered membrane dynamics")paper_search("fold switching p53 KaiB RfaH") ← training datapaper_search("membrane proteins GPCRs rhodopsin")full_text_search("BCCIP RfaH KaiB p53") ← training datapaper_search("training data bias memorization")Reads 2 papersExtracted (canonical):Disordered regions, pLDDT, MSA, KaiB, RfaHProtein names from training dataKaiB, RfaH, pLDDT, MSA, disorderedCanonical only. 8 calls, 2 papers read.No XCL1. No β-solenoid. No adversarial invariance.???
All 3 deep findings missed.
Paperclip
Stateful — --from keeps the result set
50 papers handle persists7 greps + 4 reads — 14 calls totalgrep "fold switching" --from prev8 match match no matchreads → discovers XCL1 (lymphotactin)grep "intrinsically disordered" --from prev12 match reads → discovers β-solenoid hallucinationgrep "training data|MSA" --from prev10 match reads → discovers adversarial invarianceDiscovered from full text:XCL1β-solenoidadversarial invarianceNone of these appear in standard reviewsAll 3 deep findings found.14 calls · 4 papers read · 7 greps on same set
Stateful grep → XCL1, β-solenoid, adversarial invariance.

Same pattern as KRAS, but with three deep findings instead of one. The API agent ran 5 independent searches and read 2 papers, extracting canonical findings (KaiB, RfaH, disordered regions). It searched for "fold switching p53 KaiB RfaH" and "BCCIP RfaH KaiB p53 GA95"—protein names from training data, not from anything it read. The Paperclip agent anchored 50 papers to a single handle and ran 7 greps against it. Grepping "fold switching" narrowed to 8 papers; reading one revealed XCL1. Grepping "intrinsically disordered" surfaced a paper on β-solenoid hallucination. Grepping "training data|MSA" led to the adversarial invariance finding. Both agents used comparable resources (14 vs 8 calls), but the exploration graphs tell different stories—and led to different discoveries.

Use Paperclip today

# Install
curl -fsSL https://paperclip.gxl.ai/install.sh | bash

# Authenticate
paperclip login

# Verify
paperclip config

Or add it as an MCP server directly—no local install needed:

# Claude Code
claude mcp add --transport http paperclip https://paperclip.gxl.ai/mcp

Full documentation →