Case Study

Build a Literature Monitor

Transform passive searching into active intelligence.
Detect new breakthroughs the moment they appear and keep your users ahead of the curve.

The Discovery

Imagine needing to stay on top of a fast-moving field like "graphene". Manually checking all journals is impossible, but you can automate this task by using ScholarAPI and calling its /search endpoint periodically.

This creates a self-updating feed. Use the q parameter to specify your topic (q=graphene) and append indexed_after with the timestamp of your last check to retrieve only the fresh papers added since then.

The response is a JSON object with a list of metadata records ranked by relevance. Capped at 50 best matches, every response is a compact delta you can push straight into user alerts or internal data stores.

GET Request

/api/v1/search?q=graphene&indexed_after=2025-01-15T10:30:05Z

import requests
from datetime import datetime, timedelta

# 1. Calculate check time
yesterday = (datetime.now() - timedelta(days=1)).isoformat()

# 2. Query for new items
resp = requests.get(
    "https://scholarapi.net/api/v1/search",
    params={
        "q": "graphene",
        "indexed_after": yesterday,
        "limit": 20
    },
    headers={"X-API-Key": "YOUR_KEY"}
)

print(f"Found {len(resp.json()['results'])} new papers")

last = None

while True:
    resp = requests.get(
        "https://scholarapi.net/api/v1/list",
        params={"q": "graphene", "indexed_after": last},
        headers={"X-API-Key": "YOUR_KEY"}
    )
    
    results = resp.json()["results"]
    if not results:
        break

    process(results)
    last = results[-1]["indexed_at"]

Continuous Ingestion

For critical applications—like competitive intelligence or systematic reviews—missing even a single paper can be costly. The sequential /list feed is designed for exhaustive coverage, sorting results by indexed_at rather than relevance.

By calling /list and using the newest indexed_at value as a cursor for the indexed_after parameter in your next request, you can pick up exactly where you left off. This guarantees a reliable, gap-free ingestion pipeline that detects all matches and never misses a beat, even if your process restarts.

Anatomy of a Record

Reliable automation requires structured data. ScholarAPI normalizes the diverse metadata formats found at different journals and converts them to a common structure.

Publication Record

{
  "id": "36f5a2",
  "title": "Spin-valley protected Kramers pair in bilayer graphene",
  "authors": ["Denisov, A.", "Reckova, V.", "Cances, S."],
  "abstract": "The intrinsic valley degree of freedom makes graphene...",
  "journal": "Nature Nanotechnology",
  "journal_issue": "20",
  "journal_pages": "494-499",
  ...

The returned records contain comprehensive metadata, such as, typically, the publication title, list of authors, abstract, publishing journal. ScholarAPI does its best to collect as complete and detailed information as possible. It also assigns to each record a stable internal identifier (id).

Secondary Fields

  ...
  "doi": "10.1038/s41586-023-01234-x",
  "url": "https://www.nature.com/articles/s41565-025-01858-8",
  "has_text": true,
  "published_date": "2025-02-10",
  "indexed_at": "2025-03-25T08:30:00.000Z"
}

Additional fields like doi, url, published_date, indexed_at, and has_text indicate where the document was found, its freshness, and whether the full text is available via the /text and /pdf endpoints.

The Full Text

ScholarAPI goes beyond abstracts, providing access to the full-text PDF via the /pdf endpoint, allowing you to deliver the complete document to end users, or feed it into your own processing pipelines.

Crucially, ScholarAPI also extracts clean plain text and makes it available via the /text endpoint. This unlocks powerful capabilities for your tool: automatically detecting key facts, tracking citations, running analytics, summarizing findings with LLMs — all without the need for a custom PDF conversion pipeline.

Both endpoints accept the record ID, so you can call them immediately after receiving a search hit.

Plain Text NLP Ready

GET /api/v1/text/{id}

Original PDF Full Binary

GET /api/v1/pdf/{id}

Command-line sample

$ curl -s https://scholarapi.net/api/v1/text/36f5a2 -H "X-API-Key=..."

  Spin-valley protected Kramers pair in bilayer graphene
  Artem O. Denisov, Veronika Reckova, Solenn Cances
  
  The intrinsic valley degree of freedom makes bilayer graphene (BLG) a
  unique platform for semiconductor qubits. The single-carrier quantum dot
  (QD) ground state exhibits a twofold degeneracy, where the two states that
  constitute a Kramers pair have opposite spin and valley quantum numbers
  ......

Exact Phrase

Find specific multi-word concepts.

/search ?q="large language models"

Compound Logic

Combine distinct topics (OR logic).

/list ?q=graphene &q="carbon nanotubes"

Full Text Only

Only include full text records.

/list ?q="quantum dots" &has_text=true

Advanced Filtering

As your monitoring scale grows, precision becomes paramount. Refine your feed to capture exactly what matters—using quoted phrases "..." for exact matches, or multiple q parameters in a single API call for broad OR logic.

If your use case requires analyzing the complete document content, set has_text to true. This ensures your feed only includes records where the full text is available, skipping abstract-only entries that were found by ScholarAPI while indexing hybrid (partly open access) journals.

End of Guide

Ready to build your Literature Monitor?

Get API Key View Full Docs