Self-hosted academic publication tracker.
Track Google Scholar profiles, discover new papers automatically, resolve open-access PDFs, and stay on top of the literature you care about.
Most researchers track new papers by manually checking Google Scholar, setting up email alerts, or juggling RSS feeds. Scholarr replaces all of that with a single self-hosted service:
- Add scholars once -- by profile URL, Scholar ID, or name search
- Publications appear automatically -- a background scheduler scrapes profiles on a configurable interval
- Open-access PDFs are resolved for you -- Unpaywall and arXiv are queried automatically when a DOI is found
- Everything is deduplicated -- publications are global records; no duplicates across scholars
- Your data stays yours -- fully self-hosted, export/import your entire library at any time
| Automated Ingestion | Background scheduler with configurable intervals, continuation queue, and multi-page pagination |
| Identifier Resolution | Cross-references arXiv, Crossref, and OpenAlex to gather DOIs, arXiv IDs, PMIDs |
| PDF Discovery | Resolves open-access PDFs via Unpaywall API and arXiv, with automatic retry queue |
| Scrape Safety | Rate limiting, cooldowns, and backoff strategies that prevent IP bans -- these are safety floors, not optional |
| Multi-User | Session-based auth, admin user management, user-scoped scholar tracking |
| Theming | 7 color presets with light/dark mode, tokenized component system |
| Import / Export | Portable scholar data with full publication and read-state preservation |
| Single Container | FastAPI backend + Vue 3 frontend ship as one Docker image |
# 1. Clone and configure
git clone https://github.com/JustinZeus/scholarr.git
cd scholarr
cp .env.example .env
# 2. Set required secrets in .env
# POSTGRES_PASSWORD=<secure-password>
# SESSION_SECRET_KEY=<random-32-char-string>
# 3. Start
docker compose up -d
# 4. Open http://localhost:8000To bootstrap an admin account on first run, add to .env:
BOOTSTRAP_ADMIN_ON_START=1
BOOTSTRAP_ADMIN_EMAIL=admin@example.com
BOOTSTRAP_ADMIN_PASSWORD=<secure-password>graph LR
UI[Vue 3 Dashboard] <-->|REST + SSE| API[FastAPI]
API --> Scheduler[Scheduler]
Scheduler -->|Scrape HTML| Scholar[Google Scholar]
Scholar -->|Parse & Deduplicate| DB[(PostgreSQL)]
Scholar -.->|Identify| Ext[arXiv / Crossref / OpenAlex]
Ext --> DB
DB -->|DOIs| PDF[PDF Resolution]
PDF -->|Unpaywall / arXiv| DB
API <--> DB
| Layer | Technology |
|---|---|
| Backend | Python 3.12, FastAPI, SQLAlchemy 2.0 (async), Alembic |
| Frontend | TypeScript, Vue 3, Vite, Tailwind CSS |
| Database | PostgreSQL 15 |
| Infrastructure | Multi-stage Docker, Docker Compose |
Full documentation: justinzeus.github.io/scholarr
| Section | Covers |
|---|---|
| User Guide | Installation, configuration, all environment variables |
| Developer Guide | Architecture, local dev, contributing, testing |
| Operations | Deployment, database runbook, scrape safety |
| API Reference | Envelope spec, all endpoints, DTO contracts |
Scholarr uses conventional commits and semantic versioning. See the contributing guide for PR process and code standards.
# Dev environment
docker compose -f docker-compose.yml -f docker-compose.dev.yml up --build
# Run tests (always in containers)
docker compose -f docker-compose.yml -f docker-compose.dev.yml run --rm app \
python -m pytestSee LICENSE for details.
