I read a lot of research papers for work. My workflow evolved around an ever-growing inbox of bookmarked papers from arXiv et al. Great for exploration, but hard to keep track of what I read.
Distillate bridges the tools I already use: Zotero (literature management), reMarkable (reader + highlighter), and Obsidian (notes). It automates the whole pipeline:
$ distillate
save to Zotero ──> auto-syncs to reMarkable
│
read & highlight on tablet
just move to Read/ when done
│
V
auto-saves notes + highlights
It polls Zotero for new papers, uploads PDFs to the reMarkable via rmapi, then watches for papers you've finished reading in your Read folder. When it finds one, it:- Parses .rm files using rmscene to extract highlighted text (GlyphRange items) - Searches for that text in the original PDF using PyMuPDF and adds highlight annotations - Enriches metadata from Semantic Scholar (publication date, venue, citations) - Creates a structured markdown note with metadata, highlights grouped by page, and the annotated PDF (I keep mine in an Obsidian vault)
The core workflow just needs Zotero and a reMarkable — no paid APIs, no cloud backend, your notes stay on your machine. Optional extras if you plug them in:
- AI summaries via Claude (one-liner + key learnings from your highlights) - Daily reading suggestions from your queue - Weekly email digest via Resend - Obsidian Bases database for tracking your reading
Stack: rmapi for reMarkable Cloud, rmscene for .rm parsing, PyMuPDF for PDF annotation. Python 3.10+, pip installable.
The trickiest part was highlight extraction: reMarkable stores highlighted text as GlyphRange items in a scene tree, and matching that text back to positions in the original PDF required fuzzy search with OCR cleanup, plus special merging logic for e.g. cross-page highlights. Happy to say it works well ~99% of the time now.
Install: pip install distillate && distillate --init
Code: https://github.com/rlacombe/distillate
Site: https://distillate.dev
I built this for myself but would love feedback, especially from other reMarkable + Zotero users. What's missing from your workflow? What else should I add?
Breakdown of traffic from my Sol LeWitt Show HN post last week. 69 points, ~60k visitors, 10-hour spike, referral spread analysis.
by cdelmonte ·
Short comparative note on Dijkstra’s Shunting Yard (1961) and Pratt’s Top-Down Operator Precedence (1973), walking through both on the same expression.
Link: https://cdelmonte.medium.com/two-algorithms-one-intuition-shunting-yard-and-pratt-parsing-f561a1dad6ac