by galfrevn ·
Hey HN,
I've been building Kraken for the past month — an open source autonomous dev agent that runs entirely in your terminal.
The architecture is a three-process system: a Rust scheduler (cron + file watchers), a Go LLM gateway (supports OpenAI, Anthropic and OpenRouter), and a TypeScript/React TUI built with OpenTUI. All three communicate over ConnectRPC on localhost.
A few things I wanted to get right from the start:
- Model-agnostic: uses a custom XML-based tool-calling protocol instead of native provider APIs, so it works the same regardless of the LLM - Plugin system: plugins implement a KrakenPlugin interface from the SDK, can register tools, hook into the agent lifecycle and extend the system prompt - No cloud: all state lives in a local SQLite file
It's early and a lot is still missing (docs, tests, CI). I'm sharing it now because I'd rather get feedback from people who know what they're doing than polish it in private.
Repo: github.com/galfrevn/kraken
Happy to answer questions about the architecture or design decisions.
by CostsentryAI ·
ANyone else currently been having this problem, or at least had this problem?
I wanted a way to preserve Claude Code sessions. Once a session ends, the conversation is gone — no searchable history, no way to trace back why a decision was made in a specific PR.
The idea is simple: one GitHub Issue per session, automatically linked to a GitHub Projects board. Every prompt and response gets logged as issue comments with timestamps. Since issues can reference PRs, you get full context tracing between conversations and actual code changes.
npx claude-session-tracker
The installer handles everything: creates a private repo, sets up a Projects board with status fields, and installs Claude Code hooks globally. It requires gh CLI — if missing, the installer detects and walks you through setup.
Why GitHub, not Notion/Linear/Plane?
I actually built integrations for all three first. Linking sessions back to PRs was never smooth on any of them, but the real dealbreaker was API rate limits. This fires on every single prompt and response — essentially a timeline — so rate limits meant silently dropped entries. I shipped all three, hit the same wall each time, and ended up ripping them all out. GitHub Issues via the gh CLI against your own repo has no practical rate limit for this workload. (GitLab would be interesting to support eventually.)
Design decisions
No MCP. I didn't want to consume context window tokens for session tracking. Everything runs through Claude Code's native hook system. Fully async. All hooks fire asynchronously — zero impact on Claude's response latency. Idempotent installer. Re-running just reuses existing config. No duplicates.
What it tracks
- Creates an issue per session, linked to your Projects board
- Logs every prompt/response with timestamps
- Auto-updates issue title with latest prompt for easy scanning
- `claude --resume` reuses the same issue
- Auto-closes idle sessions (30 min default)
- Pause/resume for sensitive work
I've been experimenting with running an LLM not as a chatbot but as the core runtime of a business system, and I'm curious how others approach this.
The idea is that the model doesn't just answer questions but orchestrates tools and interacts with real application logic.
The architecture I'm currently testing includes:
Runtime
tool orchestration parallel tool execution loop detection circuit breaker / timeout guards token budgeting Context
context compression dynamic token ceiling Caching
deterministic LLM response cache semantic cache using pgvector Memory
short-term session memory longer-term semantic memory Evaluation
prompt evaluation set to test tool reasoning and failures I'm trying to figure out which parts are actually necessary in production and which ones are over-engineering.
For people building LLM systems beyond simple chat interfaces:
how do you handle tool orchestration? do you implement memory layers or just rely on context? are semantic caches worth it in practice? Curious to hear how others structure this.
Hey HN, I built this because I kept opening Instagram to reply to a DM and resurfacing 30–40 minutes later deep in Reels. The problem wasn't willpower, it was that Instagram puts the addictive stuff between you and the useful stuff.
Mindful Instagram replaces the Instagram home page with a quiet overlay: an inspirational quote, your DM notifications (fetched from Instagram's own API), a daily open counter, and a 1-minute breathing exercise. Messages, Create, and Profile are always accessible. Everything else — Reels, Explore, Search, Suggested Posts — is permanently blocked.
The core mechanic is intentional friction: – Home Feed, Stories, and Notifications can each be toggled on, but only 3 times per day. After that, they stay off until tomorrow. – Every toggle-on has a 5-second cooldown before it activates. Just enough time to ask yourself "do I actually need this?" – A 10-minute reminder and 20-minute auto-off prevent runaway sessions. – Focus Lock disables all toggles for 24 hours. Irreversible. For the days you really need it. Technical notes for the curious: – Manifest V3, vanilla JS, zero dependencies. The entire extension is ~300 lines of content script + a popup. – Zero data collection. No analytics, no telemetry, no external API calls beyond Instagram's own endpoints. Settings sync via chrome.storage.sync. That's it. – The UI is inspired by Teenage Engineering's hardware design language — beige chassis, monospace type, orange accents, hardware-switch-style toggles. I wanted it to feel like a physical device, not another app competing for your attention. – DM notifications are fetched from Instagram's inbox API endpoint and displayed as names on the overlay, so you can see who messaged you without touching the feed. This is v3.3. I'm a solo builder and this started as a tool for myself. It's free, no premium tier, no plans to monetize. Would love feedback from this community — especially on the 3-use-per-day limit (too strict? not strict enough?) and whether the Focus Lock UX makes sense.