Modern NewsTopAskShowBestNew

Show

    Show HN: MCP App to play backgammon with your LLM

    by sam256 · 20 minutes ago

    I learned to play backgammon the same week the MCP App Extension standard came out. So...

    You can now play backgammon with any AI client that supports MCP App (which includes Claude Desktop, VSCode, ChatGPT these days). Actually you can do more than that -- you can play against a friend and have they AI comment; you can watch the AI play itself (boring); you can play against the LLM and have it teach you the finer points of the game (my use case!).

    The one problem--and this is a limitation of the current spec--is that the board gets redrawn each time the AI takes a turn. That's actually not that bad, but when the spec adds persistent/reusable views it will be even cooler.

    2|github.com|0 comments

    Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

    by isitcontent · about 16 hours ago

    Example repo: https://github.com/valdanylchuk/breezydemo

    The underlying ESP-IDF component: https://github.com/valdanylchuk/breezybox

    It is something like Raspberry Pi, but without the overhead of a full server-grade OS.

    It captures a lot of the old school DOS era coding experience. I created a custom fast text mode driver, plan to add VGA-like graphics next. ANSI text demos run smooth, as you can see in the demo video featured in the Readme.

    App installs also work smoothly. The first time it installed 6 apps from my git repo with one command, felt like, "OMG, I got homebrew to run on a toaster!" And best of all, it can install from any repo, no approvals or waiting, you just publish a compatible ELF file in your release.

    Coverage:

    Hackaday: https://hackaday.com/2026/02/06/breezybox-a-busybox-like-she...

    Hackster.io: https://www.hackster.io/news/valentyn-danylchuk-s-breezybox-...

    Reddit: https://www.reddit.com/r/esp32/comments/1qq503c/i_made_an_in...

    241|github.com|26 comments

    Show HN: I spent 4 years building a UI design tool with only the features I use

    by vecti · about 18 hours ago

    Hello everyone!

    I'm a solo developer who's been doing UI/UX work since 2007. Over the years, I watched design tools evolve from lightweight products into bloated feature-heavy platforms. I kept finding myself using a small amount of the features while the rest just mostly got in the way.

    So a few years ago I set out to build a design tool just like I wanted. So I built Vecti with what I actually need: pixel-perfect grid snapping, a performant canvas renderer, shared asset libraries, and export/presentation features. No collaborative whiteboarding. No plugin ecosystem. No enterprise features. Just the design loop.

    Four years later, I can proudly show it off. Built and hosted in the EU with European privacy regulations. Free tier available (no credit card, one editor forever).

    On privacy: I use some basic analytics (page views, referrers) but zero tracking inside the app itself. No session recordings, no behavior analytics, no third-party scripts beyond the essentials.

    If you're a solo designer or small team who wants a tool that stays out of your way, I'd genuinely appreciate your feedback: https://vecti.com

    Happy to answer questions about the tech stack, architecture decisions, why certain features didn't make the cut, or what's next.

    342|vecti.com|153 comments

    Show HN: If you lose your memory, how to regain access to your computer?

    by eljojo · about 19 hours ago

    Due to bike-induced concussions, I've been worried for a while about losing my memory and not being able to log back in.

    I combined shamir secret sharing (hashicorp vault's implementation) with age-encryption, and packaged it using WASM for a neat in-browser offline UX.

    The idea is that if something happens to me, my friends and family would help me get back access to the data that matters most to me. 5 out of 7 friends need to agree for the vault to unlock.

    Try out the demo in the website, it runs entirely in your browser!

    307|eljojo.github.io|189 comments

    Show HN: I'm 75, building an OSS Virtual Protest Protocol for digital activism

    by sakanakana00 · about 1 hour ago

    Hi HN,

    I’m a 75-year-old former fishmonger from Japan, currently working on compensation claims for victims of the Fukushima nuclear disaster. Witnessing social divisions and bureaucratic limitations firsthand, I realized we need a new way for people to express their will without being “disposable.”

    To address this, I designed the Virtual Protest Protocol (VPP) ? an open-source framework for large-scale, 2D avatar-based digital demonstrations.

    Key Features:

    Beyond Yes/No: Adds an "Observe" option for the silent majority

    Economic Sustainability: Funds global activism through U.S. commercial operations and avatar creator royalties

    AI Moderation: LLMs maintain civil discourse in real-time

    Privacy First: Minimal data retention ? only anonymous attributes, no personal IDs after the event

    I shared this with the Open Technology Fund (OTF) and received positive feedback. Now, I’m looking for software engineers, designers, and OSS collaborators to help implement this as a robust project. I am not seeking personal gain; my goal is to leave this infrastructure for the next generation.

    Links:

    GitHub: https://github.com/voice-of-japan/Virtual-Protest-Protocol/b...

    Project Site: https://voice-of-japan.net

    Technical Notes:

    Scalable 2D Rendering: 3?4 static frames per avatar, looped for movement

    Cell-Based Grid System: Manages thousands of avatars efficiently, instantiates new cells as participation grows

    Low Barrier to Entry: Accessible on low-spec smartphones and low-bandwidth environments

    We are looking for collaborators with expertise in:

    Backend/Real-Time Architecture: Node.js, Go, etc.

    Frontend/Canvas Rendering: Handling thousands of avatars

    AI Moderation / LLM Integration

    OSS Governance & Project Management

    If you’re interested, have technical advice, or want to join the build, please check the GitHub link and reach out. Your feedback and contribution can help make this infrastructure real and sustainable.

    5|github.com|1 comments

    Show HN: I built Divvy to split restaurant bills from a photo

    by pieterdy · about 1 hour ago

    I built Divvy to make splitting restaurant bills less annoying. You take a photo of the bill, tap which items belong to each person, and it calculates totals including tax and tips.

    3|divvyai.app|0 comments

    Show HN: R3forth, a ColorForth-inspired language with a tiny VM

    by phreda4 · about 16 hours ago

    77|github.com|14 comments

    Show HN: Smooth CLI – Token-efficient browser for AI agents

    by antves · 2 days ago

    Hi HN! Smooth CLI (https://www.smooth.sh) is a browser that agents like Claude Code can use to navigate the web reliably, quickly, and affordably. It lets agents specify tasks using natural language, hiding UI complexity, and allowing them to focus on higher-level intents to carry out complex web tasks. It can also use your IP address while running browsers in the cloud, which helps a lot with roadblocks like captchas (https://docs.smooth.sh/features/use-my-ip).

    Here’s a demo: https://www.youtube.com/watch?v=62jthcU705k Docs start at https://docs.smooth.sh.

    Agents like Claude Code, etc are amazing but mostly restrained to the CLI, while a ton of valuable work needs a browser. This is a fundamental limitation to what these agents can do.

    So far, attempts to add browsers to these agents (Claude’s built-in --chrome, Playwright MCP, agent-browser, etc.) all have interfaces that are unnatural for browsing. They expose hundreds of tools - e.g. click, type, select, etc - and the action space is too complex. (For an example, see the low-level details listed at https://github.com/vercel-labs/agent-browser). Also, they don’t handle the billion edge cases of the internet like iframes nested in iframes nested in shadow-doms and so on. The internet is super messy! Tools that rely on the accessibility tree, in particular, unfortunately do not work for a lot of websites.

    We believe that these tools are at the wrong level of abstraction: they make the agent focus on UI details instead of the task to be accomplished.

    Using a giant general-purpose model like Opus to click on buttons and fill out forms ends up being slow and expensive. The context window gets bogged down with details like clicks and keystrokes, and the model has to figure out how to do browser navigation each time. A smaller model in a system specifically designed for browsing can actually do this much better and at a fraction of the cost and latency.

    Security matters too - probably more than people realize. When you run an agent on the web, you should treat it like an untrusted actor. It should access the web using a sandboxed machine and have minimal permissions by default. Virtual browsers are the perfect environment for that. There’s a good write up by Paul Kinlan that explains this very well (see https://aifoc.us/the-browser-is-the-sandbox and https://news.ycombinator.com/item?id=46762150). Browsers were built to interact with untrusted software safely. They’re an isolation boundary that already works.

    Smooth CLI is a browser designed for agents based on what they’re good at. We expose a higher-level interface to let the agent think in terms of goals and tasks, not low-level details.

    For example, instead of this:

      click(x=342, y=128)
      type("search query")
      click(x=401, y=130)
      scroll(down=500)
      click(x=220, y=340)
      ...50 more steps
    
    Your agent just says:

      Search for flights from NYC to LA and find the cheapest option
    
    Agents like Claude Code can use the Smooth CLI to extract hard-to-reach data, fill-in forms, download files, interact with dynamic content, handle authentication, vibe-test apps, and a lot more.

    Smooth enables agents to launch as many browsers and tasks as they want, autonomously, and on-demand. If the agent is carrying out work on someone’s behalf, the agent’s browser presents itself to the web as a device on the user’s network. The need for this feature may diminish over time, but for now it’s a necessary primitive. To support this, Smooth offers a “self” proxy that creates a secure tunnel and routes all browser traffic through your machine’s IP address (https://docs.smooth.sh/features/use-my-ip). This is one of our favorite features because it makes the agent look like it’s running on your machine, while keeping all the benefits of running in the cloud.

    We also take away as much security responsibility from the agent as possible. The agent should not be aware of authentication details or be responsible for handling malicious behavior such as prompt injections. While some security responsibility will always remain with the agent, the browser should minimize this burden as much as possible.

    We’re biased of course, but in our tests, running Claude with Smooth CLI has been 20x faster and 5x cheaper than Claude Code with the --chrome flag (https://www.smooth.sh/images/comparison.gif). Happy to explain further how we’ve tested this and to answer any questions about it!

    Instructions to install: https://docs.smooth.sh/cli. Plans and pricing: https://docs.smooth.sh/pricing.

    It’s free to try, and we'd love to get feedback/ideas if you give it a go :)

    We’d love to hear what you think, especially if you’ve tried using browsers with AI agents. Happy to answer questions, dig into tradeoffs, or explain any part of the design and implementation!

    93|docs.smooth.sh|67 comments

    Show HN: ARM64 Android Dev Kit

    by denuoweb · 2 days ago

    GUI-first, multi-service gRPC scaffold for an Android Development Kit style workflow on an AArch64 system.

    17|github.com|2 comments

    Show HN: BioTradingArena – Benchmark for LLMs to predict biotech stock movements

    by dchu17 · about 21 hours ago

    Hi HN,

    My friend and I have been experimenting with using LLMs to reason about biotech stocks. Unlike many other sectors, Biotech trading is largely event-driven: FDA decisions, clinical trial readouts, safety updates, or changes in trial design can cause a stock to 3x in a single day (https://www.biotradingarena.com/cases/MDGL_2023-12-14_Resmet...).

    Interpreting these ‘catalysts,’ which comes in the form of a press release, usually requires analysts with previous expertise in biology or medicine. A catalyst that sounds “positive” can still lead to a selloff if, for example: the effect size is weaker than expected

    - results apply only to a narrow subgroup

    - endpoints don’t meaningfully de-risk later phases,

    - the readout doesn’t materially change approval odds.

    To explore this, we built BioTradingArena, a benchmark for evaluating how well LLMs can interpret biotech catalysts and predict stock reactions. Given only the catalyst and the information available before the date of the press release (trial design, prior data, PubMed articles, and market expectations), the benchmark tests to see how accurate the model is at predicting the stock movement for when the catalyst is released.

    The benchmark currently includes 317 historical catalysts. We also created subsets for specific indications (with the largest in Oncology) as different indications often have different patterns. We plan to add more catalysts to the public dataset over the next few weeks. The dataset spans companies of different sizes and creates an adjusted score, since large-cap biotech tends to exhibit much lower volatility than small and mid-cap names.

    Each row of data includes:

    - Real historical biotech catalysts (Phase 1–3 readouts, FDA actions, etc.) and pricing data from the day before, and the day of the catalyst

    - Linked Clinical Trial data, and PubMed pdfs

    Note, there are may exist some fairly obvious problems with our approach. First, many clinical trial press releases are likely already included in the LLMs’ pretraining data. While we try to reduce this by ‘de-identifying each press release’, and providing only the data available to the LLM up to the date of the catalyst, there are obviously some uncertainties about whether this is sufficient.

    We’ve been using this benchmark to test prompting strategies and model families. Results so far are mixed but interesting as the most reliable approach we found was to use LLMs to quantify qualitative features and then a linear regression of these features, rather than direct price prediction.

    Just wanted to share this with HN. I built a playground link for those of you who would like to play around with it in a sandbox. Would love to hear some ideas and hope people can play around with this!

    26|www.biotradingarena.com|12 comments