by Hey1-Arthur ·
# I'm 11 years old and I trained my own LLM from scratch. 50 people downloaded it in 24 hours.
Hey r/LocalLLaMA,
I'm Arthur, I'm 11 years old, and I just released *Wind Arc 1.6* — a custom architecture LLM I built and trained myself.
## What it is
Wind Arc 1.6 is a 3.6B parameter model with a custom architecture I designed:
- *Mixture of Experts FFN* — 4 routed experts + 1 shared expert per layer (replaces standard MLP) - *YaRN RoPE* — extends context from 8k → 32k tokens - *Hybrid Attention* — full attention every 4th layer, sliding window otherwise - *QK-Norm* for training stability
Base: Qwen3-1.7B with the FFN layers completely replaced by my custom MoE. (Fully custom)
## How I trained it
- Hardware: 1× RTX 5090 rented from Nova Cloud - Cost: literally $1 - Time: 55 minutes - Final loss: 2.66
Data mix: smoltalk + python-codes-25k + FineWeb-Edu + custom identity and Christian Q&A data I wrote myself.
## What it's good at
- Python and general coding with explanations - Christian questions (Bible, theology, Christian living) - General chat and learning
## The honest truth
It's not GPT-4. Loss of 2.66 on a 1.7B base with 55 minutes of training isn't going to beat frontier models. But it runs locally, it's open source, and it's mine.
I still need to do SFT (the identity responses aren't perfect yet) and GGUF conversion is blocked by the custom MoE architecture. Working on both.
## Why I built it
I'm building *North.ai* — an AI startup focused on powerful models that run on small hardware. Wind Arc is our flagship model. Our platform Neurotype will let anyone train, deploy, and use AI without needing expensive cloud budgets.
I've trained 10+ models. This is the first one I'm actually proud enough to release.
## Links
HuggingFace: https://huggingface.co/arthu1/wind-arc-1-6
Would love feedback from people who actually know what they're doing. Be honest — I can take it.
— Arthur, age 11
I am making an AI agent that can read my local Apple book, turn pages, summarize, or even just recite the context. It can also look at my screen, hear what I am watching, and use those to help me learn a new languages.
I scraped and open-sourced a corpus of 518,255 Vietnamese legal documents — laws, decrees, circulars, decisions — spanning a century of legislation. Metadata + full Markdown text, ~3.6 GB parquet, CC BY 4.0. Vietnamese legal text is nearly absent from existing NLP datasets despite Vietnam having one of the more prolific legislative systems in Southeast Asia.