作り、考え、繰り返す
I've shipped an open-source e-paper computer to makers in 20+ countries, written genomics tooling in Rust, and put healthcare software into production. Lately my focus has narrowed to one problem I find genuinely useful: getting language models to run well on the hardware in front of you — your laptop, your phone, the edge — not in someone else's data center.
notes on making LLMs run well on-device
Gemma 4 QAT on a 16 GB Mac: the E4B matches the 12B at 42% less RAM and 3× the speed
Google's quantization-aware-trained Gemma-4 E4B posts the same math and factual scores as the full 12B on an M3 MacBook Air — in 6.6 GB instead of 11.4, at 8.2 tok/s instead of 2.7. On the tasks I can measure cleanly, QAT is a free lunch. Here are the honest numbers, measured under a real 2048-token load.
My benchmark graded '7! = 5040' as wrong — and three other ways it lied to me
Re-running my own LLM benchmark, I found a bug that had inflated the quality scores in three already-published posts. Then a second bug. Then a third. Here's how a wrong number looks exactly like a right one — and why you spot-check the failures, not the passes.
One flag makes Qwen3-4B beat Llama-3.1-8B on a 16 GB Mac — at half the RAM
On an M3 MacBook Air, Qwen3-4B with the thinking trace turned off scores 19/21 on a verifiable suite — beating Llama-3.1-8B's 17/21 at half the memory and nearly double the speed. With thinking on, the same model drops to 7/21. The flag is enable_thinking=False, and here's exactly what it changes and why it matters.
software · hardware · tools · the occasional bit of fun
paperd.ink
Open-source e-paper dev board — like Arduino or Raspberry Pi, but built around e-ink displays. I secured the grant funding, managed production in China, handled global shipping, and ran customer support. In makers' hands across 20+ countries.
vcfkit
VCF is the format genomics runs on — and the tooling around it hasn't kept up. vcfkit is a single static binary that normalizes variants, lifts coordinates between genome builds, and filters by expression or plain English via AI (variant data stays local). 4× faster than bcftools on hot paths. Zero dependencies. Runs on macOS, Linux, and Windows.
penna.ink
Paste any article URL, get a LinkedIn post in your voice in 30 seconds. A two-pass AI pipeline that synthesizes your brand voice, not a template filler. For founders who have opinions but not the hour it usually takes to share them.
Redacted 9
Full-stack admin portal for a US pharmacy — HIPAA-compliant, zero-compromise. Orders management, patient action queue, provider fax workflows, and a custom auth layer with 15-min inactivity auto-logout. React 19 + TypeScript, real-time 30-second polling, silent token refresh. The kind of software that runs ops and stays invisible.
Redacted 1
Describe a hardware product — pick your power source, MCU, sensors, and peripherals in plain English — and the tool generates a complete PCB schematic with BOM and KiCad export. No EDA knowledge required. Bridges the gap between "I have a hardware idea" and "I have a schematic."
Redacted 2
600+ programmatically generated pages — one for every major university in the UK, US, and Canada — helping Indian students find verified housing abroad. Ranks organically, converts consistently, generates real referral revenue without running ads.
Redacted 3
A SaaS spreadsheet translator that works meaning-for-meaning, not word-for-word. Upload your sheet, pick target languages, get back translations that preserve tone, intent, and cultural register — not literal swaps that read like nobody wrote them.
Redacted 4
A multi-step GPT pipeline for medical content. Select a drug, set keywords and word count — get a fully structured article with FAQs, side effects, dosage, and cost breakdowns. Over 500k words generated across 250+ articles. Content that ranks and reads like a subject matter expert wrote it.
Hacker Newspaper
I quit Instagram and YouTube and switched to Hacker News — great content, frustrating mobile UI. So I built a better one. Comments-first (tap opens discussion, not the link), readable nested threads, auto-resume where you left off. Newspaper layout, no algorithmic noise.
Redacted 5
A voice AI therapist — speak freely, get thoughtful responses. Built around the idea that talking out loud is fundamentally different from typing, and that emotional support should be accessible without a calendar invite.
Redacted 6
Describe the ugliest workflow in your business — the one held together by spreadsheets and Slack threads. The tool breaks it down and shows exactly what to automate, and how. Built and shipped in a day.
Redacted 7
Reads live Hacker News hiring threads via the Algolia API and surfaces companies matching a custom ICP definition. No scraping, no manual trawling — a live feed of companies worth talking to, filtered by what actually matters to the sales team.
Redacted 8
An AI-native care planning tool that maps a patient's healthcare journey — appointments, follow-ups, medication timelines — into a coherent view. Built to show what proactive, coordinated care looks like with an AI layer.
Pi Oracle
Ask the oracle anything. It answers only in π. Built for Pi Day 2026 — absurd premise, oddly satisfying. The kind of thing that makes you think your answer was already somewhere in 3.14159…
Otto
Tell Otto your supplement stack and meal times, and it schedules everything so each one actually absorbs — no fish oil quietly cancelling your iron. A small, fun tool for people who like to over-optimize.
Open to conversations about new products and collaborations. If you're making something interesting, reach out.