Spellhand
Overview
Sparked by a chance encounter with a deaf stranger I couldn't communicate with, Spellhand is a mobile-first ASL fingerspelling trainer that runs entirely in the browser. A solo-built remake of the desktop-only fingerspelling.xyz, targeting phones where most casual learners actually are. Hand tracking runs on-device via MediaPipe HandLandmarker, classification is rule-based (one TypeScript function per letter, no trained model), and completion issues a shareable certificate backed by Supabase magic-link auth.
Tech Stack
- Next.js 16 + TypeScript for the App Router app with Turbopack
- TailwindCSS v4 for styling using the CSS-first @theme configuration
- @mediapipe/tasks-vision for HandLandmarker (21 keypoints per frame, runs in-browser)
- Custom rule-based recognition module inspired by the fingerpose gesture-rule pattern
- Motion (motion/react) for stage transitions
- Supabase + @supabase/ssr for auth, RLS, and the certificate flow
- Vercel for hosting and the dynamic OG image route
Features
- 24 Static ASL Letters. Public-domain Wikimedia SVG references for every letter except J and Z, which are dynamic and deferred to a later phase.
- Progressive Levels & Final Challenge. Four levels of increasing difficulty plus a memory-mode Challenge that gates the certificate flow.
- In-Browser Hand Tracking. MediaPipe HandLandmarker runs entirely client-side; no video frame ever leaves the device.
- Rule-Based Per-Letter Classifier. Each letter is a TypeScript function decomposing the handshape into independent per-finger sub-checks: curled, extended, angle, tip distance.
- Per-Finger Sub-Check Overlay. When detection fails, the skeleton highlights exactly which finger is wrong instead of just showing a confidence score.
- Right- & Left-Handed Toggle. Mirrors detection logic, video crop, and reference image together, not just the visual flip.
- Magic-Link Auth & Shareable Certificate. Public /cert/[token] share page backed by a dynamic OG image and an anon-readable Supabase view.
Challenges & Solutions
Recognition without a training set. Bundling a TensorFlow model would bloat the JS payload and obscure what the classifier is actually doing. Solved by writing one rule-based TypeScript function per letter, decomposing each handshape into independent per-finger sub-checks. Every rule is debuggable in plain TS, and a debug overlay exposes every sub-check value live.
Actionable feedback for learners. A single confidence number tells the learner nothing useful. Solved by making the skeleton itself the feedback surface: each finger lights up or stays gray, so the learner sees which finger is wrong, not just that something is wrong.
Handedness without breaking the logic. A naive horizontal flip looks correct but breaks classification because landmark indices stay the same. Solved by mirroring detection logic, video crop, and reference image together so the layout swaps while rule checks still reference the actual hand.
Privacy by default. Sending video to a server would force a compliance review and inflate hosting cost. Solved by running every frame on-device and persisting only landmark-derived state (the certificate row itself) to Supabase. Tables are RLS-protected, and the public share view exposes only share_token, display_name, and issued_at.
Mobile-first without sacrificing accuracy. Camera orientation, hand size, and aspect ratio differ wildly on phones, where the original desktop trainer was never tuned to run. Solved by normalizing by hand size and using angle and distance ratios instead of absolute pixel positions.
Lessons Learned
- Rule-based beats trained models when the rules are short and a dataset would be expensive. Every letter is one function and every failure is debuggable in plain TS.
- Per-finger feedback is the product. A single confidence number is invisible to learners; coloring the skeleton is what actually teaches.
- Privacy as a hard constraint simplifies architecture. No video pipeline, no storage layer, no ML inference budget — just landmarks and rules.
- Ship the smallest schema that works. Two tables and one view keep the certificate flow easy to reason about; mastery, streaks, and tiered certs can wait until they earn their store.