From MERN to Edge ML: Building a Hybrid Tone Classifier for Night City Voices
“You get to a certain age, you drop all your illusions. Life just gets easier from there.” — Viktor Vektor
Is that hopeful? Cynical? It sounds like wisdom. It lands like defeat.
That’s the problem. And it’s harder to solve than it looks.
Night City Voices started as a #100DaysOfCode experiment: a React frontend hitting an Express REST API, backed by MongoDB, serving random Cyberpunk 2077 quotes. Somewhere along the way, classifying the tone of those quotes became the real challenge. And solving it led to something I didn’t plan: a hybrid, edge-inferred ML system with no Python server in the production path.
The Real Problem: Tone ≠ Sentiment
Cyberpunk dialogue doesn’t behave like normal sentiment datasets. A character can say something brutal that reads hopeful. Something optimistic that lands cynical. Something calm that feels dark. Binary positive/negative models collapse immediately. The goal shifted:
Given any quote from Cyberpunk 2077, classify it as DARK, HOPEFUL, or CYNICAL; accurately enough to be useful, and honestly enough to surface ambiguity.
That’s when the architecture had to change.
The Evolution
Phase 1 — Rule-Based Heuristics
The first classifier was pure JavaScript: three word lists (DARK_WORDS, HOPEFUL_WORDS, CYNICAL_WORDS), count matches, pick the strongest signal.
It worked — until sarcasm entered the room.
Quotes with hopeful vocabulary but dark structure? Misclassified. Quotes with contrast (“dreams cost eddies”)? Broken.
This was 44% accuracy territory. Cynical recall was worse.
Fine for a frontend gimmick. Not fine for a portfolio piece.
Phase 2 — TF-IDF + Logistic Regression
Training moved offline to Python:
scikit-learn 1.5.2- TF-IDF vectorizer (1–2 grams,
sublinear_tf=True) LogisticRegression(C=0.3,class_weight="balanced")- 155 labeled quotes across 3 classes
| Model | Accuracy | Dark recall | Hopeful recall | Cynical recall |
|---|---|---|---|---|
| TF-IDF + LogReg | 54% | 43% | 55% | 64% |
Better. Still not good enough.
But here’s the key decision that shaped everything else: inference would not run on a Python server. The model would export its weights as JSON, and a pure-JS implementation would handle TF-IDF vectorization, logistic regression softmax, and probability scoring. All at the edge.
No Flask. No FastAPI. No server.
That constraint forced the interesting engineering work.
The Three-Layer Hybrid System
The final architecture is layered by confidence, not complexity. Each layer only activates if the previous one can’t resolve with enough certainty.
fetchTone(text)
│
▼
Layer 1 — Embedded Lookup (MiniLM)
│
▼
Layer 2 — TF-IDF + LogReg (Cloudflare Worker)
│
▼
Layer 3 — Deterministic Rule Fallback
Layer 1 — MiniLM Semantic Lookup (Client-Side)
Rather than calling an API for every request, all 155 labeled quotes are pre-embedded using all-MiniLM-L6-v2 (384-dim) and injected directly into index.html as EMBEDDED_LABELS.
If the incoming quote exists in the corpus:
- Resolution is instant
- Zero network call
- Zero inference cost
| Model | Accuracy | Dark recall | Hopeful recall | Cynical recall |
|---|---|---|---|---|
| MiniLM + LogReg | 67% | 36% | 91% | 79% |
67% sounds modest, but on a 155-sample, 3-class ambiguous dataset — where the classes genuinely overlap — this is the expected ceiling before adding more labeled data. The fallback chain handles the rest. This layer deliberately trades page weight for latency elimination on known quotes.
Layer 2 — Edge Inference (TF-IDF Worker)
If a quote isn’t in the embedded lookup, the frontend sends a POST /tone to a Cloudflare Worker.
Inside the Worker:
- JS TF-IDF vectorization
- Softmax probability calculation
- Polarity contrast nudge
- Analogy marker detection (
"like a","as if") - Rate limited: 60 req/min per IP
If cynical probability ≥ 0.35 → classify cynical. Otherwise return the full probability vector.
No server round-trip beyond the Worker. No Python runtime anywhere in the inference path.
Layer 3 — Deterministic Rule-Based Fallback
If the Worker fails, hits a rate limit, or returns max_prob < threshold, the system falls back to CP77-specific word lists with contrast detection: hopeful + dark signals → cynical boost.
This layer lives entirely in the frontend. No deploy required. No dependency. No AI cost.
It exists for one reason: a request must never return without a label. Correctness floors matter.
Ambiguity Is a Feature
Most classifiers silently return argmax.
This one doesn’t.
If the top two probabilities fall within threshold, the UI shows both:
DARK | CYNICAL
Low confidence is information. A quote that reads dark and cynical should say so. This small design decision makes the system feel more honest — and prevents false certainty.
The Real Engineering Problem (Revisited)
How do you train in Python and infer in a zero-dependency JavaScript edge environment with no server in between?
Key pieces that made it work:
model_export.json— custom weight serialization- Pure JS TF-IDF reimplementation
- Dimension validation at Worker cold start
- Regex-based HTML injection for embedding table
- Deterministic fallback chain
- Rate-limited Worker edge deployment
Training lives offline. Inference lives at the edge. The two communicate only through serialized weights.
Failure Mode Thinking
| Failure | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Worker inference error | Low | Moderate | Layer 3 fallback |
| Rate limit exceeded | Low–Medium | Low | Silent fallback |
| Weight/vocab drift | Medium | High | Dimension validation on startup |
| Dark misclassification | High | Low | Dual-label ambiguity display |
| Build injection failure | Low | High | Token validation during build |
Dark recall remains low by design. Dark quotes are linguistically sparse — they lack contrast markers and resist surface features. Improving that requires more labeled data, not architectural tweaks.
What This Demonstrates
This is a small system with narrow scope. But it demonstrates:
- Bridging Python ML pipelines to edge JS inference
- Designing fallback chains based on confidence thresholds
- Cost-aware architecture (no per-request AI API calls)
- Surfacing uncertainty instead of hiding it
- Deploying ML to Cloudflare Workers without a server
And it shows one more thing: this project started as a MERN quote generator during #100DaysOfCode. It ended as a hybrid, edge-inferred ML system with layered guarantees.
Different UI. Completely different architecture underneath.
Sometimes you only see the journey looking back.