Night City Voices

From MERN to Edge ML: Building a Hybrid Tone Classifier for Night City Voices

“You get to a certain age, you drop all your illusions. Life just gets easier from there.” — Viktor Vektor

Is that hopeful? Cynical? It sounds like wisdom. It lands like defeat.

That’s the problem. And it’s harder to solve than it looks.

Night City Voices started as a #100DaysOfCode experiment: a React frontend hitting an Express REST API, backed by MongoDB, serving random Cyberpunk 2077 quotes. Somewhere along the way, classifying the tone of those quotes became the real challenge. And solving it led to something I didn’t plan: a hybrid, edge-inferred ML system with no Python server in the production path.

The Real Problem: Tone ≠ Sentiment

Cyberpunk dialogue doesn’t behave like normal sentiment datasets. A character can say something brutal that reads hopeful. Something optimistic that lands cynical. Something calm that feels dark. Binary positive/negative models collapse immediately. The goal shifted:

Given any quote from Cyberpunk 2077, classify it as DARK, HOPEFUL, or CYNICAL; accurately enough to be useful, and honestly enough to surface ambiguity.

That’s when the architecture had to change.

The Evolution

Phase 1 — Rule-Based Heuristics

The first classifier was pure JavaScript: three word lists (DARK_WORDS, HOPEFUL_WORDS, CYNICAL_WORDS), count matches, pick the strongest signal.

It worked — until sarcasm entered the room.

Quotes with hopeful vocabulary but dark structure? Misclassified. Quotes with contrast (“dreams cost eddies”)? Broken.

This was 44% accuracy territory. Cynical recall was worse.

Fine for a frontend gimmick. Not fine for a portfolio piece.

Phase 2 — TF-IDF + Logistic Regression

Training moved offline to Python:

scikit-learn 1.5.2
TF-IDF vectorizer (1–2 grams, sublinear_tf=True)
LogisticRegression (C=0.3, class_weight="balanced")
155 labeled quotes across 3 classes

Model	Accuracy	Dark recall	Hopeful recall	Cynical recall
TF-IDF + LogReg	54%	43%	55%	64%

Better. Still not good enough.

But here’s the key decision that shaped everything else: inference would not run on a Python server. The model would export its weights as JSON, and a pure-JS implementation would handle TF-IDF vectorization, logistic regression softmax, and probability scoring. All at the edge.

No Flask. No FastAPI. No server.

That constraint forced the interesting engineering work.

The Three-Layer Hybrid System

The final architecture is layered by confidence, not complexity. Each layer only activates if the previous one can’t resolve with enough certainty.

fetchTone(text)
      │
      ▼
Layer 1 — Embedded Lookup (MiniLM)
      │
      ▼
Layer 2 — TF-IDF + LogReg (Cloudflare Worker)
      │
      ▼
Layer 3 — Deterministic Rule Fallback

Layer 1 — MiniLM Semantic Lookup (Client-Side)

Rather than calling an API for every request, all 155 labeled quotes are pre-embedded using all-MiniLM-L6-v2 (384-dim) and injected directly into index.html as EMBEDDED_LABELS.

If the incoming quote exists in the corpus:

Resolution is instant
Zero network call
Zero inference cost

Model	Accuracy	Dark recall	Hopeful recall	Cynical recall
MiniLM + LogReg	67%	36%	91%	79%

67% sounds modest, but on a 155-sample, 3-class ambiguous dataset — where the classes genuinely overlap — this is the expected ceiling before adding more labeled data. The fallback chain handles the rest. This layer deliberately trades page weight for latency elimination on known quotes.

Layer 2 — Edge Inference (TF-IDF Worker)

If a quote isn’t in the embedded lookup, the frontend sends a POST /tone to a Cloudflare Worker.

Inside the Worker:

JS TF-IDF vectorization
Softmax probability calculation
Polarity contrast nudge
Analogy marker detection ("like a", "as if")
Rate limited: 60 req/min per IP

If cynical probability ≥ 0.35 → classify cynical. Otherwise return the full probability vector.

No server round-trip beyond the Worker. No Python runtime anywhere in the inference path.

Layer 3 — Deterministic Rule-Based Fallback

If the Worker fails, hits a rate limit, or returns max_prob < threshold, the system falls back to CP77-specific word lists with contrast detection: hopeful + dark signals → cynical boost.

This layer lives entirely in the frontend. No deploy required. No dependency. No AI cost.

It exists for one reason: a request must never return without a label. Correctness floors matter.

Ambiguity Is a Feature

Most classifiers silently return argmax.

This one doesn’t.

If the top two probabilities fall within threshold, the UI shows both:

DARK | CYNICAL

Low confidence is information. A quote that reads dark and cynical should say so. This small design decision makes the system feel more honest — and prevents false certainty.

The Real Engineering Problem (Revisited)

How do you train in Python and infer in a zero-dependency JavaScript edge environment with no server in between?

Key pieces that made it work:

model_export.json — custom weight serialization
Pure JS TF-IDF reimplementation
Dimension validation at Worker cold start
Regex-based HTML injection for embedding table
Deterministic fallback chain
Rate-limited Worker edge deployment

Training lives offline. Inference lives at the edge. The two communicate only through serialized weights.

Failure Mode Thinking

Failure	Likelihood	Impact	Mitigation
Worker inference error	Low	Moderate	Layer 3 fallback
Rate limit exceeded	Low–Medium	Low	Silent fallback
Weight/vocab drift	Medium	High	Dimension validation on startup
Dark misclassification	High	Low	Dual-label ambiguity display
Build injection failure	Low	High	Token validation during build

Dark recall remains low by design. Dark quotes are linguistically sparse — they lack contrast markers and resist surface features. Improving that requires more labeled data, not architectural tweaks.

What This Demonstrates

This is a small system with narrow scope. But it demonstrates:

Bridging Python ML pipelines to edge JS inference
Designing fallback chains based on confidence thresholds
Cost-aware architecture (no per-request AI API calls)
Surfacing uncertainty instead of hiding it
Deploying ML to Cloudflare Workers without a server

And it shows one more thing: this project started as a MERN quote generator during #100DaysOfCode. It ended as a hybrid, edge-inferred ML system with layered guarantees.

Different UI. Completely different architecture underneath.

Sometimes you only see the journey looking back.