Tone ML Engine — Biokoder

Following Through on a #100DaysOfCode Promise

A while back during #100DaysOfCode I tweeted a progress update on Night City Voices, a Cyberpunk 2077 quote generator, and listed what I’d build next.

Future improvements: add more endpoints, open-source the API.

I didn’t.

Other projects took over. It sat there for two years. When I came back to the project, I didn’t just open-source it, I rebuilt its core logic.

I wanted to classify tone, not sentiment. Sentiment is blunt: positive vs. negative. Tone is narrower. It’s the emotional register of the writing.

A line can read hopeful and still land cynical. Classic sentiment models miss this completely.

Always forgive your enemies; nothing annoys them so much.

A Wilde epigram often scores “positive.” It’s not. The wording is cheerful. The intent is cynical.

The Constraint That Shaped Everything

I didn’t want a Python server or a hosted API in the loop. I wanted inference to run where the request lands and always return a label, even if the ML layer goes down.

That led to a three-layer pipeline:

Input text
│
▼ seen in embedding lookup? (MiniLM, pre-computed, client-side)
YES → label, zero network calls
NO ↓
▼ POST /tone → Cloudflare Worker (TF-IDF + LogReg, pure JS)
confidence ≥ 0.38 → label
NO ↓
▼ rule-based word lists (client-side, always resolves)

Layer 1 — MiniLM labels baked into the HTML as a flat lookup. Any text the model has seen before resolves client-side in microseconds. No network call.

Layer 2 — TF-IDF + logistic regression running in a Cloudflare Worker. The model exports its weights as static JSON. tonemodel.js reimplements the math. No Python runtime in the inference path. Weight/vocabulary drift gets caught at training time, not production.

Layer 3 — Rule-based word lists. Always returns something.

Accuracy

Tone isn’t sentiment. A line can be brutal and read hopeful, or optimistic and land cynical. That ambiguity is the domain, and it’s why binary classifiers don’t apply here.

Layer 1 (MiniLM embeddings) reaches 67% on the training corpus. Layer 2 (TF-IDF + Logistic Regression) reaches 54% on held-out novel quotes. Three classes, 155 training examples, no pretrained model in the inference path. That’s the signal the data supports. It scales with the dataset.

Rather than forcing a single label, the system surfaces uncertainty. When the top two scores fall within threshold, it returns both,treating ambiguity as output, not noise to suppress.

What this project actually explores:

Exporting scikit-learn weights to a static JSON format reconstructed in pure JavaScript
Running TF-IDF vectorization and logistic regression inference inside a Cloudflare Worker. No Python runtime, no server.
A three-layer fallback chain that guarantees a label on every request.
Treating low-confidence predictions as signal worth communicating
Performance scales with the dataset. The architecture doesn’t change.

Why This Exists

This project came out of Night City Voices, but it’s not really about Cyberpunk quotes.

It’s about building systems you understand end to end.

Reimplementing TF-IDF and logistic regression in JavaScript isn’t new. Exporting sklearn weights to JSON and running inference outside Python is a known pattern. sklearn-porter exists. So does m2cgen. This doesn’t use them.

Take a trained model, understand the math well enough to reproduce it, run it in a different runtime, and still have it behave predictably. This project rebuilds that path inside a Cloudflare Worker, with a fallback chain that guarantees a result even under partial failure.

The training corpus is 161 lines of public domain poetry, but the engine is dataset-agnostic. Swap in dialogue, moderation labels, chatbot tone filters, game writing; two CSV columns: text, tone, and the same pipeline runs.

The quotes were just the first corpus. The engine is the point.

The repo is here: github.com/colombomf/tone-ml-engine Clone it, swap the corpus, break the layers, make it yours.

Live Demo: tonemlengine.biokoder.com