← all thoughts

Building in Public

log entry No. 04

Tone ML Engine

Following Through on a #100DaysOfCode Promise

A while back during #100DaysOfCode I tweeted a progress update on Night City Voices, a Cyberpunk 2077 quote generator, and listed what I’d build next.

Future improvements: add more endpoints, open-source the API.

I didn’t.

Other projects took over. It sat there for two years. When I came back to the project, I didn’t just open-source it, I rebuilt its core logic.

I wanted to classify tone, not sentiment. Sentiment is blunt: positive vs. negative. Tone is narrower. It’s the emotional register of the writing.

A line can read hopeful and still land cynical. Classic sentiment models miss this completely.

Always forgive your enemies; nothing annoys them so much.

A Wilde epigram often scores “positive.” It’s not. The wording is cheerful. The intent is cynical.

The Constraint That Shaped Everything

I didn’t want a Python server or a hosted API in the loop. I wanted inference to run where the request lands and always return a label, even if the ML layer goes down.

That led to a three-layer pipeline:

Input text
│
▼ seen in embedding lookup? (MiniLM, pre-computed, client-side)
YES → label, zero network calls
NO ↓
▼ POST /tone → Cloudflare Worker (TF-IDF + LogReg, pure JS)
confidence ≥ 0.38 → label
NO ↓
▼ rule-based word lists (client-side, always resolves)

Layer 1 — MiniLM labels baked into the HTML as a flat lookup. Any text the model has seen before resolves client-side in microseconds. No network call.

Layer 2 — TF-IDF + logistic regression running in a Cloudflare Worker. The model exports its weights as static JSON. tonemodel.js reimplements the math. No Python runtime in the inference path. Weight/vocabulary drift gets caught at training time, not production.

Layer 3 — Rule-based word lists. Always returns something.

Accuracy

Tone isn’t sentiment. A line can be brutal and read hopeful, or optimistic and land cynical. That ambiguity is the domain, and it’s why binary classifiers don’t apply here.

Layer 1 (MiniLM embeddings) reaches 67% on the training corpus. Layer 2 (TF-IDF + Logistic Regression) reaches 54% on held-out novel quotes. Three classes, 155 training examples, no pretrained model in the inference path. That’s the signal the data supports. It scales with the dataset.

Rather than forcing a single label, the system surfaces uncertainty. When the top two scores fall within threshold, it returns both,treating ambiguity as output, not noise to suppress.

What this project actually explores:

Why This Exists

This project came out of Night City Voices, but it’s not really about Cyberpunk quotes.

It’s about building systems you understand end to end.

Reimplementing TF-IDF and logistic regression in JavaScript isn’t new. Exporting sklearn weights to JSON and running inference outside Python is a known pattern. sklearn-porter exists. So does m2cgen. This doesn’t use them.

Take a trained model, understand the math well enough to reproduce it, run it in a different runtime, and still have it behave predictably. This project rebuilds that path inside a Cloudflare Worker, with a fallback chain that guarantees a result even under partial failure.

The training corpus is 161 lines of public domain poetry, but the engine is dataset-agnostic. Swap in dialogue, moderation labels, chatbot tone filters, game writing; two CSV columns: text, tone, and the same pipeline runs.

The quotes were just the first corpus. The engine is the point.

The repo is here: github.com/colombomf/tone-ml-engine Clone it, swap the corpus, break the layers, make it yours.

Live Demo: tonemlengine.biokoder.com