2026-06-18T14:28:59Z by Showboat 0.6.1
The payoff: browse a curated reading list by grade band — and trust what you see.
A parent or teacher asks the library a simple question: what can my kid actually read? The honest answer is scattered across scales that don’t line up — Lexile, Accelerated Reader (ATOS), Guided Reading / Fountas & Pinnell — and for many books it isn’t catalogued at all. The companion walkthrough “What grade is this book?” showed how we extract those measures honestly and where the catalog is silent. This one is the payoff: turning that scattered, multi-scale data into one grade band per book that you can actually browse — with the estimates clearly marked, never hidden.
The design decision that makes it work: AI-estimated bands are included in the browse filter, but rendered faint with an ⓘ that shows exactly how they were made. Including them keeps the browsable surface useful (it roughly doubles what you can filter); rendering them faint keeps it honest. A real catalogued grade is solid; anything derived or estimated is faint.
Every command below drives the real consumer code
against a small frozen sample of real Newbery titles, their real catalog
measures, and a real AI-estimate audit trail captured once from the
lake. showboat verify re-runs each one and diffs the
output.
The scales don’t convert to each other cleanly, so RL3 resolves each book’s measures per work into one coarse grade band, native-first: a real catalogued grade wins and shows solid; a scale we had to convert is kept but flagged as an estimate; genuine disagreement widens the band rather than inventing a midpoint.
consumers/reading_lists_datasette/.venv/bin/python docs/demos/browse-by-level/_driver.py consensus
One band from many scales
=========================
A book can carry several measures on scales that DON'T compare directly
(a Lexile is not an ATOS is not an F&P letter). RL3 resolves them per WORK into
one coarse grade band, native-first: a real catalogued grade (ATOS/MARC) wins and
is solid; a scale we had to CONVERT (e.g. Lexile→grade) is kept but flagged as an
estimate; genuine disagreement WIDENS the band, it never invents a midpoint.
title band basis shown
---------------------- ------------ ------------------------ ---------
Holes Grades 3-5 native (catalogued) solid
from: AR 4.6, 660L
A Wrinkle in Time Grades 3-5 converted (scale→grade) faint
from: 740L
A Single Shard Grades 6-8 AI estimate faint
from: (no catalogued measure)
26 Fairmount Avenue Grades K-2 AI estimate faint
from: (no catalogued measure)
Holes carries TWO scales — ATOS 4.6 and Lexile 660L — that both land in grades 3–5;
the native ATOS wins, so its band is solid. A Wrinkle in Time has only a Lexile, so
its band is CONVERTED — kept, but marked an estimate. The two with no catalogued
measure at all fall through to the AI tier (next beat).
For the books with no catalogued measure at all, a local model — running on our own hardware, so the description never leaves the library network — estimates a band from the book’s description. The safeguard is self-consistency: three independent reads must agree within about one grade, or we abstain and show nothing. Every read is kept on a transparency view, so an estimate is never a black box.
consumers/reading_lists_datasette/.venv/bin/python docs/demos/browse-by-level/_driver.py estimate
The AI estimate, honestly
=========================
When a book has NO catalogued measure, a local model (qwen3:32b, on our own
hardware — the description never leaves the library network) estimates a grade band
from the book's description. But only when it agrees with itself: we take 3
independent reads; they must land within ~1 grade or we ABSTAIN and show nothing.
Every read is kept for inspection on the transparency view:
'A Single Shard' — 3 independent reads (qwen3:32b):
read 0: grades 5–8 (medium confidence)
“Historical context, complex vocabulary (e.g., 'celadon ware,' 'irascible temper'), and thematic depth (perseverance, cultural craftsmanship) suggest middle-grade to young adult readability with nuanced character development and setting details.”
read 1: grades 5–8 (medium confidence)
“The description uses compound sentences, historical context, and specific vocabulary (e.g., 'celadon ware,' 'irascible temper') suggesting mid-to-upper elementary difficulty. Thematic depth and narrative complexity align with middle-grade readers.”
read 2: grades 5–8 (medium confidence)
“Historical setting, technical pottery terms, and emotional depth suggest mid-grade difficulty. Protagonist's age and accessible themes balance complexity.”
agreement spread: 0.0 grades → all three agree exactly → PUBLISH
→ reading band: Grades 6-8, marked is_estimated=true (basis 'ai').
Had the reads disagreed by more than ~1 grade, we would have published nothing —
"don't know" is an honest answer, and a better one than a confident guess.
This is the heart of it, and it’s the real render plugin the public site runs. A native band is a solid coloured pill; a converted or AI band is the same colour but faint and dashed, with an ⓘ linking to “How AI estimates are made.” The colour grows up with the band. Here is the literal HTML the plugin emits for each sample book:
consumers/reading_lists_datasette/.venv/bin/python docs/demos/browse-by-level/_driver.py render
How it renders — the real plugin
================================
The browse surface runs these exact functions. A native band is a SOLID coloured
pill; a converted or AI band is the same colour but FAINT and dashed, with an ⓘ
that links to “How AI estimates are made”. The colour grows up with the band
(K-2 green → 3-5 blue → 6-8 indigo). Below is the literal HTML the plugin emits:
Holes <span class="reading-band band--3-5">Grades 3-5</span>
A Wrinkle in Time <span class="reading-band band--3-5 band--estimated">Grades 3-5 <a class="band-info" href="/how-ai-estimates-are-made" title="Estimated — how?">ⓘ</a></span>
A Single Shard <span class="reading-band band--6-8 band--estimated">Grades 6-8 <a class="band-info" href="/how-ai-estimates-are-made" title="Estimated — how?">ⓘ</a></span>
26 Fairmount Avenue <span class="reading-band band--k-2 band--estimated">Grades K-2 <a class="band-info" href="/how-ai-estimates-are-made" title="Estimated — how?">ⓘ</a></span>
Raw 'as catalogued' measures render as verbatim chips (Holes):
<span class="chips"><span class="chip chip--catalogued meas--atos">AR 4.6</span><span class="chip chip--catalogued meas--lexile">660L</span></span>
The colour class is keyed on the band; estimates add band--estimated:
Grades K-2 -> band--k-2
Grades 3-5 -> band--3-5
Grades 6-8 -> band--6-8
Solid vs faint is the whole honesty contract: a parent can never mistake an
estimate for what the catalogue actually says — and the ⓘ shows exactly how it was made.
On chimpy-reader the grade band is a real facet: pick Grades 3-5 and a curated list filters to those titles. Because the estimates are in the facet (marked), the browsable surface stays useful instead of being cut in half. A snapshot of the live distribution:
consumers/reading_lists_datasette/.venv/bin/python docs/demos/browse-by-level/_driver.py facet
Browse it — filter a curated list by level
==========================================
On chimpy-reader the grade band is a real Datasette facet: pick “Grades 3-5” and the
curated list filters to those titles. AI-estimated bands are INCLUDED in the facet
(marked faint), so the browsable surface stays useful instead of being halved.
This snapshot of the live surface — 99 of 177 curated titles carry a band:
Grades K-2 5 █████
Grades 3-5 67 ███████████████████████████████████████████████████████████████████
Grades 6-8 27 ███████████████████████████
99 banded titles browsable by level today — roughly half AI estimates
(faint), the rest catalogued or converted (solid). Coverage grows on its own as
the catalog backfill and the daily enrichment timers land more measures.
Every command above re-runs under showboat verify. The
on-thesis invariants — native bands solid, converted
and AI bands faint + ⓘ + inspectable, every band
colour-coded, raw measures verbatim — are pinned by the demo’s own guard
test, run against the real plugin:
PYTHONPATH=src consumers/reading_lists_datasette/.venv/bin/python -m pytest tests/demos/test_browse_by_level.py -q 2>&1 | sed -E 's/ in [0-9.]+s//'..... [100%]
5 passed
Browse-by-reading-level ships today. The next moves are designed or merely imagined, and posed here as questions, not claims:
The discipline is the point. We shipped the part we can stand behind — one honest, browsable band per book, estimates marked and inspectable — and everything above re-runs on demand.
← all walkthroughs · Rendered from 226199c on 2026-06-18 · showboat verify: reproduces. A living artifact — the version ledger is git.