I ❤️ Datasette

From ILS Data to a $5/Month Analysis Platform

Ray Voelker
Cincinnati & Hamilton County Public Library
ray.voelker@chpl.org

Your Data Is Trapped

  • Libraries sit on TONS of structured data
  • ILS systems weren't designed for flexible, arbitrary queries and analysis tasks
  • The people who work with the collection best often have the least direct access to the data

So What Do People Do?

  • Wrestling with complicated data pipelines / stale spreadsheets
  • Rely on one or two staff to produce reports
  • Scrape the Public WebPAC (this is especially bad)

What If You Could Just... Browse It?

  • Browse the entire collection metadata in a web browser
  • Filter by location, format, status, date — instantly
  • Run pre-built queries without writing SQL
  • Export results to CSV
  • Access a JSON API for advanced use

collection-analysis.cincy.pl

collection-analysis.cincy.pl

What is Datasette?

  • Created by Simon Willison (co-creator of Django)
  • "A tool for exploring and publishing data"
  • Aimed at journalists, museum curators, archivists, local governments, and anyone else who has data
  • Read-only by default — safe to hand to anyone

datasette.io

Things I ❤️ about Datasette

  • Written in Python / Available via PyPI.org
  • Easy and intuitive to use
  • Well-documented
  • Useful, large, and growing plugin library
  • Built-in API... THE API IS SQL!
  • Open source + supportive developer
  • Flexible deployment with CHEAP hosting options

The Pipeline


Sierra ILS PostgreSQL (Sierra DB)
    ↓
Python Scripts (sqlite-utils)
    ↓
SQLite Database
    ↓
Datasette → collection-analysis.cincy.pl
					

github.com/cincinnatilibrary/collection-analysis

Demo Time!

Hosting: $5 a Month. Seriously.

  • DigitalOcean Droplet: $5/month
    • 1 vCPU, 1 GB RAM, 25 GB SSD
  • Apache reverse proxy (free)
  • Let's Encrypt TLS (free)
  • Datasette + Python (free, open source)

Once You Have the Pattern, Everything Is a Dataset

  • ILS collection data ✓
  • Newspaper indexes (Newsdex) — up next!

Resources

Thanks!

Please let me know if you come up with any cool queries, ❤️ Datasette as much as I do, or want to use this at your library!

Ray Voelker
ray.voelker@gmail.com | ray.voelker@chpl.org
github.com/rayvoelker