python-playground

Playground for developing / demonstrating python dev techniques.

This repository hosts Acoustic Reference Book — Phase 1: a schema-driven pipeline that turns acoustic calculation output into validated XML Acoustic Dataset, built as a learning project where every decision is documented and defensible.

Quick start

Open in a GitHub Codespace (Code → Codespaces → Create codespace). It provisions Python 3.9.4 (matching the target system) and runs make bootstrap automatically. Then:

make verify        # lint + tests — confirm the environment is green
make docs-serve    # browse the docs at http://localhost:8000

Working locally instead? You need Python 3.9.x and make, then make bootstrap. Run make to see all targets.

Self-guided adventures

New here? Work through these in order. Each is a short, hands-on loop — run a command, look at what it produced, then peek at the code that did it. Everything below has been run end-to-end, so the output you see should match.

All commands assume make bootstrap has finished (it runs automatically in a Codespace). If acoustic isn't found, the package is exposed as a module too: replace acoustic with python -m acoustic_dataset.cli.

Adventure 1 — Run the whole pipeline and watch the gates

make verify                                  # lint + type-check + tests: is the env green?
acoustic pipeline                            # input XML -> typed objects -> XML -> validate

You should see something like:

pipeline ok: 10 band(s) -> build/acoustic_dataset.xml (schema-valid, round-trip-equal)

Open the file it wrote (build/acoustic_dataset.xml) and compare it to the input (examples/calculation_input.xml). Then run the structural gate on its own and the migration-safety diff against the known-good reference:

acoustic validate --xml build/acoustic_dataset.xml
acoustic compare build/acoustic_dataset.xml examples/reference/trial_known_good.xml

Now make it your own. examples/calculation_input.xml is your sandbox — edit a value and re-run to watch the validated output change. Bump <BandCount>10 to 20, or <BaseLevelDb>140.0 to 155.0, then:

acoustic pipeline

The band count in the message and the generated XML move with your edit. This is the tightest loop in the project: input XML in, validated dataset out.

Then break it on purpose to watch the gates earn their keep. Set <BandCount> to a negative number, or put text where a number belongs, and re-run:

error: build rejected a value before serialisation: ... schema: value must be positive

The pipeline rejects it, prints the reason, exits non-zero, and does not write a stale artifact. (Restore the file afterward — git checkout examples/calculation_input.xml.)

Now read why: docs/concepts/two-verification-gates.md explains why "schema-valid" and "correct" are two different checks. The code lives in src/acoustic_dataset/validate.py and compare.py, dispatched from cli.py.

Adventure 2 — Explore a typed Python data class

The pipeline never carries data in a loose dict; it builds one typed object that meets the schema. Feel the difference yourself.

First, experience editor autocomplete. Open examples/explore_platform.py in VS Code (the Codespace ships Pylance). Find the # platform.radiated_noise. line, delete the #, put your cursor right after the dot, and type: VS Code pops up the object's declared attributes (band, …) — statically, before the code has even run. Try platform. too for the top-level attributes (radiated_noise, sensors, …). Keep drilling (platform.radiated_noise.band[0].) and it completes all the way down. That instant, schema-shaped autocomplete is the real payoff of declared fields. Run the file any time with:

python examples/explore_platform.py

If autocomplete doesn't appear, give Pylance a moment to index, and make sure VS Code's Python interpreter is set to the Codespace's 3.9 (bottom-right status bar / Python: Select Interpreter).

Then poke at it in the REPL:

python

>>> from acoustic_dataset import build, serialize
>>> platform = build.build_platform_from_file("examples/calculation_input.xml")
>>> type(platform).__name__
'Platform'
>>> platform.radiated_noise.band[0].centre_frequency      # a Decimal, not a string
>>> print(serialize.to_xml(platform)[:500])               # the same object, as XML
>>> from acoustic_dataset.models.acoustic_dataset import Sector
>>> Sector(bering=1, level=2)        # a typo is a TypeError, not a silent new key

The REPL can also complete attributes, but that's a separate readline/rlcompleter feature that introspects the live object at runtime — not the static, declared-field completion you saw in the editor. If <TAB> doesn't complete, enable it for the session with import readline, rlcompleter; readline.parse_and_bind("tab: complete").

Read docs/concepts/typed-vs-dicts.md for what declared fields buy you over a dictionary, then look at how the builder rejects out-of-range values in src/acoustic_dataset/build.py.

Adventure 3 — Reverse-engineer the data classes from the schema

Those data classes aren't hand-written — they're generated from the XSD by xsdata. The XSD is the single source of truth; the Python is a build artifact (note the # DO NOT EDIT BY HAND header on every model file).

acoustic generate                            # regenerate models from every schema/*.xsd

Trace the chain for one element:

Open schema/acoustic_dataset.xsd and find an element, e.g. Sector.
Open the generated src/acoustic_dataset/models/acoustic_dataset.py and find the matching @dataclass. Notice how XSD types, ranges, and docs became Python field metadata.
Read src/acoustic_dataset/generate.py to see the exact xsdata invocation — and the two post-processing steps that keep the output deterministic and Python-3.9-compatible.

Want to see generation in action? Change a <xs:documentation> string in the XSD, run acoustic generate, and git diff the models — the docstring tracks the schema. (Revert the XSD edit afterward; CI fails if committed models drift from the schema — ADR 0008.)

Adventure 4 — Generate and view the XSD documentation

Two complementary docs come out of this repo. First, a standalone HTML schema reference rendered straight from the XSD (via the vendored xs3p stylesheet):

acoustic gen-schema-docs --out build/schema-preview.html

Open build/schema-preview.html (in a Codespace: right-click the file → Download, or use the Live Preview extension) to browse every element, type, and constraint of the contract. The pre-built version is committed at docs/reference/schema/index.html. The generator lives in src/acoustic_dataset/schema_html.py.

Second, the full project site — tutorials, concepts, ADRs, and a Mermaid ERD — served by MkDocs Material:

make docs-serve                              # browse at http://localhost:8000

In a Codespace, when the port-forward notification appears, click Open in Browser.

Adventure 5 — Point it at the real (private) schema

Everything above runs against the placeholder schema committed in schema/. The real, proprietary XSD and corpus are meant to live under private/ — a directory that is entirely gitignored (see .gitignore) so it never reaches git, CI, or the internet.

One constraint shapes the workflow: src/acoustic_dataset/models/ is committed (the placeholder-generated models that CI drift-checks), so you must not regenerate over it from a real schema — an accidental commit would leak that structure. Generate real models into private/ instead. The same CLI takes explicit paths, so once the real material is in place (real XSD in private/schema/, inputs in private/examples/, known-good XML in private/reference/):

# Generate typed models from the real XSD into the gitignored output dir
acoustic generate --schema private/schema/<real>.xsd --out private/models

# Structural gate (XSD + round-trip) on a real file against the real schema
acoustic validate --xml private/examples/<real>.xml --schema private/schema/<real>.xsd

# Migration-safety diff: generated vs known-good reference
acoustic compare private/<generated>.xml private/reference/<known_good>.xml

Caveat — the full pipeline command. acoustic pipeline imports acoustic_dataset.models (the committed package), so it always builds against the placeholder-derived bindings even when you pass --schema private/.... Running the end-to-end pipeline on real-generated models would mean repointing that import at private/models/ — a deliberate change that touches committed code, so raise it for review first. With the real schema you can use generate, validate, and compare against private/ paths immediately; full pipeline needs that reviewed change.

Where to go next

Follow the guided tutorial that ties all of this together: docs/tutorials/01-start-here.md, then skim the decision records in docs/decisions/ — each says why a choice was made and what was rejected.

Documentation

The full, navigable documentation — tutorials, how-to guides, concepts, decision records (ADRs), and a Mermaid schema ERD — lives in docs/ and renders as an attractive HTML site via MkDocs Material (make docs).

Start here: docs/tutorials/01-start-here.md.

Spec-driven development

This project uses GitHub Spec Kit. The active feature's specification, plan, and design artifacts are under specs/001-codespace-xml-scaffold/. Use the /speckit-* skills (specify, plan, tasks, implement) to drive development.

Layout

Path	What it is
`.devcontainer/`	Codespaces environment (Python 3.9.4)
`schema/`	The XSD contract
`src/acoustic_dataset/`	The pipeline package (`models/` is generated, never hand-edited)
`examples/`	Example calculation input + `reference/` known-good files
`tests/`	Unit, integration, and golden-file tests
`docs/`	The documentation site (MkDocs Material)
`specs/`	Spec Kit feature specs, plans, and contracts

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.claude/skills		.claude/skills
.devcontainer		.devcontainer
.github/workflows		.github/workflows
.specify		.specify
docs		docs
examples		examples
schema		schema
specs/001-codespace-xml-scaffold		specs/001-codespace-xml-scaffold
src/acoustic_dataset		src/acoustic_dataset
tests		tests
tools/xs3p		tools/xs3p
.gitattributes		.gitattributes
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Makefile		Makefile
PLAYGROUND.md		PLAYGROUND.md
README.md		README.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

python-playground

Quick start

Self-guided adventures

Adventure 1 — Run the whole pipeline and watch the gates

Adventure 2 — Explore a typed Python data class

Adventure 3 — Reverse-engineer the data classes from the schema

Adventure 4 — Generate and view the XSD documentation

Adventure 5 — Point it at the real (private) schema

Where to go next

Documentation

Spec-driven development

Layout

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

python-playground

Quick start

Self-guided adventures

Adventure 1 — Run the whole pipeline and watch the gates

Adventure 2 — Explore a typed Python data class

Adventure 3 — Reverse-engineer the data classes from the schema

Adventure 4 — Generate and view the XSD documentation

Adventure 5 — Point it at the real (private) schema

Where to go next

Documentation

Spec-driven development

Layout

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages