Battle-tested AWS + Databricks resource naming automation with ACID transactions.
This guide shows how to use the three key parts of the system together:
.dpn/*.yaml- Configuration files (naming values and patterns)blueprints/*.json- Resource definitions (what to create)schemas/*.json- Validation schemas (ensures correctness)
Set up your project's naming configuration:
# Interactive prompts
uv run dpn config init
# Or specify values directly
uv run dpn config init --project myproject --environment dev --region us-east-1This creates two configuration files in .dpn/:
.dpn/naming-values.yaml- Variable values (project, environment, region, etc.).dpn/naming-patterns.yaml- Name templates with placeholders
Customize if needed:
# Edit values
vim .dpn/naming-values.yaml
# Edit patterns (optional - defaults usually work)
vim .dpn/naming-patterns.yaml
# Validate your changes
uv run dpn config validate
# View current configuration
uv run dpn config showGenerate a blueprint that defines what resources to create:
# Creates blueprints/dev.json
uv run dpn plan init --env dev --project myproject
# Creates blueprints/prd.json
uv run dpn plan init --env prd --project myprojectEdit the blueprint to customize your resources:
vim blueprints/dev.jsonThe blueprint specifies:
- Environment metadata (env, project, region, team, cost center)
- AWS resources (S3 buckets, Glue databases/tables)
- Databricks resources (clusters, jobs, Unity Catalog)
Validate blueprint structure and configuration:
# Validate blueprint against schema
uv run dpn plan validate blueprints/dev.json
# Validate configuration files
uv run dpn config validateValidation uses schemas in schemas/:
naming-values-schema.json- Validates configuration valuesnaming-patterns-schema.json- Validates naming patterns- Blueprint schema (internal) - Validates blueprint structure
See what names will be created before executing:
# Table format (default)
uv run dpn plan preview blueprints/dev.json
# JSON format for automation
uv run dpn plan preview blueprints/dev.json --format json --output preview.json
# With runtime overrides
uv run dpn plan preview blueprints/dev.json --override environment=prdDeploy your infrastructure:
# Dry run first (recommended)
uv run dpn create --blueprint blueprints/dev.json --dry-run
# Execute creation
uv run dpn create --blueprint blueprints/dev.json.dpn/
├── naming-values.yaml → Variable values (project, environment, etc.)
└── naming-patterns.yaml → Name templates with {placeholders}
↓
(combine values + patterns)
↓
Generated Names
↓
blueprints/
└── dev.json → Resource definitions
↓
(validated against)
↓
schemas/
├── naming-values-schema.json
├── naming-patterns-schema.json
└── (internal blueprint schema)
↓
(creates resources)
↓
AWS & Databricks Resources
Complete workflow in 5 commands:
# 1. Set up configuration
uv run dpn config init --project dataplatform --environment dev
# 2. Create blueprint
uv run dpn plan init --env dev --project dataplatform
# 3. Preview names
uv run dpn plan preview blueprints/dev.json
# 4. Dry run
uv run dpn create --blueprint blueprints/dev.json --dry-run
# 5. Create resources
uv run dpn create --blueprint blueprints/dev.jsonValues (.dpn/naming-values.yaml):
defaults:
project: dataplatform
environment: dev
region: us-east-1Patterns (.dpn/naming-patterns.yaml):
patterns:
aws_s3_bucket:
template: "{project}-{purpose}-{layer}-{environment}-{region_code}"Blueprint (blueprints/dev.json):
{
"version": "1.0",
"metadata": {"environment": "dev", "project": "dataplatform"},
"resources": {
"aws": {"s3_buckets": [{"purpose": "raw", "layer": "raw"}]}
}
}Result: S3 bucket named dataplatform-raw-raw-dev-use1
# Install UV (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone repository
git clone https://github.com/robandrewford/data-platform-naming.git
cd data-platform-naming
# Install dependencies (creates .venv automatically)
uv sync
# Verify installation
uv run dpn --help
# Preview names
uv run dpn plan preview blueprints/prd.json
# Generate blueprint (creates blueprints/prd.json by default)
uv run dpn plan init --env prd --project platform
# Create resources (dry-run first)
uv run dpn create --blueprint blueprints/prd.json --dry-run
uv run dpn create --blueprint blueprints/prd.jsonNote:
uv syncautomatically creates a.venvvirtual environment. You don't need to runuv venvmanually or activate the environment -uv runhandles everything!
uv sync
uv run dpn --helppip install data-platform-naming
dpn --help- AWS: S3, Glue (databases/tables)
- Databricks: Clusters, Jobs, Unity Catalog (3-tier)
- Consistent: Environment-aware patterns
- Validated: Schema enforcement
- Atomic: All-or-nothing execution
- Consistent: Pre/post validation
- Isolated: File-lock WAL
- Durable: Crash recovery
- Declarative: JSON configuration
- Validated: JSON Schema compliance
- Preview: Dry-run mode
- Dependencies: Auto-resolved execution order
- Create: Batch resource provisioning
- Read: Configuration inspection
- Update: In-place modifications
- Delete: Archive or permanent removal
The configuration system allows you to customize naming patterns and values without modifying code:
# 1. Initialize configuration
uv run dpn config init
# Prompts for: project, environment, region
# 2. Validate configuration
uv run dpn config validate
# 3. View configuration
uv run dpn config show
uv run dpn config show --resource-type aws_s3_bucket
# 4. Use in commands (automatic with .dpn/ configs)
uv run dpn plan preview dev.json
uv run dpn create --blueprint dev.jsonTwo YAML files control resource naming:
naming-values.yaml - Variable values with precedence hierarchy:
defaults:
project: dataplatform
environment: dev
region: us-east-1
team: data-engineering
environments:
prd:
environment: prd
team: data-platform
resource_types:
aws_s3_bucket:
purpose: analyticsnaming-patterns.yaml - Template patterns with placeholders:
patterns:
aws_s3_bucket:
template: "{project}-{purpose}-{layer}-{environment}-{region_code}"
required_variables: ["project", "purpose", "layer", "environment", "region_code"]
dbx_cluster:
template: "{project}-{workload}-{cluster_type}-{environment}"
required_variables: ["project", "workload", "cluster_type", "environment"]
transformations:
region_mapping:
us-east-1: "use1"
us-west-2: "usw2"
validation:
max_length:
aws_s3_bucket: 63
dbx_cluster: 100Default location: .dpn/ (project directory)
.dpn/naming-values.yaml.dpn/naming-patterns.yaml
Custom paths: Use flags with any command
uv run dpn plan preview dev.json \
--values-config custom-values.yaml \
--patterns-config custom-patterns.yamlOverride any value at runtime without modifying config files:
# Single override
uv run dpn plan preview dev.json --override environment=prd
# Multiple overrides
uv run dpn plan preview dev.json \
--override environment=prd \
--override project=oncology \
--override region=us-west-2
# Works with all commands
uv run dpn create --blueprint dev.json \
--override environment=prd \
--dry-run- Runtime overrides (
--overrideflags) - Blueprint metadata (explicit values in blueprint)
- Resource type overrides (in naming-values.yaml)
- Environment overrides (in naming-values.yaml)
- Defaults (in naming-values.yaml)
# Initialize (one-time setup)
uv run dpn config init --project myproject --environment dev --region us-east-1
# Customize values
vim .dpn/naming-values.yaml
# Customize patterns (optional)
vim .dpn/naming-patterns.yaml
# Validate changes
uv run dpn config validate
# View effective configuration
uv run dpn config show
# Test with preview
uv run dpn plan preview my-blueprint.json
# Create resources
uv run dpn create --blueprint my-blueprint.jsonCommands work without configuration files (legacy mode):
# Without config - uses hardcoded patterns
uv run dpn plan preview dev.json
# ⚠ No configuration files found, using legacy mode
# Run 'uv run dpn config init' to create configuration files
# With config - uses customizable patterns
uv run dpn config init
uv run dpn plan preview dev.json
# ✓ Using configuration-based namingFor detailed migration instructions, see Configuration Migration Guide.
uv run dpn plan init --env prd --project dataplatform --output prod.jsonuv run dpn plan validate prod.json# Table format
uv run dpn plan preview prod.json
# JSON export
uv run dpn plan preview prod.json --format json --output preview.jsonuv run dpn plan schema --output blueprint-schema.json# Dry run
uv run dpn create --blueprint prod.json --dry-run
# Execute
uv run dpn create --blueprint prod.json
# With AWS profile
uv run dpn create --blueprint prod.json --aws-profile production
# With Databricks
uv run dpn create --blueprint prod.json \
--dbx-host https://workspace.cloud.databricks.com \
--dbx-token dapi123...# JSON format
uv run dpn read --resource-id cluster-name --type cluster --format json
# YAML format
uv run dpn read --resource-id bucket-name --type s3 --format yaml
# Table format
uv run dpn read --resource-id db-name --type glue-db --format table# Rename
uv run dpn update --resource-id old-name --rename new-name
# Update params
uv run dpn update --resource-id cluster-name --params updates.json# Permanent delete (with confirmation)
uv run dpn delete --resource-id cluster-name --type cluster
# Archive (soft delete)
uv run dpn delete --resource-id cluster-name --type cluster --archiveuv run dpn recoveruv run dpn status{
"version": "1.0",
"metadata": {
"environment": "prd",
"project": "dataplatform",
"region": "us-east-1",
"team": "data-engineering",
"cost_center": "IT-1001"
},
"resources": {
"aws": {
"s3_buckets": [
{
"purpose": "raw",
"layer": "raw",
"versioning": true,
"lifecycle_days": 90
}
],
"glue_databases": [
{
"domain": "finance",
"layer": "gold",
"description": "Finance gold layer"
}
],
"glue_tables": [
{
"database_ref": "finance-gold",
"entity": "customers",
"table_type": "dim",
"columns": [...]
}
]
},
"databricks": {
"clusters": [
{
"workload": "etl",
"cluster_type": "shared",
"node_type": "i3.xlarge",
"autoscale": {"min": 2, "max": 8}
}
],
"jobs": [
{
"job_type": "batch",
"purpose": "customer-ingestion",
"cluster_ref": "etl",
"schedule": "daily"
}
],
"unity_catalog": {
"catalogs": [
{
"catalog_type": "main",
"schemas": [
{
"domain": "finance",
"layer": "gold",
"tables": [
{
"entity": "customers",
"table_type": "dim",
"columns": [...]
}
]
}
]
}
]
}
}
}
}S3 Bucket:
{project}-{purpose}-{layer}-{env}-{region}
Example: dataplatform-raw-raw-prd-useast1Glue Database:
{project}_{domain}_{layer}_{env}
Example: dataplatform_finance_gold_prdGlue Table:
{type}_{entity}
Example: dim_customers, fact_transactionsCluster:
{project}-{workload}-{type}-{env}
Example: dataplatform-etl-shared-prdJob:
{project}-{type}-{purpose}-{env}-{schedule}
Example: dataplatform-batch-customer-load-prd-dailyUnity Catalog (3-tier):
Catalog: {project}_{type}_{env}
Schema: {domain}_{layer}
Table: {type}_{entity}
Full: dataplatform_main_prd.finance_gold.dim_customers# Install dev dependencies
uv sync --dev
# Format
uv run black src/ tests/
# Lint
uv run ruff check src/ tests/
# Type check
uv run mypy src/docker run --rm \
-v $(pwd):/tmp/lint \
oxsecurity/megalinter:v7Supported Languages:
- Python (black, ruff, mypy)
- Bash (shellcheck, shfmt)
- SQL (sqlfluff)
- R (lintr)
- Scala (scalafmt)
- Terraform (tflint, terraform-fmt)
# Run tests
uv run pytest
# With coverage
uv run pytest --cov
# Specific test
uv run pytest tests/test_transaction_manager.pyAll operations succeed or all rollback. No partial changes.
Pre/post-condition validation ensures valid state transitions.
File-based locking prevents concurrent transaction conflicts.
Write-ahead logging (WAL) persists all state. Automatic crash recovery.
Transaction
├── Write to WAL
├── Execute Operations
│ ├── Operation 1 → Success
│ ├── Operation 2 → Success
│ └── Operation 3 → Failed
└── Rollback (reverse order)
├── Revert Operation 2
└── Revert Operation 1Creating resources... ━━━━━━━━━━━━━━ 67% [23.4s] Creating job: batch-customer-prd
✓ Cluster created: dataplatform-etl-shared-prd
✓ Job created: dataplatform-batch-customer-load-prd-daily
✗ Table failed: Invalid schema
→ Rolling back transaction...
✓ Transaction rolled back: tx-abc123# AWS
export AWS_PROFILE=production
export AWS_REGION=us-east-1
# Databricks
export DATABRICKS_HOST=https://workspace.cloud.databricks.com
export DATABRICKS_TOKEN=dapi123....dpn/
├── wal/ # Write-ahead log
│ ├── {tx-id}.wal
│ ├── {tx-id}.committed
│ └── {tx-id}.rolled_back
└── state/
└── state.json # Resource state# Development
uv run dpn create --blueprint dev.json --aws-profile dev
# Staging
uv run dpn create --blueprint stg.json --aws-profile staging
# Production
uv run dpn create --blueprint prd.json --aws-profile production# List all S3 buckets
uv run dpn read --type s3 --resource-id "dataplatform-*"
# Cluster details
uv run dpn read --type cluster --resource-id etl-cluster --format json# Recover failed transactions
uv run dpn recover
# Archive resources
uv run dpn delete --resource-id old-cluster --type cluster --archive
# Check system status
uv run dpn statusgit clone https://github.com/yourusername/data-platform-naming.git
cd data-platform-naming
uv sync --devuv run pytest
uv run pytest --covuv run ruff check src/ tests/
uv run black src/ tests/
uv run mypy src/docker run --rm -v $(pwd):/tmp/lint oxsecurity/megalinter:v7- Fork repository
- Create feature branch
- Run tests + linters
- Submit pull request
MIT License - see LICENSE file.
- Issues: https://github.com/robandrewford/data-platform-naming/issues
- Documentation: https://github.com/robandrewford/data-platform-naming/docs
- Discussions: https://github.com/robandrewford/data-platform-naming/discussions
- GitHub Actions integration
- Terraform provider
- Text UI (TUI)
- GCP naming support
- API server mode