Open
Conversation
Spec for an interactive Jupyter notebook to explore dataset pool statistics (source balance, scenario type, mutation families, obligation coverage, governance triggers) before running the GRS v3.0 sampling operation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ebook 10-task plan covering all 8 notebook sections with complete cell code, execution verification steps, and commit checkpoints. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implement phase2_draw() to balance scenario selection across sources (claude, gemini, gpt). Allocates budget equally with remainders distributed to first sources alphabetically. Handles shortfalls by drawing from overflow pools of non-exhausted sources. Wires Phase 2 into main() with early exit when n_phase2 <= 0 (no budget remaining). Adds T08 (source balance tolerance check) and T15 (Phase 1-only when target == obligations). All 6 tests pass including new Phase 2 tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…-sampling notebook grs_debug_sampler.py: add normalize_record() to derive the five sampler-required fields (source, obligation_id, scenario_type, mutation_type, primary_dimension) from the pipeline's native schema (seed_trace, mutation_trace, governance_triggers), so the sampler accepts actual pipeline output without any pre-processing step. post_sampling_analysis.ipynb: new notebook with 13 sections covering audit results, source/phase breakdown, obligation coverage, dimension and mutation family distributions, domain/industry diversity, governance triggers heatmap, risk level, and sample-vs-pool comparison. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Describe your changes
Summary
Introduces
grs_debug_sampler.py, a reproducible scenario sub-sampler for smoke-testing the GRS pipeline, andpost_sampling_analysis.ipynb, a companion notebook for inspecting the produced dataset.grs_debug_sampler.pyA standalone CLI tool that draws a small (15–100), reproducible sample from the three GRS scenario pools (GPT, Gemini, Claude) and writes a JSONL output file plus a JSON manifest.
Sampling algorithm
seed=42), guaranteeing full obligation coverage.seed+1) for deterministic output ordering.Schema normalisation
Adds
normalize_record()to bridge the gap between the pipeline's native output schema (seed_trace,mutation_trace,governance_triggers) and the five sampler-required fields (source,obligation_id,scenario_type,mutation_type,primary_dimension). No pre-processing step needed.Audit checks (recorded in manifest)
source_balanceobligation_coveragesample_sizeid_uniquenessscenario_idsdimension_coverageUsage
###Test plan
audit PASS, manifest written, no drops
no errors, all sections render correctly
Write your issue number after "Fixes "
This PR does not intend to fix any specific issues.
Please ensure all items are checked off before requesting a review: