Pressure-Based Moral Emergence and Structural Output Constraints (“Protected Set”) #1609
Replies: 1 comment
-
|
Hello, I would like to share a preprint proposing a structural framing of moral emergence and AI safety constraints. The core claim is not normative but architectural: Moral labeling emerges from pressure interactions across layers (biological, heuristic/schema, reflective). Because reflective correction is slower than pressure accumulation, safety must be instantiated structurally at the output layer. I call this minimal structural constraint a “Protected Set.” The Protected Set is not a governance authority. This framing may offer: A pressure-based interpretation of alignment failures A systems-engineering grounding for rate limits, refusal mechanisms, and boundary enforcement A way to reason about “banality of good” effects in optimization systems GitHub: Preprint (DOI archived): I would be grateful for technical feedback, especially regarding: Whether this framing meaningfully differs from existing alignment literature Whether the pressure formalism is useful for eval design Whether “output-layer structural constraints” is already well-formalized elsewhere Thank you for your time. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Beta Was this translation helpful? Give feedback.
All reactions