Pressure-Based Moral Emergence and Structural Output Constraints (“Protected Set”) #1609

aritakahayashi-png · 2025-12-29T01:33:26Z

aritakahayashi-png
Dec 29, 2025

aritakahayashi-png · 2026-02-13T23:18:26Z

aritakahayashi-png
Feb 13, 2026
Author

Hello,

I would like to share a preprint proposing a structural framing of moral emergence and AI safety constraints.

The core claim is not normative but architectural:

Moral labeling emerges from pressure interactions across layers (biological, heuristic/schema, reflective).

Because reflective correction is slower than pressure accumulation, safety must be instantiated structurally at the output layer.

I call this minimal structural constraint a “Protected Set.”

The Protected Set is not a governance authority.
It functions more like a circuit breaker or rate limiter:
it prevents irreversible fracture without evaluating moral correctness.

This framing may offer:

A pressure-based interpretation of alignment failures

A systems-engineering grounding for rate limits, refusal mechanisms, and boundary enforcement

A way to reason about “banality of good” effects in optimization systems

GitHub:
https://github.com/aritakahayashi-png/protected-set-theory

Preprint (DOI archived):
https://zenodo.org/records/18633423

I would be grateful for technical feedback, especially regarding:

Whether this framing meaningfully differs from existing alignment literature

Whether the pressure formalism is useful for eval design

Whether “output-layer structural constraints” is already well-formalized elsewhere

Thank you for your time.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pressure-Based Moral Emergence and Structural Output Constraints (“Protected Set”) #1609

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Pressure-Based Moral Emergence and Structural Output Constraints (“Protected Set”) #1609

Uh oh!

Uh oh!

aritakahayashi-png Dec 29, 2025

Replies: 1 comment

Uh oh!

Uh oh!

aritakahayashi-png Feb 13, 2026 Author

aritakahayashi-png
Dec 29, 2025

aritakahayashi-png
Feb 13, 2026
Author