AAAI-26 Undergraduate Consortium Submission
This repository implements a Jungian-inspired reinforcement learning architecture where agents maintain a four-dimensional proto-emotional state that influences learning and decision-making. The framework decomposes emotional regulation into four modules inspired by Carl Jung's Map of the Soul:
- Persona & Shadow Modulators: Reward perception scaling based on affective state
- Ego Modulator: Dynamic arbitration between expert policies
- Self-Regulation: Entropy-based homeostatic balance
The affective state evolves continuously through a linear dynamical system with theoretical boundedness guarantees, enabling stable long-horizon learning.
make installOr manually:
pip install -r requirements.txt- Python >= 3.10
- PyTorch >= 2.1
- gymnasium == 0.29.1
- pettingzoo == 1.24.1
- stable-baselines3 == 2.3.0
mots_rl/
├── core.py # CoreAffectiveDynamics: A_t = αA_{t-1} + βg(r,δ,φ(s))
├── modulators/
│ ├── persona.py # Persona modulator: positive affect amplification
│ ├── shadow.py # Shadow modulator: negative affect processing
│ ├── ego.py # Ego arbitration: π = Σ ω_i(A,s) π_i
│ └── self_reg.py # Self-regulation: entropy-based homeostasis
├── policy/
│ ├── ppo_policy.py # PPO with affective state conditioning
│ └── sac_policy.py # SAC with affective state conditioning
├── trainers/
│ ├── trainer.py # Base training loop with robustness mechanisms
│ └── multi_agent.py # Multi-agent training with partner modeling
├── baselines/
│ ├── vanilla.py # Standard PPO/SAC
│ ├── icm.py # Intrinsic Curiosity Module
│ ├── rnd.py # Random Network Distillation
│ └── rl2.py # Meta-RL baseline
├── envs/
│ ├── diagnostic.py # Custom environments for theoretical validation
│ └── wrappers.py # Robustness wrappers (noise, dropout, etc.)
└── utils/
├── replay.py # Experience replay buffer
└── logging.py # TensorBoard and CSV logging
configs/ # Hydra configuration files
├── stage1/ # Theoretical validation (4 configs)
├── stage2/ # Single-agent tasks (4 configs)
├── stage3/ # Multi-agent tasks (4 configs)
├── robustness/ # Distribution shift tests (5 configs)
├── ablations/ # Component ablations (9 configs)
└── baselines/ # Baseline comparisons (6 configs)
scripts/
├── train.py # Main training entry point
├── evaluate.py # Generate all figures
├── generate_results.py # Synthetic result generation for prototyping
└── run_experiments.sh # Batch experiment runner
python scripts/train.py +experiment=stage2/mujoco.yamlmake train-allOr manually:
bash scripts/run_experiments.shpython scripts/evaluate.pyOr:
make eval-allmake testpython -m pytest tests/The affective state evolves according to:
A_{t+1} = α ⊙ A_t + β ⊙ g(r_t, δ_t, φ(s_t))where:
A_t ∈ R^4is the affective state vectorα ∈ (0,1)^4are learned decay coefficients (temporal persistence)β ∈ (0,1)^4are learned update coefficients (sensitivity to new stimuli)g: R^n → R^4is a two-layer tanh MLP mapping observations to affectr_tis the reward,δ_tis the TD error,φ(s_t)is the observation encoding
This linear dynamical system guarantees boundedness: ||A_t||_∞ ≤ β_max / (1 - α_max).
The final policy is a weighted mixture of expert policies:
π(a|s,A_t) = Σ_i ω_i(A_t, s_t) π_i(a|s, A_t)where ω_i(A_t, s_t) are context-dependent gating weights computed by the Ego module.
Perceived reward is modulated by Persona and Shadow:
r_perceived = r_raw + λ_persona · f_persona(A_t, r_raw) + λ_shadow · f_shadow(A_t, r_raw)The Persona amplifies positive experiences when affect is high; Shadow processes negative experiences when affect is low.
Validates mathematical properties of the affective dynamics:
boundedness.yaml: Empirically verify||A_t||_∞stays below theoretical bound across 100 episodesrecovery.yaml: Inject large perturbations (A_t = 10) and measure exponential decay ratestimescale.yaml: Analyze autocorrelation functions to confirm multi-timescale representationcoupled.yaml: Compute cross-correlation matrix to validate coherent but non-redundant dynamics
Evaluate on standard RL benchmarks:
minigrid.yaml: Discrete navigation (MiniGrid-FourRooms, sparse rewards)mujoco.yaml: Continuous control (HalfCheetah-v4, dense rewards)memory.yaml: Memory-dependent tasks (T-Maze with long horizons)mo_hopper.yaml: Multi-objective locomotion (Hopper with speed-stability trade-off)
Test social reasoning capabilities:
ipd.yaml: Iterated Prisoner's Dilemma (cooperation emergence)mpe_nav.yaml: Multi-agent navigation (coordination without communication)mpe_pred.yaml: Predator-prey (strategic opponent modeling)coin_game.yaml: Coin collection game (competitive dynamics)
Distribution shift scenarios:
r1_reward_swap.yaml: Flip reward sign at episode 125r2_partner.yaml: Swap partner policy mid-training (multi-agent only)r3_obs_noise.yaml: Add 20% Gaussian noise to observationsr4_action_dropout.yaml: Randomly drop 15% of actionsr5_zero_shot.yaml: Transfer to unseen task variant
Component-wise importance analysis:
persona_off.yaml: Disable persona modulator (no positive amplification)shadow_off.yaml: Disable shadow modulator (no negative processing)ego_fixed.yaml: Fix expert weights uniformly (no dynamic arbitration)self_off.yaml: Disable self-regulation (no entropy balancing)affect_2d.yaml: Reduce affect to 2D valence-arousal onlyaffect_random.yaml: Random affect updates (control condition)gating_lstm.yaml: Replace ego with LSTM-based gatingalpha_sweep_fast.yaml: Fast dynamics (α ≈ 0.99, short memory)alpha_sweep_slow.yaml: Slow dynamics (α ≈ 0.5, long memory)
vanilla.yaml: Standard PPO/SAC without affective mechanismsvanilla_plus.yaml: Capacity-matched baseline (same parameter count)icm.yaml: Intrinsic Curiosity Module (Pathak et al., 2017)rnd.yaml: Random Network Distillation (Burda et al., 2019)dim_affect.yaml: Simplified 2D affect without Jung frameworkrl2.yaml: Meta-RL (Wang et al., 2016)
Implements the linear dynamical system with learnable parameters:
class CoreAffectiveDynamics(nn.Module):
def __init__(self, affect_dim=4, obs_dim=64):
self.alpha = nn.Parameter(torch.rand(affect_dim) * 0.3 + 0.7) # [0.7, 1.0)
self.beta = nn.Parameter(torch.rand(affect_dim) * 0.1 + 0.1) # [0.1, 0.2)
self.affine_net = nn.Sequential(
nn.Linear(obs_dim + 2, 128), # obs + reward + TD error
nn.Tanh(),
nn.Linear(128, affect_dim),
nn.Tanh()
)
def forward(self, affect_prev, obs, reward, td_error):
stimulus = self.affine_net(torch.cat([obs, reward, td_error], -1))
return self.alpha * affect_prev + self.beta * stimulusPersona (positive amplification):
def forward(self, affect, reward):
# Amplify positive rewards when affect is high
if reward > 0:
return torch.sigmoid(self.mlp(affect)) * reward * self.scale
return 0Shadow (negative processing):
def forward(self, affect, reward):
# Process negative rewards when affect is low
if reward < 0:
return torch.sigmoid(self.mlp(-affect)) * reward * self.scale
return 0Ego (expert arbitration):
def forward(self, affect, obs):
# Context-dependent gating
weights = torch.softmax(self.mlp(torch.cat([affect, obs], -1)), dim=-1)
return weightsSelf-Regulation (homeostasis):
def forward(self, policy_dist):
# Encourage exploration when entropy is low
entropy = policy_dist.entropy()
target_entropy = self.target_entropy
bonus = torch.relu(target_entropy - entropy) * self.coef
return bonusAfter running python scripts/evaluate.py, figures are saved to results/figures/:
stage1_core.png: Four-panel diagnostic validation (boundedness, recovery, timescale, coupling)stage2_learning.png: Single-agent learning curves with baseline comparisonsstage3_social.png: Multi-agent performance and cooperation emergencerobustness_shift.png: Performance under five distribution shift scenariosablations_heatmap.png: Component importance heatmap
@inproceedings{litchiowong2026persona,
title={Persona, Ego, Shadow, and Self: A Map of the Soul Framework for Proto-Emotional Homeostasis in AI},
author={Litchiowong, Napassorn},
institution={National University of Singapore},
booktitle={AAAI Conference on Artificial Intelligence - Undergraduate Consortium},
year={2026}
}MIT License