Toy model: TMS with correlations between input dimensions

**Migrated from:** goodfire-ai/spd-gf#41
**Original author:** @leesharkey

---

One of the toy models that we think SPD should be able to decompose easily is Toy Model of Superposition where the input data features are correlated.

In particular, we care most about the case where correlation between some features = 1, so that when one feature activates, the other always co-activates. SPD should learn to group these components into a single component (after the clustering step).

We also think that another case, where 0 < correlation < 1, will be useful for sanity testing SPD, since it should still be able to learn distinct components in this case. If it does not, then there is a problem somewhere.

This should be integrated with the evals suite.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Toy model: TMS with correlations between input dimensions #22

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Toy model: TMS with correlations between input dimensions #22

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions