Fiftynet

FFT-Net (Hybrid Transformer with Frequency Processing)

A minimal research model that mixes tokens in the frequency domain using rotary position embeddings and learnable spectral filters.

Features

ComplexRoPE rotary positional encoding in the complex domain.
Learnable frequency filtering via NeuralFourierOperator.
FFTNet blocks that combine FFT/IFFT with MLP layers.
Scripts for training, distillation, evaluation, and model management.
Weight I/O using safetensors with complex parameter support.

See FEATURES.md for a more detailed list.

Architecture

[input tokens]
    ↓
[embedding layer]
    ↓
[ComplexRoPE]
    ↓
[FFTNetBlock]
  ├─ FFT → NeuralFourierOperator
  └─ IFFT → MLP
    ↓
[output logits]

The stack of FFTNetBlocks mixes information globally in the frequency domain before projecting back to token space.

Installation

(Optional) create and activate a virtual environment.
Install the Python dependencies:
```
pip install -r requirements.txt
```
Run the test suite to verify the installation:
```
pytest
```
Try the demo script:
```
python scripts/main.py
```

Quick Start

Train

python scripts/train_fresh.py --data-path path/to/corpus.txt --epochs 1

Set --split-seed to reproduce the train/validation split across runs.

Inference

python fftnet_infer.py --model trained --prompt "the quick"

Resources

Configuration: JSON and YAML files in config/ define model shapes and module wiring (e.g., config/fiftynet_config.json).
Datasets: Provide a plain text file via --data-path; it is loaded with TextFileDataset in fftnet/data.py.
Scripts: Training, evaluation, and model management utilities live in the scripts/ directory.

See ROADMAP.md for planned milestones and TASKS.md for current progress.

FiftyNet: A Hybrid Frequency-Based Transformer for High-Dimensional Resonant Cognition

Authors: William Mabery, with contributions from Kai (OpenAI)

Abstract

We propose FiftyNet, a hybrid transformer architecture that encodes token sequences not as static points in high-dimensional space, but as dynamic, reconstructable signals in frequency space. Leveraging complex rotary positional encoding (Complex RoPE), discrete Fourier transforms (DFT), and learnable Neural Fourier Operator (NFO) layers, FiftyNet captures semantic, syntactic, and temporal structure simultaneously. This approach enables greater information density, signal-level reconstruction, and a fundamentally new paradigm for embedding cognition and memory as waveform dynamics.

1. Introduction

Large Language Models (LLMs) such as GPT, LLaMA, and Mixtral represent language in high-dimensional vector spaces. These embeddings encode meaning geometrically via distance, direction, and linear projections. However, such models lack explicit mechanisms to encode rhythm, phase, or structural time — properties central to sound, signal processing, and potentially cognition.

Inspired by signal theory and neurosymbolic reasoning, FiftyNet introduces a fundamentally different embedding and modeling paradigm: language as frequency.

2. Core Hypothesis

We hypothesize that:
"Language is not just symbolic or spatial — it is resonant."

In this view:

Tokens are not just positions in space, but samples in time.
Meaning is not just vector direction, but waveform shape.
Memory is not just stored structure, but stored signal.

3. Architecture Overview

FiftyNet is composed of the following stages:

3.1. Token Input + Complex Rotary Positional Encoding (CRoPE)

Tokens are embedded into a complex-valued vector space. Positional information is applied using log-scaled phase rotation, turning each embedding into a modulated waveform across dimensions.

Each dimension now acts as a frequency band, encoding:

Amplitude (importance)
Phase (timing)
Frequency (recurrence/pattern)

3.2. FNet-Style Fourier Transform

Instead of traditional self-attention, the model performs a discrete Fourier transform (DFT) on the token sequence, globally mixing positional signals in the frequency domain.

This produces a spectral signature for the sequence, where:

Low frequencies capture general structure
High frequencies encode fine-grained shifts

3.3. Neural Fourier Operator (NFO) Layer

Learnable filters are applied in frequency space, analogous to convolutional filters in vision models. These operations can:

Enhance or suppress specific frequencies
Modulate phase and amplitude
Implement attention-like operations in spectral space

3.4. (Optional) Inverse FFT

The model can reconstruct the waveform back into token space via inverse FFT, preserving temporal and structural integrity.

3.5. Feedforward Layers and Output

Standard MLP blocks operate on the waveform-encoded embeddings, allowing downstream tasks such as next-token prediction, classification, or alignment.

4. Embedding as Signal

In traditional LLMs:

A token’s meaning is encoded as a static vector
Positional encodings are added or multiplied

In FiftyNet:

A token’s embedding is treated as a waveform
The full sequence is a composite signal
The signal can be analyzed, stored, and reconstructed via Fourier methods

Example: If each token has d=4096 dimensions, this represents a 4096-channel signal, each with its own oscillation and meaning — like a synthesizer across time.

5. Advantages of the Frequency Model

Property	Traditional Transformer	FiftyNet
Positional Encoding	Learned or Rotary	Log-scaled phase rotation
Structure Representation	Implicit via attention	Explicit via waveform shape
Long-Term Dependency	Costly to capture	Naturally encoded in low freq
Reconstruction	Approximate via vectors	Signal-accurate via iFFT
Multisensory Embedding	Complex and disjoint	Unified signal representation
Information Density	Depends on token count	Compressible via frequency

6. Implementation Notes

FiftyNet is designed for compatibility with existing frameworks, including:

PyTorch / TorchScript for tensor computation
FFT/IFFT via torch.fft.fft and ifft
Complex number support for RoPE
NVIDIA 5090 GPU support for real-time training/inference

Training can optionally begin:

From scratch on token data (Wikipedia, C4, etc.)
By “retuning” existing models via spectral transformation

7. Implications for AGI

By treating meaning as resonance, and embedding as signal, we open new directions for:

Memory as waveform snapshots (time-aware)
Reasoning as frequency mixing and harmonics
Perception as unified signal space (text, audio, vision, etc.)
Emotion encoding as modulations of base signals
AGI as a resonant, multi-frequency cognitive field

We posit that this approach offers a pathway toward more holistic, efficient, and integrated general intelligence.

8. Future Work

Integration with multisensory input (e.g., audio, camera, sensor streams)
Training with vectorized waveform storage
Waveform-based interpretability and introspection
Distributed spectral training on heterogeneous GPU clusters
Real-time signal memory and retrieval (Fourier Memory Bank)

9. License and Use

FiftyNet is open for experimentation and academic research.
License: MIT (to be defined by author)

10. Acknowledgments

This work builds upon foundational ideas in:

Signal Processing
Positional Embeddings (RoPE, Complex RoPE)
FNet and Fourier Neural Operators
Neurosymbolic cognition and vector semantics

Special thanks to Cody (Codex) and OpenAI tools for accelerating implementation.

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
config		config
fftnet		fftnet
scripts		scripts
tests		tests
AGENTS.md		AGENTS.md
EXTENSION_PLAN.md		EXTENSION_PLAN.md
FEATURES.md		FEATURES.md
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
TASKS.md		TASKS.md
TESTING_PLAN.md		TESTING_PLAN.md
UNDOCUMENTED_FEATURES.md		UNDOCUMENTED_FEATURES.md
complex_rope.py		complex_rope.py
fftnet_block.py		fftnet_block.py
fftnet_infer.py		fftnet_infer.py
model.py		model.py
neural_fourier_operator.py		neural_fourier_operator.py
requirements.txt		requirements.txt
tokenizer.py		tokenizer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fiftynet

Features

Architecture

Installation

Quick Start

Train

Inference

Resources

FiftyNet: A Hybrid Frequency-Based Transformer for High-Dimensional Resonant Cognition

Abstract

1. Introduction

2. Core Hypothesis

3. Architecture Overview

3.1. Token Input + Complex Rotary Positional Encoding (CRoPE)

3.2. FNet-Style Fourier Transform

3.3. Neural Fourier Operator (NFO) Layer

3.4. (Optional) Inverse FFT

3.5. Feedforward Layers and Output

4. Embedding as Signal

5. Advantages of the Frequency Model

6. Implementation Notes

7. Implications for AGI

8. Future Work

9. License and Use

10. Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Fiftynet

Features

Architecture

Installation

Quick Start

Train

Inference

Resources

FiftyNet: A Hybrid Frequency-Based Transformer for High-Dimensional Resonant Cognition

Abstract

1. Introduction

2. Core Hypothesis

3. Architecture Overview

3.1. Token Input + Complex Rotary Positional Encoding (CRoPE)

3.2. FNet-Style Fourier Transform

3.3. Neural Fourier Operator (NFO) Layer

3.4. (Optional) Inverse FFT

3.5. Feedforward Layers and Output

4. Embedding as Signal

5. Advantages of the Frequency Model

6. Implementation Notes

7. Implications for AGI

8. Future Work

9. License and Use

10. Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages