An interactive tool for exploring the latent spaces of multimodal AI models
through real-time perception, introspective feedback, and symbolic memory.
- Captures user input (image, speech) in real-time
- Uses existing AI models (CLIP, Whisper, Ollama) to generate latent representations
- Tokenizes perceptions into symbolic structures (via
Sentience) - Reflects on them via a cognitive agent loop (
AI-Ego) - Displays everything in a comprehensive interface with multiple views:
- Live Perception (Webcam + STT + Audio Visualization)
- Latent Insight (Embeddings, tokens, semantic facets)
- Thought Stream (LLM-based internal reflections + consciousness metrics)
- Memory Timeline (STM → LTM events + waypoint system)
- 2D Map View - Top-down perspective with clustering
- 3D Space View - Immersive 3D navigation with camera presets
- 3D Scatter Plot - Interactive point cloud exploration
- Interactive Navigation (Camera presets, mini-map, filtering)
- Memory Timeline Playback - Speed-controlled memory exploration
- Memory Event Analysis - AI-powered insights and reflection
- Export Functionality - Data persistence and sharing
- Memory Annotation - Categorization and labeling system
- Memory Consolidation (AI-powered concept creation)
- Service Health Monitoring (Real-time status tracking)
- Responsive Design (Mobile and desktop optimized)
latent-journey/
├── docs/ # Documentation and design decisions
│ ├── 0_PRODUCT_GOAL_AND_ACCEPTANCE.md
│ ├── 1_GOAL_INTERPRETATION.md
│ ├── 2_PROBLEM_FRAMING.md
│ ├── 3_DESIGN_DECISIONS.md
│ ├── 4_SYSTEM_ARCHITECTURE.md
│ ├── 5_INTERFACE_AND_FLOW.md
| └── ...
├── cmd/
│ └── gateway/ # Go main (HTTP + SSE)
│ ├── main.go
│ └── go.mod
├── pkg/
│ ├── api/ # Go handlers, DTOs, event bus
│ │ ├── routes.go
│ │ ├── sse.go
│ └── go.mod
├── services/
│ ├── ml-py/ # Python FastAPI ML Service
│ │ ├── app.py
│ │ └── requirements.txt
│ ├── sentience-rs/ # Rust Sentience Service
│ │ ├── Cargo.toml
│ │ ├── src/main.rs
│ │ └── data/memory.jsonl
│ ├── llm-py/ # Python LLM Service
│ │ ├── app.py
│ │ └── requirements.txt
│ └── ego-rs/ # Rust Ego Service (AI Reflection)
│ ├── Cargo.toml
│ ├── src/main.rs
│ ├── src/handlers.rs
│ ├── src/reflection.rs
│ ├── src/memory.rs
│ └── src/types.rs
├── ui/ # React Frontend (Vite)
│ ├── index.html
│ ├── src/
│ │ ├── App.tsx
│ │ ├── components/
│ │ │ ├── ThoughtStream.tsx
│ │ │ ├── LatentSpacePage.tsx
│ │ │ ├── LatentSpaceView.tsx
│ │ │ ├── LatentSpace3D.tsx
│ │ │ └── LatentScatter.tsx
│ │ └── main.tsx
│ ├── package.json
│ └── vite.config.ts
├── sentience-dsl/ # Sentience DSL implementation
├── Makefile # Service orchestration
├── Cargo.toml # Root Rust workspace
└── README.md
Make sure you have the following installed:
- Go (1.21+)
- Python (3.8+)
- Rust (1.70+)
- Node.js (18+)
- Ollama (for AI thought generation)
⚠️ Required
The system requires Ollama for AI thought generation. Without it, the Thought Stream feature will not work.
macOS:
brew install ollamaLinux:
curl -fsSL https://ollama.ai/install.sh | shWindows: Download from https://ollama.ai/download
# Start Ollama service (keep this running)
ollama serve
# In a separate terminal, download the model
ollama pull llama3.2:3bAlternative Models:
llama3.1:8b-instruct(larger, better quality)phi:2.7b(smaller, faster)gemma:latest(Google's model)
make installmake devThis will start all services with health checks:
- Gateway (Go): http://localhost:8080
- ML Service (Python): http://localhost:8081
- Sentience Service (Rust): http://localhost:8082
- LLM Service (Python): http://localhost:8083
- Ego Service (Rust): http://localhost:8084
- UI (React): http://localhost:5173
Visit http://localhost:5173 in your browser to see the latent-journey interface.
The main interface showing live perception, latent insights, thought stream, and memory timeline
Interactive 3D latent space exploration with multiple view modes and navigation controls
Advanced memory analysis with timeline playback, export functionality, and annotation system
- Enable Camera & Microphone: Allow browser permissions when prompted
- Check Service Status: Ensure all services show "online" in the status bar
- Start with Vision: Show objects to the camera to see CLIP analysis
- Try Speech: Click the microphone button and speak to see Whisper transcription
- Explore 3D Space: Navigate to the Latent Space page for visualization
- Analyze Memories: Use the Memory Analysis page for detailed exploration
- Camera Feed: Real-time video capture with CLIP analysis
- Audio Visualization: Live audio levels and frequency display
- Capture Controls: Snap photos and record audio for processing
- Semantic Facets: Real-time display of perception tokens
- Progress Bars: Visual representation of affect/valence and arousal
- Confidence Scores: CLIP and Whisper confidence indicators
- AI Reflections: Real-time LLM-generated thoughts about perceptions
- Consciousness Metrics: Attention, salience, and coherence tracking
- Auto/Manual Modes: Control when thoughts are generated
- Event History: Chronological display of all memory events
- Filtering: Filter by source type (All/Speech/Vision)
- Waypoint System: Bookmark interesting states for comparison
- Multiple Views: Switch between 2D Map, 3D Space, and 3D Scatter
- Interactive Navigation: Camera presets and free navigation
- Mini-Map: Overview with filtering and trajectory visualization
- Waypoint Comparison: A/B state comparison interface
- Timeline Playback: Speed-controlled memory exploration
- Event Analysis: Detailed insights and AI-powered reflection
- Export Data: Save memory data for external analysis
- Annotation System: Label and categorize memories
If you prefer not to use Ollama, you can configure other providers in services/llm-py/app.py:
- OpenAI: Set
LLM_PROVIDER=openaiandOPENAI_API_KEY=your_key - Anthropic: Set
LLM_PROVIDER=anthropicandANTHROPIC_API_KEY=your_key
Note: The system will start without external LLM services, but the Thought Stream feature will not work until a valid LLM provider is configured and running.
The system offers different startup modes depending on your needs:
make dev # Full functionality (all services, ~15-20 seconds)
make quick # Essential services only (~8-10 seconds)
make fast # UI development only (~5-8 seconds)make build # Build all services
make test # Test all services (requires them to be running)
make clean # Clean build artifacts
make help # Show all available commands# Check if ports are available
lsof -i :8080-8084,5173
# Kill processes using ports
sudo lsof -ti:8080-8084,5173 | xargs kill -9
# Restart services
make clean && make dev# Check if Ollama is running
ollama list
# Restart Ollama
pkill ollama
ollama serve
# Download model if missing
ollama pull llama3.2:3b- Check browser permissions for camera and microphone
- Ensure no other applications are using the camera
- Try refreshing the page and re-granting permissions
- Check browser WebGL support: https://get.webgl.org/
- Try disabling hardware acceleration in browser
- Clear browser cache and refresh
- Check available disk space (models require ~2GB)
- Monitor system memory usage
- Restart services if memory usage is high
Visit these URLs to check individual services:
- Gateway: http://localhost:8080/healthz
- ML Service: http://localhost:8081/health
- Sentience: http://localhost:8082/health
- LLM Service: http://localhost:8083/health
- Ego Service: http://localhost:8084/health
# View service logs
make logs
# Check specific service logs
tail -f services/ego-rs/logs/app.log
tail -f services/sentience-rs/logs/app.log
# Debug mode (verbose logging)
DEBUG=1 make devThe project follows a microservices architecture:
cmd/gateway/- Go HTTP gateway and SSE serverpkg/api/- Shared Go API handlers and typesservices/ml-py/- Python ML service (CLIP, Whisper)services/sentience-rs/- Rust service for tokenizationservices/ego-rs/- Rust service for AI reflectionservices/llm-py/- Python LLM service (Ollama integration)ui/- React frontend with TypeScriptsentience-dsl/- Custom DSL for AI agent programming
# Install development dependencies
make install
# Start in development mode with hot reload
make dev
# Run tests
make test
# Format code
make format
# Lint code
make lint- Backend Services: Add new endpoints in
pkg/api/routes.go - Frontend Components: Add to
ui/src/components/ - Memory System: Extend
services/ego-rs/src/memory.rs - Sentience DSL: Modify
sentience-dsl/src/for new agent behaviors
Environment variables can be set in .env file:
# LLM Configuration
LLM_PROVIDER=ollama
OLLAMA_MODEL=llama3.2:3b
# Service Ports
GATEWAY_PORT=8080
ML_PORT=8081
SENTIENCE_PORT=8082
LLM_PORT=8083
EGO_PORT=8084
UI_PORT=5173
# Debug Mode
DEBUG=false
LOG_LEVEL=infoGET /healthz- Health checkPOST /api/vision/frame- Process imagePOST /api/speech/transcript- Process audioGET /events- SSE event stream
GET /memory- Retrieve memory eventsPOST /api/ego/thought- Generate AI thoughtGET /api/ego/memories- Get STM dataGET /api/ego/experiences- Get LTM data
- Input → Camera/Microphone captures data
- Perception → CLIP/Whisper processes input
- Tokenization → Sentience service creates semantic tokens
- Memory → Events stored in STM/LTM
- Reflection → Ego service generates AI thoughts
- Visualization → UI displays all data in real-time
- Live Interactive Exploration: Real-time latent space navigation
- Multi-modal Interface: Vision + Speech + 3D visualization
- AI Model Integration: CLIP, Whisper, Ollama working seamlessly
- Journey: AI reflection system with consciousness metrics
- 3D Visualization: Three distinct exploration modes
- Memory Consolidation: AI-powered concept creation
- Interactive Navigation: Camera presets and mini-map
- Professional UI: Modern dark theme with smooth animations
- Vision Processing: <100ms (target: <250ms) - 2.5x faster
- Speech Processing: <200ms (target: <500ms) - 2.5x faster
- UI Responsiveness: <50ms updates
- System Reliability: >99% uptime
- OS: macOS 10.15+, Ubuntu 18.04+, Windows 10+
- RAM: 8GB (16GB recommended)
- Storage: 5GB free space
- CPU: 4 cores (8 cores recommended)
- GPU: Integrated graphics (dedicated GPU recommended for 3D visualization)
- RAM: 16GB+ for smooth operation
- GPU: Dedicated GPU with WebGL 2.0 support
- Network: Stable internet connection for model downloads
- Browser: Chrome 90+, Firefox 88+, Safari 14+
- Vision Processing: <100ms (target: <250ms) - 2.5x faster
- Speech Processing: <200ms (target: <500ms) - 2.5x faster
- UI Responsiveness: <50ms updates
- Memory Operations: <100ms with optimized state management
- System Reliability: >99% uptime in development
- Memory Footprint: <2GB (optimized with efficient state management)
- CPU Usage: <50% average (efficient rendering and processing)
- Network Bandwidth: <10Mbps (local processing, minimal network usage)
- Storage: <1GB for models (CLIP and Whisper models optimized)
This project implements the theoretical framework defined in:
Structured Synthetic Memory: The SynthaMind Hypothesis and Architecture Overview
A 15-page theoretical paper describing how intelligence, consciousness, and personality may emerge from structured, relational memory - implemented in this repository via Sentience DSL, ego.thought stream, and stratified memory subsystems.
Comprehensive documentation lives in the /docs folder. Start with:
docs/README.md- Complete documentation indexdocs/7_SYNTHAMIND_HYPOTHESIS.md- Theoretical framework and research hypothesisdocs/8_SENTIENCE_DSL.md- Sentience DSL language documentationdocs/4_SYSTEM_ARCHITECTURE.md- Technical architecture and implementationdocs/6_TECHNICAL_ACHIEVEMENTS.md- Detailed technical accomplishments