A production-ready reference app demonstrating the RunAnywhere Kotlin SDK capabilities for on-device AI. This app showcases how to build privacy-first, offline-capable AI features with LLM chat, speech-to-text, text-to-speech, and a complete voice assistant pipeline—all running locally on your device.
Important: This sample app consumes the RunAnywhere Kotlin SDK as a local Gradle included build. Before opening this project, you must first build the SDK's native libraries.
# 1. Navigate to the Kotlin SDK directory
cd runanywhere-sdks/sdk/runanywhere-kotlin
# 2. Run the setup script (~10-15 minutes on first run)
# This builds the native C++ JNI libraries and sets testLocal=true
./scripts/build-kotlin.sh --setup
# 3. Open this sample app in Android Studio
# File > Open > examples/android/RunAnywhereAI
# 4. Wait for Gradle sync to complete
# 5. Connect an Android device (ARM64 recommended) or use an emulator
# 6. Click RunThis sample app uses settings.gradle.kts with includeBuild() to reference the local Kotlin SDK:
This Sample App → Local Kotlin SDK (sdk/runanywhere-kotlin/)
↓
Local JNI Libraries (sdk/runanywhere-kotlin/src/androidMain/jniLibs/)
↑
Built by: ./scripts/build-kotlin.sh --setup
The build-kotlin.sh --setup script:
- Downloads dependencies (Sherpa-ONNX, ~500MB)
- Builds the native C++ libraries from
runanywhere-commons - Copies JNI
.sofiles tosdk/runanywhere-kotlin/src/androidMain/jniLibs/ - Sets
runanywhere.testLocal=trueingradle.properties
- Kotlin SDK code changes: Rebuild in Android Studio or run
./gradlew assembleDebug - C++ code changes (in
runanywhere-commons):cd sdk/runanywhere-kotlin ./scripts/build-kotlin.sh --local --rebuild-commons
Download the app from Google Play Store to try it out.
This sample app demonstrates the full power of the RunAnywhere SDK:
| Feature | Description | SDK Integration |
|---|---|---|
| AI Chat | Interactive LLM conversations with streaming responses | RunAnywhere.generateStream() |
| Thinking Mode | Support for models with <think>...</think> reasoning |
Thinking tag parsing |
| Real-time Analytics | Token speed, generation time, inference metrics | MessageAnalytics |
| Speech-to-Text | Voice transcription with batch & live modes | RunAnywhere.transcribe() |
| Text-to-Speech | Neural voice synthesis with Piper TTS | RunAnywhere.synthesize() |
| Voice Assistant | Full STT -> LLM -> TTS pipeline with auto-detection | RunAnywhere.processVoice() |
| Model Management | Download, load, and manage multiple AI models | RunAnywhere.downloadModel() |
| Storage Management | View storage usage and delete models | RunAnywhere.storageInfo() |
| Offline Support | All features work without internet | On-device inference |
The app follows modern Android architecture patterns:
┌─────────────────────────────────────────────────────────────────┐
│ Jetpack Compose UI │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────┐ │
│ │ Chat │ │ STT │ │ TTS │ │ Voice │ │Settings│ │
│ │ Screen │ │ Screen │ │ Screen │ │ Screen │ │ Screen │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └───┬────┘ │
├───────┼────────────┼────────────┼────────────┼───────────┼──────┤
│ ▼ ▼ ▼ ▼ ▼ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────┐ │
│ │ Chat │ │ STT │ │ TTS │ │ Voice │ │Settings│ │
│ │ViewModel │ │ViewModel │ │ViewModel │ │ViewModel │ │ViewModel│
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └───┬────┘ │
├───────┴────────────┴────────────┴────────────┴───────────┴──────┤
│ │
│ RunAnywhere Kotlin SDK │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Extension Functions (generate, transcribe, synthesize) │ │
│ │ EventBus (LLMEvent, STTEvent, TTSEvent, ModelEvent) │ │
│ │ Model Management (download, load, unload, delete) │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌──────────────────┴──────────────────┐ │
│ ▼ ▼ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ LlamaCpp │ │ ONNX Runtime │ │
│ │ (LLM/GGUF) │ │ (STT/TTS) │ │
│ └─────────────────┘ └─────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
- MVVM Pattern — ViewModels manage UI state with
StateFlow, Compose observes changes - Single Activity — Jetpack Navigation Compose handles all screen transitions
- Coroutines & Flow — All async operations use Kotlin coroutines with structured concurrency
- EventBus Pattern — SDK events (model loading, generation, etc.) propagate via
EventBus.events - Repository Abstraction —
ConversationStorepersists chat history
RunAnywhereAI/
├── app/
│ ├── src/main/
│ │ ├── java/com/runanywhere/runanywhereai/
│ │ │ ├── RunAnywhereApplication.kt # SDK initialization, model registration
│ │ │ ├── MainActivity.kt # Entry point, initialization state handling
│ │ │ │
│ │ │ ├── data/
│ │ │ │ └── ConversationStore.kt # Chat history persistence
│ │ │ │
│ │ │ ├── domain/
│ │ │ │ ├── models/
│ │ │ │ │ ├── ChatMessage.kt # Message data model with analytics
│ │ │ │ │ └── SessionState.kt # Voice session states
│ │ │ │ └── services/
│ │ │ │ └── AudioCaptureService.kt # Microphone audio capture
│ │ │ │
│ │ │ ├── presentation/
│ │ │ │ ├── chat/
│ │ │ │ │ ├── ChatScreen.kt # LLM chat UI with streaming
│ │ │ │ │ ├── ChatViewModel.kt # Chat logic, thinking mode
│ │ │ │ │ └── components/
│ │ │ │ │ └── MessageInput.kt # Chat input component
│ │ │ │ │
│ │ │ │ ├── stt/
│ │ │ │ │ ├── SpeechToTextScreen.kt # STT UI with waveform
│ │ │ │ │ └── SpeechToTextViewModel.kt # Batch & live transcription
│ │ │ │ │
│ │ │ │ ├── tts/
│ │ │ │ │ ├── TextToSpeechScreen.kt # TTS UI with playback
│ │ │ │ │ └── TextToSpeechViewModel.kt # Synthesis & audio playback
│ │ │ │ │
│ │ │ │ ├── voice/
│ │ │ │ │ ├── VoiceAssistantScreen.kt # Full voice pipeline UI
│ │ │ │ │ └── VoiceAssistantViewModel.kt # STT→LLM→TTS orchestration
│ │ │ │ │
│ │ │ │ ├── settings/
│ │ │ │ │ ├── SettingsScreen.kt # Storage & model management
│ │ │ │ │ └── SettingsViewModel.kt # Storage info, cache clearing
│ │ │ │ │
│ │ │ │ ├── models/
│ │ │ │ │ ├── ModelSelectionBottomSheet.kt # Model picker UI
│ │ │ │ │ └── ModelSelectionViewModel.kt # Download & load logic
│ │ │ │ │
│ │ │ │ ├── navigation/
│ │ │ │ │ └── AppNavigation.kt # Bottom nav, routing
│ │ │ │ │
│ │ │ │ └── common/
│ │ │ │ └── InitializationViews.kt # Loading/error states
│ │ │ │
│ │ │ └── ui/theme/
│ │ │ ├── Theme.kt # Material 3 theming
│ │ │ ├── AppColors.kt # Color palette
│ │ │ ├── Type.kt # Typography
│ │ │ └── Dimensions.kt # Spacing constants
│ │ │
│ │ ├── res/ # Resources (icons, strings)
│ │ └── AndroidManifest.xml # Permissions, app config
│ │
│ ├── src/test/ # Unit tests
│ └── src/androidTest/ # Instrumentation tests
│
├── build.gradle.kts # Project build config
├── settings.gradle.kts # Module settings
└── README.md # This file
- Android Studio Hedgehog (2023.1.1) or later
- Android SDK 24+ (Android 7.0 Nougat)
- JDK 17+
- Device/Emulator with arm64-v8a architecture (recommended: physical device)
- ~2GB free storage for AI models
# Clone the repository
git clone https://github.com/RunanywhereAI/runanywhere-sdks.git
cd runanywhere-sdks/examples/android/RunAnywhereAI
# Build debug APK
./gradlew assembleDebug
# Install on connected device
./gradlew installDebug- Open the project in Android Studio
- Wait for Gradle sync to complete
- Select a physical device (arm64 recommended) or emulator
- Click Run or press
Shift + F10
# Install and launch
./gradlew installDebug
adb shell am start -n com.runanywhere.runanywhereai.debug/.MainActivityThe SDK is initialized in RunAnywhereApplication.kt:
// Initialize SDK with development environment
RunAnywhere.initialize(environment = SDKEnvironment.DEVELOPMENT)
// Complete services initialization (device registration)
RunAnywhere.completeServicesInitialization()
// Register AI backends
LlamaCPP.register(priority = 100) // LLM backend (GGUF models)
ONNX.register(priority = 100) // STT/TTS backend
// Register models
RunAnywhere.registerModel(
id = "smollm2-360m-q8_0",
name = "SmolLM2 360M Q8_0",
url = "https://huggingface.co/prithivMLmods/SmolLM2-360M-GGUF/...",
framework = InferenceFramework.LLAMA_CPP,
memoryRequirement = 500_000_000,
)// Download with progress tracking
RunAnywhere.downloadModel("smollm2-360m-q8_0").collect { progress ->
println("Download: ${(progress.progress * 100).toInt()}%")
}
// Load into memory
RunAnywhere.loadLLMModel("smollm2-360m-q8_0")// Generate with streaming
RunAnywhere.generateStream(prompt).collect { token ->
// Display token in real-time
displayToken(token)
}
// Or non-streaming
val result = RunAnywhere.generate(prompt)
println("Response: ${result.text}")// Load STT model
RunAnywhere.loadSTTModel("sherpa-onnx-whisper-tiny.en")
// Transcribe audio bytes
val transcription = RunAnywhere.transcribe(audioBytes)
println("Transcription: $transcription")// Load TTS voice
RunAnywhere.loadTTSVoice("vits-piper-en_US-lessac-medium")
// Synthesize speech
val result = RunAnywhere.synthesize(text, TTSOptions(
rate = 1.0f,
pitch = 1.0f,
))
// result.audioData contains WAV audio bytes// Process voice through full pipeline
val result = RunAnywhere.processVoice(audioData)
if (result.speechDetected) {
println("User said: ${result.transcription}")
println("AI response: ${result.response}")
// result.synthesizedAudio contains TTS audio
}What it demonstrates:
- Streaming text generation with real-time token display
- Thinking mode support (
<think>...</think>tags) - Message analytics (tokens/sec, time to first token)
- Conversation history management
- Model selection bottom sheet integration
Key SDK APIs:
RunAnywhere.generateStream()— Streaming generationRunAnywhere.generate()— Non-streaming generationRunAnywhere.cancelGeneration()— Stop generationEventBus.events.filterIsInstance<LLMEvent>()— Listen for LLM events
What it demonstrates:
- Batch mode: Record full audio, then transcribe
- Live mode: Real-time streaming transcription
- Audio level visualization
- Transcription metrics (confidence, RTF, word count)
Key SDK APIs:
RunAnywhere.loadSTTModel()— Load Whisper modelRunAnywhere.transcribe()— Batch transcriptionRunAnywhere.transcribeStream()— Streaming transcription
What it demonstrates:
- Neural voice synthesis with Piper TTS
- Speed and pitch controls
- Audio playback with progress
- Fun sample texts for testing
Key SDK APIs:
RunAnywhere.loadTTSVoice()— Load TTS modelRunAnywhere.synthesize()— Generate speech audioRunAnywhere.stopSynthesis()— Cancel synthesis
What it demonstrates:
- Complete voice AI pipeline
- Automatic speech detection with silence timeout
- Continuous conversation mode
- Model status tracking for all 3 components (STT, LLM, TTS)
Key SDK APIs:
RunAnywhere.startVoiceSession()— Start voice sessionRunAnywhere.processVoice()— Process audio through pipelineRunAnywhere.voiceAgentComponentStates()— Check component status
What it demonstrates:
- Storage usage overview
- Downloaded model management
- Model deletion with confirmation
- Cache clearing
Key SDK APIs:
RunAnywhere.storageInfo()— Get storage detailsRunAnywhere.deleteModel()— Remove downloaded modelRunAnywhere.clearCache()— Clear temporary files
./gradlew test./gradlew connectedAndroidTest# Detekt static analysis
./gradlew detekt
# ktlint formatting check
./gradlew ktlintCheck
# Android lint
./gradlew lintFilter logcat for RunAnywhere SDK logs:
adb logcat -s "RunAnywhere:D" "RunAnywhereApp:D" "ChatViewModel:D"| Tag | Description |
|---|---|
RunAnywhereApp |
SDK initialization, model registration |
ChatViewModel |
LLM generation, streaming |
STTViewModel |
Speech transcription |
TTSViewModel |
Speech synthesis |
VoiceAssistantVM |
Voice pipeline |
ModelSelectionVM |
Model downloads, loading |
- Open Android Studio Profiler
- Select your app process
- Record memory allocations during model loading
- Expected: ~300MB-2GB depending on model size
| Variant | Description |
|---|---|
debug |
Development build with debugging enabled |
release |
Optimized build with R8/ProGuard |
benchmark |
Release-like build for performance testing |
export KEYSTORE_PATH=/path/to/keystore.jks
export KEYSTORE_PASSWORD=your_password
export KEY_ALIAS=your_alias
export KEY_PASSWORD=your_key_password| Model | Size | Memory | Description |
|---|---|---|---|
| SmolLM2 360M Q8_0 | ~400MB | 500MB | Fast, lightweight chat |
| Qwen 2.5 0.5B Q6_K | ~500MB | 600MB | Multilingual, efficient |
| LFM2 350M Q4_K_M | ~200MB | 250MB | LiquidAI, ultra-compact |
| Llama 2 7B Chat Q4_K_M | ~4GB | 4GB | Powerful, larger model |
| Mistral 7B Instruct Q4_K_M | ~4GB | 4GB | High quality responses |
| Model | Size | Description |
|---|---|---|
| Sherpa Whisper Tiny (EN) | ~75MB | English transcription |
| Model | Size | Description |
|---|---|---|
| Piper US English (Medium) | ~65MB | Natural American voice |
| Piper British English (Medium) | ~65MB | British accent |
- ARM64 Only — Native libraries built for
arm64-v8aonly (x86 emulators not supported) - Memory Usage — Large models (7B+) require devices with 6GB+ RAM
- First Load — Initial model loading takes 1-3 seconds (cached afterward)
- Thermal Throttling — Extended inference may trigger device throttling on some devices
See CONTRIBUTING.md for guidelines.
# Fork and clone
git clone https://github.com/YOUR_USERNAME/runanywhere-sdks.git
cd runanywhere-sdks/examples/android/RunAnywhereAI
# Create feature branch
git checkout -b feature/your-feature
# Make changes and test
./gradlew assembleDebug
./gradlew test
./gradlew detekt ktlintCheck
# Commit and push
git commit -m "feat: your feature description"
git push origin feature/your-feature
# Open Pull RequestThis project is licensed under the Apache License 2.0 - see LICENSE for details.
- Discord: Join our community
- GitHub Issues: Report bugs
- Email: san@runanywhere.ai
- Twitter: @RunanywhereAI
- RunAnywhere Kotlin SDK — Full SDK documentation
- iOS Example App — iOS counterpart
- React Native Example — Cross-platform option
- Main README — Project overview
