Add multi-batch inference support, fix hivemind dependency, and improve installation process by JiuChen0 · Pull Request #27 · ai-decentralized/BloomBee

JiuChen0 · 2025-11-01T00:21:16Z

Key Changes

Multi-batch inference support
Enables inference with variable batch sizes for better performance and scalability.
Fix hivemind dependency installation issue
Resolved issues with installing hivemind dependencies in certain environments.
Remove hardcoded model restriction
Removed the hardcoded limitation that only allowed loading LLaMA 7B, enabling support for other model sizes.
Update installation instructions in README.md
Simplified the installation process. The entire BloomBee environment can now be installed and configured with a single command:
```
pip install -e .
```

- Add --batch_size CLI argument for parallel sequence processing - Add conditional CUDA stream creation for CPU-only mode - Add device-aware ExecutionEnv and Policy resource distribution - Fix MPS compatibility on macOS

…ve installation process (ai-decentralized#27) * Add batch inference support and CPU compatibility - Add --batch_size CLI argument for parallel sequence processing - Add conditional CUDA stream creation for CPU-only mode - Add device-aware ExecutionEnv and Policy resource distribution - Fix MPS compatibility on macOS * fix hardcode of model loading and support batch size * Resolving dependency conflicts * docs: refine README setup and usage sections for clarity and correctness * Add batch size related updates * delete ddebug output * delete .id files * fix max token size problem * add prompt * clear the debug print --------- Co-authored-by: Danny Willow Liu <dannywillowliu@uchicago.edu>

dannywillowliu-uchi and others added 10 commits October 15, 2025 22:43

Add batch inference support and CPU compatibility

6192e15

- Add --batch_size CLI argument for parallel sequence processing - Add conditional CUDA stream creation for CPU-only mode - Add device-aware ExecutionEnv and Policy resource distribution - Fix MPS compatibility on macOS

fix hardcode of model loading and support batch size

48fbd69

Resolving dependency conflicts

3d3ff5b

docs: refine README setup and usage sections for clarity and correctness

9fafef5

Add batch size related updates

0b5b97a

delete ddebug output

4ad4882

delete .id files

136054a

fix max token size problem

b717a53

add prompt

5d26e9b

clear the debug print

612787f

HaibaraAiChan approved these changes Nov 1, 2025

View reviewed changes

HaibaraAiChan merged commit 862bd3b into ai-decentralized:main Nov 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add multi-batch inference support, fix hivemind dependency, and improve installation process#27

Add multi-batch inference support, fix hivemind dependency, and improve installation process#27
HaibaraAiChan merged 10 commits intoai-decentralized:mainfrom
JiuChen0:upload

JiuChen0 commented Nov 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

JiuChen0 commented Nov 1, 2025

Key Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants