Skip to content

Latest commit

 

History

History
340 lines (246 loc) · 12.1 KB

File metadata and controls

340 lines (246 loc) · 12.1 KB

Batch Inference Guide

This guide provides instructions on how to run the batch inference for JarvisArt.

Step 1: Installation

  1. Conda Environment: Set up the conda environment with required dependencies before running the inference.
conda create -n jarvisart_infer python=3.11
conda activate jarvisart_infer
conda install -c conda-forge ffmpeg=7 pkg-config -y
pip install -r envs/requirements_demo.txt
  1. Install Adobe Lightroom: Please download and install Adobe Lightroom Classic on your local machine from the official website. After installation, sign in using your Adobe account credentials.

Note: Adobe Lightroom Classic is a commercial product and may require a subscription or trial account.

Step 2: Download Model Weights

To run the batch inference, you need to download the JarvisArt model weights from Hugging Face:

  1. Create the weights directory (if it doesn't exist):

    cd JarvisArt/
    mkdir -p ./checkpoints/pretrained/JarvisArt-preview/
  2. Download the JarvisArt weights from Hugging Face repository:

    # Using huggingface-cli (recommended)
    huggingface-cli download JarvisArt/JarvisArt-preview --local-dir ./checkpoints/pretrained/JarvisArt-preview
    
    # Or using git-lfs
    git lfs install
    git clone https://huggingface.co/JarvisArt/JarvisArt-preview ./checkpoints/pretrained/JarvisArt-preview
  3. If you've placed the model weights in a different location, remember to update the model_name_or_path parameter in src/inference/config/qwen2_vl.yaml to point to your custom model directory.

Step 3: Choose Your Inference Mode

JarvisArt provides two inference modes:

  • Mode 1 (Basic): Generate Lightroom preset files only (.lrtemplate)
  • Mode 2 (End-to-End): Generate presets AND automatically process images with Lightroom

Choose the mode that best fits your workflow below.


Mode 1: Basic Inference (Preset Generation Only)

This mode generates Lightroom preset files (.lrtemplate) that you can manually apply in Adobe Lightroom Classic.

3.1 Start JarvisArt API Service

Start the JarvisArt API service on the server:

cd JarvisArt/
API_PORT=8002 CUDA_VISIBLE_DEVICES=0 llamafactory-cli api src/inference/config/qwen2_vl.yaml

Note: This service needs to keep running. Please start it in a separate terminal window.

3.2 Run Basic Batch Inference

python inference.py --image_path /path/to/your/images

Replace /path/to/your/images with the path to the directory containing the images you want to process.

Input Directory Structure:

.
 |-- first_image
 |   |-- before.jpg
 |   |-- user_want.txt      # This file contains the user's desired output description
 |   `-- output_image.lrtemplate  # Output file generated by the script
 `-- second_image
     |-- before.png
     |-- user_want.txt
     `-- output_image.lrtemplate

Note:

  • The script will generate .lrtemplate preset files in the same directory as the input images
  • You can manually import and apply these presets in Adobe Lightroom Classic

Mode 2: End-to-End Inference (with Lightroom Integration)

This mode provides complete end-to-end processing: it generates Lightroom presets AND automatically processes images using the Lightroom reverse connection service.

4.1 Start Server-Side Services

4.1.1 Start JarvisArt API Service (Server-Side - Terminal 1)

First, start the JarvisArt API service on the server:

cd JarvisArt/
API_PORT=8002 CUDA_VISIBLE_DEVICES=0 llamafactory-cli api src/inference/config/qwen2_vl.yaml

Note: This service needs to keep running. Please start it in a separate terminal window.

4.1.2 Start Lightroom Reverse Connection Server (Server-Side - Terminal 2)

Open a new terminal window on the server and start the Lightroom reverse connection server:

cd lrc_scripts/servers
bash start_reverse_server.sh

Default Configuration:

  • Listen Address: 0.0.0.0
  • Listen Port: 8081
  • Upload Directory: JarvisArt/lrc_scripts/servers/lr_caches/uploads
  • Results Directory: JarvisArt/lrc_scripts/servers/lr_caches/results

Custom Configuration Examples:

# Customize port and directories
bash start_reverse_server.sh --port 8082 --max-retries 10 --wait-timeout 300

# View all available parameters
bash start_reverse_server.sh --help

Note: This service needs to keep running. Please start it in a separate terminal window.

4.2 Start Mac/Windows Client Service

4.2.1 Configure Server Connection Information

On the Mac/Windows local machine, you first need to configure the server address to connect to. Edit the lrc_scripts/clients/start_mac_client.sh file and modify the following configuration:

# Change to your server IP and port (supports multiple servers, separated by commas) in start_mac_client.sh
DEFAULT_LINUX_SERVERS="YOUR_SERVER_IP:8081"

4.2.2 Install Agent-to-Lightroom (A2L) Plugin

IMPORTANT: Before starting the Mac/Windows client, you must first install the A2L plugin in Adobe Lightroom Classic. This plugin enables communication between the agent and Lightroom for automated photo processing.

Installation Steps:

  1. Open Adobe Lightroom Classic
  2. Navigate to FilePlug-in Manager...
  3. Click the Add button in the Plugin Manager window
  4. Browse and select the lrc_scripts/clients/agent_to_lightroom/XMPlayer.lrplugin/ directory
  5. Click Done to complete the installation

The XMP Player plugin should now appear in your plugin list and be ready to use.

For detailed installation instructions with screenshots, please refer to: Agent-to-Lightroom Plugin Documentation

4.2.3 Start Mac/Windows Client

After installing the A2L plugin, run the following command on the Mac/Windows local machine to start the client:

cd lrc_scripts/clients
bash start_mac_client.sh

Default Configuration:

  • Server Address: Needs to be configured in the script
  • Local Lightroom API Port: 7777
  • Polling Interval: 1.0 seconds

Custom Configuration Examples:

# Specify server address and port
bash start_mac_client.sh --servers "192.168.1.100:8081,192.168.1.101:8081" --api-port 7777

# View all available parameters
bash start_mac_client.sh --help

Prerequisites:

  • Ensure Adobe Lightroom Classic is installed and running on Mac/Windows
  • Ensure the A2L plugin is properly installed (see section 4.2.2)
  • Ensure network connectivity to server port 8081

Note: This service needs to keep running. The client will automatically connect to the server and maintain heartbeat.

4.3 Run End-to-End Batch Inference

Once all services are successfully started and connected, run the end-to-end inference on the server side:

# Basic usage
python inference_e2e.py \
    --image_path /path/to/your/images \
    --api_endpoint localhost \
    --api_port 8002

# Advanced usage with multiple threads and load balancing
python inference_e2e.py \
    --image_path /path/to/your/images \
    --save_base_path /path/to/save/results \
    --api_endpoint localhost \
    --api_port 8002 8003 8004 \
    --model_name qwen2_vl \
    --api_key 0 \
    --max_threads 20 \
    --prompt_file_name user_want.txt \
    --default_timeout 180 \
    --api_timeout 30

Input Directory Structure:

.
 |-- first_image
 |   |-- before.jpg
 |   `-- user_want.txt      # This file contains the user's desired output description
 `-- second_image
     |-- before.png
     `-- user_want.txt

Output Directory Structure:

After processing, each image directory will contain:

result_dir/
├── before.jpg              # Original image copy
├── output_image.lua        # Lua format Lightroom parameters
├── output_image.lrtemplate # lrtemplate format preset file
├── processed.jpg           # Image processed by Lightroom
├── response.txt            # Complete model response
└── conversation_history.json  # Conversation history (JSON format)

Note:

  • The script will automatically process all images using Lightroom
  • Processed images are saved as processed.jpg in the result directory
  • The script supports resuming from interruption (skips already processed images)

Parameter Explanation

API Service Parameters (Step 3.1 / 4.1.1)

  • API_PORT: Port number for the API server (default: 8002)
  • CUDA_VISIBLE_DEVICES: GPU device to use (e.g., 0 for first GPU)
  • src/inference/config/qwen2_vl.yaml: Configuration file containing:
    • model_name_or_path: Path to the model checkpoint directory (default: ./checkpoints/pretrained/JarvisArt-preview/)
    • template: Model template type (default: qwen2_vl)
    • infer_backend: Inference backend (default: vllm)
    • max_new_tokens: Maximum number of tokens to generate (default: 10240)
    • vllm_maxlen: Maximum sequence length (default: 10240)

Lightroom Reverse Connection Server Parameters (Step 4.1.2)

  • --host HOST: Listen address (default: 0.0.0.0)
  • --port PORT: Listen port (default: 8081)
  • --upload-dir DIR: Upload file storage directory (default: ./lr_caches/uploads)
  • --results-dir DIR: Result file storage directory (default: ./lr_caches/results)
  • --max-retries NUM: Maximum retry count (default: 5)
  • --wait-timeout SEC: File wait timeout in seconds (default: 180.0)
  • --retry-delay SEC: Retry delay in seconds (default: 2.0)
  • --backoff-factor NUM: Backoff factor (default: 1.5)

Mac/Windows Client Parameters (Step 4.2.3)

  • --servers SERVERS: Linux server addresses (format: IP:PORT,IP:PORT)
  • --api-port PORT: Local Lightroom API port (default: 7777)
  • --api-path PATH: API_Lightroom project path (default: ./)
  • --client-id ID: Client ID (default: auto-generated)
  • --poll-interval SEC: Polling interval in seconds (default: 1.0)
  • --retry-delay SEC: Connection retry delay in seconds (default: 3.0)
  • --max-failures NUM: Maximum consecutive failures (default: 5)
  • --health-interval SEC: Health check interval in seconds (default: 30.0)
  • --max-empty-polls NUM: Consecutive empty polls threshold (default: 50)

End-to-End Inference Parameters (Step 4.3)

API Configuration:

  • --api_endpoint: API server address (default: localhost)
  • --api_port: API server port(s). Multiple ports enable load balancing (default: [8002]). You can specify multiple ports like --api_port 8002 8003
  • --api_key: API authentication key (default: 0)
  • --model_name: AI model name for image processing (default: qwen2_vl)

Processing Configuration:

  • --max_threads: Maximum concurrent processing threads (default: 10)

File Paths:

  • --image_path (required): Directory containing input images with subdirectories. Each subdirectory should contain an image file (before.jpg or before.png) and a user prompt file
  • --save_base_path: Base directory for saving processing results (default: {image_path}/results)
  • --prompt_file_name: Filename of user prompt file in each image directory (default: user_want.txt)

Processing Parameters:

  • --default_timeout: Default timeout for API requests in seconds (default: 180)
  • --api_timeout: API connection timeout in seconds (default: 30)

Troubleshooting

Mode 1 (Basic) Issues

API Connection Failed:

  • Verify the API service is running
  • Check the port number is correct
  • Verify network connectivity

Mode 2 (E2E) Issues

Lightroom Processing Failed:

  • Ensure the Lightroom reverse connection server is running
  • Verify environment variables LIGHTROOM_SERVER_HOST and LIGHTROOM_SERVER_PORT
  • Check the Mac/Windows client is connected
  • Verify the A2L plugin is installed in Adobe Lightroom Classic

Client Connection Issues:

  • Ensure Adobe Lightroom Classic is running
  • Verify the A2L plugin is properly installed
  • Check network connectivity between client and server
  • Review client logs for connection errors

Performance Issues:

  • Reduce max_threads parameter to decrease concurrent load
  • Add more API server instances for load balancing
  • Check GPU memory usage and adjust accordingly