Pruna and ComfyUI
Getting Started
Setting up Pruna within ComfyUI is straightforward. With just a few steps, you'll be ready to optimize your Stable Diffusion or Flux models for faster inference right inside the ComfyUI interface. Here's a quick guide to get started.

Step 1 - Prerequisites
You will need a Linux system with a GPU to run our nodes. First, set up a conda environment, then install both ComfyUI and Pruna:
Create a conda environment, e.g., with
conda create -n comfyui python=3.11 && conda activate comfyui
To use Pruna Pro, you also need to export your Pruna token as an environment variable:
export PRUNA_TOKEN=<your_token_here>
[Optional] If you want to use the x-fast
or stable-fast
compiler, you need to install additional dependencies:
pip install pruna[stable-fast]==0.2.3
Note: To use our caching nodes or the x_fast
compiler, you need access to Pruna Pro.
Step 2 - Pruna node integration
With your environment prepared, you're ready to integrate Pruna nodes into your ComfyUI setup. Follow these steps to clone the repository and launch ComfyUI:
Navigate to your ComfyUI installation’s custom_nodes folder:
cd <path_to_comfyui>/custom_nodes
Clone the ComfyUI_pruna repository:
git clone <https://github.com/PrunaAI/ComfyUI_pruna.git
>Launch ComfyUI
cd <path_to_comfyui> && python main.py --disable-cuda-malloc --gpu-only
After completing these steps, you should see all the Pruna nodes in the nodes menu, under the Pruna category.
Pruna nodes - A short explanation
Pruna adds four powerful nodes to ComfyUI:
A compilation node that optimizes inference speed through model compilation. While this technique preserves output quality, performance gains can vary depending on the model.
Three distinct caching nodes, each implementing a unique strategy to accelerate inference by reusing intermediate computations:
Adaptive Caching: Dynamically adjusts caching for each prompt by identifying the optimal inference steps to reuse cached outputs.
Periodic Caching: Caches model outputs at fixed intervals, reusing them in subsequent steps to reduce computation.
Auto Caching: Automatically determines the optimal caching schedule to achieve a target latency reduction with minimal quality trade-off.
By tuning the hyperparameters of each node, you can achieve the best trade-off between speed and output quality for your specific use case. Please check out the detailed guide in our repo or the documentation for more details.
Read the full article: https://www.pruna.ai/blog/faster-comfyui-nodes