Flux-Dev

We ran a comprehensive benchmark comparing FLUX-juiced with the “FLUX.1 [dev]” endpoints offered by: Replicate, Fal, Fireworks AI, and Together AI.

Regarding cost efficiency, FLUX-juiced generates up to 180 images per dollar, compared to the baseline’s 60, making it 3× more cost-efficient.

Regarding inference speed, FLUX-juiced delivers images in 2.5 seconds, versus 6–7 seconds for the base model—a 2.4–2.8× speedup, translating to ~18 hours saved when generating 1 million images.
Overall, FLUX-juiced consistently hits the Pareto front: top-tier quality, faster output, lower cost, saving up to $20,000 at scale.

Try it on your setup.


import torch
from diffusers import FluxPipeline
from pruna_pro import SmashConfig, smash

pipe = FluxPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-dev",
    torch_dtype=torch.bfloat16,
).to("cuda")

smash_config = SmashConfig()
smash_config["compiler"] = "torch_compile"
smash_config["cacher"] = "taylor_auto"
# lightly juiced (0.6), juiced (0.5), extra juiced (0.4)
smash_config["taylor_auto_speed_factor"] = 0.4
smash_token = "<your-token>"
smashed_pipe = smash(
    model=pipe,
    token=smash_token,
    smash_config=smash_config,
)

smashed_pipe("A cute, round, knitted purple prune.").images[0]

Read the full benchmark: https://www.pruna.ai/blog/flux-juiced-the-fastest-image-generation-endpoint