Skip to content
Pruna AI Customer Support Portal home
Pruna AI Customer Support Portal home

Flux-Dev

We ran a comprehensive benchmark comparing FLUX-juiced with the “FLUX.1 [dev]” endpoints offered by: Replicate, Fal, Fireworks AI, and Together AI.

  • Regarding cost efficiency, FLUX-juiced generates up to 180 images per dollar, compared to the baseline’s 60, making it 3× more cost-efficient.

  • Regarding inference speed, FLUX-juiced delivers images in 2.5 seconds, versus 6–7 seconds for the base model—a 2.4–2.8× speedup, translating to ~18 hours saved when generating 1 million images.

  • Overall, FLUX-juiced consistently hits the Pareto front: top-tier quality, faster output, lower cost, saving up to $20,000 at scale.

Try it on your setup.

import torch from diffusers import FluxPipeline from pruna_pro import SmashConfig, smash pipe = FluxPipeline.from_pretrained( "black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16, ).to("cuda") smash_config = SmashConfig() smash_config["compiler"] = "torch_compile" smash_config["cacher"] = "taylor_auto" # lightly juiced (0.6), juiced (0.5), extra juiced (0.4) smash_config["taylor_auto_speed_factor"] = 0.4 smash_token = "<your-token>" smashed_pipe = smash( model=pipe, token=smash_token, smash_config=smash_config, ) smashed_pipe("A cute, round, knitted purple prune.").images[0]

Read the full benchmark: https://www.pruna.ai/blog/flux-juiced-the-fastest-image-generation-endpoint