Skip to content
Pruna AI Customer Support Portal home
Pruna AI Customer Support Portal home

How can I get a Model Benchmark?

We’re often asked: “What can Pruna do on the XYZ model?” And the answer depends on your goal. Are you exploring possibilities or validating for real-world use?

We offer two clear paths:

🟡 Simple Overview: If you're just looking to get a ballpark view, something like "what can you do on Model X?", we will either:

  • Share existing benchmark numbers, if we’ve already run the model.

  • Run a quick optimization pass on our side using our optimization agent.

It's free and suitable for early discovery. You’ll get quick signals using open-source models, general datasets, and no custom tuning.

🟢 Full Benchmark: the go-to path when inference optimization is critical enough to justify time and budget. We replicate your production setup, test multiple strategies, and show if there’s real performance to gain and ROI to capture.

It all starts with an intake: the Benchmark Request Document, where we collect:

  • The context is needed to avoid wrong assumptions and align with the success criteria

  • Your technical environment: hosting provider, hardware, serving framework

  • Your inference setup: latency targets, batch size, evaluation metrics, custom logic

You can fill a request at bench.pruna.ai