Docs
DataCrunch HomeSDKAPILogin / Signup
  • Welcome to DataCrunch
    • Overview
    • Locations and Sustainability
    • Support
  • GPU Instances
    • Set up a GPU instance
    • Securing Your Instance
    • Shutdown, Hibernate, and Delete
    • Adding a New User
    • Block Volumes
    • Shared Filesystems (SFS)
    • Managing SSH Keys
    • Connecting to Your DataCrunch.io Server
    • Connecting to Jupyter notebook with VS Code
    • Team Projects
    • Pricing and Billing
  • Clusters
    • Instant Clusters
      • Deploying a GPU cluster
      • Slurm
      • Spack
      • Good to know
    • Customized GPU clusters
  • Containers
    • Overview
    • Container Registries
    • Scaling and health-checks
    • Batching and Streaming
    • Async Inference
    • Tutorials
      • Quick: Deploy with vLLM
      • In-Depth: Deploy with TGI
      • In-Depth: Deploy with SGLang
      • In-Depth: Deploy with vLLM
      • In-Depth: Deploy with Replicate Cog
      • In-Depth: Asynchronous Inference Requests with Whisper
  • Inference
    • Overview
    • Authorization
    • Audio Models
      • Whisper X
  • Pricing and Billing
  • Resources
    • Resources Overview
    • DataCrunch API
  • Python SDK
  • Get Free Compute Credits
Powered by GitBook
On this page
  • Serverless Containers pricing
  • Features
  • Coming soon

Was this helpful?

  1. Containers

Overview

Last updated 17 days ago

Was this helpful?

With our Containers service, you can create your own inference endpoints to serve your models while paying only for the compute that is in active use.

We support loading containers from any registry and are about how the container is built.

You can deploy your first container by following the following guide: Quick: Deploy with vLLM

Serverless Containers pricing

Price is calculated in 10-minute intervals for the currently running replicas of your container. The number of currently running replicas will depend on your Scaling and health-checks settings.

GPU model
VRAM
Price

NVIDIA H200 SXM5

141GB

$4.125/h

NVIDIA H100 SXM5

80GB

$3.975/h

NVIDIA A100 SXM4

40GB

$1.290/h

NVIDIA L40S

48GB

$1.290/h

NVIDIA RTX6000 Ada

48GB

$1.290/h

General Compute

24GB

$0.890/h

Features

  • Scale to hundreds of GPUs when needed with our battle-tested inference cluster

  • Scale to zero when idle, so you only pay while your container is running

  • Support for any container registry, using either registry-specific authentication methods or a vanilla Docker config.json-style auth

  • Both manual and request queue-based autoscaling, with adjustable scaling sensitivity

  • Logging and metrics in the dashboard

  • for managing your deployments

Coming soon

  • Python SDK

  • Shared storage between the Containers and Cloud GPU instances

  • Support for async / polling requests

quite flexible
RESTful API