Deploying Custom Models

Prerequisites

Make sure you have:

Access to your webconsole instance (as root user)

Push your model image

Suppose you have a custom model image that you want to deploy to SoraNova. You should first pull the image to your webconsole instance.

docker login <model-registry>
docker pull <model-image>

For example, if you have a custom model image hosted on docker.soracloud.net, you can pull it with the following command:

docker login docker.soracloud.net # if your registry is private
docker pull docker.soracloud.net/model-image:latest

Then, you can seed your custom model with the following command:

# This command pushes your custom image to your SoraNova registry
sora image push <repo>/<model-image> <model-image>:<image-tag>
# e.g.
# sora image push docker.soracloud.net/model-image:latest model-image:latest

The sora CLI is a command-line interface used to interact with your SoraNova instance. If you haven’t installed it yet, follow the Quickstart guide.

Define the deployment recipe

You can deploy it by creating a new recipe by creating a model.hcl file with the following contents:

The deployment configuration DSL shown below is experimental and subject to change in future releases.

# ----- Variables Block: Defines variables that can be customized during deployment -----

# GPU device name (e.g., 'nvidia/gpu')
variable "gpu_name" {
  type        = string
  description = "GPU name for the model task"
  default     = "nvidia/gpu"
}

# List of GPU memory allocations (in MiB) — one per GPU device
variable "gpu_memory_mibs" {
  type        = list(number)
  description = "GPU memory allocation in MiB for the model task"
  default     = [20480, 20480]
}

# Strategy for GPU sharing — 'mps' enables CUDA MPS for multi-process sharing
variable "sharing_strategy" {
  type        = string
  description = "GPU sharing strategy for the model task"
  default     = "mps"
}

# Number of replicas (i.e., how many copies of the model to run)
variable "replica_count" {
  type        = number
  description = "Number of replicas for the model task"
  default     = 1
}

# ----- Model Block: Defines the application configuration -----
model {
  name = "your-custom-model-name" # Display name of your model. Has to be unique.
  count = "${var.replica_count}"
  
  config {
    image = "your-model-name:latest"
    args  = [] # Arguments to pass to the model container
  }

  env {
    # Add any environment variables needed for your model
  }

  endpoint llm {
    port = 5000
    health {
      type     = "http"
      path     = "/health"
      interval = "10s"
      timeout  = "2s"
    }
  }

  resources {
    cpu_mhz          = 10240
    memory_mib       = 14360
    sharing_strategy = "${var.sharing_strategy}"
    device {
      name        =  "${var.gpu_name}"
      memory_mibs = "${var.gpu_memory_mibs}"
    }
  }
}

health needs to be defined by the image to tell SoraNova how to check if the model is healthy.

Deploy and Interact

To deploy the model:

sora recipe seed model.hcl
sora recipe list
sora recipe deploy <recipe-slug> # replace with the slug from the previous command

After deployment, list your models using:

sora model list

🎉 That’s it — your model is now live and ready to serve requests. Happy building!

Get Started

Essentials

Prerequisites

Push your model image

Define the deployment recipe

Deploy and Interact

Get Started

Essentials

​Prerequisites

​Push your model image

​Define the deployment recipe

​Deploy and Interact

Prerequisites

Push your model image

Define the deployment recipe

Deploy and Interact