Prerequisites
Make sure you have:
- Access to your webconsole instance (as root user)
Push your model image
Suppose you have a custom model image that you want to deploy to SoraNova. You should first pull the image to your webconsole instance.
docker login <model-registry>
docker pull <model-image>
For example, if you have a custom model image hosted on docker.soracloud.net
, you can pull it with the following command:
docker login docker.soracloud.net # if your registry is private
docker pull docker.soracloud.net/model-image:latest
Then, you can seed your custom model with the following command:
# This command pushes your custom image to your SoraNova registry
sora image push <repo>/<model-image> <model-image>:<image-tag>
# e.g.
# sora image push docker.soracloud.net/model-image:latest model-image:latest
The sora
CLI is a command-line interface used to interact with your SoraNova instance. If you haven’t installed it yet, follow the Quickstart guide.
Define the deployment recipe
You can deploy it by creating a new recipe by creating a model.hcl
file with the following contents:
The deployment configuration DSL shown below is experimental and subject to change in future releases.
# ----- Variables Block: Defines variables that can be customized during deployment -----
# GPU device name (e.g., 'nvidia/gpu')
variable "gpu_name" {
type = string
description = "GPU name for the model task"
default = "nvidia/gpu"
}
# List of GPU memory allocations (in MiB) — one per GPU device
variable "gpu_memory_mibs" {
type = list(number)
description = "GPU memory allocation in MiB for the model task"
default = [20480, 20480]
}
# Strategy for GPU sharing — 'mps' enables CUDA MPS for multi-process sharing
variable "sharing_strategy" {
type = string
description = "GPU sharing strategy for the model task"
default = "mps"
}
# Number of replicas (i.e., how many copies of the model to run)
variable "replica_count" {
type = number
description = "Number of replicas for the model task"
default = 1
}
# ----- Model Block: Defines the application configuration -----
model {
name = "your-custom-model-name" # Display name of your model. Has to be unique.
count = "${var.replica_count}"
config {
image = "your-model-name:latest"
args = [] # Arguments to pass to the model container
}
env {
# Add any environment variables needed for your model
}
endpoint llm {
port = 5000
health {
type = "http"
path = "/health"
interval = "10s"
timeout = "2s"
}
}
resources {
cpu_mhz = 10240
memory_mib = 14360
sharing_strategy = "${var.sharing_strategy}"
device {
name = "${var.gpu_name}"
memory_mibs = "${var.gpu_memory_mibs}"
}
}
}
health
needs to be defined by the image to tell SoraNova how to check if the model is healthy.
Deploy and Interact
To deploy the model:
sora recipe seed model.hcl
sora recipe list
sora recipe deploy <recipe-slug> # replace with the slug from the previous command
After deployment, list your models using:
🎉 That’s it — your model is now live and ready to serve requests. Happy building!