Overview
SoraNova lets you deploy two main types of applications:
- Standard applications: These can include a backend API, a frontend, or both, and are exposed via endpoints.
- Daemon jobs: Background services that run on every node in your cluster (e.g., log shippers, monitoring agents).
Below are examples for both types.
Example 1: Deploying a Custom Application (Frontend + Backend)
Suppose you have two container images:
- A backend API (
my-backend:latest
)
- A frontend web app (
my-frontend:latest
)
You want to deploy both, and have the frontend know the backend’s URL via an injected environment variable.
First, push your images to your SoraNova registry:
docker pull <backend-image>
docker pull <frontend-image>
sora image push <backend-image> my-backend:latest
sora image push <frontend-image> my-frontend:latest
Then, create an application recipe, e.g. my-app.hcl
:
variable "replica_count" {
type = number
default = 1
}
application {
name = "My Web App"
summary = "A simple frontend and backend application"
description = "This app demonstrates deploying a backend API and a frontend, with the frontend receiving the backend URL via an environment variable."
tags = ["frontend", "backend", "example"]
service "backend" {
name = "backend"
task_group "backend" {
name = "backend-task-group"
count = "${var.replica_count}"
task {
name = "backend"
driver = "docker"
config {
image = "my-backend:latest"
}
resources {
cpu_mhz = 1000
memory_mib = 1024
}
env {
# key = "value"
}
endpoint api {
port = 8080
health {
type = "http"
path = "/health"
interval = "10s"
timeout = "2s"
}
}
}
}
}
service "frontend" {
name = "frontend"
task_group "frontend" {
name = "frontend-task-group"
count = "${var.replica_count}"
task {
driver = "docker"
config {
image = "my-frontend:latest"
}
env = {
BACKEND_URL = "${service.backend}"
}
resources {
cpu_mhz = 1000
memory_mib = 1024
}
endpoint web {
port = 3000
health {
type = "http"
path = "/"
interval = "10s"
timeout = "2s"
}
}
}
}
config {
depends_on = ["backend"]
}
}
}
The ${service.backend}
variable injects the backend’s endpoint URL into the frontend container as the BACKEND_URL
environment variable.
To deploy:
sora recipe seed my-app.hcl
sora recipe list
sora recipe deploy <recipe-slug>
Example 2: Deploying a Daemon Application
Daemon applications are background applications that run on every node in your SoraNova cluster. They are ideal for log shippers, monitoring agents, or any persistent process that should be present on all nodes.
Below is a sample configuration for deploying Vector as a log shipper daemon. This will collect logs from /var/log
and Docker, and can be configured to push them to an external HTTP endpoint.
variable "custom_sink_uri" {
type = string
}
application {
name = "Vector Log Daemon"
summary = "High-performance log collection and forwarding daemon that efficiently pushes logs to various sinks."
description = "Vector is a lightweight, ultra-fast tool for building observability pipelines. It collects, transforms, and routes logs from multiple sources to various sinks with minimal resource overhead and maximum reliability."
category = "daemon"
tags = ["logging", "observability", "daemon"]
daemon {
name = "vector"
config {
image = "timberio/vector:latest-debian"
volumes = [
"/var/log:/var/log:ro"
]
args = [
"--config-dir", "/local/etc/vector"
]
}
template {
data = <<EOF
[sources.my_source]
type = "file"
include = ["/var/log/**/*.log"]
fingerprint.strategy = "device_and_inode"
[sources.docker_logs]
type = "docker_logs"
[sinks.my_sink]
type = "http"
inputs = ["my_source", "docker_logs"]
uri = "${var.custom_sink_uri}"
encoding.codec = "json"
EOF
destination = "/local/etc/vector/vector.toml"
}
resources {
cpu_mhz = 100
memory_mib = 1024
}
}
}
Example 3: Deploying a GPU-Accelerated Custom Application (Frontend + Backend)
Suppose you have:
- A custom ML backend image (
my-gpu-backend:latest
)
- A custom frontend image (
my-frontend:latest
)
First, pull and push your images to your SoraNova registry:
docker pull <your-backend-image>
docker pull <your-frontend-image>
sora image push <your-backend-image> my-gpu-backend:latest
sora image push <your-frontend-image> my-frontend:latest
Then, create an application recipe, e.g. my-gpu-app.hcl
:
variable "memory_mib" {
type = number
description = "Memory allocation in MiB for the backend task"
default = 14360
}
variable "gpu_name" {
type = string
description = "GPU name for the backend task"
default = "nvidia/gpu/NVIDIA L4"
}
variable "gpu_memory_mibs" {
type = list(number)
description = "GPU memory allocation in MiB for the backend task"
default = [10240] # request for a node with at least 1 gpu having 10,240 MiB of memory by default
}
variable "sharing_strategy" {
type = string
description = "GPU sharing strategy for the backend task"
default = "mps"
}
variable "replica_count" {
type = number
description = "Number of replicas for the backend task"
default = 1
}
application {
name = "Custom GPU Web App"
summary = "A GPU-accelerated backend with a frontend"
description = "This app demonstrates deploying a GPU-accelerated backend and a frontend, with the frontend receiving the backend URL via an environment variable."
tags = ["frontend", "backend", "gpu", "example"]
service "backend" {
name = "gpu-backend"
task_group "backend" {
name = "gpu-backend-task-group"
count = "${var.replica_count}"
task {
name = "gpu-backend"
driver = "docker"
config {
image = "my-gpu-backend:latest"
ipc_mode = "host" # Use host IPC for GPU access
}
env {
# Add any required environment variables for your backend here
}
endpoint api {
port = 5000
health {
type = "http"
path = "/health"
interval = "10s"
timeout = "2s"
}
}
resources {
cpu_mhz = 10240
memory_mib = "${var.memory_mib}"
sharing_strategy = "${var.sharing_strategy}"
device {
name = "${var.gpu_name}"
memory_mibs = "${var.gpu_memory_mibs}"
}
}
}
}
}
service "frontend" {
name = "frontend"
task_group "frontend" {
name = "frontend-task-group"
count = 1
task {
name = "frontend"
driver = "docker"
config {
image = "my-frontend:latest"
}
env = {
BACKEND_URL = "${service.gpu-backend}"
}
resources {
memory_mib = 4096
}
endpoint web {
port = 3000
health {
type = "http"
path = "/"
interval = "10s"
timeout = "2s"
}
}
}
}
config {
depends_on = ["gpu-backend"]
}
}
}
Note: Your custom images must implement their respective health check endpoints that returns a 200 status code when the service is healthy. This endpoint is used by the orchestration platform to determine when your service is up and ready to receive traffic.
The ${service.gpu-backend.api.url}
variable injects the backend’s endpoint URL into the frontend container as the BACKEND_URL
environment variable.
To deploy:
sora recipe seed my-gpu-app.hcl
sora recipe list
sora recipe deploy <recipe-slug>
Example 4: Deploying a GPU-Accelerated Daemon (Echo Server Example)
You can run any GPU-enabled daemon on every node that meets your GPU requirements. For example, here’s how to deploy a simple echo server as a GPU daemon. You can replace the image with your own.
First, pull and push the image to your SoraNova registry:
docker pull jmalloc/echo-server
sora image push jmalloc/echo-server echo-server:latest
Then, create an application recipe, e.g. gpu-echo-daemon.hcl
:
application {
name = "GPU Job Processor"
summary = "A GPU-accelerated job processing backend"
description = "This service fetches jobs from a queue, processes them using the GPU, stores results, and repeats."
tags = ["gpu", "job-processing", "backend", "example"]
daemon {
name = "gpu-job-processor"
config {
image = "echo-server:latest"
ipc_mode = "host" # Use host networking for direct access to the GPU
}
resources {
cpu_mhz = 10240
memory_mib = 10240
sharing_strategy = "mps"
device {
name = "nvidia/gpu"
memory_mibs = [10240]
}
}
# endpoint - only required if the daemon needs to be accessible via HTTP. If not specified, the daemon will run in the background without an HTTP endpoint.
endpoint api {
port = 8080 # container port
static = 8080 # static port on the host
health {
type = "http"
path = "/"
interval = "30s"
timeout = "5s"
}
}
env {
# Example: JOB_QUEUE_URL, RESULT_STORE_URL, etc.
# JOB_QUEUE_URL = "amqp://my-queue"
# RESULT_STORE_URL = "s3://my-bucket/results"
}
}
}
You can replace echo-server:latest
with any image you want. The daemon will run on all nodes with a GPU that meets the specified requirements. When new machines are provisioned, they will automatically start running this daemon if they have the required resources.
To deploy:
sora recipe seed gpu-echo-daemon.hcl
sora recipe list
sora recipe deploy <recipe-slug>
Key Points
- Standard applications: Use the
service
block for frontend/backend workloads. Use ${service.<name>}
to inject service URLs.
- Daemon applications run on every node: Use them for log collection, monitoring, or any background process.
- Resource allocation: Use the
resources
block to control CPU and memory usage.
- GPU allocation: Use the
device
block to specify GPU requirements. You can target specific GPU models or specify memory requirements.
- GPU sharing: Use
sharing_strategy = "mps"
to enable CUDA MPS for multi-process sharing on the same GPU.
Deploying
-
Save your configuration to a file (e.g., my-app.hcl
or vector-daemon.hcl
).
-
Deploy it using the SoraNova CLI:
sora recipe seed <your-file>.hcl
sora recipe list
sora recipe deploy <recipe-slug>
-
Your application or daemon will now run in your cluster.
🎉 That’s it! You now know how to deploy both standard applications and daemon applications on your SoraNova cluster.