Overview

SoraNova lets you deploy two main types of applications:

  • Standard applications: These can include a backend API, a frontend, or both, and are exposed via endpoints.
  • Daemon jobs: Background services that run on every node in your cluster (e.g., log shippers, monitoring agents).

Below are examples for both types.


Example 1: Deploying a Custom Application (Frontend + Backend)

Suppose you have two container images:

  • A backend API (my-backend:latest)
  • A frontend web app (my-frontend:latest)

You want to deploy both, and have the frontend know the backend’s URL via an injected environment variable.

First, push your images to your SoraNova registry:

docker pull <backend-image>
docker pull <frontend-image>
sora image push <backend-image> my-backend:latest
sora image push <frontend-image> my-frontend:latest

Then, create an application recipe, e.g. my-app.hcl:

variable "replica_count" {
  type    = number
  default = 1
}

application {
  name        = "My Web App"
  summary     = "A simple frontend and backend application"
  description = "This app demonstrates deploying a backend API and a frontend, with the frontend receiving the backend URL via an environment variable."
  tags        = ["frontend", "backend", "example"]

  service "backend" {
    name = "backend"

    task_group "backend" {
      name  = "backend-task-group"
      count = "${var.replica_count}"

      task {
        name  = "backend"
        driver = "docker"
        config {
          image = "my-backend:latest"
        }

        resources {
          cpu_mhz    = 1000
          memory_mib = 1024
        }

        env {
          # key = "value"
        }

        endpoint api {
          port = 8080
          health {
            type     = "http"
            path     = "/health"
            interval = "10s"
            timeout  = "2s"
          }
        }


      }

    }
  }

  service "frontend" {
    name = "frontend"

    task_group "frontend" {
      name  = "frontend-task-group"
      count = "${var.replica_count}"

      task {
        driver = "docker"
        config {
          image = "my-frontend:latest"
        }

        env = {
          BACKEND_URL = "${service.backend}"
        }

        resources {
          cpu_mhz    = 1000
          memory_mib = 1024
        }

        endpoint web {
          port = 3000
          health {
            type     = "http"
            path     = "/"
            interval = "10s"
            timeout  = "2s"
          }
        }
      }
    }
    config {
      depends_on = ["backend"]
    }
  }
}

The ${service.backend} variable injects the backend’s endpoint URL into the frontend container as the BACKEND_URL environment variable.

To deploy:

sora recipe seed my-app.hcl
sora recipe list
sora recipe deploy <recipe-slug>

Example 2: Deploying a Daemon Application

Daemon applications are background applications that run on every node in your SoraNova cluster. They are ideal for log shippers, monitoring agents, or any persistent process that should be present on all nodes.

Below is a sample configuration for deploying Vector as a log shipper daemon. This will collect logs from /var/log and Docker, and can be configured to push them to an external HTTP endpoint.

variable "custom_sink_uri" {
    type = string
}

application {
    name        = "Vector Log Daemon"
    summary     = "High-performance log collection and forwarding daemon that efficiently pushes logs to various sinks."
    description = "Vector is a lightweight, ultra-fast tool for building observability pipelines. It collects, transforms, and routes logs from multiple sources to various sinks with minimal resource overhead and maximum reliability."
    category    = "daemon"
    tags        = ["logging", "observability", "daemon"]

  daemon {
    name = "vector"

    config {
      image = "timberio/vector:latest-debian"

      volumes = [
        "/var/log:/var/log:ro"
      ]

      args = [
        "--config-dir", "/local/etc/vector"
      ]
    }

    template {
      data = <<EOF
[sources.my_source]
type = "file"
include = ["/var/log/**/*.log"]
fingerprint.strategy = "device_and_inode"

[sources.docker_logs]
type = "docker_logs"

[sinks.my_sink]
type = "http"
inputs = ["my_source", "docker_logs"]
uri = "${var.custom_sink_uri}"
encoding.codec = "json"

EOF

      destination = "/local/etc/vector/vector.toml"
    }

    resources {
      cpu_mhz    = 100
      memory_mib = 1024
    }
  }
}

Example 3: Deploying a GPU-Accelerated Custom Application (Frontend + Backend)

Suppose you have:

  • A custom ML backend image (my-gpu-backend:latest)
  • A custom frontend image (my-frontend:latest)

First, pull and push your images to your SoraNova registry:

docker pull <your-backend-image>
docker pull <your-frontend-image>
sora image push <your-backend-image> my-gpu-backend:latest
sora image push <your-frontend-image> my-frontend:latest

Then, create an application recipe, e.g. my-gpu-app.hcl:

variable "memory_mib" {
  type        = number
  description = "Memory allocation in MiB for the backend task"
  default     = 14360
}

variable "gpu_name" {
  type        = string
  description = "GPU name for the backend task"
  default     = "nvidia/gpu/NVIDIA L4"
}

variable "gpu_memory_mibs" {
  type        = list(number)
  description = "GPU memory allocation in MiB for the backend task"
  default     = [10240] # request for a node with at least 1 gpu having 10,240 MiB of memory by default
}

variable "sharing_strategy" {
  type        = string
  description = "GPU sharing strategy for the backend task"
  default     = "mps"
}

variable "replica_count" {
  type        = number
  description = "Number of replicas for the backend task"
  default     = 1
}

application {
  name        = "Custom GPU Web App"
  summary     = "A GPU-accelerated backend with a frontend"
  description = "This app demonstrates deploying a GPU-accelerated backend and a frontend, with the frontend receiving the backend URL via an environment variable."
  tags        = ["frontend", "backend", "gpu", "example"]

  service "backend" {
    name = "gpu-backend"

    task_group "backend" {
      name  = "gpu-backend-task-group"
      count = "${var.replica_count}"

      task {
        name   = "gpu-backend"
        driver = "docker"
        config {
          image = "my-gpu-backend:latest"
          ipc_mode = "host" # Use host IPC for GPU access
        }

        env {
          # Add any required environment variables for your backend here
        }

        endpoint api {
          port = 5000
          health {
            type     = "http"
            path     = "/health"
            interval = "10s"
            timeout  = "2s"
          }
        }

        resources {
          cpu_mhz          = 10240
          memory_mib       = "${var.memory_mib}"
          sharing_strategy = "${var.sharing_strategy}"
          device {
            name        = "${var.gpu_name}"
            memory_mibs = "${var.gpu_memory_mibs}"
          }
        }
      }
    }
  }

  service "frontend" {
    name = "frontend"

    task_group "frontend" {
      name  = "frontend-task-group"
      count = 1

      task {
        name   = "frontend"
        driver = "docker"
        config {
          image = "my-frontend:latest"
        }

        env = {
          BACKEND_URL = "${service.gpu-backend}"
        }

        resources {
          memory_mib = 4096
        }

        endpoint web {
          port = 3000
          health {
            type     = "http"
            path     = "/"
            interval = "10s"
            timeout  = "2s"
          }
        }
      }
    }

    config {
      depends_on = ["gpu-backend"]
    }
  }
}

Note: Your custom images must implement their respective health check endpoints that returns a 200 status code when the service is healthy. This endpoint is used by the orchestration platform to determine when your service is up and ready to receive traffic. The ${service.gpu-backend.api.url} variable injects the backend’s endpoint URL into the frontend container as the BACKEND_URL environment variable.

To deploy:

sora recipe seed my-gpu-app.hcl
sora recipe list
sora recipe deploy <recipe-slug>

Example 4: Deploying a GPU-Accelerated Daemon (Echo Server Example)

You can run any GPU-enabled daemon on every node that meets your GPU requirements. For example, here’s how to deploy a simple echo server as a GPU daemon. You can replace the image with your own.

First, pull and push the image to your SoraNova registry:

docker pull jmalloc/echo-server
sora image push jmalloc/echo-server echo-server:latest

Then, create an application recipe, e.g. gpu-echo-daemon.hcl:

application {
  name        = "GPU Job Processor"
  summary     = "A GPU-accelerated job processing backend"
  description = "This service fetches jobs from a queue, processes them using the GPU, stores results, and repeats."
  tags        = ["gpu", "job-processing", "backend", "example"]

  daemon {
    name = "gpu-job-processor"

    config {
      image = "echo-server:latest"
      ipc_mode = "host" # Use host networking for direct access to the GPU
    }

    resources {
      cpu_mhz          = 10240
      memory_mib       = 10240
      sharing_strategy = "mps"
      device {
        name = "nvidia/gpu"
        memory_mibs = [10240]
      }
    }

    # endpoint - only required if the daemon needs to be accessible via HTTP. If not specified, the daemon will run in the background without an HTTP endpoint.
    endpoint api {
        port = 8080 # container port
        static = 8080 # static port on the host
        health {
            type = "http"
            path = "/"
            interval = "30s"
            timeout = "5s"
        }
    }

    env {
      # Example: JOB_QUEUE_URL, RESULT_STORE_URL, etc.
      # JOB_QUEUE_URL    = "amqp://my-queue"
      # RESULT_STORE_URL = "s3://my-bucket/results"
    }
  }
}

You can replace echo-server:latest with any image you want. The daemon will run on all nodes with a GPU that meets the specified requirements. When new machines are provisioned, they will automatically start running this daemon if they have the required resources.

To deploy:

sora recipe seed gpu-echo-daemon.hcl
sora recipe list
sora recipe deploy <recipe-slug>

Key Points

  • Standard applications: Use the service block for frontend/backend workloads. Use ${service.<name>} to inject service URLs.
  • Daemon applications run on every node: Use them for log collection, monitoring, or any background process.
  • Resource allocation: Use the resources block to control CPU and memory usage.
  • GPU allocation: Use the device block to specify GPU requirements. You can target specific GPU models or specify memory requirements.
  • GPU sharing: Use sharing_strategy = "mps" to enable CUDA MPS for multi-process sharing on the same GPU.

Deploying

  1. Save your configuration to a file (e.g., my-app.hcl or vector-daemon.hcl).

  2. Deploy it using the SoraNova CLI:

    sora recipe seed <your-file>.hcl
    sora recipe list
    sora recipe deploy <recipe-slug>
    
  3. Your application or daemon will now run in your cluster.


🎉 That’s it! You now know how to deploy both standard applications and daemon applications on your SoraNova cluster.