Compute
Pick the GPU type for your replica containers.
Container
Replica config for container image, entrypoint, disk size, and more.
Scaling
Scaling config for replicas, you can control min, max and how fast to scale.
Environment
Env variables to expose to your replica container.
App Config Details
Let’s get started, follow these steps to create the app.1
Compute
Pick the best GPU option depending on your application.
2
Container
Pick any name
Define the container image, it can be hosted on dockerhub, gcr, or any other private repository.
Example:
8scale/hello-world:latest
Use the UI to create auth if necessary. We encrypt auth secrets to properly store them securely.
Define container entrypoint to run a different service than the default.
Temporary disk in GiB given to a replica container while active. It is removed when replica is not active.
Persistent disk in GiB given to a replica container that persists even when replica is not active. This can be used to cache artifacts.Multiple replicas on the same server will share the same cache volume.
Persistent disk is mounted in replica container using this path. If this path currently exists in container image, it is overwritten by this volume and its data.
3
Scaling
Minimum replicas to keep active at all times.If set to 0, cost will be lower but first request may incure higher cold start times as replicas scale up to become active.
Maximum replicas the app can scale to during traffic spikes.You can also use this setting to control max spend per app.
Number of requests a single replica can handle.If set to 1, every request will require 1 replica and cause the app to scale up.If set to 20, then first replica can handle up to 20 requests before a scaling event occurs to run another replica to handle more traffic.
When a runnning replica gets no traffic, it will wait this many seconds before scaling down and changing to idle state.This can help curve pre-mature scale down and scale up events causing higher cold starts. Keeping app replicas active for longer can help provide better experience for your users.
4
Environment
Define any environment variables your replica container may require.