Compute
Pick the GPU type for your replica containers.
Container
Replica config for container image, entrypoint, disk size, and more.
Scaling
Scaling config for replicas, you can control min, max and how fast to scale.
Environment
Env variables to expose to your replica container.
https://{app-id}.8scale.app
App Replicas
An app consists of many replicas. Each replica defines the scale of an app. You can specify maximum replicas in your config. You can also set minimum replicas greater than zero to always have replicas running to provide lowest latency for first requests. Each replica runs a container and sends telemetry like logs and metrics back to your app. While replica and container may sound redundent, replica is a higher level element than container due to its longer lifecycle.Replica Port
By default, you must expose your REST API on port 80. If you want to use another port then you can define env variable PORT with an int value. 8Scale will automatically forward traffic fromhttps://{app-id}.8scale.app
to your replicas.
Replica Health
GET /health or /ping must return 200 status
GET /health or /ping can return 204 status while the container is initializing or loading model