Documentation Index
Fetch the complete documentation index at: https://sysg.dev/llms.txt
Use this file to discover all available pages before exploring further.
Configuration
systemg uses YAML files to define services and their relationships.
Complete example
version: "1"
env:
vars:
APP_ENV: "production"
logs:
sink: file
max_bytes: 10485760
max_files: 5
status:
snapshot_mode: summary
snapshot_interval_secs: 5
services:
postgres:
command: "postgres -D /var/lib/postgresql/data"
restart_policy: "always"
redis:
command: "redis-server /etc/redis/redis.conf"
restart_policy: "always"
api:
command: >
gunicorn app:application
--bind 0.0.0.0:8000
env:
file: "/etc/myapp/production.env"
vars:
PORT: "8000"
DATABASE_URL: "postgres://localhost/myapp"
depends_on:
- postgres
- redis
restart_policy: "always"
backoff: "10s"
deployment:
strategy: "rolling"
pre_start: "python manage.py migrate"
health_check:
command: "curl --fail http://localhost:8000/health"
hooks:
on_start:
success:
command: "echo 'API started'"
on_stop:
error:
command: "curl --request POST https://alerts.example.com/api/crash"
worker:
command: >
celery -A tasks worker
--loglevel=info
depends_on:
- redis
restart_policy: "on-failure"
max_restarts: 5
backup:
command: >
pg_dump mydb >
/backups/db-$(date +%Y%m%d).sql
cron: "0 2 * * *"
Configuration sections
version
Required. Specifies the configuration schema version.
env
Optional environment variables shared by all services.
env:
vars:
LOG_LEVEL: "info"
APP_ENV: "production"
file: "/etc/myapp/common.env"
logs
Optional defaults for service stdout/stderr handling.
logs:
sink: file
max_bytes: 10485760
max_files: 5
Fields:
sink: file captures service output to systemg-managed log files. none discards service output without creating log-writer threads or files.
max_bytes: active log-file size before rotation for the file sink.
max_files: number of rotated files to retain per active log.
Use sink: none for noisy production services when service output is already collected by another logging pipeline.
status
Optional defaults for cached status and inspect snapshots.
status:
snapshot_mode: summary
snapshot_interval_secs: 5
Fields:
snapshot_mode: off, summary, or detailed.
snapshot_interval_secs: seconds between background snapshot refreshes, clamped between 1 and 300.
Modes:
summary: default. Tracks service state, pid, health, last exit, cron state, and sampled metric summaries while skipping expensive process tree expansion.
detailed: includes runtime command details and process/spawn descendants for richer inspect output.
off: disables background live process snapshot refresh and uses persisted state plus pid files.
For large deployments, keep summary globally and use focused inspect --service workflows when deeper investigation is needed.
services
Required. Defines the services to manage.
services:
web:
command: "python app.py"
Service configuration
command
Required. The command to execute.
services:
web:
command: "python app.py"
depends_on
Services that must start before this one.
services:
api:
command: "python app.py"
depends_on:
- postgres
- redis
env
Service-specific environment configuration.
services:
api:
command: "python app.py"
env:
vars:
PORT: "8000"
DATABASE_URL: "postgres://localhost/myapp"
file: "/etc/myapp/production.env"
restart_policy
Control how services recover from crashes.
services:
api:
command: "python app.py"
restart_policy: "always"
backoff: "5s"
max_restarts: 10
Service logs
Override global logging settings for one service.
services:
api:
command: "python app.py"
logs:
sink: file
max_bytes: 5242880
max_files: 3
noisy_worker:
command: "worker --verbose"
logs:
sink: none
Policies:
always - Restart on any exit
on-failure - Restart only on non-zero exit codes
never - Don’t restart
hooks
Run commands when services start or stop.
services:
api:
command: "python app.py"
hooks:
on_start:
success:
command: "curl --request POST https://status.example.com/api/up"
error:
command: "curl --request POST https://status.example.com/api/down"
on_stop:
error:
command: "curl --request POST https://alerts.example.com/crash"
health_check
Verify services are ready before marking them healthy.
services:
api:
command: "python app.py"
health_check:
command: "curl --fail http://localhost:8000/health"
interval: "10s"
timeout: "5s"
retries: 3
cron
Run services on a schedule instead of continuously.
services:
backup:
command: >
pg_dump mydb >
/backups/db-$(date +%Y%m%d).sql
cron: "0 2 * * *"
deployment
Control how services update during restarts.
services:
api:
command: "python app.py"
deployment:
strategy: "rolling"
pre_start: "python manage.py migrate"
health_check:
command: "curl --fail http://localhost:8000/health"
interval: "5s"
timeout: "30s"
grace_period: "5s"
blue_green:
env_var: "PORT"
slots: ["8000", "8001"]
candidate_health_check:
command: "curl --fail http://127.0.0.1:{slot}/health"
interval: "2s"
switch_command: "/usr/local/bin/switch-upstream {candidate_slot}"
switch_verify:
command: "curl --fail http://localhost:8000/health"
state_path: ".state/api-slot.json"
Rolling deployments start the new instance, wait for health checks, then stop the old instance. For single-host zero-downtime with fixed ports, use blue_green so traffic can be switched between two slots. A blue-green deployment uses two identical slots, starts the new version in the idle slot, verifies it, and then switches traffic only after the candidate is ready.
Field reference
Service fields
Primary keys available on each service definition.
| Field | Type | Description |
|---|
command | string | Command to execute (required) |
depends_on | array | Services that must start first |
env | object | Environment configuration |
restart_policy | string | always, on-failure, or never |
backoff | string | Time between restart attempts |
max_restarts | number | Maximum restart attempts |
hooks | object | Lifecycle event handlers |
health_check | object | Service readiness probe |
cron | string | Cron schedule expression |
deployment | object | Update strategy configuration |
logs | object | Service stdout/stderr capture and rotation settings |
Environment object
Environment sources and inline overrides merged into the service process environment.
| Field | Type | Description |
|---|
vars | object | Key-value environment variables |
file | string | Path to env file |
Hooks object
Lifecycle callbacks you can trigger on service start/stop/restart outcomes.
| Field | Type | Description |
|---|
on_start | object | Commands for start events |
on_stop | object | Commands for stop events |
on_restart | object | Commands for restart events |
Each hook has success and error handlers with:
command - Command to execute
timeout - Maximum execution time
Health check object
Probe configuration used to determine readiness/health during deployment workflows.
| Field | Type | Description |
|---|
command | string | Check command |
url | string | HTTP endpoint (alternative to command) |
interval | string | Time between checks |
timeout | string | Check timeout |
retries | number | Attempts before marking unhealthy |
Deployment object
Controls how restarts are performed and what validation happens before cutover.
| Field | Type | Description |
|---|
strategy | string | rolling or immediate |
pre_start | string | Command to run before starting |
health_check | object | Health check configuration |
grace_period | string | Time before stopping old instance |
blue_green | object | Single-host blue/green rollout settings |
Blue/green deployment object
Single-host zero-downtime options for alternating between two rollout slots (typically ports).
| Field | Type | Description |
|---|
env_var | string | Env var injected with slot value (PORT default) |
slots | array | Exactly two slot values to alternate between |
switch_command | string | Command to switch traffic to candidate slot |
candidate_health_check | object | Optional candidate verification check ({slot} supported in url or command) |
switch_verify | object | Optional post-switch verification check |
state_path | string | Optional persisted active-slot state file path |
::::info Manifest schema migrations
The top-level version field declares the manifest schema version. systemg accepts the current version as either a string or integer, so version: "1" and version: 1 are equivalent.
When upgrading systemg, the binary reads the declared manifest version first, then migrates supported older manifest schemas into the current runtime configuration before starting services. This compatibility step happens in memory; systemg does not rewrite your YAML file just because it can migrate it.
When downgrading systemg, the older binary can only parse and migrate manifest versions it knows about. If a manifest has been updated to a newer schema, keep a copy of the previous manifest or convert it back to a version supported by the older binary before starting that older release.
::::