# Deployment Workflows — Locus Build

> Companion guide to [`SKILL.md`](./SKILL.md). Covers deployment timing, agent monitoring workflow, communication best practices, and lifecycle endpoints.

## When To Load

Load this file only when timing, polling cadence, cancel, rollback, restart, redeploy, or deployment lifecycle behavior matters.

## Table of Contents

- [Expected Timing](#expected-timing)
- [Recommended Agent Workflow](#recommended-agent-workflow)
- [Agent Communication Best Practices](#agent-communication-best-practices)
- [Deployment Lifecycle Endpoints](#deployment-lifecycle-endpoints)
- [When to Check Logs](#when-to-check-logs)

## Expected Timing

| Phase | Duration | What's Happening |
|-------|----------|------------------|
| **Queued** | 5-30 seconds | Deployment request received, waiting for worker |
| **Building** | 2-5 minutes | Cloning GitHub repo, running `docker build`, pushing to registry |
| **Deploying** | 1-5 minutes | Pulling image, starting container, waiting for ECS rollout, verifying public endpoint |
| **Total (GitHub)** | 3-9 minutes | From trigger to `healthy` |
| **Total (Image)** | 1-2 minutes | Skips build phase |

**Important:** `queued` status lasting 2-5 minutes is **NORMAL** — this is the build phase. The status only changes to `building` after the build job starts, which can take 30-60 seconds after the deployment is queued.

## Recommended Agent Workflow

**For AI agents:** Follow this workflow when deploying services to provide clear status updates to your human and avoid confusion about deployment timing.

> **Reminder:** If the user asks you to deploy a **different codebase** than what's in the current project, create a new project and environment first. Never deploy a new codebase into a project that belongs to a different application.

**Step 1: Trigger the deployment**
```bash
DEPLOYMENT_ID=$(curl -s -X POST https://api.buildwithlocus.com/v1/deployments \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"serviceId": "'"$SERVICE_ID"'"}' | jq -r '.id')
```

**Step 2: Immediately inform your human**
```
Deployment triggered!
   Deployment ID: deploy_abc123
   Status: queued
   Expected time: 3-9 minutes (GitHub source) or 1-2 minutes (image)

I'll monitor the build and update you when it's live.
```

**Step 3: Poll status once per tool call (not in a blocking loop)**
```bash
# Poll once — do NOT use a while loop (it blocks your output and makes you go silent)
RESPONSE=$(curl -s -H "Authorization: Bearer $TOKEN" \
  "https://api.buildwithlocus.com/v1/deployments/$DEPLOYMENT_ID")
STATUS=$(echo $RESPONSE | jq -r '.status')
echo "[$(date +%H:%M:%S)] Deployment status: $STATUS"

# If terminal, report to human. Otherwise, tell human you'll check again in ~60s.
```

**Important:** Do NOT wrap polling in a `while true` loop. Blocking loops prevent you from sending status updates to the human — you'll go silent for 5+ minutes. Instead:
1. Poll once
2. Report the status to the human
3. Wait ~60 seconds, then poll again in a new tool call

**Step 4: Notify human on completion**

**On success:**
```
Deployment complete!
   Service is live and healthy
   URL: https://svc-{id}.buildwithlocus.com (replace underscores with hyphens)
   Time taken: 4m 23s
```

> **⚠️ Service Discovery Delay:** Before `healthy`, the public URL may return **503 for up to 60 seconds** while the container registers with service discovery. This is normal, not a bug. Tell the user:
> ```
> Your service is still finishing startup. The URL may take up to 60 seconds to
> start responding while the container registers with the load balancer.
> ```

**On failure:**
```
Deployment failed
   Check logs: GET /v1/deployments/{id}/logs
   Common issues: container crash, port mismatch, missing dependencies, health check timeout (if healthCheckPath set)
```

## Agent Communication Best Practices

**DO:**
- Tell the human immediately when deployment is queued
- Set clear expectations about timing (3-9 minutes for GitHub builds)
- Poll every 60 seconds (not every 5 seconds — builds take time)
- Update human only on completion (success/failure)
- Include the deployment ID for reference

**DON'T:**
- Use blocking shell loops (`while true`) for deployment monitoring — they make you go silent
- Send updates every 10 seconds ("still building...")
- Assume something is broken if `queued` lasts 2-3 minutes
- Poll faster than once per minute (wastes API calls, builds won't finish faster)
- Say "something might be wrong" before 7-8 minutes have elapsed

## Deployment Lifecycle Endpoints

| Action | Endpoint | Notes |
|--------|----------|-------|
| **Cancel** | `POST /v1/deployments/:id/cancel` | Cancels queued/building/deploying. Sets status to `cancelled`. Returns 409 if already terminal |
| **Rollback** | `POST /v1/deployments/:id/rollback` | `{reason?}` — deploys previous healthy image. Requires a previous `healthy` deployment with `imageUri`. First deploy or deploys that never reached `healthy` can't be rolled back |
| **Restart** | `POST /v1/services/:id/restart` | Rolling restart without rebuild. Requires a healthy deployment and running container. Returns 400 if no healthy deploy exists or container not found |
| **Redeploy** | `POST /v1/services/:id/redeploy` | Redeploys latest successful image (skips build) or re-runs source build |

## Status Clarifications

- **`healthy` = ECS rollout completed and the public endpoint responded.** The deployment pipeline waits for ECS to finish rolling out the task, then performs an HTTP check against the public service URL. Use `GET /v1/services/:id?include=runtime` only when you need lower-level ECS details.
- **Restart requires running instances.** `POST /v1/services/:id/restart` needs actual ECS tasks running. If `runtime_instances.status` is `not_deployed`, trigger a new deployment (`POST /v1/deployments`) instead.
- **Rollback requires a previous healthy deployment with `imageUri`.** See the [Rollback entry in troubleshooting.md](./troubleshooting.md) for details.

## Redeploy vs. Fresh Deploy

| | Fresh deploy (`POST /v1/deployments`) | Redeploy (`POST /v1/services/:id/redeploy`) | Git push |
|---|---|---|---|
| **Builds from source?** | Yes | No — reuses last successful image | Yes |
| **Applies `buildArgs`?** | Yes | **No** — build is skipped | Yes |
| **Applies env var changes?** | Yes | Yes | Yes |
| **Applies `startCommand` changes?** | Yes | Yes | Yes |
| **Typical duration** | 3-9 min | 1-2 min | 3-9 min |
| **When to use** | First deploy, or after changing Dockerfile / build args | Env var updates, restart with latest config | Code changes pushed to repo |

> **Key takeaway:** If you changed `buildArgs` in the service config, you must trigger a fresh build (push a new commit or create a new deployment from source). A `redeploy` skips the build phase entirely, so build arg changes won't take effect.

## When to Check Logs

Only fetch logs if:
- Deployment is `failed`
- Deployment stuck in same state for >10 minutes
- Human explicitly asks for logs

```bash
# Fetch logs (phase-aware)
curl -H "Authorization: Bearer $TOKEN" \
  "https://api.buildwithlocus.com/v1/deployments/$DEPLOYMENT_ID/logs"

# Stream logs in real-time (SSE)
curl -N -H "Authorization: Bearer $TOKEN" \
  "https://api.buildwithlocus.com/v1/deployments/$DEPLOYMENT_ID/logs?follow=true"
```

For detailed log streaming, search, and best practices, see [logs.md](./logs.md).