# Smart Deploy Full Documentation > Canonical full-text snapshot for long-context agents. Source index: /llms.txt --- # README # Smart Deploy

Smart Deploy is a preview-driven deployment platform for solo developers.

Scan a repo, review a live blueprint of what will run, edit infrastructure files in context, and deploy only after the plan makes sense.

Preview the deploy. Then ship it.

## Highlights | What you get | Why it matters | |--------------|----------------| | Preview-first workflow | See services, routing, and artifacts before anything runs | | Blueprint view | One place to understand build steps, containers, and traffic flow | | Smart Analysis | Railpack build plans with optional build verification | | Multi-target deploys | ECS Fargate for containers, S3 for static sites | | Live deploy feedback | Stream logs, track run history, and watch health status update in place | | Deployment Agent | Ask about your deployments, history, and runtime health | ## Workflow 1. **Scan and define** — Connect a repo and run Smart Analysis to resolve deploy units and target shape. 2. **Preview** — Open the blueprint, review the deployment path, and adjust config before anything runs. 3. **Deploy** — Start the deploy when the preview looks right, then follow live logs, run history, and health updates. ```mermaid flowchart LR A[Repo scan] --> B[Blueprint preview] B --> C{Plan looks right?} C -->|Edit| B C -->|Yes| D[Deploy] D --> E[Logs and health] ``` ## Deploy targets | Target | Best for | What Smart Deploy provisions | |--------|----------|------------------------------| | **ECS Fargate** | Server apps, Railpack builds, existing Docker images | CodeBuild → ECR → Fargate task behind a shared ALB | | **Static sites** | SPAs and static builds (no runtime) | CodeBuild → S3, optional CloudFront invalidation | ## Documentation User-facing guides for deploying and debugging your apps: - **[Documentation home](docs/README.md)** — full index - **[Getting Started](docs/GETTING_STARTED.md)** — first deployment - **[Debugging Deployments](docs/DEBUGGING_DEPLOYMENTS.md)** — production issues runbook - **[Deployment Agent](docs/DEPLOYMENT_AGENT.md)** — AI inspector for your deploys - **[FAQ](docs/FAQ.md)** · **[Error Catalog](docs/ERROR_CATALOG.md)** Browse all docs in the app at `/docs`. For AI agents: [`/llms.txt`](/llms.txt) and [`/llms-full.txt`](/llms-full.txt). ## Tech stack - Next.js 16, React 19, TypeScript - Tailwind CSS 4, shadcn/ui - Better Auth, Supabase - GraphQL (Yoga) + REST API routes - WebSocket worker for deploy execution - AWS SDK (CodeBuild, ECR, ECS, ALB, Route 53, S3, CloudFront, Secrets Manager) - Railpack + Mise for container builds - Vitest, Playwright ## License Smart Deploy is licensed under the Apache License 2.0. See [LICENSE](./LICENSE). Smart Deploy was created and is maintained by Anirudh Raghavendra Makuluri. Forks and derivative projects should preserve the required license and attribution notices, and should not imply that they are the official Smart Deploy project unless explicitly authorized. --- # docs/AI_ASSISTANCE.md # AI Assistance Smart Deploy includes AI features to speed up production debugging. Each has a different data source and scope. ## Feature comparison | Feature | Where | Data source | Best for | |---------|-------|-------------|----------| | **Deployment Agent** | Header → Agent | Live DB + runtime health via tools | Quick triage: list deploys, check health, recent failures | | **Analyze failure** | History / Logs | Full run logs + failure classification | Deep dive on one failed deploy attempt | | **Improve scan** | Scan results | SD Artifacts feedback stream | Fix Railpack plan or build after scan/verification failure | ## Recommended debugging flow ```text 1. Deployment Agent → "Why did my last deployment fail?" 2. Deployment History → full step logs for the failed run 3. Analyze failure → LLM explanation using complete logs 4. Improve scan → if build/plan issue (re-scan before redeploy) 5. Targeted guide → Build Failures, Health Checks, etc. ``` ## Deployment Agent Read-only inspector. Uses up to 2 tool calls per question. - ✅ "Show my deployments", "is api healthy?", "what failed last time?" - ❌ Cannot deploy, rollback, or edit config See [Deployment Agent](./DEPLOYMENT_AGENT.md). ## Analyze failure Runs on a **specific deployment history entry**. Sends: - Failure code and classification - Step summary - Full logs from object storage (when available) Use when the agent's log excerpts are not enough or you need a narrative root-cause analysis. Available from: - **Deployment History** — on a failed entry - **Deploy Logs** — during or after a failed deploy ## Improve scan Sends failure context back to SD Artifacts to regenerate or repair the Railpack plan. Use when: - Build verification failed - Railpack plan looks wrong - Deploy failed at Build with dependency or Dockerfile issues tied to scan output Always review the updated scan and blueprint before redeploying. ## Choosing the right tool | Situation | Start with | |-----------|------------| | "Is my app down?" | Deployment Agent → Runtime Health tab | | "Deploy failed 5 minutes ago" | Deployment Agent → History → Analyze failure | | "Build failed in CodeBuild" | History logs → [Build Failures](./BUILD_FAILURES.md) → Improve scan if plan-related | | "Wrong Node version" | [Railpack](./RAILPACK.md) → Improve scan if auto-detection failed | ## Related - [Debugging Deployments](./DEBUGGING_DEPLOYMENTS.md) - [Deployment Logs](./DEPLOYMENT_LOGS.md) - [Smart Analysis](./SMART_ANALYSIS.md) --- # docs/BLUEPRINT_AND_PREVIEW.md # Blueprint and Preview The blueprint answers one question before deploy: **What exactly is going to happen to this app?** It is the center of the product — a visual pipeline you can read and edit without losing context. ## Five preview steps | Step | What you see | |------|--------------| | **Auth and resolve ref** | Repo, branch, commit SHA | | **Build** | Deploy units, Railpack plans or Dockerfile, CodeBuild output (ECR or S3) | | **Setup** | AWS region, Fargate networking or static bucket | | **Deploy** | Runtime env vars (Secrets Manager on ECS), ALB host rules | | **Done** | Hosted subdomain and public Visit URL | Each step surfaces artifacts: Railpack plan JSON, Docker build units, ECS prerequisites, and domain routing. ## What you can edit in preview | Field | Effect | |-------|--------| | **Branch** | Which ref CodeBuild checks out | | **Region** | Where AWS resources are created | | **Env vars** | Build-time vars to CodeBuild; runtime vars to ECS Secrets Manager | | **Hosted subdomain** | Hostname on the platform domain (`myapp.example.com`) | Edits stay in the same preview surface — you do not lose the pipeline context. ## Reading Railpack artifacts When the scan used Railpack, the blueprint shows: - **Deploy units** — name, root path, framework, port, provider - **Railpack plan** — install/build steps and `deploy.startCommand` - **Build status** — passed, failed, or skipped from verification A missing Railpack plan on a non-Docker unit is a blocker — re-run Smart Analysis or use Improve scan. See [Railpack](./RAILPACK.md) and [Smart Analysis](./SMART_ANALYSIS.md). ## Deploy shape indicators | Shape | Blueprint hint | |-------|----------------| | `server` | Container on ECS via Railpack | | `static` / `static_build` (no start command) | Static S3 path | | `static_build` (with start command) | Container serving built assets | | `existing_docker` | Dockerfile via CodeBuild → ECS | | `multi` | Multiple deploy units; compose-style builds | ## Validation before deploy The preview model warns when: - Build status is not passed (when verification ran) - Railpack plan is missing for a unit - Required scan data is incomplete Review warnings in the blueprint before pressing deploy. ## After preview When the plan looks right: 1. Deploy from the workspace 2. Watch the same stages execute with live logs 3. Compare blueprint expectations to actual step output if something fails ## Related - [Deployment Pipeline](./DEPLOYMENT_PIPELINE.md) - [Getting Started](./GETTING_STARTED.md) - [Environment Variables](./ENVIRONMENT_VARIABLES.md) --- # docs/BUILD_FAILURES.md # Build Failures Build failures happen during **CodeBuild** (deploy step `build` or `publish`). The image or static artifact never reaches production. ## First steps 1. Open **Deployment History** → failed run → **Build** step logs 2. Find the first failing command (npm, pip, docker, railpack) 3. Check scan **build_status** — did Smart Analysis verification pass? 4. If plan-related → **Improve scan** before redeploy ## Failure code Most build failures map to: **`CODEBUILD_DOCKER_IMAGE_BUILD_FAILED`** Exact log strings: - `Docker image build failed. Check build logs above.` - `CodeBuild failed: Docker image build did not succeed` ## Railpack build failures CodeBuild runs: ```text docker buildx build -f /tmp/railpack-plan.json ... ``` | Cause | What to check | |-------|---------------| | Dependency install failed | Lockfile committed, registry auth env vars, Node/Python version files | | Build script failed | `npm run build` errors locally at same commit | | Missing files | Monorepo `package path` — wrong service root | | Railpack plan empty | Re-run Smart Analysis | See [Railpack](./RAILPACK.md) and [Smart Analysis](./SMART_ANALYSIS.md). ## Dockerfile failures (`existing_docker`) | Cause | What to check | |-------|---------------| | Wrong context | Dockerfile path matches selected service directory | | Base image pull failed | Docker Hub rate limit — platform may need registry credentials | | Multi-stage copy fails | Paths relative to build context | ## Docker Hub rate limits Anonymous pulls in CodeBuild can hit 429 errors. Symptoms in logs: rate limit, 429, toomanyrequests. Retry after a cooldown or ensure build uses authenticated registry pulls via env configuration. ## Static build failures S3-targeted builds fail when: - Build command does not produce expected output directory - `RAILPACK_SPA_OUTPUT_DIR` wrong for monorepo SPA - Build-time env vars missing (`NEXT_PUBLIC_*`) ## Scan verification vs deploy build | Phase | When | |-------|------| | Scan `build_verification` | During Smart Analysis (SD Artifacts) | | CodeBuild | During deploy | Verification can pass but deploy build fail if branch/commit/env differ. Align branch and env vars between scan and deploy. ## Fix checklist - [ ] Reproduce build locally at the same commit - [ ] Confirm version files (`.node-version`, `mise.toml`) match expected runtime - [ ] Confirm env vars needed at build time are set in Smart Deploy - [ ] For monorepos, confirm correct package path / service - [ ] Run Improve scan if Railpack plan looks wrong - [ ] Redeploy after fixes ## Related - [Railpack](./RAILPACK.md) - [Environment Variables](./ENVIRONMENT_VARIABLES.md) - [Error Catalog](./ERROR_CATALOG.md) - [Deployment Logs](./DEPLOYMENT_LOGS.md) --- # docs/CUSTOM_DOMAINS.md # Custom Domains and Deployment URLs Smart Deploy assigns each deployment a public **Visit** URL on a shared platform domain. Routing uses a hosted subdomain plus AWS DNS and load balancer rules. ## Default URL shape ```text https://{hosted-subdomain}.{deployment-domain} ``` Example: service `myapp` with subdomain `myapp` → `https://myapp.smart-deploy.xyz` You choose the **hosted subdomain** in blueprint preview or config. It must be unique across the platform. ## How routing works | Target | Routing | |--------|---------| | **ECS (containers)** | Wildcard DNS `*.domain` → shared ALB; per-service ALB listener rule matches your subdomain | | **Static S3** | Build output synced to S3; optional CloudFront; DNS points at the static endpoint | Each ECS service gets its own host rule — multiple services mean multiple subdomains, not path-based routing on one hostname. ## HTTPS HTTPS terminates at the shared ALB (ECS) or CloudFront (static when configured). Your app receives HTTP behind the load balancer; bind to `PORT` inside the container. ## Subdomain rules - Use lowercase alphanumeric characters and hyphens - Subdomain is global — two users cannot claim the same name - Changing subdomain after deploy updates DNS and ALB rules on the next deploy ## Custom apex or external DNS Today, user-facing deploy URLs use the **platform deployment domain** with a hosted subdomain. Bringing your own apex domain (for example `app.yourcompany.com`) requires DNS you control pointing at the platform ALB or static endpoint. For DNS not resolving or TLS issues, see [Domain and TLS Issues](./DOMAIN_AND_TLS_ISSUES.md). ## Multiple services in one repo Each service needs its own subdomain: | Service | Subdomain | URL | |---------|-----------|-----| | `web` | `acme-web` | `https://acme-web.{domain}` | | `api` | `acme-api` | `https://acme-api.{domain}` | ## After changing subdomain 1. Save the new subdomain in config 2. Redeploy so Route 53 and ALB rules update 3. Allow DNS propagation (minutes to hours depending on TTL) ## Related - [Domain and TLS Issues](./DOMAIN_AND_TLS_ISSUES.md) - [Health Checks](./HEALTH_CHECKS.md) - [Getting Started](./GETTING_STARTED.md) --- # docs/DEBUGGING_DEPLOYMENTS.md # Debugging Deployments Use this runbook when a deploy fails or a live app looks unhealthy. ## Quick triage (5 minutes) ### 1. Ask the Deployment Agent Open **Agent** in the header: - "Why did my last deployment fail?" - "Is {repo} {service} healthy right now?" The agent lists deployments, checks history, or loads runtime health (read-only, up to 2 tool calls). ### 2. Check deployment status In the deploy workspace **Overview**: | Status | Meaning | |--------|---------| | `deploying` | Pipeline still running — watch live logs | | `running` | Last deploy succeeded; check runtime health if URL fails | | `failed` | Last deploy did not complete — open History | | `degraded` / `unreachable` | Runtime health probe failing | See [Deployment Status Reference](./DEPLOYMENT_STATUS_REFERENCE.md). ### 3. Open Deployment History Find the latest failed entry: 1. Note **failed step** (Build, Verify, Deploy, etc.) 2. Expand step logs — find the first `❌` or `error` line 3. Note **failure code** if shown (for example `CODEBUILD_DOCKER_IMAGE_BUILD_FAILED`) ### 4. Escalate by failure type | Failed at | Guide | |-----------|-------| | Build / Publish | [Build Failures](./BUILD_FAILURES.md) | | Verify | [Health Checks](./HEALTH_CHECKS.md), [Startup and Runtime Failures](./STARTUP_AND_RUNTIME_FAILURES.md) | | Deploy / Rollout | [Startup and Runtime Failures](./STARTUP_AND_RUNTIME_FAILURES.md) | | URL loads but wrong behavior | Runtime logs, env vars — [Environment Variables](./ENVIRONMENT_VARIABLES.md) | | URL does not resolve | [Domain and TLS Issues](./DOMAIN_AND_TLS_ISSUES.md) | ### 5. Analyze failure (optional) On a specific history entry, run **Analyze failure** for an LLM summary using full run logs. ### 6. Fix and redeploy - Config issue → update env vars / branch / subdomain → redeploy - Build plan issue → **Improve scan** → review blueprint → redeploy - App code issue → fix repo → push → redeploy ## Severe production outage If users are impacted and you need service back before root-cause analysis: 1. **Rollback** to the last successful history entry (manual, by commit) 2. Confirm URL healthy via Overview or Deployment Agent 3. Debug the failed commit separately See [Deployment History and Rollback](./DEPLOYMENT_HISTORY_AND_ROLLBACK.md). Automatic rollback failure codes exist in classification, but **manual rollback by commit** is the supported recovery path today. ## Collect evidence before asking for help 1. Exact error text from step logs 2. Repo name, service name, branch, commit SHA 3. Failure code from history (if any) 4. Whether scan build_status was passed 5. Recent config changes (env vars, subdomain) ## Related - [Error Catalog](./ERROR_CATALOG.md) - [Deployment Agent](./DEPLOYMENT_AGENT.md) - [AI Assistance](./AI_ASSISTANCE.md) - [Deployment Logs](./DEPLOYMENT_LOGS.md) --- # docs/DEPLOYMENT_AGENT.md # Deployment Agent The **Deployment Agent** is a read-only AI inspector for your live deployments. Open it from the header **Agent** button. It answers questions by fetching **your** deployment data — status, history, health — through tools. It does not guess repos, services, or health states. ## What it can do | Capability | Example questions | |------------|-------------------| | List deployments | "Show me my deployments" | | Inspect current state | "What's the status of my api service?" | | Review history | "Why did my last deployment fail?" | | Check health | "Is my service healthy right now?" | ## What it cannot do The agent is **read-only**. It cannot: - Trigger deploys or rollbacks - Change env vars, branch, or subdomain - Re-run Smart Analysis or Improve scan - Access another user's deployments For actions, use the deploy workspace UI. ## How to ask effective questions **Use repo and service names from your dashboard:** | You say | Agent uses | |---------|------------| | `acme/smart-deploy` | `repoName: smart-deploy` | | service `web` | `serviceName: web` | **Good prompts:** - "Show me my deployments" - "Why did smart-deploy web fail on the last deploy?" - "Is shop-api healthy right now?" **Ambiguous prompts:** If you mention only `api` without a repo, the agent lists deployments first instead of guessing. ## Tools the agent uses | Tool | Returns | |------|---------| | `list_deployments` | Up to 25 deployments: status, branch, target, URL | | `get_deployment_details` | Status, commit, revision, region, cloud resources, scan summary | | `get_deployment_history` | Recent attempts: success/fail, failed step, log excerpts | | `get_runtime_health` | Recent probes: app status, HTTP code, latency, ECS/ALB signals | ## Live status updates While working, the agent streams progress: - Accepted → status updates → tool started/completed → final message If the WebSocket worker is offline, you see: *"The deployment agent is offline right now. Refresh the page and try again."* ## Limits | Limit | Value | |-------|-------| | Tool calls per question | 2 | | Conversation memory | Last 6 turns | | History/health samples per tool | 5 | | Write actions | None | Complex root-cause analysis may hit the tool limit. Use **Analyze failure** on a specific history entry or read full logs in the History tab. ## Starter prompts Built-in shortcuts in the agent sheet: 1. Show me my deployments 2. Why did my last deployment fail? 3. Is my service healthy right now? ## When to escalate | Need | Use instead | |------|-------------| | Full step logs | Deployment History tab | | Deep failure analysis | Analyze failure on a history entry | | Fix build plan | Improve scan | | Platform how-to | [FAQ](./FAQ.md), [Debugging Deployments](./DEBUGGING_DEPLOYMENTS.md) | See [AI Assistance](./AI_ASSISTANCE.md). ## Related - [Debugging Deployments](./DEBUGGING_DEPLOYMENTS.md) - [Runtime Health](./RUNTIME_HEALTH.md) - [Deployment History and Rollback](./DEPLOYMENT_HISTORY_AND_ROLLBACK.md) --- # docs/DEPLOYMENT_HISTORY_AND_ROLLBACK.md # Deployment History and Rollback Every deploy attempt is recorded. Use history to audit changes, compare failures, and roll back to a known-good commit. ## History entries Each entry includes: | Field | Description | |-------|-------------| | **timestamp** | When the attempt started | | **success** | Whether the pipeline completed healthy | | **branch / commitSha** | Git ref that was deployed | | **duration** | Wall-clock time for the attempt | | **steps** | Per-step status and logs | | **failureCode** | Structured code when failed (see [Status Reference](./DEPLOYMENT_STATUS_REFERENCE.md)) | | **failureClassification** | Summary, likely cause, evidence | | **release_artifact** | ECR image URI/digest or S3 path for reproducibility | ## Viewing history Open the **History** tab in the deploy workspace. Failed entries highlight the failed step and show failure code when classified. Use **Analyze failure** on an entry for LLM analysis with full logs. ## Manual rollback Rollback redeploys a **previous commit SHA** from a successful history entry. ### What rollback restores | Restored | Kept from current config | |----------|--------------------------| | Commit SHA (and thus built artifact for that commit) | Current env vars | | Branch context from selected entry | Current subdomain, region | Rollback is a **new deploy attempt** using the old commit — not an instant ALB pointer swap. ### How to roll back 1. Open **History** 2. Select a **successful** entry before the bad deploy 3. Confirm rollback 4. Wait for pipeline to complete and verify health ### Rollback limitations - Fails with `MANUAL_ROLLBACK_FAILED` if artifact metadata is missing or redeploy errors - Automatic rollback codes exist in classification but **automatic rollback is not implemented** in the deploy handler today - Pause/resume are not supported on AWS — only delete or redeploy ## Release artifacts Successful deploys store `release_artifact` metadata: - **ECS**: ECR image URI and digest - **Static**: S3 bucket and prefix Used for rollback context and reproducibility. If missing, rollback may not reconstruct the prior release. ## Comparing attempts When debugging regressions: 1. Compare failed entry to last successful entry 2. Diff commit messages and SHAs 3. Check if env vars changed between attempts (rollback keeps current env) 4. Review whether scan `build_status` changed after Improve scan ## Related - [Debugging Deployments](./DEBUGGING_DEPLOYMENTS.md) - [Deployment Logs](./DEPLOYMENT_LOGS.md) - [Deployment Agent](./DEPLOYMENT_AGENT.md) --- # docs/DEPLOYMENT_LOGS.md # Deployment Logs Smart Deploy gives you three log surfaces: **live deploy logs**, **history logs**, and **ECS runtime logs**. ## Live deploy logs During an active deploy, logs stream over WebSocket to the **Logs** tab in the deploy workspace. | Property | Detail | |----------|--------| | **Transport** | WebSocket worker (`deploy:log`, `deploy:steps`) | | **Format** | Timestamped lines grouped by deploy step | | **Reconnection** | Subscribing to workspace replays in-progress state (`deploy:snapshot`) | If logs stop updating: - Check system health indicator in the header (worker offline = degraded) - Refresh the page to reconnect WebSocket ## Deploy steps in logs Typical step order for ECS: ```text auth → build → publish → setup → deploy → rollout → verify → done ``` Each step shows status (`running`, `success`, `error`) and accumulated log lines. On failure, the first error line in the **failed step** is usually the root cause. ## History logs Every deploy attempt is stored in **Deployment History**. | Field | Content | |-------|---------| | `steps` | Per-step status and inline log lines | | `failureCode` | Structured code when classified | | `failureClassification` | Summary, likely cause, evidence | | Full logs | Fetched from object storage via history UI (when `logRef` exists) | Use history when the live stream is gone or you need an older attempt. ## ECS CloudWatch logs (runtime) For **running** ECS deployments, the Logs tab can tail **CloudWatch** log group for the service (last ~50 lines). Use for: - App crashes after successful deploy - Verify failures where the image built but the process exits - Runtime exceptions not visible in deploy-step logs Runtime logs appear after the task is running — not during CodeBuild. ## Build log excerpts (scan) Smart Analysis stores `build_verification.log_excerpt` and `repair_history[].build_log_excerpt` in scan results. Check these when deploy fails at Build but CodeBuild logs are sparse in history. ## Reading logs effectively 1. Find the **failed step** id 2. Search for `error`, `Error`, `FAILED`, `exit code`, first npm/pip/Docker failure 3. Ignore trailing cascade errors — fix the earliest failure 4. For verify failures, scroll to ECS diagnostics appended after probe timeout ## Analyze failure Pulls the **full** log payload for one history entry and sends it to the LLM with failure classification. Use when inline history logs are truncated. ## Agent log excerpts The Deployment Agent returns **summarized** log lines from history (last few lines per failed step). For complete logs, open History directly. ## Related - [Debugging Deployments](./DEBUGGING_DEPLOYMENTS.md) - [Build Failures](./BUILD_FAILURES.md) - [Deployment Agent](./DEPLOYMENT_AGENT.md) --- # docs/DEPLOYMENT_PIPELINE.md # Deployment Pipeline Every deploy follows the same high-level stages. The blueprint shows them before you ship; the deploy workspace streams them live. ## Pipeline stages | Stage | What happens | |-------|--------------| | **Auth** | Resolve GitHub access, branch, and commit SHA | | **Build** | CodeBuild clones the repo and builds the artifact | | **Publish** | Push image to ECR (containers) or sync to S3 (static) | | **Setup** | Region, ECS service/task, or static bucket configuration | | **Deploy** | Roll out the release, configure ALB host rules, apply secrets | | **Rollout** | Wait for ECS tasks to become running (container path) | | **Verify** | HTTP probes until the app responds healthy or timeout | | **Done** | Hosted URL is live; status updates to running or failed | ## Container path (ECS Fargate) Used for server apps, Railpack-built containers, and repos with an existing Dockerfile. ```text GitHub repo @ commit → CodeBuild (Railpack plan or Dockerfile) → ECR image → ECS Fargate task → shared ALB + host rule → Route 53 subdomain → verification probes → https://{subdomain}.{domain} ``` **Build input**: Railpack plan JSON (`docker buildx build -f /tmp/railpack-plan.json`) or repo `Dockerfile`. **Runtime**: ECS task uses Railpack `deploy.startCommand` or the image default `CMD`. See [Railpack](./RAILPACK.md). ## Static path (S3) Used for plain static files or build-only SPAs without a runtime start command. ```text GitHub repo @ commit → CodeBuild (static build) → S3 prefix sync → optional CloudFront invalidation → public base URL ``` **Routing rule**: `deploy_shape: static`, or `static_build` where Railpack has no `deploy.startCommand`. ## What Smart Deploy provisions | Resource | Container | Static | |----------|-----------|--------| | CodeBuild project | Yes | Yes | | ECR repository | Yes | No | | ECS cluster/service | Yes | No | | ALB host rule | Yes | No | | Route 53 record | Yes (wildcard + host rule) | Per-subdomain when needed | | S3 prefix | No | Yes | | Secrets Manager | Runtime env (ECS) | No | ## Editable before deploy From blueprint preview or config tabs: - **Branch** — defaults to repo default branch if blank - **AWS region** — where resources are created - **Env vars** — build-time in CodeBuild; runtime via Secrets Manager on ECS - **Hosted subdomain** — `https://{subdomain}.{deployment-domain}` Changing env vars after a successful deploy typically requires a **redeploy** for runtime values to take effect on ECS. ## Verification After rollout, Smart Deploy probes your URL for up to roughly five minutes: - Paths tried: `/`, `/health`, `/healthz`, `/api/health` - Success: HTTP 2xx or 3xx - Failure: deploy marked failed with `DEPLOYMENT_VERIFICATION_FAILED` See [Health Checks](./HEALTH_CHECKS.md). ## Failure stages Failures are classified by stage: `clone`, `auth`, `build`, `publish`, `setup`, `deploy`, `rollout`, `verify`, `rollback`. See [Deployment Status Reference](./DEPLOYMENT_STATUS_REFERENCE.md) and [Error Catalog](./ERROR_CATALOG.md). ## Related - [Blueprint and Preview](./BLUEPRINT_AND_PREVIEW.md) - [Build Failures](./BUILD_FAILURES.md) - [Startup and Runtime Failures](./STARTUP_AND_RUNTIME_FAILURES.md) --- # docs/DEPLOYMENT_STATUS_REFERENCE.md # Deployment Status Reference Vocabulary for deployment status, failure stages, categories, and codes. ## Deployment status | Status | Meaning | |--------|---------| | `pending` | Created, never deployed | | `deploying` | Pipeline in progress | | `running` | Last deploy succeeded | | `failed` | Last deploy failed | | `degraded` | Running but runtime health degraded | | `unreachable` | Running but probes failing | | `paused` | UI state; AWS pause not fully supported | ## Runtime health status | Status | Meaning | |--------|---------| | `healthy` | App and infrastructure signals OK | | `degraded` | Partial failure (for example ALB targets unhealthy) | | `unreachable` | HTTP probe or ECS counts indicate outage | | `unknown` | Insufficient recent samples | ## Failure stages | Stage | Typical failure point | |-------|----------------------| | `clone` | Git checkout | | `detect` | Service/scan detection | | `auth` | GitHub or cloud credentials | | `build` | CodeBuild / Railpack / Docker | | `publish` | ECR push or S3 sync | | `setup` | Infrastructure preparation | | `deploy` | ECS service update | | `rollout` | Waiting for tasks | | `verify` | HTTP health probes | | `rollback` | Rollback attempt | | `unknown` | Unclassified | ## Failure categories | Category | Description | |----------|-------------| | `auth_failure` | Credential or permission problem | | `build_failure` | Image or artifact build failed | | `startup_failure` | Process failed to start (often overlaps verify) | | `health_check_failure` | Verification probes failed | | `rollback_failure` | Rollback could not complete | | `infrastructure_failure` | Transient network or cloud reachability | | `unknown_failure` | No pattern matched | ## Failure codes | Code | Summary | Retryable | |------|---------|-----------| | `AUTHENTICATION_FAILED` | GitHub, cloud, or registry auth failed | No | | `CODEBUILD_DOCKER_IMAGE_BUILD_FAILED` | CodeBuild image build failed | No | | `DEPLOYMENT_VERIFICATION_FAILED` | Post-deploy health check failed | No | | `AUTOMATIC_ROLLBACK_FAILED` | Verify failed and auto-rollback failed | No | | `AUTOMATIC_ROLLBACK_NO_CANDIDATE` | No prior release to restore | No | | `MANUAL_ROLLBACK_FAILED` | User rollback failed | No | | `INFRASTRUCTURE_NETWORK_FAILURE` | Transient network issue | Yes | | `DEPLOYMENT_FAILED_GENERIC` | Unclassified — check step logs | No | Detailed symptoms and fixes: [Error Catalog](./ERROR_CATALOG.md). ## Scan build status | Value | Meaning | |-------|---------| | `passed` | Build verification succeeded | | `failed` | Verification or build repair failed | | `skipped` | Verification not run | ## Deploy shapes | Shape | Typical target | |-------|----------------| | `static` | S3 | | `static_build` (no start command) | S3 | | `static_build` (with start command) | ECS | | `server` | ECS | | `multi` | ECS (multiple units) | | `existing_docker` | ECS via Dockerfile | ## Related - [Error Catalog](./ERROR_CATALOG.md) - [Debugging Deployments](./DEBUGGING_DEPLOYMENTS.md) - [Glossary](./GLOSSARY.md) --- # docs/DOMAIN_AND_TLS_ISSUES.md # Domain and TLS Issues Use this guide when your Visit URL does not load, shows certificate errors, or DNS does not resolve. ## Expected URL ```text https://{hosted-subdomain}.{deployment-domain} ``` Confirm the subdomain in config matches what you are visiting. Typos and stale bookmarks are common. ## DNS propagation After first deploy or subdomain change: | Factor | Typical wait | |--------|--------------| | New ALB host rule | Seconds to minutes | | DNS TTL | Up to prior TTL (often 300s–3600s) | | Wildcard record | Must exist for ECS `*.domain` routing | **Check resolution:** ```bash dig +short myapp.example.com nslookup myapp.example.com ``` Expected: ALB DNS name or CloudFront distribution for static endpoints. ## TLS / HTTPS errors | Error | Common cause | |-------|--------------| | Certificate mismatch | Visiting wrong hostname (subdomain not deployed) | | NET::ERR_CERT_AUTHORITY_INVALID | DNS points to wrong endpoint | | SSL handshake failed | ALB listener or cert not ready — retry after deploy completes | HTTPS terminates at the platform ALB or CloudFront — your container serves HTTP internally. ## 502 / 503 with valid DNS DNS works but edge returns bad gateway: - Usually **runtime** issue, not DNS — see [Startup and Runtime Failures](./STARTUP_AND_RUNTIME_FAILURES.md) - ALB has no healthy targets behind your host rule ## Subdomain conflicts Hosted subdomains are globally unique. If deploy DNS steps fail: - Another deployment may already use that subdomain - Pick a different subdomain and redeploy ## Multiple services Each service needs a distinct subdomain and host rule. Visiting the wrong subdomain shows another service or default ALB response. ## After fixing DNS or subdomain 1. Save config 2. Redeploy (updates Route 53 / ALB rules) 3. Wait for propagation 4. Hard-refresh browser or clear local DNS cache Windows: `ipconfig /flushdns` macOS: `sudo dscacheutil -flushcache; sudo killall -HUP mDNSResponder` ## Related - [Custom Domains](./CUSTOM_DOMAINS.md) - [Health Checks](./HEALTH_CHECKS.md) - [Error Catalog](./ERROR_CATALOG.md) --- # docs/ENVIRONMENT_VARIABLES.md # Environment Variables Smart Deploy uses env vars at **build time** (CodeBuild) and **runtime** (ECS tasks). Static S3 deploys only use build-time vars during the build step. ## Build-time vs runtime | Phase | Used for | ECS deploys | Static S3 | |-------|----------|-------------|-----------| | **Build-time** | `npm ci`, `npm run build`, Railpack install | Injected into CodeBuild as `.env` | Same | | **Runtime** | App process after container starts | AWS Secrets Manager → ECS task env | N/A (static files only) | Changing runtime vars in the UI updates Secrets Manager but the running task may need a **redeploy** to pick up new values. ## Where to set them - **Blueprint preview** or **Config → Environment** tab - Format: `KEY=value` per line; lines starting with `#` are ignored ## ECS runtime secrets For container deploys, runtime env vars are stored in **AWS Secrets Manager** per deployment (`secretsArn`). Smart Deploy syncs your env string to the secret and mounts keys on the ECS task definition. Do not put build-only secrets in runtime if the build step needs them — use build-time vars for CodeBuild. ## Common variables | Variable | When needed | |----------|-------------| | `PORT` | Container must listen on the port ECS expects (often from scan/Railpack plan) | | `NODE_ENV` | `production` for Node server apps | | `DATABASE_URL` | Runtime connection string for server apps | | `RAILPACK_PACKAGES` | Pin Mise packages at build: `node@22` | | `RAILPACK_SPA_OUTPUT_DIR` | Monorepo SPA dist path for static routing | | `RAILPACK_BUILD_CMD` / `RAILPACK_START_CMD` | Override Railpack commands | See [Railpack](./RAILPACK.md). ## Railpack plan variables Railpack may embed `deploy.variables` in the scan plan. User env vars in Smart Deploy merge with build context — conflicting keys: prefer explicit env tab values for overrides. ## Build-time `.env` in CodeBuild During build, Smart Deploy writes your env string to `.env` in the CodeBuild workspace before Railpack/Docker build runs. Useful for: - `NEXT_PUBLIC_*` vars needed at build - Private registry tokens for `npm`/`pip` - Build-args equivalents for static site builds ## Static sites Only build-time vars apply. There is no server process — anything the built HTML/JS needs at build time must be present during CodeBuild. ## Security practices - Never commit production secrets to the repo; use Smart Deploy env vars or Secrets Manager - Rotate credentials if logs show leaked values - Separate staging and production deployments rather than reusing secrets across branches ## Troubleshooting | Symptom | Check | |---------|-------| | App missing config at runtime | Runtime env tab; redeploy after changes | | Build can't reach private npm | Build-time token vars | | Wrong API URL in built SPA | `NEXT_PUBLIC_*` at build time, not runtime | | Container exits immediately | Required runtime vars (`DATABASE_URL`, `PORT`) | ## Related - [Railpack](./RAILPACK.md) - [Startup and Runtime Failures](./STARTUP_AND_RUNTIME_FAILURES.md) - [Build Failures](./BUILD_FAILURES.md) --- # docs/ERROR_CATALOG.md # Error Catalog Symptom-first reference for deployment and runtime issues. Match your error text, run quick checks, apply the fix. **Agent retrieval notes:** Each entry includes exact strings, `stage`, `retryable`, and `agent_signals` for automated matching. --- ## How to use 1. Match exact error text or closest symptom 2. Run quick checks in order 3. Apply fix and redeploy 4. Escalate via [Debugging Deployments](./DEBUGGING_DEPLOYMENTS.md) or [Deployment Agent](./DEPLOYMENT_AGENT.md) --- ## Deployment failure codes ### DEPLOYMENT_FAILED_GENERIC - **Stage:** unknown - **Retryable:** no - **Exact strings:** `Deployment failed` - **Symptoms:** Deploy ends in `failed` without a specific code in UI - **Likely cause:** Wrapper failure — root cause is in step logs - **Quick checks:** 1. Deployment History → first step with `error` status 2. First `❌` or `error` line in that step's logs 3. Map to a specific code below - **Fix:** Address the underlying step failure; redeploy - **Related:** [Deployment Logs](./DEPLOYMENT_LOGS.md) - **Agent signals:** `failed`, `error`, `step logs` ### CODEBUILD_DOCKER_IMAGE_BUILD_FAILED - **Stage:** build - **Retryable:** no - **Exact strings:** - `Docker image build failed. Check build logs above.` - `CodeBuild failed: Docker image build did not succeed` - **Symptoms:** Pipeline stops at Build; `failed` status - **Likely cause:** Railpack/Dockerfile build failed — deps, version, context - **Quick checks:** 1. Build step logs for first failing npm/pip/docker command 2. Scan `build_status` — did verification pass? 3. Correct package path for monorepo service? - **Fix:** Fix Dockerfile/plan/deps; Improve scan if plan wrong; redeploy - **Related:** [Build Failures](./BUILD_FAILURES.md), [Railpack](./RAILPACK.md) - **Agent signals:** `build`, `CodeBuild`, `Dockerfile`, `Railpack`, `npm`, `pip` ### DEPLOYMENT_VERIFICATION_FAILED - **Stage:** verify - **Retryable:** no - **Exact strings:** `Deployment verification failed`, health probe timeout messages - **Symptoms:** Build/deploy succeed; Verify step fails; URL unhealthy - **Likely cause:** App not listening on PORT, crash on start, no 2xx on probed paths - **Quick checks:** 1. Verify step logs and ECS diagnostics 2. CloudWatch runtime logs for crash stack trace 3. App binds `0.0.0.0` and `PORT` 4. `/health` returns 200 - **Fix:** Fix startup/port/env; redeploy - **Related:** [Health Checks](./HEALTH_CHECKS.md), [Startup and Runtime Failures](./STARTUP_AND_RUNTIME_FAILURES.md) - **Agent signals:** `verify`, `health`, `502`, `503`, `unreachable` ### AUTHENTICATION_FAILED - **Stage:** auth - **Retryable:** no - **Exact strings:** `unauthorized`, `GitHub not connected`, `access denied`, `invalid token` - **Symptoms:** Early pipeline failure at auth or clone - **Likely cause:** GitHub token expired or cloud credentials invalid - **Quick checks:** 1. Re-link GitHub account 2. Retry deploy after fresh sign-in - **Fix:** Restore GitHub connection; retry - **Related:** [FAQ](./FAQ.md) - **Agent signals:** `auth`, `GitHub`, `token`, `unauthorized` ### INFRASTRUCTURE_NETWORK_FAILURE - **Stage:** deploy - **Retryable:** yes - **Exact strings:** `ECONNREFUSED`, `timed out`, `ENOTFOUND`, `socket hang up` - **Symptoms:** Intermittent deploy failure mid-pipeline - **Likely cause:** Transient AWS or network reachability - **Quick checks:** 1. Retry deploy 2. If persistent, check AWS service health - **Fix:** Retry; contact operator if repeated - **Agent signals:** `timeout`, `network`, `ECONNREFUSED` ### MANUAL_ROLLBACK_FAILED - **Stage:** rollback - **Retryable:** no - **Exact strings:** `Rollback failed`, `could not restore the selected release` - **Symptoms:** Rollback action errors in UI - **Likely cause:** Missing release artifact or redeploy of old commit failed - **Quick checks:** 1. Select a different successful history entry 2. Confirm entry has commit SHA and success=true - **Fix:** Pick another rollback target or fix forward with new deploy - **Related:** [Deployment History and Rollback](./DEPLOYMENT_HISTORY_AND_ROLLBACK.md) - **Agent signals:** `rollback`, `restore`, `artifact` ### AUTOMATIC_ROLLBACK_FAILED / AUTOMATIC_ROLLBACK_NO_CANDIDATE - **Stage:** rollback - **Retryable:** no - **Note:** Classification codes exist; automatic rollback is not active in deploy handler. Treat as verify failure and use manual rollback. - **Related:** [Deployment History and Rollback](./DEPLOYMENT_HISTORY_AND_ROLLBACK.md) --- ## Runtime symptoms (no deploy failure) ### APP_RETURNS_502_503 - **Stage:** verify / runtime - **Retryable:** no - **Symptoms:** URL loads but ALB returns 502 or 503 - **Likely cause:** ECS tasks unhealthy or not listening on correct port - **Quick checks:** 1. Runtime health in Overview 2. CloudWatch logs 3. `PORT` and bind address - **Fix:** [Startup and Runtime Failures](./STARTUP_AND_RUNTIME_FAILURES.md) - **Agent signals:** `502`, `503`, `unhealthy`, `ALB` ### RUNTIME_DEGRADED - **Stage:** runtime - **Symptoms:** Status `running` but health `degraded` - **Likely cause:** Partial infrastructure or app probe failure - **Quick checks:** Deployment Agent → runtime health; ECS vs ALB signals - **Fix:** [Runtime Health](./RUNTIME_HEALTH.md) - **Agent signals:** `degraded`, `health`, `ECS`, `ALB` --- ## Scan and build verification ### SCAN_BUILD_VERIFICATION_FAILED - **Stage:** build (scan) - **Symptoms:** Smart Analysis completes with `build_status: failed` - **Likely cause:** Railpack plan does not build at scanned commit - **Quick checks:** `build_verification.log_excerpt`, `repair_history` - **Fix:** Improve scan; fix repo; re-scan before deploy - **Related:** [Smart Analysis](./SMART_ANALYSIS.md), [Build Failures](./BUILD_FAILURES.md) - **Agent signals:** `scan`, `verification`, `repair`, `Railpack` ### MISSING_RAILPACK_PLAN - **Stage:** build (preview) - **Exact strings:** `Missing Railpack plan for` - **Symptoms:** Blueprint warning; deploy may fail at buildspec generation - **Likely cause:** Scan incomplete or wrong service path - **Fix:** Re-run Smart Analysis on correct service - **Related:** [Railpack](./RAILPACK.md) --- ## GitHub and access ### GITHUB_NOT_CONNECTED - **Exact strings:** `GitHub not connected` - **Symptoms:** Scan or deploy actions blocked - **Fix:** Sign in with GitHub or link GitHub account - **Agent signals:** `GitHub`, `connected`, `OAuth` ### REDIRECTED_TO_WAITING_LIST - **Symptoms:** Sign-in succeeds then `/waiting-list` - **Likely cause:** Email not approved on this instance - **Fix:** Request access from platform operator - **Agent signals:** `waiting list`, `approved` --- ## Domain and DNS ### CUSTOM_DOMAIN_NOT_RESOLVING - **Symptoms:** Visit URL does not resolve or wrong site - **Quick checks:** 1. Subdomain spelling in config 2. `dig` / `nslookup` for hostname 3. Redeploy after subdomain change - **Fix:** [Domain and TLS Issues](./DOMAIN_AND_TLS_ISSUES.md) - **Agent signals:** `DNS`, `domain`, `resolve`, `TLS` --- ## Deployment Agent ### DEPLOYMENT_AGENT_OFFLINE - **Exact strings:** `The deployment agent is offline right now` - **Symptoms:** Agent button returns immediately with offline message - **Likely cause:** WebSocket worker disconnected - **Quick checks:** System health indicator in header - **Fix:** Refresh page; wait for worker recovery - **Agent signals:** `agent offline`, `WebSocket`, `worker` ### DEPLOYMENT_AGENT_TOOL_LIMIT - **Exact strings:** `couldn't finish the inspection within the current tool-call limit` - **Symptoms:** Agent stops after partial answer - **Fix:** Ask narrower question; use History + Analyze failure - **Related:** [Deployment Agent](./DEPLOYMENT_AGENT.md) - **Agent signals:** `tool limit`, `inspection` --- ## Docker registry ### DOCKERHUB_RATE_LIMIT_429 - **Exact strings:** `429`, `toomanyrequests`, rate limit in CodeBuild logs - **Symptoms:** Build fails pulling base images - **Fix:** Retry later; use authenticated registry pulls - **Related:** [Build Failures](./BUILD_FAILURES.md) - **Agent signals:** `429`, `Docker Hub`, `rate limit` --- ## Notes for retrieval quality - Prefer exact quoted error strings in search indexes - Tag entries with `user_facing: true` and `stage` - Append new incidents; avoid renaming stable codes - Cross-link to deep guides for fixes --- # docs/FAQ.md # FAQ ## Getting started ### What is Smart Deploy? A preview-driven deployment platform. Scan a repo, review a blueprint, edit config, then deploy to AWS (ECS or static S3). See [What is Smart Deploy](./WHAT_IS_SMART_DEPLOY.md). ### How do I deploy my first app? Connect GitHub → open repo → detect services → Smart Analysis → review blueprint → deploy. See [Getting Started](./GETTING_STARTED.md). ### Why am I sent to `/waiting-list` after sign-in? Your email is not on the approved users list for this Smart Deploy instance. Contact the platform operator for access. ## GitHub and repos ### Can I deploy without GitHub? GitHub is required for repo scanning, cloning, and deploys from Git repositories. ### "GitHub not connected" — what does that mean? Your session has no linked GitHub OAuth token. Sign in with GitHub or link GitHub in account settings. ### Does Smart Deploy support monorepos? Yes. It detects workspace packages, compose dirs, and multiple services. Each service can have its own deployment. See [Monorepos and Multi-Service](./MONOREPOS_AND_MULTI_SERVICE.md). ## Scan and Railpack ### What is Smart Analysis? The repo scan that detects deploy shape, generates Railpack plans, and optionally verifies builds. See [Smart Analysis](./SMART_ANALYSIS.md). ### What is Railpack? The default build system that produces container images from your repo without a Dockerfile. It uses Mise for runtimes. See [Railpack](./RAILPACK.md). ### Why did build verification fail? Dependency errors, wrong runtime version, or monorepo path issues. Review scan logs and `repair_history`, then try Improve scan. See [Build Failures](./BUILD_FAILURES.md). ## Deploy and runtime ### ECS vs static S3 — how is it chosen? Server apps and containers go to ECS. Plain static files and build-only SPAs (no Railpack start command) go to S3. See [How It Works](./HOW_IT_WORKS.md). ### My app works locally but deploy fails — why? Common causes: missing env vars at build/runtime, wrong port binding, lockfile not committed, or Node/Python version mismatch. See [Debugging Deployments](./DEBUGGING_DEPLOYMENTS.md). ### Deploy succeeded but URL returns 502/503 Usually runtime startup failure — port, missing `DATABASE_URL`, or crash on boot. Check CloudWatch logs and health probes. See [Startup and Runtime Failures](./STARTUP_AND_RUNTIME_FAILURES.md). ### Which port should my app listen on? Use `PORT` from the environment. Bind `0.0.0.0`, not `127.0.0.1`. Default depends on framework (often 3000 for Node). ### How do env vars work? Build-time vars go to CodeBuild; runtime vars on ECS go to Secrets Manager. Redeploy after runtime changes. See [Environment Variables](./ENVIRONMENT_VARIABLES.md). ## URLs and domains ### How is my deployment URL generated? `https://{hosted-subdomain}.{deployment-domain}` — you pick the subdomain in config. See [Custom Domains](./CUSTOM_DOMAINS.md). ### Domain not loading after deploy? Wait for DNS propagation, confirm subdomain spelling, redeploy after DNS-related config changes. See [Domain and TLS Issues](./DOMAIN_AND_TLS_ISSUES.md). ## Debugging and AI ### What does the Agent button do? Opens the **Deployment Agent** — read-only AI that inspects your deployments, history, and health. See [Deployment Agent](./DEPLOYMENT_AGENT.md). ### Why is the Deployment Agent offline? The WebSocket worker is disconnected. Refresh the page; check system health indicator in the header. ### Agent says it hit the tool-call limit Ask a narrower question or open Deployment History for full logs. Use Analyze failure for one failed run. See [AI Assistance](./AI_ASSISTANCE.md). ### How do I roll back? Pick a successful entry in Deployment History and confirm rollback. Redeploys that commit; keeps current env vars. See [Deployment History and Rollback](./DEPLOYMENT_HISTORY_AND_ROLLBACK.md). ## Where to start when stuck 1. [Debugging Deployments](./DEBUGGING_DEPLOYMENTS.md) 2. [Error Catalog](./ERROR_CATALOG.md) 3. Deployment Agent in the header --- # docs/GETTING_STARTED.md # Getting Started This guide walks through your first deployment on Smart Deploy. ## Prerequisites - A GitHub account linked to Smart Deploy - Access approved for your email (if your instance uses a waiting list) - A repository Smart Deploy can scan (Node, Python, Go, Docker, static site, or monorepo) ## 1. Sign in and connect GitHub Sign in with GitHub so Smart Deploy can list repos, read branches, and clone code for scans and deploys. If repo actions fail with **GitHub not connected**, link GitHub from your account settings and retry. ## 2. Open a repository From the dashboard, open `owner/repo`. Smart Deploy loads the repo and any existing deployments. ## 3. Detect services If no services appear, run **Detect services**. Smart Deploy walks the repo and lists deployable units — root app, monorepo packages, compose directories, or Dockerfile folders. For monorepos, you may see multiple services. Each gets its own deployment row. See [Monorepos and Multi-Service](./MONOREPOS_AND_MULTI_SERVICE.md). ## 4. Select a service and run Smart Analysis Pick the service you want to deploy (for example `web` or `.` for the root app). Run **Smart Analysis** (scan). Progress moves through: 1. Scanner — resolve commit and scope 2. Clone repo 3. Classifier — deploy shape and units 4. Railpack prepare — build plan 5. Deploy briefing — operator summary 6. Build and repair — verify build (when enabled) 7. Finalize When the scan completes, review **build status** and the deploy briefing before deploying. See [Smart Analysis](./SMART_ANALYSIS.md). ## 5. Review the blueprint Open the **blueprint** to see the full pipeline before anything runs: - Which branch and commit will deploy - Build units and artifacts (Railpack plan or Dockerfile) - AWS region and target (ECS or static S3) - Env vars and subdomain - Final public URL Adjust branch, region, env vars, or hosted subdomain from preview if needed. See [Blueprint and Preview](./BLUEPRINT_AND_PREVIEW.md). ## 6. Deploy When the preview looks right, start the deploy. Watch live step logs: - Auth → Build → Publish → Deploy → Rollout → Verify → Done On success you get a **Visit** URL like `https://your-service.yourdomain.com`. See [Deployment Pipeline](./DEPLOYMENT_PIPELINE.md). ## 7. Confirm it is running - **Overview** — URL, screenshot, runtime health sparkline - **Logs** — deploy steps or ECS CloudWatch tail for running services - **History** — all deploy attempts Ask the **Deployment Agent** (header **Agent** button): *"Is my service healthy right now?"* ## If something fails 1. Open **Deployment History** and find the failed run 2. Read the first error line in the failed step's logs 3. Ask the **Deployment Agent**: *"Why did my last deployment fail?"* 4. Use **Analyze failure** on that history entry for a deeper explanation 5. Follow [Debugging Deployments](./DEBUGGING_DEPLOYMENTS.md) ## Common first-deploy issues | Symptom | Where to look | |---------|---------------| | Scan fails | [Smart Analysis](./SMART_ANALYSIS.md), [Build Failures](./BUILD_FAILURES.md) | | Deploy fails at Build | [Build Failures](./BUILD_FAILURES.md), [Railpack](./RAILPACK.md) | | Deploy succeeds but URL unhealthy | [Health Checks](./HEALTH_CHECKS.md), [Startup and Runtime Failures](./STARTUP_AND_RUNTIME_FAILURES.md) | | Wrong Node/Python version | [Railpack](./RAILPACK.md) — version files and `RAILPACK_PACKAGES` | ## Next steps - [Environment Variables](./ENVIRONMENT_VARIABLES.md) — configure build and runtime - [Custom Domains](./CUSTOM_DOMAINS.md) — how URLs are formed - [FAQ](./FAQ.md) --- # docs/GLOSSARY.md # Glossary | Term | Definition | |------|------------| | **Blueprint** | Visual preview of the deploy pipeline before anything runs | | **Build verification** | SD Artifacts step that test-builds the Railpack plan during scan | | **CodeBuild** | AWS service that runs the Docker/static build during deploy | | **Deploy briefing** | Markdown summary produced at end of Smart Analysis | | **Deploy shape** | Scan classification: `static`, `static_build`, `server`, `multi`, `existing_docker` | | **Deploy unit** | One buildable unit in a scan (name, root path, Railpack plan, port) | | **Deployment Agent** | Read-only AI that inspects deployments via tools (list, details, history, health) | | **Deployment target** | `ecs` (Fargate container) or `static_s3` (static hosting) | | **ECS Fargate** | AWS serverless containers used for Railpack and Dockerfile deploys | | **Failure code** | Structured deploy failure identifier (for example `DEPLOYMENT_VERIFICATION_FAILED`) | | **Failure classification** | Summary, likely cause, evidence, and retryable flag for a failed run | | **Hosted subdomain** | User-chosen hostname label on the platform domain | | **Improve scan** | Feedback flow to SD Artifacts to fix scan/build plan after failures | | **Mise** | Runtime/toolchain manager used inside Railpack builds | | **Package path** | Repo-relative directory scoped for a service scan (for example `apps/web`) | | **Railpack** | Build system generating plans and container images from repo analysis | | **Railpack plan** | JSON build spec (`steps`, `deploy.startCommand`) used as CodeBuild input | | **Release artifact** | Stored ECR image or S3 path metadata for a successful deploy | | **Runtime health** | Ongoing probes of app HTTP, ECS, and ALB after deploy succeeds | | **SD Artifacts** | Backend service for scan, Railpack, build verification, and improve-scan | | **Service catalog** | List of detected deployable services on a repo page | | **Smart Analysis** | User-facing name for the repo scan / analyze stream | | **Verify step** | Post-deploy HTTP probes until app responds or timeout | | **WebSocket worker** | Process that runs deploys, streams logs, reconciles health, runs Deployment Agent | --- # docs/HEALTH_CHECKS.md # Health Checks Smart Deploy verifies your app is reachable after deploy and continues probing running deployments for runtime health. ## Post-deploy verification After ECS rollout (or static publish), the deploy pipeline runs **verification**: | Setting | Value | |---------|-------| | **Paths probed** | `/`, `/health`, `/healthz`, `/api/health` | | **Success** | HTTP 2xx or 3xx on any path | | **Window** | Up to ~5 minutes with multiple rounds | | **On failure** | Deploy status `failed`, code `DEPLOYMENT_VERIFICATION_FAILED` | Verification logs include ECS service events and filtered CloudWatch excerpts on failure. ## Adding a health endpoint (recommended) Expose a lightweight route that returns 200 when your app is ready: ```javascript // Express example app.get('/health', (_req, res) => res.status(200).send('ok')); ``` ```python # FastAPI example @app.get("/health") def health(): return {"status": "ok"} ``` If your app only serves SPA routes, `/` may return 200 once the server is up — verification can succeed without a dedicated `/health`. ## Database-dependent health For production-grade checks, verify critical dependencies: ```javascript app.get('/health', async (_req, res) => { await db.ping(); res.status(200).send('ok'); }); ``` Keep checks fast — verification runs repeatedly during the deploy window. ## Static sites Static S3 deploys verify the public URL serves expected content (HTTP success on probed paths). No container process — ensure `index.html` is at the correct S3 prefix. ## Common verification failures | Symptom | Likely cause | |---------|--------------| | Connection refused | App not listening on `PORT`; container crashed on start | | 502/503 from ALB | Tasks not healthy; app still starting or wrong port | | 404 on all paths | Wrong start command or missing build output | | Timeout | Slow cold start; increase readiness in app or optimize startup | See [Startup and Runtime Failures](./STARTUP_AND_RUNTIME_FAILURES.md). ## Runtime health (ongoing) Separate from deploy verification, a background reconciler probes **running** deployments every ~10 minutes. See [Runtime Health](./RUNTIME_HEALTH.md). ## Related - [Deployment Pipeline](./DEPLOYMENT_PIPELINE.md) - [Debugging Deployments](./DEBUGGING_DEPLOYMENTS.md) - [Error Catalog](./ERROR_CATALOG.md) — `DEPLOYMENT_VERIFICATION_FAILED` --- # docs/HOW_IT_WORKS.md # How It Works Smart Deploy connects your GitHub repo to AWS through a predictable pipeline: **scan → preview → deploy → monitor**. ```mermaid flowchart LR A[Connect GitHub] --> B[Detect services] B --> C[Smart Analysis] C --> D[Blueprint preview] D --> E{Plan OK?} E -->|Edit| D E -->|Yes| F[Deploy] F --> G[Logs and health] ``` ## Components you interact with | Piece | What it does for you | |-------|----------------------| | **Dashboard** | Lists repos and deployments | | **Repo page** | Service catalog, scan, and deploy workspace per repo | | **Blueprint** | Visual preview of the full deploy path | | **Deploy workspace** | Live logs, overview, history, and config while deploying or running | | **Deployment Agent** | Read-only AI that inspects your deployments via tools | ## Behind the scenes | Component | Role | |-----------|------| | **SD Artifacts** | Scans repos, generates Railpack plans, verifies builds, supports improve-scan feedback | | **WebSocket worker** | Runs long deploy jobs, streams logs, reconciles runtime health | | **AWS** | CodeBuild, ECR, ECS Fargate, ALB, Route 53, S3, CloudFront, Secrets Manager, CloudWatch | | **Database** | Stores deployments, scan results, history, and agent conversations | You do not configure these primitives directly — Smart Deploy orchestrates them from your repo scan and deployment config. ## Scan → plan 1. **Service detection** discovers deployable units (monorepo packages, compose dirs, Dockerfiles, root apps). 2. **Smart Analysis** clones your repo at a commit and runs SD Artifacts: - Classifies **deploy shape** (`server`, `static`, `static_build`, `multi`, `existing_docker`) - Generates **Railpack plans** (or uses your Dockerfile) - Optionally **verifies the build** and attempts repair 3. Results are stored and linked to your deployment as the source of truth for preview and deploy. See [Smart Analysis](./SMART_ANALYSIS.md) and [Railpack](./RAILPACK.md). ## Preview → configure The blueprint turns scan results into a five-step pipeline you can read and edit: 1. Auth and resolve ref (repo, branch, commit) 2. Build (deploy units, CodeBuild output) 3. Setup (region, networking or static bucket) 4. Deploy (env vars, routing, secrets) 5. Done (hosted URL) See [Blueprint and Preview](./BLUEPRINT_AND_PREVIEW.md). ## Deploy → release When you deploy, the WebSocket worker runs the pipeline: - **Container path**: CodeBuild → ECR → ECS Fargate → ALB host rule → Route 53 → HTTP verification - **Static path**: CodeBuild → S3 sync → optional CloudFront invalidation Logs stream to the UI in real time. Each attempt is recorded in deployment history. See [Deployment Pipeline](./DEPLOYMENT_PIPELINE.md). ## Monitor → debug After deploy: - **Runtime health** probes your URL and ECS/ALB signals on a schedule - **Deployment history** keeps every attempt with step logs and failure classification - **Deployment Agent** can list deployments, load details, history, and health on demand See [Runtime Health](./RUNTIME_HEALTH.md) and [Debugging Deployments](./DEBUGGING_DEPLOYMENTS.md). ## Deploy routing (ECS vs static S3) | Scan signal | Target | |-------------|--------| | `deploy_shape: static` | S3 | | `deploy_shape: static_build` with no Railpack start command | S3 | | `deploy_shape: server`, `multi`, `existing_docker`, or static with runtime | ECS Fargate | Railpack's `deploy.startCommand` is the key fork for `static_build` — build-only SPAs go to S3; apps that serve built assets at runtime go to ECS. ## Related - [Getting Started](./GETTING_STARTED.md) - [Deployment Pipeline](./DEPLOYMENT_PIPELINE.md) - [Glossary](./GLOSSARY.md) --- # docs/MONOREPOS_AND_MULTI_SERVICE.md # Monorepos and Multi-Service Smart Deploy detects multiple deployable units in one repository and lets you deploy each as a separate service with its own scan, config, and URL. ## How services are discovered The service catalog merges several sources (in order): 1. **Docker Compose directories** — one catalog row per compose folder (collapsed for UI) 2. **Monorepo packages** — `apps/*`, `services/*`, `packages/*`, `modules/*`, `projects/*` when workspace tooling exists 3. **Dockerfile directories** — any folder with a `Dockerfile` 4. **Root siblings** — immediate subdirs like `api/` + `web/` without a root workspace 5. **Root app** — optional extra row when the repo root is also an app Libraries under `packages/*` without `start`, `serve`, or `dev` scripts are skipped. ## Monorepo indicators Detection runs when the repo root has: - `pnpm-workspace.yaml`, or - `package.json` with `workspaces`, or - `turbo.json`, `nx.json`, or `lerna.json` ## Deploying a specific package Each service row has a **package path** (for example `apps/web`). Smart Analysis scopes the scan to that path: - Classifier and Railpack prepare target the package directory - Deploy units filter to services under that path - You get a separate deployment, subdomain, and history per service ## Multi-service compose repos | UI catalog | Deploy-time behavior | |------------|---------------------| | One row per compose directory | May expand to one deploy unit per compose service | | Collapsed summary in repo cards | Full unit list in scan `deploy_units` | Root compose with app + database services: database-only compose services are filtered out. ## Multiple services, multiple URLs Each service gets: - Its own `serviceName` (for example `web`, `api`) - Its own hosted subdomain (must be unique on the platform) - Its own ALB host rule: `https://{subdomain}.{domain}` Plan subdomains before deploying several services from one monorepo. ## Choosing ECS vs S3 per service Routing is per service based on that service's scan: - Frontend `static_build` without start command → S3 - API `server` shape → ECS - Mixed monorepos commonly have both targets in one repo ## Adding a service manually You can add a repo-relative directory as a new service root. Smart Deploy validates the path and infers language/framework for that folder. ## Tips | Scenario | Recommendation | |----------|----------------| | Turborepo `apps/web` + `apps/api` | Deploy each app as its own service | | Compose at repo root | One catalog entry; review expanded units in scan | | Only want one service from a big repo | Select the right package path before scanning | | Shared env vars | Set per deployment; no automatic sharing across services | ## Related - [Smart Analysis](./SMART_ANALYSIS.md) - [Railpack](./RAILPACK.md) — `RAILPACK_SPA_OUTPUT_DIR` for monorepo SPAs - [Environment Variables](./ENVIRONMENT_VARIABLES.md) --- # docs/RAILPACK.md # Railpack **Railpack** is the default build system for most apps on Smart Deploy. It generates a build plan at scan time and produces a container image at deploy time — without you writing a Dockerfile. Smart Deploy does not call Railpack directly in the UI. SD Artifacts runs Railpack during Smart Analysis; CodeBuild executes the plan during deploy. ## Railpack + Mise [Railpack is built on Mise](https://railpack.com/config/mise/). Mise manages language runtimes and tools inside the built image: - Auto-detects version files: `.node-version`, `.python-version`, `mise.toml`, `.tool-versions`, `.nvmrc`, `.go-version`, etc. - Installs the correct Node, Python, Ruby, Go, and other toolchains during the install step - Railpack writes global Mise config to `/etc/mise/config.toml` in the image; your repo's `mise.toml` can override it You typically configure runtimes via repo files rather than touching Mise directly. ## Railpack plan structure Stored in scan results as `railpack_plan`: ```json { "steps": [ { "name": "install", "commands": [{ "cmd": "npm ci" }] }, { "name": "build", "commands": [{ "cmd": "npm run build" }] } ], "deploy": { "startCommand": "npm run start", "variables": {} } } ``` | Part | Role | |------|------| | `steps[].install` | Dependencies and toolchain setup | | `steps[].build` | Compile or bundle the app | | `deploy.startCommand` | Container command on ECS; also determines ECS vs S3 for `static_build` | | `deploy.variables` | Build-time variables embedded in the plan | ## Three build paths | Path | When | |------|------| | **Railpack → ECS** | Server apps, static with runtime, most monorepo services | | **Railpack → S3** | `static_build` with no `deploy.startCommand` | | **Dockerfile → ECS** | `existing_docker` — repo Dockerfile, Railpack skipped | ## How deploy uses the plan CodeBuild: 1. Decodes the Railpack plan JSON from the scan 2. Runs `docker buildx build -f /tmp/railpack-plan.json` 3. Pushes the image to ECR ECS runs `deploy.startCommand` as the container entrypoint. ## Editing commands From scan results you can override install, build, and start commands before deploy. Changes update the in-session plan used for the next deploy. Prefer fixing the repo (scripts, `package.json`, version files) when overrides are a temporary workaround. ## Railpack environment variables Set these in Smart Deploy env vars (build-time unless noted): | Variable | Description | |----------|-------------| | `RAILPACK_PACKAGES` | Extra Mise packages: `node@22 python@3.12 jq@latest` | | `RAILPACK_INSTALL_CMD` | Override install step | | `RAILPACK_BUILD_CMD` | Override build step | | `RAILPACK_START_CMD` | Override start command | | `RAILPACK_BUILD_APT_PACKAGES` | Extra apt packages at build time | | `RAILPACK_DEPLOY_APT_PACKAGES` | Extra apt packages in final image | | `RAILPACK_SPA_OUTPUT_DIR` | Monorepo SPA output path for static artifact routing | See [Railpack env var docs](https://railpack.com/config/environment-variables/). ## Version pinning | Method | Example | |--------|---------| | `.node-version` | `22` | | `.python-version` | `3.12` | | `mise.toml` | `[tools] node = "22"` | | `RAILPACK_PACKAGES` | `node@22` in env vars | Wrong runtime version after deploy? Check these before overriding commands. ## Common issues | Symptom | Check | |---------|-------| | Build fails at `npm ci` / `pip install` | Dependency files, lockfiles, private registry tokens | | Wrong Node/Python | Version files, `RAILPACK_PACKAGES`, `mise.toml` | | No start command on SPA | Expected for S3 path; add start command if you need ECS runtime | | Plan missing | Re-run Smart Analysis; check classifier logs | See [Build Failures](./BUILD_FAILURES.md). ## Related - [Smart Analysis](./SMART_ANALYSIS.md) - [Environment Variables](./ENVIRONMENT_VARIABLES.md) - [Deployment Pipeline](./DEPLOYMENT_PIPELINE.md) --- # docs/README.md # Smart Deploy Documentation User-facing guides for deploying and debugging apps on Smart Deploy. These docs focus on **your deployments**, not platform self-hosting. ## Learn | Guide | Summary | |-------|---------| | [What is Smart Deploy](./WHAT_IS_SMART_DEPLOY.md) | Why the platform exists and who it is for | | [How It Works](./HOW_IT_WORKS.md) | Scan → blueprint → deploy → monitor at a high level | | [Getting Started](./GETTING_STARTED.md) | Connect a repo and ship your first deployment | | [Glossary](./GLOSSARY.md) | Key terms used across the product | ## Deploy | Guide | Summary | |-------|---------| | [Deployment Pipeline](./DEPLOYMENT_PIPELINE.md) | Stages from clone to live URL (ECS and static S3) | | [Blueprint and Preview](./BLUEPRINT_AND_PREVIEW.md) | Read and edit the deploy plan before you ship | | [Smart Analysis](./SMART_ANALYSIS.md) | Repo scan, deploy shapes, and build verification | | [Railpack](./RAILPACK.md) | How apps are built, including Mise runtimes | | [Monorepos and Multi-Service](./MONOREPOS_AND_MULTI_SERVICE.md) | Multiple services per repo and package paths | | [Environment Variables](./ENVIRONMENT_VARIABLES.md) | Build-time vs runtime configuration | | [Custom Domains](./CUSTOM_DOMAINS.md) | Deployment URLs, subdomains, DNS, and HTTPS | ## Debug | Guide | Summary | |-------|---------| | [Debugging Deployments](./DEBUGGING_DEPLOYMENTS.md) | Step-by-step runbook for failed or unhealthy deploys | | [Deployment Agent](./DEPLOYMENT_AGENT.md) | AI inspector for status, history, and health | | [AI Assistance](./AI_ASSISTANCE.md) | When to use the agent, Analyze failure, and Improve scan | | [Deployment Logs](./DEPLOYMENT_LOGS.md) | Live logs, history, and ECS CloudWatch | | [Health Checks](./HEALTH_CHECKS.md) | Post-deploy verification and app health endpoints | | [Runtime Health](./RUNTIME_HEALTH.md) | Ongoing health signals and status meanings | | [Deployment History and Rollback](./DEPLOYMENT_HISTORY_AND_ROLLBACK.md) | Past attempts and manual rollback | | [Build Failures](./BUILD_FAILURES.md) | CodeBuild, Railpack, and Docker build errors | | [Startup and Runtime Failures](./STARTUP_AND_RUNTIME_FAILURES.md) | Crashes, ports, and 502/503 after deploy | | [Domain and TLS Issues](./DOMAIN_AND_TLS_ISSUES.md) | DNS propagation and certificate problems | ## Reference | Guide | Summary | |-------|---------| | [FAQ](./FAQ.md) | Common questions | | [Error Catalog](./ERROR_CATALOG.md) | Symptom-first errors with checks and fixes | | [Deployment Status Reference](./DEPLOYMENT_STATUS_REFERENCE.md) | Status vocabulary, stages, and failure codes | ## AI agents - Index: [/llms.txt](/llms.txt) - Full snapshot: [/llms-full.txt](/llms-full.txt) --- # docs/RUNTIME_HEALTH.md # Runtime Health After a successful deploy, Smart Deploy tracks **runtime health** — whether your app stays reachable and ECS/ALB signals look normal. ## Where to see it - **Overview** tab — status badge and health sparkline - **Deployment Agent** — "Is my service healthy right now?" uses `get_runtime_health` ## Health states | Status | Meaning | |--------|---------| | **healthy** | App probe succeeded; ECS/ALB signals nominal | | **degraded** | Partial failure — for example app up but ALB targets unhealthy | | **unreachable** | HTTP probe failed or ECS desired ≠ running | | **unknown** | Not enough recent samples or deployment not running | Statuses use **anti-flap** logic — brief blips do not immediately flip `running` to `unreachable`. ## What gets probed Each reconciliation cycle (~10 minutes) collects: | Signal | Source | |--------|--------| | **App HTTP** | GET to deployment URL (same paths as verify) | | **ECS** | Desired vs running task count, rollout state | | **ALB** | Healthy vs unhealthy target count | Samples are stored in runtime health history and exposed via API for charts and the Deployment Agent. ## Deploy status vs runtime health | Field | When set | |-------|----------| | Deployment `status: running` | Last **deploy** succeeded | | Runtime `degraded` / `unreachable` | **Ongoing** probes failing after deploy | A deployment can show `running` while runtime health is `degraded` — the release deployed but the app is misbehaving now. ## Debugging degraded health 1. Open **Logs** → ECS CloudWatch tail 2. Ask Deployment Agent for runtime health entries (HTTP code, latency) 3. Check recent deploy or config change in History 4. Follow [Startup and Runtime Failures](./STARTUP_AND_RUNTIME_FAILURES.md) ## ECS-specific signals | Signal | Interpretation | |--------|----------------| | `running < desired` | Tasks crashing or failing health checks | | `rolloutState: FAILED` | ECS deployment circuit breaker or failed rollout | | Unhealthy ALB targets | Port mismatch, app not listening on `PORT`, or slow startup | ## Related - [Health Checks](./HEALTH_CHECKS.md) - [Deployment Agent](./DEPLOYMENT_AGENT.md) - [Debugging Deployments](./DEBUGGING_DEPLOYMENTS.md) --- # docs/SMART_ANALYSIS.md # Smart Analysis **Smart Analysis** is the repo scan that runs before you deploy. It detects how your app should be built, generates Railpack plans (or recognizes a Dockerfile), and optionally verifies the build. ## When to run it - First time deploying a service - After significant repo changes (new framework, Dockerfile, monorepo layout) - When a deploy failed due to build or scan issues — then use **Improve scan** with failure context Re-scanning updates `scanResults` linked to your deployment and refreshes the blueprint. ## Scan progress nodes | Node | Description | |------|-------------| | **Scanner** | Resolve commit and repo scope | | **Clone repo** | Check out repository at commit | | **Classifier** | Detect deploy shape and deploy units | | **Railpack prepare** | Generate Railpack build plan per unit | | **Deploy briefing** | Operator summary (markdown) | | **Build and repair** | Verify build; AI repair loop when enabled | | **Finalize** | Schema version and final build status | Watch scan logs in the UI for the first `❌` or error line if a node fails. ## Deploy shapes | Shape | Meaning | |-------|---------| | `static` | Plain static files, no build step | | `static_build` | SPA or static site with a build step | | `server` | Server app built and run as a container | | `multi` | Multiple deploy units (compose-style) | | `existing_docker` | Uses repo Dockerfile instead of Railpack | Deploy routing uses shape plus Railpack `deploy.startCommand` to choose ECS vs S3. See [How It Works](./HOW_IT_WORKS.md). ## Scan result fields Key fields in the stored analysis response: | Field | Purpose | |-------|---------| | `deploy_units[]` | Name, root, type, framework, port, `railpack_plan` | | `build_status` | `passed`, `failed`, `skipped`, etc. | | `build_verification` | Verification attempt logs and message | | `repair_history[]` | AI repair attempts if build failed initially | | `deploy_briefing` | Human-readable scan summary | | `railpack_version` | Railpack version used for the plan | ## Build verification When enabled, SD Artifacts runs a test build after plan generation. Outcomes: - **Passed** — safer to deploy; blueprint shows green build status - **Failed** — review `build_verification.log_excerpt` and `repair_history` - **Skipped** — verification not run for this scan Use **Improve scan** to send failure logs back to SD Artifacts for remediation. ## Improve scan (feedback) After a failed deploy or failed verification: 1. Open Improve scan from scan results 2. Add context about what failed 3. SD Artifacts re-analyzes with your failure evidence 4. Review updated plan before redeploying ## Package path (monorepos) Scans are scoped to a **package path** per service (for example `apps/web`). The classifier and Railpack prepare run in that scope while still understanding repo layout. See [Monorepos and Multi-Service](./MONOREPOS_AND_MULTI_SERVICE.md). ## Related - [Railpack](./RAILPACK.md) - [Build Failures](./BUILD_FAILURES.md) - [Blueprint and Preview](./BLUEPRINT_AND_PREVIEW.md) --- # docs/STARTUP_AND_RUNTIME_FAILURES.md # Startup and Runtime Failures These issues appear when the **build succeeded** but the app does not stay healthy — verify fails, ECS tasks exit, or the URL returns 502/503. ## Symptom map | Symptom | Likely stage | |---------|--------------| | Verify step fails | App not responding on probed paths within timeout | | Deploy `running` then `degraded` | Crashes after initial success | | ALB 502/503 | No healthy targets — wrong port or crashing process | | ECS tasks stop repeatedly | Exit on boot — missing env, DB connection, wrong CMD | Failure code for verify: **`DEPLOYMENT_VERIFICATION_FAILED`** ## Port binding ECS expects your container to listen on the port from the scan/Railpack plan (commonly `3000` for Node, `8000` for Python). | Check | Action | |-------|--------| | App binds `localhost` only | Bind `0.0.0.0` | | Hardcoded port | Use `process.env.PORT` or plan port | | Railpack start command | Must start the server, not just build | Example: ```javascript const port = process.env.PORT || 3000; app.listen(port, '0.0.0.0'); ``` ## Missing runtime environment Container starts then exits — check ECS CloudWatch logs for: - `DATABASE_URL` undefined - connection refused to Postgres/Redis - missing API keys Set vars in Smart Deploy **runtime** env and **redeploy**. See [Environment Variables](./ENVIRONMENT_VARIABLES.md). ## Wrong start command Railpack `deploy.startCommand` must match how your app runs in production: | Mistake | Fix | |---------|-----| | `npm run dev` in production | Use `npm run start` or `node dist/index.js` | | Migrating DB on every boot without DB | Add migrations or fix `DATABASE_URL` | | SPA static server on wrong path | Point serve command at `dist` or `build` output | Override in scan results or fix repo scripts. ## Slow cold start Verification waits up to ~5 minutes. If your app needs longer: - Optimize startup (lazy init, smaller image) - Add `/health` that returns 200 only when ready - Reduce dependencies loaded at boot See [Health Checks](./HEALTH_CHECKS.md). ## ECS diagnostics on verify failure Failed verify appends to logs: - ECS service events (task failed to start, unhealthy target) - Filtered CloudWatch high-signal lines Read these before re-running deploy blindly. ## Static sites Runtime failures are rare — usually wrong S3 content or missing `index.html`: - 404 on all routes → build output path wrong - Blank page → JS bundle path wrong for asset prefix Check `RAILPACK_SPA_OUTPUT_DIR` and build logs. ## Debugging workflow 1. Deployment History → Verify step logs 2. Logs tab → CloudWatch runtime tail 3. Deployment Agent → runtime health (HTTP status, ECS counts) 4. Compare env vars to local `.env` that works 5. Rollback if outage is severe — [History and Rollback](./DEPLOYMENT_HISTORY_AND_ROLLBACK.md) ## Related - [Health Checks](./HEALTH_CHECKS.md) - [Runtime Health](./RUNTIME_HEALTH.md) - [Debugging Deployments](./DEBUGGING_DEPLOYMENTS.md) --- # docs/WHAT_IS_SMART_DEPLOY.md # What is Smart Deploy? Smart Deploy is a **preview-driven deployment platform** for solo developers. You scan a GitHub repo, review a live blueprint of what will run, adjust configuration in context, and deploy only when the plan makes sense. **Preview the deploy. Then ship it.** ## The problem Most deployment tools ask you to commit before you can see the plan. - A PaaS moves fast, but the real deploy path stays hidden until something breaks. - Raw cloud tooling gives control, but dumps the full surface area on you at once. - Solo developers need a middle path: ship quickly without flying blind. Smart Deploy is built around **preview**. You should know what will run, how traffic will flow, and which cloud resources are involved before you press deploy. ## What you get | Capability | Why it matters | |------------|----------------| | **Repo scan** | Detects services, frameworks, and deploy shape automatically | | **Blueprint preview** | Shows build units, routing, and cloud targets before anything runs | | **Editable config** | Branch, region, env vars, and subdomain from the same preview surface | | **Real cloud deploys** | ECS Fargate for containers, S3 (+ optional CloudFront) for static sites | | **Live feedback** | Stream deploy logs, track history, and watch health update in place | | **Deployment Agent** | Ask questions about your deployments, history, and runtime health | ## Who it is for Smart Deploy targets developers who: - Want PaaS-like speed without a black-box deploy path - Ship from GitHub and need multi-service or monorepo support - Prefer inspecting infrastructure before committing cloud resources - Need structured debugging when production deploys fail ## What Smart Deploy is not - **Not a generic CI runner** — deploys are opinionated paths to AWS (ECS or static S3). - **Not a local dev environment** — it builds and runs your app in cloud primitives. - **Not fully multi-cloud today** — production deploy code targets AWS; GCP paths are not active in the current deploy handler. ## Next steps - [Getting Started](./GETTING_STARTED.md) — first deployment walkthrough - [How It Works](./HOW_IT_WORKS.md) — architecture and data flow - [Deployment Pipeline](./DEPLOYMENT_PIPELINE.md) — what happens when you deploy