Architecture, security, deployment, and operations reference for the customer-owned CompliTru deployment.
CompliTru Self-Hosted Documentation Package
CompliTru is a cloud security and compliance platform that closes the loop between finding misconfigurations and fixing them. Most security tools surface findings and stop. CompliTru continues — assessing blast radius, generating remediation, and applying fixes (with approval) inside the same workflow.
The self-hosted edition runs entirely inside your AWS account. No customer data leaves your VPC. AI features are deployment-configurable: AWS Bedrock in your account, your own OpenAI key, or fully disabled.
| Capability | What it does |
|---|---|
| Cloud configuration scanning | Scans AWS, Azure, and GCP for misconfigurations across 600+ checks mapped to CIS, NIST, SOC 2, ISO 27001, HIPAA, PCI DSS |
| Vulnerability assessment | Aggregates findings from cloud-native scanners (Inspector, Defender, Security Command Center) plus CompliTru's own scan engine |
| Impact-aware remediation | Before suggesting a fix, evaluates blast radius — what depends on this resource, what breaks if changed, what running workloads are affected |
| Automated remediation | One-click apply for safe fixes; multi-step playbooks for complex changes; full audit trail of every change |
| Compliance reporting | Live mapping of findings to SOC 2, ISO 27001, NIST CSF, CIS Benchmarks, HIPAA, PCI DSS controls |
| AI-powered analysis | Optional LLM-driven remediation suggestions, finding summarization, risk narration, and natural-language querying |
| Cost optimization | Right-sizing recommendations that check workload dependencies before suggesting downgrades |
| Identity and access analysis | CIEM — analyzes IAM policies for over-privileged identities, unused permissions, privilege escalation paths |
| Multi-account / multi-cloud | Cross-account roles for AWS organizations; equivalent setups for Azure tenants and GCP organizations |
AI providers supported (configured per deployment): - AWS Bedrock (Claude Sonnet 4 / Haiku 3.5) — default for self-hosted - OpenAI (GPT-4o, GPT-4o-mini) — requires customer-provided API key - Disabled — all AI features turn off cleanly; product remains fully functional without AI
See AI_PROVIDER.md for configuration details.
See SOC2_CONTROLS.md for the full SOC 2 control mapping.
admin, analyst, viewer, auditor (default); custom roles with granular permissionsschellman/
├── README.md Quick start, requirements, package contents
├── docs/
│ ├── PRODUCT_OVERVIEW.md This document
│ ├── ARCHITECTURE.md System architecture, components, data flow
│ ├── SECURITY.md Security controls, threat model, hardening
│ ├── DEPLOYMENT.md Step-by-step deployment runbook
│ ├── CONFIGURATION.md All environment variables and tunables
│ ├── AI_PROVIDER.md OpenAI / Bedrock / disabled configuration
│ ├── OPERATIONS.md Logs, metrics, troubleshooting, day-2 ops
│ ├── SOC2_CONTROLS.md SOC 2 control-by-control mapping
│ └── UPGRADE.md Version upgrade process
├── docker/
│ ├── Dockerfile.backend Flask API + Gunicorn (Python 3.12)
│ ├── Dockerfile.frontend Next.js 14 SSR (Node 20)
│ ├── Dockerfile.worker Celery worker for async jobs
│ ├── docker-compose.yml Local verification stack
│ └── .dockerignore
├── terraform/
│ ├── main.tf Root module composing all infrastructure
│ ├── variables.tf All tunable parameters
│ ├── outputs.tf URLs, ARNs, connection info
│ ├── versions.tf Provider pinning
│ ├── example.tfvars Example values
│ └── modules/
│ ├── networking/ VPC, subnets, NAT gateways, security groups
│ ├── rds/ MySQL RDS with encryption, backups, parameter group
│ ├── ecs/ Fargate cluster, task definitions, services, IAM roles
│ ├── secrets/ Secrets Manager entries for app secrets
│ └── loadbalancer/ ALB with ACM cert integration, optional WAF
└── scripts/
├── bootstrap.sh Pre-deployment setup (ACM cert, S3 backend)
├── migrate.sh Run DB migrations after deployment
└── verify-deployment.sh Post-deploy smoke tests
| Target | Status | Notes |
|---|---|---|
| AWS ECS Fargate (recommended) | ✅ Production-ready | Terraform modules included |
| AWS EKS | ✅ Helm chart available on request | Deployment via Helm + values file |
| AWS EC2 + Docker Compose | ✅ Reference compose included | For dev/test/POC environments |
| Standalone Kubernetes (on-prem) | ✅ Helm chart available on request | Requires customer-provided MySQL and Redis |
| Azure (AKS / Azure Container Apps) | 🚧 Roadmap (Q3 2026) | Equivalent Terraform modules planned |
| GCP (GKE / Cloud Run) | 🚧 Roadmap (Q4 2026) | Equivalent Terraform modules planned |
To set expectations clearly:
The CompliTru differentiator versus other CSPM/CNAPP tools (Wiz, Lacework, Orca, Prisma Cloud):
For deeper dives: - System architecture and data flow → ARCHITECTURE.md - Security controls and threat model → SECURITY.md - Deployment runbook → DEPLOYMENT.md - Configuration reference → CONFIGURATION.md - AI provider configuration → AI_PROVIDER.md - SOC 2 control mapping → SOC2_CONTROLS.md
CompliTru is a three-tier application deployed entirely within your AWS account. All customer data, scan findings, and audit logs remain in your infrastructure at all times.
/api/* to backend, /* to frontendGET /api/healthGET /healthrequire_secure_transport = ON (TLS required)log_bin_trust_function_creators = 0local_infile = 0Single secret at path complitru/${environment}/app containing:
DATABASE_URL — RDS connection string (auto-populated from RDS module)REDIS_URL — ElastiCache endpoint (auto-populated)SECRET_KEY — Flask session secret (auto-generated, rotatable)ENCRYPTION_KEY — Fernet key for field-level encryption (auto-generated)JWT_SECRET — API token signing (auto-generated)COMPLITRU_LICENSE_KEY — Your license key (you provide)SMTP_* — Optional email configuration (you provide)OPENAI_API_KEY — Optional, if you enable AI features (you provide)Rotation supported via native Secrets Manager rotation for DB credentials.
Two buckets created:
complitru-reports-${account}-${region} — report storage, SSE-S3, versioned, 365-day lifecyclecomplitru-audit-${account}-${region} — immutable audit logs, Object Lock enabled, 7-year retentionUser → Route 53 → ALB (TLS) → Frontend (SSR) → API Backend → RDS / Redis
↓
Secrets Manager (token signing)
Backend startup → License check against license.complitru.ai (HTTPS)
↓
Every 60 minutes → Refresh license status
↓
If offline > 7 days → Application fails closed (refuses scans)
Only outbound traffic. License server never initiates connections to your infrastructure.
VPC flow logs enabled and shipped to CloudWatch Logs with 90-day retention.
backend_cpu / backend_memoryfrontend_cpu / frontend_memoryworker_cpu / worker_memorydb_instance_class| Component | Size | Cost estimate |
|---|---|---|
| Backend ECS (2 tasks) | 1 vCPU / 2 GB each | ~$70/mo |
| Frontend ECS (2 tasks) | 0.5 vCPU / 1 GB each | ~$35/mo |
| Worker ECS (1 task) | 1 vCPU / 2 GB | ~$35/mo |
| RDS db.t3.medium Multi-AZ | 2 vCPU / 4 GB | ~$140/mo |
| ElastiCache t3.micro | 2 nodes | ~$25/mo |
| ALB + NAT + data | — | ~$55/mo |
| Total | ~$360/mo |
Actual costs vary by region and traffic volume.
aws secretsmanager get-secret-value to encrypted S3 optional.Alarms publish to an SNS topic you control — route to PagerDuty, Slack, Opsgenie, email, etc.
The current package is a fully self-hosted monolithic deployment. Roadmap includes:
Current package is forward-compatible — migration path will be documented when v2 ships.
Security architecture, threat model, and hardening reference for the self-hosted CompliTru deployment.
| Threat | Control | Residual risk |
|---|---|---|
| External attacker exploits web vulnerability | WAF (optional), HTTPS only, security headers, input validation, CSP | Low |
| Attacker with stolen user credentials | MFA support, session timeout, audit log of every admin action, IP allow-listing option | Low |
| Malicious insider with AWS console access | CloudTrail captures all console activity, Object Lock on audit bucket prevents tampering | Medium — mitigated by customer's own governance |
| Container escape | Non-root execution, minimal base image, no privileged containers, read-only root filesystem where possible | Low |
| RDS credential theft | Secrets Manager rotation, TLS required for DB connections, no credentials in code or env | Low |
| Supply chain attack on container images | Images built from verified base, CVE scanning at build time, SBOM provided | Medium — customer should pin image digests |
| DNS hijack / MITM | HSTS enforced, TLS 1.2+, certificate pinning at ALB | Low |
| Data exfiltration via compromised task | VPC endpoints restrict S3/Secrets Manager access to your VPC, flow logs capture anomalies | Low |
require_secure_transport = ON at parameter group).python:3.12-slim-bookworm for backend and workernode:20-alpine for frontend build, gcr.io/distroless/nodejs20-debian12 for runtime--privileged flagsecrets block (references to Secrets Manager), never passed as plaintext env varsImage digests published at each release:
complitru/backend:1.0.0@sha256:...
complitru/frontend:1.0.0@sha256:...
complitru/worker:1.0.0@sha256:...
Pin to digests in your task definitions for reproducible deployments.
admin, analyst, viewer, auditorEvery admin action, auth event, and data mutation logged to:
audit_log table (queryable via UI)Each entry includes:
- Timestamp (UTC, ISO-8601)
- Actor (user ID, API key ID, or system)
- Source IP and user agent
- Action (e.g., user.create, scan.start, finding.resolve)
- Target resource ID
- Before/after state for mutations
- Request correlation ID
Four IAM roles created per deployment, each with tightly scoped permissions:
complitru-ecs-task-backendsecretsmanager:GetSecretValue → complitru/* only
ssm:GetParameter → /complitru/* only
s3:* → complitru-reports-* and complitru-audit-* only
sts:AssumeRole → customer-provided cross-account roles (for scans)
textract:DetectDocumentText → * (needed for OCR feature)
bedrock:InvokeModel → specific model ARNs only (if AI enabled)
complitru-ecs-task-workerSame as backend, plus:
ses:SendEmail → identity-based restrictions
complitru-ecs-task-frontend(no AWS permissions — frontend is read-only, proxies through backend)
complitru-rds-monitoringAWS managed: AmazonRDSEnhancedMonitoringRole
Full IAM policy documents are in terraform/modules/ecs/iam.tf.
All secrets live in AWS Secrets Manager. No secrets in:
Rotation support:
SECRET_KEY, ENCRYPTION_KEY, JWT_SECRET): Manual rotation documented in docs/DEPLOYMENT.md#rotating-secrets# Run as part of CI
trivy image complitru/backend:${VERSION} --exit-code 1 --severity CRITICAL,HIGH
trivy image complitru/frontend:${VERSION} --exit-code 1 --severity CRITICAL,HIGH
trivy image complitru/worker:${VERSION} --exit-code 1 --severity CRITICAL,HIGH
Release images must pass with zero CRITICAL/HIGH unresolved.
Recommended: Enable Amazon ECR image scanning on the repositories where you mirror CompliTru images. ECR will notify on new CVEs discovered post-deploy.
Notifications sent to the email tied to your license key.
audit_log table + VPC flow logsdocs/DEPLOYMENT.md#incident-responseCompliTru security team reachable at security@complitru.ai for coordinated disclosure on product vulnerabilities.
Security researchers may disclose vulnerabilities in the CompliTru product to security@complitru.ai. PGP key available on request. Response SLA: 48 hours to initial acknowledgment, 30 days to fix or mitigation for critical issues.
CompliTru's own release process:
SBOMs for each release are included in the package under sbom/.
CompliTru's AI features are deployment-configurable. Choose AWS Bedrock (default for self-hosted), OpenAI, or disable AI entirely. The product is fully functional in all three modes — AI is an enhancement, not a dependency.
| Mode | Data flow | Use when |
|---|---|---|
| Bedrock (recommended) | All inference happens via Bedrock in your AWS account. No data leaves your AWS boundary. | Default for self-hosted deployments; aligns with "no data to vendor" requirements |
| OpenAI | Requests sent to OpenAI's API using your enterprise OpenAI key + BAA. CompliTru is never in the data path. | You have an existing OpenAI enterprise contract and prefer GPT-4o-class models |
| Disabled | All AI features hidden in UI; AI-powered endpoints return raw findings without LLM processing. | AI not yet approved by your security team; air-gapped environments without Bedrock; cost reduction |
Enable model access in your AWS account (one-time, console-only) - Open the AWS Bedrock console in the region of your deployment - Navigate to Model access → Manage model access - Request access to:
Anthropic Claude Sonnet 4 (anthropic.claude-sonnet-4-20250514-v1:0) — used for default and powerfulAnthropic Claude 3.5 Haiku (anthropic.claude-3-5-haiku-20241022-v1:0) — used for fastVerify IAM permissions — the ECS task IAM role (complitru-ecs-task-backend and complitru-ecs-task-worker) needs:
json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream",
"bedrock:Converse",
"bedrock:ConverseStream"
],
"Resource": [
"arn:aws:bedrock:*::foundation-model/anthropic.claude-sonnet-4-*",
"arn:aws:bedrock:*::foundation-model/anthropic.claude-3-5-haiku-*",
"arn:aws:bedrock:*:*:inference-profile/us.anthropic.claude-sonnet-4-*",
"arn:aws:bedrock:*:*:inference-profile/us.anthropic.claude-3-5-haiku-*"
]
}
]
}
Provided automatically when you deploy via the included Terraform (terraform/modules/ecs/iam.tf).
hcl
ai_enabled = true
ai_provider = "bedrock"
ai_region = "us-east-1" # Defaults to deployment region
Or set the corresponding environment variables on the ECS task:
AI_ENABLED=true
AI_PROVIDER=bedrock
AWS_REGION=us-east-1
Provision an OpenAI enterprise key with a BAA in place if you handle PHI.
Store the key in Secrets Manager under the existing app secret:
bash
aws secretsmanager update-secret \
--secret-id complitru/production/app \
--secret-string '{
"OPENAI_API_KEY": "sk-...",
...other existing secrets...
}'
hcl
ai_enabled = true
ai_provider = "openai"
Environment variable equivalent:
AI_ENABLED=true
AI_PROVIDER=openai
ai_enabled = false
Environment variable equivalent:
AI_ENABLED=false
When AI_ENABLED=false:
This mode is appropriate for: - Initial deployment while AI is going through internal security review - Air-gapped environments without Bedrock model access - Customers who prefer to avoid AI entirely - Cost-conscious deployments (AI accounts for ~10–30% of compute spend depending on usage)
Three logical model tiers map to provider-specific model IDs:
| Logical name | Bedrock model | OpenAI model | When used |
|---|---|---|---|
default |
Claude Sonnet 4 | GPT-4o-mini | Most AI calls (remediation suggestions, summarization) |
fast |
Claude 3.5 Haiku | GPT-4o-mini | High-volume, lower-stakes operations (auto-classification, light summaries) |
powerful |
Claude Sonnet 4 | GPT-4o | Complex reasoning (multi-step remediation playbooks, risk narration) |
You can override per call site, but the logical names are the defaults. This abstraction means switching providers requires zero application code changes — just a configuration update.
After configuring, verify the AI provider is working:
# Inside an ECS exec session into the backend task:
python3 -c "
from app.ai.provider import get_provider
provider = get_provider()
print(f'Provider: {provider.provider_name}')
result = provider.chat_completion(
messages=[{'role': 'user', 'content': 'Reply with one word: ok'}],
max_tokens=10,
)
print(f'Response: {result.content}')
print(f'Tokens used: {result.usage[\"total_tokens\"]}')
"
Expected output:
Provider: bedrock
Response: ok
Tokens used: 12
If you see Bedrock AccessDenied, model access is not enabled in the Bedrock console (step 1 of Bedrock setup).
When AI is enabled and processing data that may contain PHI/PII (the de-identify pipeline at /api/deidentify/*):
[NAME_001], [DATE_002], [SSN_003], etc.)This pipeline runs server-side. Token maps are never persisted to disk or database, never sent to the AI provider, never leave the application memory of the request handler.
For deployments that handle PHI, this de-identification step is enforced at the route layer regardless of AI provider — it is not a configuration option to disable.
Costs accrue directly in your AWS bill. As of April 2026:
| Model | Input price | Output price |
|---|---|---|
| Claude Sonnet 4 | $3.00 / 1M tokens | $15.00 / 1M tokens |
| Claude 3.5 Haiku | $0.80 / 1M tokens | $4.00 / 1M tokens |
Typical CompliTru AI usage at moderate scale (one mid-sized customer environment, daily scans, weekly compliance reports):
Costs accrue on your OpenAI organization account.
| Model | Input price | Output price |
|---|---|---|
| GPT-4o | $2.50 / 1M tokens | $10.00 / 1M tokens |
| GPT-4o-mini | $0.15 / 1M tokens | $0.60 / 1M tokens |
Typical usage cost: $15–$120/month at the same scale, since most calls hit default (GPT-4o-mini equivalent).
Zero AI cost.
Switching between providers is a runtime configuration change — no application redeploy required:
AI_PROVIDER value in Secrets Manager (or Terraform variable + apply)Cached responses are not invalidated — historical AI outputs in the database remain whatever they were originally generated with. New requests use the new provider.
For finer-grained control beyond the global AI_ENABLED toggle, individual AI features can be disabled via Terraform variables or environment flags:
AI_FEATURE_REMEDIATION_SUGGESTIONS=true|false
AI_FEATURE_FINDING_SUMMARIZATION=true|false
AI_FEATURE_RISK_NARRATION=true|false
AI_FEATURE_NATURAL_LANGUAGE_QUERY=true|false
AI_FEATURE_AUTO_CLASSIFICATION=true|false
Defaults: all true when AI_ENABLED=true. Setting AI_ENABLED=false overrides all per-feature flags.
Every LLM invocation is logged to the application audit log (audit_log table) with:
bedrock / openai)Prompt and response content are not logged by default to avoid retaining sensitive data. Enable AI_LOG_FULL_PAYLOADS=true in non-production environments only if you need to debug LLM behavior.
The provider abstraction lives in app/ai/provider.py. To call AI from custom code:
from app.ai.provider import get_provider
provider = get_provider() # Returns configured provider (Bedrock or OpenAI)
result = provider.chat_completion(
messages=[
{"role": "system", "content": "You are a security analyst."},
{"role": "user", "content": "Summarize this finding: ..."},
],
model="default", # logical name, mapped per provider
temperature=0.3,
max_tokens=1000,
)
print(result.content) # str — the AI response text
print(result.usage) # {"total_tokens": int}
Streaming, function/tool calling, and JSON mode are all supported and work identically across providers. See app/ai/provider.py source for the full interface.
Every tunable parameter, environment variable, and Terraform variable, with defaults, valid values, and where each is read.
terraform/terraform.tfvars)app/config/)In production, you should manage configuration via Terraform — never edit env vars or secrets directly in the AWS console, as those changes will be overwritten on next terraform apply.
All entries below live in a single Secrets Manager secret at the path complitru/${environment}/app. Auto-created by the terraform/modules/secrets/ module.
| Key | Description | Auto-generated? | Rotation |
|---|---|---|---|
DATABASE_URL |
RDS connection string (mysql+pymysql://user:pass@host:3306/dbname) |
✅ Yes (from RDS module) | Native Secrets Manager rotation |
REDIS_URL |
ElastiCache endpoint (rediss://:auth@host:6379/0) |
✅ Yes (from ElastiCache module) | Manual |
SECRET_KEY |
Flask session signing key (32+ bytes) | ✅ Yes (Terraform random_password) |
Manual via Terraform; rotation invalidates active sessions |
ENCRYPTION_KEY |
Fernet key for field-level encryption (32 url-safe base64 bytes) | ✅ Yes | Rotate with care — encrypts API keys in DB |
JWT_SECRET |
API token signing key | ✅ Yes | Manual; rotation invalidates active API tokens |
RESET_CODE_PEPPER |
Password reset code pepper (32+ bytes) | ✅ Yes | Manual; rotation invalidates outstanding reset tokens |
COMPLITRU_LICENSE_KEY |
Your CompliTru license key (provided by CompliTru) | ❌ Provide | Yearly with subscription renewal |
OPENAI_API_KEY |
Optional OpenAI API key (only if AI_PROVIDER=openai) |
❌ Provide | Per your OpenAI key rotation policy |
SMTP_HOST, SMTP_PORT, SMTP_USER, SMTP_PASS, SMTP_FROM |
Outbound email config | ❌ Provide if SMTP used | Per your SMTP credentials policy |
SES_FROM |
SES sender identity (if using SES instead of SMTP) | ❌ Provide if SES used | N/A |
These are set on the ECS task definition (environment block) or via Terraform variables that map to them.
| Variable | Default | Valid values | Description |
|---|---|---|---|
FLASK_ENV |
production |
production, development |
Flask environment. Set to production for self-hosted deployments. |
FLASK_DEBUG |
0 |
0, 1 |
Always 0 in production. |
APP_PORT |
1100 |
port number | Backend listen port. ALB target group must match. |
APP_BASE_URL |
(required) | URL | External base URL of the deployment, e.g., https://complitru.your-domain.com. Used in emails, OAuth callbacks, webhooks. |
WORKERS |
4 |
int | Gunicorn worker count. Recommend 2x vCPUs. |
WORKER_CLASS |
eventlet |
eventlet, sync, gevent |
Gunicorn worker class. eventlet required for Socket.IO. |
WORKER_TIMEOUT |
120 |
seconds | Gunicorn worker timeout. Increase if you have long-running synchronous handlers. |
| Variable | Default | Description |
|---|---|---|
AWS_REGION |
(required) | Region of your deployment. Used by all boto3 clients unless overridden. |
AWS_DEFAULT_REGION |
(mirrors AWS_REGION) |
boto3 fallback. |
The deployment uses the ECS task IAM role for AWS authentication — no access keys required.
| Variable | Default | Description |
|---|---|---|
DATABASE_URL |
(from Secrets Manager) | Full RDS connection string. |
SQLALCHEMY_POOL_SIZE |
10 |
Connection pool size per worker. |
SQLALCHEMY_MAX_OVERFLOW |
20 |
Max overflow connections beyond pool. |
SQLALCHEMY_POOL_RECYCLE |
3600 |
Recycle connections after N seconds (avoid stale connections). |
SQLALCHEMY_POOL_PRE_PING |
true |
Test connections before checkout. |
| Variable | Default | Description |
|---|---|---|
REDIS_URL |
(from Secrets Manager) | ElastiCache Redis URL with TLS (rediss://...). |
CELERY_BROKER_URL |
(mirrors REDIS_URL with /0) |
Celery message broker. |
CELERY_RESULT_BACKEND |
(mirrors REDIS_URL with /1) |
Celery result backend. |
CELERY_TASK_TIME_LIMIT |
1800 |
Hard timeout per task (seconds). |
CELERY_TASK_SOFT_TIME_LIMIT |
1500 |
Soft timeout (raises exception worker can catch). |
CELERY_WORKER_CONCURRENCY |
4 |
Concurrent tasks per worker process. |
RATE_LIMIT_STORAGE_URI |
(mirrors REDIS_URL with /2) |
Rate limit counter storage. |
See AI_PROVIDER.md for full detail.
| Variable | Default | Valid values | Description |
|---|---|---|---|
AI_ENABLED |
true |
true, false |
Master toggle for all AI features. |
AI_PROVIDER |
bedrock |
bedrock, openai |
Which provider to use when AI_ENABLED=true. |
BEDROCK_REGION |
(mirrors AWS_REGION) |
AWS region | Region for Bedrock API calls. |
BEDROCK_MODEL_DEFAULT |
us.anthropic.claude-sonnet-4-20250514-v1:0 |
model ID | Override the default logical model. |
BEDROCK_MODEL_FAST |
us.anthropic.claude-3-5-haiku-20241022-v1:0 |
model ID | Override the fast logical model. |
BEDROCK_MODEL_POWERFUL |
us.anthropic.claude-sonnet-4-20250514-v1:0 |
model ID | Override the powerful logical model. |
OPENAI_API_KEY |
(from Secrets Manager) | string | Required if AI_PROVIDER=openai. |
AI_FEATURE_REMEDIATION_SUGGESTIONS |
true |
true, false |
Per-feature toggle. |
AI_FEATURE_FINDING_SUMMARIZATION |
true |
true, false |
Per-feature toggle. |
AI_FEATURE_RISK_NARRATION |
true |
true, false |
Per-feature toggle. |
AI_FEATURE_NATURAL_LANGUAGE_QUERY |
true |
true, false |
Per-feature toggle. |
AI_FEATURE_AUTO_CLASSIFICATION |
true |
true, false |
Per-feature toggle. |
AI_LOG_FULL_PAYLOADS |
false |
true, false |
Logs full prompts/responses to CloudWatch. Non-production only. |
| Variable | Default | Description |
|---|---|---|
SECRET_KEY |
(from Secrets Manager) | Flask session signing. |
JWT_SECRET |
(from Secrets Manager) | API token signing. |
SESSION_TIMEOUT_HOURS |
24 |
Session lifetime. |
MFA_REQUIRED_FOR_ADMINS |
true |
Force TOTP MFA for admin role. |
MAX_LOGIN_ATTEMPTS |
5 |
Lockout threshold. |
LOCKOUT_DURATION_MINUTES |
15 |
Lockout duration after exceeded threshold. |
PASSWORD_MIN_LENGTH |
12 |
Minimum password length on signup/reset. |
PASSWORD_REQUIRE_COMPLEXITY |
true |
Require uppercase, lowercase, digit, symbol. |
OAUTHLIB_INSECURE_TRANSPORT |
0 |
Always 0 in production. Allows OAuth over HTTP for local dev only. |
| Variable | Default | Description |
|---|---|---|
ARGON2_TIME_COST |
3 |
Iterations. |
ARGON2_MEMORY_COST |
65536 |
Memory in KiB. |
ARGON2_PARALLELISM |
2 |
Parallel threads. |
| Variable | Default | Description |
|---|---|---|
RESET_CODE_TTL_MINUTES |
15 |
Code lifetime in minutes. |
RESET_TOKEN_TTL_SECONDS |
1200 |
Token lifetime in seconds. |
RESET_MAX_ATTEMPTS |
5 |
Max code attempts before invalidation. |
RESET_RESEND_COOLDOWN_SECONDS |
60 |
Cooldown between resend requests. |
RESET_CODE_PEPPER |
(from Secrets Manager) | Server-side pepper for reset codes. |
| Variable | Default | Description |
|---|---|---|
GOOGLE_CLIENT_ID |
unset | Google OAuth client ID. |
GOOGLE_CLIENT_SECRET |
unset | Google OAuth client secret. |
MICROSOFT_CLIENT_ID |
unset | Azure AD app registration client ID. |
MICROSOFT_CLIENT_SECRET |
unset | Azure AD app registration client secret. |
MICROSOFT_TENANT_ID |
common |
common, organizations, consumers, or specific tenant ID. |
OAuth is optional. If unset, only username/password authentication is available.
| Variable | Default | Description |
|---|---|---|
SAML_ENABLED |
false |
Enable SAML 2.0 SSO. |
SAML_IDP_METADATA_URL |
unset | URL to fetch IdP metadata XML. |
SAML_SP_ENTITY_ID |
(mirrors APP_BASE_URL) |
Service provider entity ID. |
SAML_SP_ACS_URL |
${APP_BASE_URL}/auth/saml/acs |
Assertion consumer service URL. |
SAML_SP_CERT_PATH |
/etc/complitru/saml/sp.crt |
Path to SP signing certificate. |
SAML_SP_KEY_PATH |
/etc/complitru/saml/sp.key |
Path to SP signing private key. |
| Variable | Default | Description |
|---|---|---|
EMAIL_PROVIDER |
smtp |
smtp, ses, disabled |
SMTP_HOST |
unset | SMTP server hostname. |
SMTP_PORT |
587 |
SMTP port. |
SMTP_USER |
unset | SMTP username. |
SMTP_PASS |
(from Secrets Manager) | SMTP password. |
SMTP_FROM |
unset | Default From: address. |
SMTP_STARTTLS |
true |
Use STARTTLS. |
SES_FROM |
unset | Required if EMAIL_PROVIDER=ses. SES sender identity. |
SES_REGION |
(mirrors AWS_REGION) |
SES region. |
| Variable | Default | Description |
|---|---|---|
RATE_LIMITS_IP |
10/hour |
Auth endpoints per IP. |
RATE_LIMITS_EMAIL |
5/hour |
Per email address. |
RATE_LIMITS_API |
1000/hour |
Per API key, applied to data endpoints. |
RATE_LIMIT_STORAGE_URI |
redis://... |
Where to store counters (use Redis in production). |
| Variable | Default | Description |
|---|---|---|
LOG_LEVEL |
INFO |
DEBUG, INFO, WARNING, ERROR, CRITICAL |
LOG_FORMAT |
json |
json for CloudWatch parsing, text for human-readable |
LOG_INCLUDE_REQUEST_ID |
true |
Adds correlation ID to every log line |
LOG_INCLUDE_USER_ID |
true |
Adds authenticated user ID to log entries |
AUDIT_LOG_TO_CLOUDWATCH |
true |
Mirror audit log entries to dedicated CloudWatch log group |
AUDIT_LOG_TO_S3 |
true |
Mirror audit log entries to Object-Locked S3 bucket |
| Variable | Default | Description |
|---|---|---|
COMPLITRU_LICENSE_KEY |
(from Secrets Manager) | Your CompliTru license. |
LICENSE_SERVER_URL |
https://license.complitru.ai |
License validation endpoint. |
LICENSE_CHECK_INTERVAL_SECONDS |
3600 |
Refresh frequency. |
LICENSE_OFFLINE_GRACE_DAYS |
7 |
Days the app continues working if license server unreachable. |
LICENSE_FAIL_CLOSED |
true |
If true, refuse scans when license invalid; if false, log warning only. |
| Variable | Default | Description |
|---|---|---|
SCAN_DEFAULT_REGIONS |
us-east-1,us-east-2,us-west-2,us-west-1,eu-west-1 |
Default regions to scan when not specified per-account. |
SCAN_MAX_PARALLEL_ACCOUNTS |
10 |
Cap on parallel account scans (controls IAM rate limit pressure). |
SCAN_API_RETRY_MAX |
5 |
Max retries on AWS API throttling. |
SCAN_API_RETRY_BACKOFF |
exponential |
exponential, linear, constant |
SCAN_TIMEOUT_MINUTES |
60 |
Hard timeout per single account scan. |
| Variable | Default | Description |
|---|---|---|
S3_REPORT_BUCKET |
(from Terraform) | Bucket for generated reports. |
S3_AUDIT_BUCKET |
(from Terraform) | Object-Locked bucket for audit logs. |
S3_REPORT_RETENTION_DAYS |
365 |
Lifecycle delete after N days. |
S3_AUDIT_RETENTION_YEARS |
7 |
Object Lock retention period. |
| Variable | Default | Description |
|---|---|---|
FEATURE_AUTO_REMEDIATION |
false |
Allow remediation without per-finding approval. Default off for safety. |
FEATURE_NL_QUERY |
true |
Natural language querying via AI. |
FEATURE_CIEM |
true |
Identity and access analysis module. |
FEATURE_COST_OPTIMIZATION |
true |
Right-sizing recommendations. |
FEATURE_AI_GOVERNANCE |
false |
Beta — AI governance and policy module. |
FEATURE_FREE_SCAN_PUBLIC |
false |
Public free-scan landing page (only relevant for hosted SaaS). |
Set in terraform/terraform.tfvars. Drives the values above.
| Variable | Default | Description |
|---|---|---|
vpc_cidr |
10.0.0.0/16 |
VPC CIDR block. |
availability_zones |
["us-east-1a", "us-east-1b"] |
AZs for multi-AZ deployment. |
enable_nat_ha |
false |
Single NAT (cost) vs. one NAT per AZ (HA). |
enable_vpc_endpoints |
true |
Create endpoints for S3, Secrets Manager, KMS, ECR, CloudWatch Logs. |
enable_flow_logs |
true |
VPC flow logs to CloudWatch. |
| Variable | Default | Description |
|---|---|---|
backend_cpu |
1024 |
Fargate CPU units (1024 = 1 vCPU). |
backend_memory |
2048 |
Fargate memory in MB. |
backend_desired_count |
2 |
Min running tasks. |
backend_max_count |
10 |
Auto-scaling ceiling. |
frontend_cpu |
512 |
|
frontend_memory |
1024 |
|
frontend_desired_count |
2 |
|
frontend_max_count |
10 |
|
worker_cpu |
1024 |
|
worker_memory |
2048 |
|
worker_desired_count |
1 |
|
worker_max_count |
5 |
|
scaling_target_cpu_percent |
70 |
Target tracking auto-scale threshold. |
| Variable | Default | Description |
|---|---|---|
db_instance_class |
db.t3.medium |
RDS instance class. |
db_allocated_storage_gb |
100 |
Initial storage. |
db_max_storage_gb |
1000 |
Storage autoscaling ceiling. |
db_multi_az |
true |
Multi-AZ for production. |
db_backup_retention_days |
30 |
Automated backup retention. |
db_deletion_protection |
true |
Prevent accidental deletion. |
db_parameter_group_family |
mysql8.0 |
RDS parameter group family. |
| Variable | Default | Description |
|---|---|---|
redis_node_type |
cache.t3.micro |
Instance class. |
redis_num_cache_nodes |
2 |
Node count (replication enabled). |
redis_at_rest_encryption |
true |
KMS encryption. |
redis_in_transit_encryption |
true |
TLS. |
| Variable | Default | Description |
|---|---|---|
domain_name |
(required) | Your custom domain (e.g., complitru.your-corp.com). |
acm_certificate_arn |
(required) | ARN of the ACM cert for domain_name. |
enable_waf |
false |
Attach AWS WAF with managed OWASP rule sets. |
alb_idle_timeout_seconds |
60 |
ALB idle timeout. |
| Variable | Default | Description |
|---|---|---|
ai_enabled |
true |
Master AI toggle. Maps to AI_ENABLED. |
ai_provider |
bedrock |
bedrock or openai. Maps to AI_PROVIDER. |
ai_region |
(mirrors aws_region) |
Region for Bedrock. |
| Variable | Default | Description |
|---|---|---|
tags |
{} |
Map of tags applied to all resources. Recommended: Owner, CostCenter, Environment. |
environment |
production |
Used in resource names and Secrets Manager paths. |
On startup, the backend validates configuration and refuses to start if:
SECRET_KEY is unset or matches the default placeholderDATABASE_URL is unset or unparseableAPP_BASE_URL is unsetCOMPLITRU_LICENSE_KEY is unset (unless LICENSE_FAIL_CLOSED=false for trial mode)AI_PROVIDER=openai but OPENAI_API_KEY is unsetEMAIL_PROVIDER=ses but SES_FROM is unsetValidation failures are logged to stdout (and CloudWatch via the ECS log driver) before the container exits.
To dump the effective configuration (with secrets redacted):
# Inside an ECS exec session:
python3 -c "from app.config import dump_effective_config; dump_effective_config()"
Outputs a JSON tree of every effective configuration value with secrets shown as ***REDACTED***. Useful for verifying what configuration was actually loaded after a deploy.
End-to-end deployment of CompliTru into your AWS account using the included Terraform package. First-time deployment takes ~45 minutes.
| Requirement | Version | Notes |
|---|---|---|
| AWS account | — | Admin or equivalent permissions for initial deployment |
| AWS CLI | >= 2.x | Configured with credentials (aws sts get-caller-identity returns your account) |
| Terraform | >= 1.5 | terraform.io/downloads |
| Domain name | — | A subdomain you can point at the ALB (e.g., complitru.your-corp.com) |
| ACM certificate | — | Valid certificate for the chosen domain in the deployment region |
| CompliTru license key | — | Provided by CompliTru in ctl_... format |
| Container registry access | — | Either pull from ghcr.io/complitru/* (default) or mirror to your own ECR |
The IAM principal running terraform apply needs permissions to create resources in:
ai_provider = "bedrock")The simplest grant is the AWS-managed AdministratorAccess policy for the deployment role, used only for the initial Terraform apply. Subsequent updates can run with a more scoped policy.
Images are published to GitHub Container Registry by default:
ghcr.io/complitru/backend:1.0.0
ghcr.io/complitru/frontend:1.0.0
ghcr.io/complitru/worker:1.0.0
For air-gapped or compliance reasons, you can mirror these to your private ECR. See Mirroring images to ECR below.
Pin to digests in your task definitions for reproducible deployments:
ghcr.io/complitru/backend:1.0.0@sha256:<digest>
ghcr.io/complitru/frontend:1.0.0@sha256:<digest>
ghcr.io/complitru/worker:1.0.0@sha256:<digest>
(Replace <digest> with the digest from your release email.)
# Verify AWS access
aws sts get-caller-identity
# Verify Terraform version
terraform version # >= 1.5
# Verify domain DNS is in your control
dig +short complitru.your-corp.com
If you don't already have an ACM certificate for your domain:
aws acm request-certificate \
--domain-name "complitru.your-corp.com" \
--validation-method DNS \
--region us-east-1
Add the DNS validation record to your Route 53 / external DNS provider. Wait for ISSUED status:
aws acm describe-certificate --certificate-arn <arn> --query 'Certificate.Status'
Note the certificate ARN — you'll provide it as acm_certificate_arn in Terraform.
ai_provider = "bedrock")In the AWS Console:
Skip if ai_provider = "openai" or ai_enabled = false.
cd schellman/terraform
cp example.tfvars terraform.tfvars
Edit terraform.tfvars with your values:
# Required
aws_region = "us-east-1"
environment = "production"
domain_name = "complitru.your-corp.com"
acm_certificate_arn = "arn:aws:acm:us-east-1:123456789012:certificate/abc..."
complitru_license_key = "ctl_..."
# AI configuration (see AI_PROVIDER.md)
ai_enabled = true
ai_provider = "bedrock"
# Networking
vpc_cidr = "10.42.0.0/16" # Adjust to avoid CIDR conflicts
availability_zones = ["us-east-1a", "us-east-1b"]
# Compute sizing — see ARCHITECTURE.md "Production defaults" for guidance
backend_cpu = 1024
backend_memory = 2048
backend_desired_count = 2
# Database
db_instance_class = "db.t3.medium"
db_multi_az = true
# Tagging
tags = {
Owner = "platform-team"
CostCenter = "engineering"
Environment = "production"
Application = "complitru"
}
Full variable reference: CONFIGURATION.md.
terraform init
Recommended: configure a remote backend (S3 + DynamoDB lock table) for state management. See terraform/backend.tf.example.
terraform plan -out=tfplan
Carefully review:
+ only on first apply)terraform apply tfplan
Apply takes 30–45 minutes. Most of the time is spent on:
Outputs at the end:
alb_dns_name = "complitru-alb-12345.us-east-1.elb.amazonaws.com"
alb_zone_id = "Z35SXDOTRQ7X7K"
db_endpoint = "complitru-prod.cluster-abc.us-east-1.rds.amazonaws.com:3306"
ecs_cluster_name = "complitru-prod"
secret_arn = "arn:aws:secretsmanager:us-east-1:123456789012:secret:complitru/production/app-AbCdEf"
s3_report_bucket = "complitru-reports-123456789012-us-east-1"
s3_audit_bucket = "complitru-audit-123456789012-us-east-1"
Create a Route 53 alias record (or CNAME for non-Route 53 DNS):
# Example using Route 53 CLI:
aws route53 change-resource-record-sets \
--hosted-zone-id <YOUR-ZONE-ID> \
--change-batch '{
"Changes": [{
"Action": "UPSERT",
"ResourceRecordSet": {
"Name": "complitru.your-corp.com",
"Type": "A",
"AliasTarget": {
"HostedZoneId": "Z35SXDOTRQ7X7K",
"DNSName": "complitru-alb-12345.us-east-1.elb.amazonaws.com",
"EvaluateTargetHealth": true
}
}
}]
}'
Wait for DNS propagation (typically under 60 seconds).
cd ../scripts
./migrate.sh
This invokes aws ecs run-task with the migration task definition, which:
terraform.tfvars)Output:
[migrate.sh] Running migrations against complitru-prod...
[migrate.sh] Applied 47 migrations
[migrate.sh] Seeded admin user: admin@your-corp.com
[migrate.sh] Loaded 612 checks across 8 frameworks
[migrate.sh] Migration complete
The admin user receives a one-time login email at the address you specified.
./verify-deployment.sh
Runs end-to-end smoke tests:
/healthhealth_check table)health_check task and waits for completion)license.complitru.aiai_enabled = false)Output:
✓ ALB health check (200 OK in 45ms)
✓ Backend → RDS connectivity (query in 12ms)
✓ Backend → Redis connectivity (PING in 3ms)
✓ Celery worker online (task processed in 280ms)
✓ License validation (status: active, expires 2027-04-15)
✓ AI provider (bedrock) (test prompt completed in 1.2s, 14 tokens)
✓ S3 report bucket writeable (test object PUT succeeded)
✓ CloudWatch log delivery (recent log entries present)
All checks passed. Deployment is healthy.
https://complitru.your-corp.com in your browserThe simplest pattern is a cross-account IAM role:
The cross-account role uses an external ID for security and grants read-only permissions plus targeted write permissions for remediation actions. Full IAM policy provided in the CloudFormation template at terraform/templates/customer-account-role.yaml.
For air-gapped or compliance-driven deployments, mirror CompliTru images to your own ECR:
# Authenticate to source registry
echo $GITHUB_TOKEN | docker login ghcr.io -u USERNAME --password-stdin
# Authenticate to destination ECR
aws ecr get-login-password --region us-east-1 \
| docker login --username AWS --password-stdin \
123456789012.dkr.ecr.us-east-1.amazonaws.com
# Create ECR repositories
for component in backend frontend worker; do
aws ecr create-repository \
--repository-name complitru/$component \
--image-scanning-configuration scanOnPush=true \
--image-tag-mutability IMMUTABLE
done
# Mirror images
for component in backend frontend worker; do
docker pull ghcr.io/complitru/$component:1.0.0
docker tag ghcr.io/complitru/$component:1.0.0 \
123456789012.dkr.ecr.us-east-1.amazonaws.com/complitru/$component:1.0.0
docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/complitru/$component:1.0.0
done
Then in terraform.tfvars:
container_registry = "123456789012.dkr.ecr.us-east-1.amazonaws.com"
Apply Terraform to update the task definitions to pull from your ECR.
The Terraform module is environment-agnostic. To deploy staging alongside production:
# Use a separate Terraform workspace
cd terraform
terraform workspace new staging
cp terraform.tfvars terraform.staging.tfvars
# Edit terraform.staging.tfvars: change `environment`, `domain_name`, `vpc_cidr`, sizing
terraform apply -var-file=terraform.staging.tfvars
Resources are namespaced by environment so staging and production coexist in the same AWS account without conflict.
SECRET_KEY (Flask session signing)Impact: All active user sessions invalidated; users re-login.
NEW_KEY=$(python3 -c "import secrets; print(secrets.token_hex(32))")
aws secretsmanager update-secret \
--secret-id complitru/production/app \
--secret-string "$(aws secretsmanager get-secret-value \
--secret-id complitru/production/app \
--query SecretString --output text \
| jq --arg k "$NEW_KEY" '.SECRET_KEY = $k')"
# Force backend tasks to pick up new value
aws ecs update-service \
--cluster complitru-production \
--service complitru-backend \
--force-new-deployment
ENCRYPTION_KEY (Fernet field encryption)Impact: Existing encrypted DB fields (API keys, integration credentials) become unreadable. Requires a re-encryption migration.
Run the included rotation script:
./scripts/rotate-encryption-key.sh
Which generates a new key, re-encrypts all affected DB fields with the new key, atomically swaps the Secrets Manager value, and forces a service restart.
Use native Secrets Manager rotation (configured by Terraform):
aws secretsmanager rotate-secret --secret-id complitru/production/db
The rotation Lambda swaps credentials with zero downtime — RDS supports two simultaneous credential pairs during rotation.
COMPLITRU_LICENSE_KEYWhen CompliTru issues a new license at renewal:
aws secretsmanager update-secret \
--secret-id complitru/production/app \
--secret-string "$(aws secretsmanager get-secret-value \
--secret-id complitru/production/app \
--query SecretString --output text \
| jq --arg k "$NEW_LICENSE_KEY" '.COMPLITRU_LICENSE_KEY = $k')"
aws ecs update-service \
--cluster complitru-production \
--service complitru-backend \
--force-new-deployment
# List automated snapshots
aws rds describe-db-snapshots \
--db-instance-identifier complitru-production \
--snapshot-type automated \
--query 'DBSnapshots[*].[DBSnapshotIdentifier,SnapshotCreateTime,Status]'
# Restore the most recent snapshot to a verification instance
aws rds restore-db-instance-from-db-snapshot \
--db-instance-identifier complitru-verify-$(date +%Y%m%d) \
--db-snapshot-identifier <most-recent-snapshot-id> \
--db-instance-class db.t3.small \
--no-multi-az
# After verification, delete the verification instance
aws rds delete-db-instance \
--db-instance-identifier complitru-verify-$(date +%Y%m%d) \
--skip-final-snapshot
# List recent audit log entries (Object Lock prevents deletion)
aws s3 ls s3://complitru-audit-${ACCOUNT_ID}-${REGION}/audit/$(date +%Y/%m)/
# Verify Object Lock retention is in place
aws s3api get-object-retention \
--bucket complitru-audit-${ACCOUNT_ID}-${REGION} \
--key audit/$(date +%Y/%m/%d)/sample.json
When CloudWatch alarms fire (high error rate, high latency, license expiration approaching):
complitru-production group:
- ECS service CPU/memory
- ALB 4xx / 5xx rates
- RDS connections, CPU, IOPS
- Celery queue depthaws ecs update-service --service complitru-backend --desired-count 6
- Block bad actor IP at WAF: add IP to deny list
- Roll back to previous task definition: aws ecs update-service --task-definition complitru-backend:N-1
- Pause Celery tasks: aws ecs update-service --service complitru-worker --desired-count 0fields @timestamp, @message | filter @logStream like /backend/ | sort @timestamp desc | limit 100
- Application audit log: query audit_log table in RDS for affected user / time window
- VPC flow logs for network anomaliesFor coordinated disclosure of CompliTru product vulnerabilities, email security@complitru.ai. Initial response within 48 hours.
To fully remove a CompliTru deployment:
cd terraform
# 1. Disable deletion protection on RDS first
terraform apply -var="db_deletion_protection=false"
# 2. Empty S3 buckets (Terraform won't delete non-empty buckets)
aws s3 rm s3://complitru-reports-${ACCOUNT_ID}-${REGION} --recursive
# Note: audit bucket cannot be emptied due to Object Lock — handle per your retention policy
# 3. Destroy infrastructure
terraform destroy
S3 audit bucket may need to remain in place until Object Lock retention period expires (default 7 years). The bucket can be left as a standalone resource after Terraform destroy of everything else.
support@complitru.aisecurity@complitru.aisales@complitru.aiReference your license key ID in all support correspondence for fastest routing.
Day-2 operations: monitoring, troubleshooting, scaling, common runbooks. Targets the platform engineer who keeps the deployment healthy.
| Endpoint | Returns | Use for |
|---|---|---|
GET /health (backend) |
200 {"status": "ok"} if Flask + RDS + Redis reachable |
ALB target group health check |
GET /api/health (frontend) |
200 {"status": "ok"} |
Frontend health |
GET /api/health/deep (backend) |
Detailed: license status, queue depth, AI provider status | Manual diagnosis |
| Celery worker | celery -A app.celery inspect ping (run via aws ecs execute-command) |
Worker liveness |
ALB target group health check configuration:
Path: /health
Healthy threshold: 2
Unhealthy threshold: 3
Timeout: 10 seconds
Interval: 30 seconds
Success codes: 200
| Source | CloudWatch Log Group | Notes |
|---|---|---|
| Backend (Flask + Gunicorn) | /aws/ecs/complitru-backend |
All HTTP requests, app logs, errors |
| Frontend (Next.js) | /aws/ecs/complitru-frontend |
SSR errors, build warnings |
| Worker (Celery) | /aws/ecs/complitru-worker |
Task execution logs, scan progress |
| ALB access logs | S3 bucket complitru-alb-logs-* (if enabled) |
Per-request access logs |
| VPC flow logs | /aws/vpc/flowlogs/complitru |
Network traffic, security investigation |
| RDS slow query log | /aws/rds/instance/complitru-production/slowquery |
DB performance debugging |
| RDS error log | /aws/rds/instance/complitru-production/error |
DB errors |
| Audit log | /aws/ecs/complitru-audit + S3 audit bucket |
Immutable copy in S3 with Object Lock |
JSON-structured logs by default (LOG_FORMAT=json):
{
"timestamp": "2026-04-15T14:23:45.123Z",
"level": "INFO",
"logger": "app.scan_engine",
"message": "Scan completed",
"request_id": "req_a1b2c3d4",
"user_id": 42,
"account_id": "aws-account-123",
"duration_ms": 18420,
"findings_count": 47
}
All errors in the last hour:
fields @timestamp, @message
| filter level = "ERROR"
| sort @timestamp desc
| limit 100
Slowest 20 requests:
fields @timestamp, request_id, path, duration_ms
| filter ispresent(duration_ms)
| sort duration_ms desc
| limit 20
All actions by a specific user:
fields @timestamp, action, target, source_ip
| filter user_id = 42
| sort @timestamp desc
| limit 200
Bedrock errors:
fields @timestamp, @message
| filter @message like /bedrock/i
| filter level in ["WARNING", "ERROR"]
| sort @timestamp desc
| limit 100
Celery task failures:
fields @timestamp, task_name, exception
| filter @logStream like /worker/
| filter @message like /Task .* raised/
| sort @timestamp desc
| limit 50
Created by Terraform; published to SNS topic complitru-alerts (route to PagerDuty / Slack / email).
| Alarm | Threshold | Severity |
|---|---|---|
complitru-backend-cpu-high |
Avg CPU > 85% for 10 min | Warning |
complitru-backend-memory-high |
Avg memory > 85% for 10 min | Warning |
complitru-backend-task-count-low |
Healthy task count < 1 for 5 min | Critical |
complitru-worker-task-count-low |
Healthy task count < 1 for 5 min | Critical |
complitru-alb-5xx-rate |
5xx rate > 1% over 5 min | Warning |
complitru-alb-target-response-time |
p95 response time > 5s for 10 min | Warning |
complitru-rds-cpu-high |
Avg CPU > 80% for 15 min | Warning |
complitru-rds-storage-low |
Free storage < 20% | Warning |
complitru-rds-connections-high |
Connections > 80% of max | Warning |
complitru-redis-memory-high |
Memory usage > 85% | Warning |
complitru-celery-queue-depth |
Queue depth > 1000 for 15 min | Warning |
complitru-license-expiry-warning |
License expires in < 7 days | Critical |
Custom CloudWatch dashboards created by Terraform:
Access at AWS Console → CloudWatch → Dashboards → complitru-*.
Backend exports custom metrics to CloudWatch under namespace CompliTru/Application:
| Metric | Unit | Description |
|---|---|---|
ScansStarted |
Count | Scans initiated per period |
ScansCompleted |
Count | Scans completed successfully |
ScansFailed |
Count | Scans failed |
ScanDuration |
Milliseconds | Scan completion time |
FindingsCreated |
Count | New findings per period |
RemediationsApplied |
Count | Remediations executed |
AICompletions |
Count | LLM calls per provider |
AITokensUsed |
Count | Total AI tokens consumed |
LicenseValidationFailures |
Count | License check failures |
ECS auto-scaling is target-tracking on CPU. Adjust thresholds in Terraform:
backend_desired_count = 2
backend_max_count = 10
scaling_target_cpu_percent = 70
worker_desired_count = 1
worker_max_count = 5
worker_scaling_queue_depth = 100 # scale workers when queue > N tasks
Apply with terraform apply. New tasks come online in ~60 seconds.
To increase task size:
backend_cpu = 2048 # 2 vCPU
backend_memory = 4096 # 4 GB
terraform apply triggers a rolling restart. Connection draining keeps existing requests served.
Storage scales automatically up to db_max_storage_gb. To resize the instance class:
db_instance_class = "db.r5.large" # was db.t3.medium
terraform apply triggers an RDS modification. With Multi-AZ, this is a zero-downtime modification (failover to standby, modify, failover back). With single-AZ, expect ~5 minutes of downtime.
Check ECS service health:
bash
aws ecs describe-services \
--cluster complitru-production \
--services complitru-backend \
--query 'services[0].{desired:desiredCount,running:runningCount,pending:pendingCount}'
Check recent task failures:
bash
aws ecs list-tasks \
--cluster complitru-production \
--service-name complitru-backend \
--desired-status STOPPED \
--max-items 5
Inspect why tasks stopped:
bash
aws ecs describe-tasks \
--cluster complitru-production \
--tasks <task-arn> \
--query 'tasks[0].{stoppedReason,stopCode,exitCode:containers[0].exitCode}'
Common stopped reasons and remediation:
| Reason | Remediation |
|---|---|
Essential container exited + exit code 1 |
Check CloudWatch logs for stack trace; rollback to previous task definition if recent deploy |
Out of memory |
Increase backend_memory in Terraform |
Health check failed |
Backend returning non-200; check /health deep dive endpoint |
Task placement failed |
Subnet running out of IPs; expand subnet CIDR or add subnet |
Image pull failure |
ECR auth issue or image deleted; verify image exists and ECS task role has pull permissions |
Check worker count:
bash
aws ecs describe-services \
--cluster complitru-production \
--services complitru-worker \
--query 'services[0].runningCount'
Check queue depth:
```bash
# SSH into a backend task via ECS exec
aws ecs execute-command \
--cluster complitru-production \
--task
# Inside the task: python3 -c " from app.celery import celery inspect = celery.control.inspect() print('Active:', inspect.active()) print('Reserved:', inspect.reserved()) print('Scheduled:', inspect.scheduled()) " ```
Scale workers up:
bash
aws ecs update-service \
--cluster complitru-production \
--service complitru-worker \
--desired-count 4
If workers running but tasks not picked up — Redis connectivity issue. Check Redis SG and TLS config.
Symptom: backend logs License validation failed repeatedly.
Check license key in Secrets Manager:
bash
aws secretsmanager get-secret-value \
--secret-id complitru/production/app \
--query 'SecretString' --output text \
| jq '.COMPLITRU_LICENSE_KEY' | head -c 20
(Just verify it starts with ctl_ — full key is sensitive.)
Check license server reachability from a backend task:
bash
# Via ECS exec
curl -v https://license.complitru.ai/health
Check VPC egress — license check requires outbound HTTPS via NAT gateway. If NAT is broken, license validation fails.
Check license expiry — every license has an expiration. Contact sales@complitru.ai to renew. The 7-day offline grace period buys time, but the application will fail closed after grace expires.
If license is valid but check still fails — potential clock skew. Verify task time is correct (NTP via ECS task agent should handle this automatically).
Identify slow queries:
sql
SELECT * FROM mysql.slow_log
ORDER BY query_time DESC
LIMIT 20;
Check connection count:
sql
SHOW STATUS LIKE 'Threads_connected';
If approaching max_connections, increase the parameter group value or scale instance class.
Common culprits:
- Unindexed query on findings table — verify indexes exist on common filter columns (account_id, severity, status)
- Audit log query without time bound — always filter audit log queries by created_at
- Long-running scan transaction — check processlist for queries > 60s
Emergency mitigation: kill long-running queries with KILL <thread_id> (admin only).
Most common: Bedrock model access not enabled or region mismatch.
Check the actual error in CloudWatch logs:
fields @timestamp, @message
| filter @message like /bedrock/i
| filter level = "ERROR"
| sort @timestamp desc
| limit 20
Common errors and fixes:
| Error | Fix |
|---|---|
AccessDeniedException |
Enable model access in Bedrock console for the region |
Model not found |
Verify BEDROCK_MODEL_DEFAULT matches an enabled model in your region |
ThrottlingException |
Bedrock per-region quota hit; request quota increase or distribute load across regions |
Timeout |
Increase WORKER_TIMEOUT for long-running AI calls; check VPC endpoint configuration if using one |
AI_ENABLED=false temporarily to keep the product operational while diagnosing.bash
aws elbv2 describe-target-health \
--target-group-arn <backend-tg-arn> \
--query 'TargetHealthDescriptions[*].{id:Target.Id,health:TargetHealth.State,reason:TargetHealth.Reason}'unhealthy with reason Target.Timeout — task is overloaded, scale up
- If targets are unused — security group not allowing ALB → task port; check Terraform SG
- If empty target list — service has 0 running tasks; investigate whyCompliTru-published images are immutable and signed. To verify an image before deploy:
# Verify signature
cosign verify ghcr.io/complitru/backend:1.0.0 \
--certificate-identity-regexp '.*@complitru\.ai$' \
--certificate-oidc-issuer 'https://token.actions.githubusercontent.com'
# Verify SBOM
syft ghcr.io/complitru/backend:1.0.0 -o spdx-json > sbom.json
diff sbom.json official-sbom-1.0.0.json
If verification fails, do not deploy. Contact security@complitru.ai.
SECRET_KEY, ENCRYPTION_KEY, JWT_SECRETTarget: p95 < 500ms for read endpoints, < 2s for scan triggers.
If exceeding:
WORKERS=4 is conservative; bump to 2 * vCPU + 1Target: 1000-resource AWS account scanned in under 5 minutes.
If slower:
SCAN_MAX_PARALLEL_ACCOUNTS — defaults to 10 concurrent accountsCELERY_WORKER_CONCURRENCY — more concurrent task processing per workerworker_desired_count and worker_max_countdefault model, 0.5–1.5s for fast modelIf slower than typical:
BEDROCK_REGION — should match deployment region (cross-region adds 50–200ms)chat_completion to stream_completion for perceived faster responseTypical monthly cost for a default deployment:
| Component | Cost |
|---|---|
| ECS Fargate (backend, frontend, worker) | ~$140 |
| RDS db.t3.medium Multi-AZ | ~$140 |
| ElastiCache t3.micro | ~$25 |
| ALB | ~$25 |
| NAT Gateway (single) | ~$30 |
| CloudWatch (logs + metrics + alarms) | ~$40 |
| S3 | ~$10 |
| Secrets Manager | ~$5 |
| Bedrock (depends on usage) | ~$30–200 |
| Total | ~$445–615/mo |
Cost reduction options:
db_multi_az = false for non-prod (~$70/mo savings)AI_ENABLED=false for environments not using AI (~$30–200/mo savings)| Issue type | Contact | Response SLA |
|---|---|---|
| Operational issue with the deployment | support@complitru.ai |
4 business hours |
| Security vulnerability in the product | security@complitru.ai |
48 hours initial response |
| License renewal / contract | sales@complitru.ai |
1 business day |
| Critical production outage | support@complitru.ai + phone (in your support contract) |
1 hour |
Always include in support requests: - Your license key ID (first 12 chars only) - Deployment region and version - Relevant log snippets (CloudWatch query results acceptable) - Steps to reproduce - Impact statement
How to upgrade your CompliTru deployment to a new version safely with zero data loss and minimal downtime.
CompliTru uses semantic versioning: MAJOR.MINOR.PATCH.
| Bump type | Compatibility | Action required |
|---|---|---|
| Patch (e.g., 1.0.0 → 1.0.1) | Fully backward-compatible — bug fixes only | Image swap, no migrations |
| Minor (e.g., 1.0.x → 1.1.0) | Backward-compatible — new features, additive schema changes | Image swap + migrations (online) |
| Major (e.g., 1.x.x → 2.0.0) | Breaking changes possible — schema, configuration, or API | Image swap + migrations + configuration review |
| Release type | Frequency | Notification |
|---|---|---|
| Critical security patch | Within 72 hours of CVE disclosure | Email to license-tied address |
| Patch | Bi-weekly | Email + release notes |
| Minor | Quarterly | Email + release notes + 30-day preview |
| Major | Annually | Email + release notes + 90-day preview + migration guide |
Release notes for every version: https://complitru.ai/releases (license required).
Before any upgrade:
pre-upgrade-${VERSION}-${DATE})Zero downtime. Rolling restart.
# 1. Update Terraform variable
cd terraform
sed -i.bak 's/complitru_version = "1.0.0"/complitru_version = "1.0.1"/' terraform.tfvars
# 2. Plan and apply
terraform plan
terraform apply
# 3. Monitor deployment
aws ecs wait services-stable \
--cluster complitru-production \
--services complitru-backend complitru-frontend complitru-worker
# 4. Verify
cd ../scripts
./verify-deployment.sh
ECS performs a rolling deploy: new tasks come up healthy, then old tasks drain. Connection draining handles in-flight requests. Total time: 5–10 minutes.
# Revert version
sed -i.bak 's/complitru_version = "1.0.1"/complitru_version = "1.0.0"/' terraform.tfvars
terraform apply
ECS rolls back to the previous task definition revision. Patches are guaranteed schema-compatible — no data migration required for rollback.
Online schema migrations. Brief feature flag rollout possible.
Minor releases may include: - New configuration variables (with safe defaults) - New database tables / columns (additive only) - Deprecated features (with at least one minor cycle of warning) - New optional features behind feature flags
cd terraform
# Update version
sed -i.bak 's/complitru_version = "1.0.5"/complitru_version = "1.1.0"/' terraform.tfvars
# Add any new variables called out in release notes
# Example: a new feature flag introduced in 1.1.0
echo 'feature_xyz_enabled = false # New in 1.1.0, default off' >> terraform.tfvars
Migrations are run as a one-off ECS task with the new image, before promoting the service:
cd ../scripts
./migrate.sh --version=1.1.0 --dry-run # Preview migrations
./migrate.sh --version=1.1.0 # Apply
Migration script: - Uses Alembic with idempotent operations - Applies in a single transaction per migration (auto-rollback on error) - Logs every step to CloudWatch - Refuses to run if any migration would cause more than 30 seconds of table lock (manual approval required for those)
For long-running migrations (large tables, index rebuilds), the script supports online migration via pt-online-schema-change patterns. Documented per-migration in release notes.
cd ../terraform
terraform apply
Same rolling deploy pattern as patch upgrades. Total time: 10–20 minutes including migration.
cd ../scripts
./verify-deployment.sh
Plus check the CompliTru UI: - Login works - Dashboards render - A scan can be initiated and completes - New features in release notes are present (if any)
If issues are detected post-upgrade:
# Revert application version
sed -i.bak 's/complitru_version = "1.1.0"/complitru_version = "1.0.5"/' terraform.tfvars
terraform apply
Application code rolls back. Schema is forward-compatible (additive changes), so old code continues working with new schema. Do not attempt to roll back schema migrations — additive changes are safe to leave; reverting them risks data loss.
If rollback fails or behavior is broken: - Restore RDS from the pre-upgrade snapshot taken in pre-upgrade checklist - Redeploy old version against restored DB - Time: ~30 minutes - Data loss window: between pre-upgrade snapshot and incident
Major upgrades may include breaking changes. Plan a maintenance window.
Every major release ships with a dedicated migration guide at https://complitru.ai/releases/2.0.0/migration. Read it end-to-end.
Key sections to expect: - Configuration changes (renamed / removed variables) - Schema changes (any non-additive changes called out explicitly) - API changes (deprecated endpoints, breaking changes) - IAM permission changes (any new permissions required) - Behavioral changes (defaults that changed)
Spin up a staging deployment from a recent production backup:
cd terraform
terraform workspace new staging-2x-test
cp terraform.tfvars terraform.staging.tfvars
# Edit: set environment, domain, smaller sizing
terraform apply -var-file=terraform.staging.tfvars
# Restore production snapshot to staging RDS
aws rds restore-db-instance-from-db-snapshot \
--db-instance-identifier complitru-staging \
--db-snapshot-identifier <recent-prod-snapshot>
# Run migration against staging
./scripts/migrate.sh --version=2.0.0 --target=staging
Run your test suite, manual QA, and integration tests in staging before touching production.
Notify users in advance via:
- In-app banner (Settings → Maintenance Mode → Schedule)
- Email to admin distribution list
- Status page update (if you publish one)
Recommended window: 2 hours during low-traffic period for first major upgrade. Subsequent ones can be shorter once you have experience.
# 1. Take pre-upgrade manual snapshot
aws rds create-db-snapshot \
--db-instance-identifier complitru-production \
--db-snapshot-identifier pre-upgrade-2.0.0-$(date +%Y%m%d-%H%M)
# Wait for snapshot to complete
aws rds wait db-snapshot-available \
--db-snapshot-identifier pre-upgrade-2.0.0-$(date +%Y%m%d-%H%M)
# 2. Enable maintenance mode (returns 503 to users with maintenance message)
aws ecs update-service \
--cluster complitru-production \
--service complitru-backend \
--task-definition complitru-backend-maintenance:LATEST
# 3. Wait for tasks to drain
sleep 60
# 4. Run migrations
cd scripts
./migrate.sh --version=2.0.0
# 5. Update Terraform with new version + any new required variables
cd ../terraform
# (edit terraform.tfvars per migration guide)
# 6. Apply
terraform apply
# 7. Disable maintenance mode (Terraform apply restores normal task definitions)
# 8. Verify
cd ../scripts
./verify-deployment.sh
Major upgrades may include schema changes that are not safely reversible. The rollback procedure for a major upgrade is restore from snapshot:
# 1. Stop application traffic
aws ecs update-service \
--cluster complitru-production \
--service complitru-backend \
--desired-count 0
# 2. Rename current DB instance (preserves the upgraded data for forensics)
aws rds modify-db-instance \
--db-instance-identifier complitru-production \
--new-db-instance-identifier complitru-production-2x-failed \
--apply-immediately
# 3. Restore from pre-upgrade snapshot to the production identifier
aws rds restore-db-instance-from-db-snapshot \
--db-instance-identifier complitru-production \
--db-snapshot-identifier pre-upgrade-2.0.0-${TIMESTAMP}
# 4. Wait for restored instance to be available
aws rds wait db-instance-available \
--db-instance-identifier complitru-production
# 5. Update Terraform back to old version, apply
cd terraform
sed -i.bak 's/complitru_version = "2.0.0"/complitru_version = "1.x.y"/' terraform.tfvars
terraform apply
# 6. Verify
cd ../scripts
./verify-deployment.sh
Data loss: anything written between snapshot time and rollback time. Document the gap and communicate to users.
CompliTru-published images use immutable tags. Once a version is published, the digest never changes:
ghcr.io/complitru/backend:1.0.0 → sha256:abc123... (always)
If a critical issue is found, a new patch version is released — never a re-tag of the same version.
For maximum reproducibility, pin to digests in terraform.tfvars:
backend_image = "ghcr.io/complitru/backend@sha256:abc123..."
frontend_image = "ghcr.io/complitru/frontend@sha256:def456..."
worker_image = "ghcr.io/complitru/worker@sha256:ghi789..."
Digests for each release are published in release notes and in your release email.
Every CompliTru image is signed with cosign. Verify before deploy:
cosign verify ghcr.io/complitru/backend:1.0.0 \
--certificate-identity-regexp '.*@complitru\.ai$' \
--certificate-oidc-issuer 'https://token.actions.githubusercontent.com'
If verification fails, do not deploy. Contact security@complitru.ai.
Each release publishes an SBOM (Software Bill of Materials) in SPDX JSON format:
# Pull the official SBOM
curl -O https://releases.complitru.ai/1.0.0/sbom-backend.spdx.json
# Generate SBOM from the image you have
syft ghcr.io/complitru/backend:1.0.0 -o spdx-json > my-sbom.json
# Compare
diff <(jq -S . sbom-backend.spdx.json) <(jq -S . my-sbom.json)
Differences indicate the image was tampered with or you have the wrong version.
License keys carry forward across upgrades. No action required.
CompliTru migrations follow these rules:
Major releases may include destructive changes (drops, type changes). These are always called out in the migration guide and require explicit operator approval.
When a new release renames or removes a configuration variable:
We strongly recommend maintaining a staging environment for upgrade testing. Patterns:
Run a full staging deployment continuously, sized down. Cost: roughly 25–30% of production. Provides best confidence — staging is always real and current.
Spin up staging only when an upgrade is planned. Restore from production snapshot, run upgrade, validate, tear down. Cost: ~$50/upgrade. Slower but cheaper.
Both patterns supported by the included Terraform modules — use a separate Terraform workspace.
CompliTru tracks upgrade history in the application:
SELECT
version,
upgraded_at,
upgraded_by,
migrations_applied,
duration_seconds
FROM upgrade_history
ORDER BY upgraded_at DESC;
Use this for compliance evidence (SOC 2 CC8.1 — change management).
For paid support customers, CompliTru provides upgrade assistance:
Contact support@complitru.ai at least one week ahead of a major upgrade to schedule.
How the CompliTru self-hosted deployment satisfies SOC 2 Trust Services Criteria. This document is structured for use as auditor evidence: each control statement maps to specific technical implementations with file references and verification steps.
The mapping covers Trust Services Criteria for: - Security (Common Criteria CC1–CC9) — required for all SOC 2 reports - Availability (A1) — applicable - Confidentiality (C1) — applicable - Processing Integrity (PI1) — applicable
Privacy (P1–P8) is applicable when CompliTru is used to handle personal data; controls noted where relevant.
Implementation:
- CompliTru's source code is governed by a Code of Conduct (in product source repository)
- All committers sign a Contributor License Agreement
- Customer-facing security disclosures handled via security@complitru.ai with PGP key
Customer evidence:
- Acceptable Use Policy enforced via Terms of Service displayed at first login
- Audit log captures every action attributed to a named user — see audit_log table
Customer responsibility — CompliTru provides audit log evidence to support board-level reporting.
Customer evidence within CompliTru:
- RBAC with admin, analyst, viewer, auditor roles (Settings → Users & Roles)
- Role assignment audit log entries (audit_log filtered by action = 'role.assign')
- Optional approval workflows for high-impact remediation (Settings → Approval Workflows)
Implementation: - CompliTru provides quarterly product training webinars to customer admins - Built-in inline documentation for every check and remediation - Onboarding workflow walks new admins through configuration
Customer evidence:
- Audit log captures actor, source IP, and before/after state for every mutation
- Reports queryable per user, per time window: Settings → Audit Log → Export
- Daily / weekly admin activity report scheduled to a security distribution list
Implementation in CompliTru: - Notifications module (Slack, Teams, email, PagerDuty) for security findings and remediation events - Configurable digest reports per role (admin daily, analyst weekly, executive monthly) - In-app notification center for finding assignments and escalations
Implementation: - Webhooks for outbound integration with customer-managed systems (SIEM, ticketing) - API for programmatic access to findings and remediation status - CompliTru security advisories published via license-key-tied email distribution
Customer responsibility for use of CompliTru-provided data in external communication.
Implementation: - Compliance frameworks built-in: SOC 2, ISO 27001, NIST 800-53, NIST CSF, CIS Benchmarks, HIPAA, PCI DSS - Custom framework authoring for organization-specific objectives - Posture dashboard: live percentage compliance per framework
Implementation: - Continuous scanning across configured cloud accounts - 600+ checks identify misconfigurations, vulnerabilities, identity risks, drift - AI-assisted risk narration explains business impact (when AI enabled)
Implementation: - CIEM module identifies privilege escalation paths, dormant identities with elevated permissions, cross-account access risks - Audit log captures every admin action with actor and source IP - Anomaly detection on user behavior (failed login spikes, unusual scan patterns)
Implementation: - Drift detection: previously-resolved findings that reappear are flagged as drift events - CloudTrail integration surfaces configuration changes affecting compliance posture - Scheduled reports show week-over-week and month-over-month posture changes
Implementation: - Continuous scanning at customer-defined cadence (default daily) - Real-time event-driven scans triggered by CloudTrail events (optional) - Manual scans on demand via API or UI - Health dashboard: live status of scanners, queues, integrations
Implementation: - Findings prioritized by severity (Critical / High / Medium / Low / Info) - Configurable SLAs per severity (e.g., Critical resolved within 24 hours) - Overdue findings escalated automatically to designated channels - SLA compliance dashboard for management reporting
Implementation: - 600+ pre-built checks mapped to SOC 2, ISO, NIST, CIS, HIPAA, PCI DSS controls - Custom check authoring SDK for organization-specific controls - Frameworks editor for mapping checks to internal controls
Implementation:
- Built-in checks for: encryption at rest, encryption in transit, IAM policies, network segmentation, logging, monitoring, backup, change management, vulnerability management
- Coverage matrix visible at Settings → Frameworks → SOC 2
Implementation: - Remediation playbooks codify standard fix procedures - Approval workflows enforce required reviews before changes apply - Audit log creates immutable record of every applied change
CompliTru self-hosted implementation: - All traffic over TLS 1.2+ (HSTS enforced) - Authentication via username/password (Argon2 hashed), OAuth (Google / Microsoft), or SAML 2.0 SSO - MFA via TOTP (required for admin role by default) - API key authentication for programmatic access (SHA-256 hashed at rest) - Session management: configurable timeout, concurrent session limits, IP allow-listing - See SECURITY.md
Implementation:
- Self-service registration disabled by default — admin must invite new users
- Invitation emails are time-limited, single-use
- Default role on invitation is viewer — privilege elevation requires explicit admin action
- All user creation events logged to audit log with actor and timestamp
Implementation: - Admin can deactivate users with one click — sessions invalidated immediately - API keys revocable with immediate effect - SAML/OAuth users auto-deactivated when removed from upstream identity provider (with optional sync interval) - Audit log of all deactivation events
Customer responsibility for AWS data center physical security — covered under AWS's SOC 2 report (AWS is a sub-processor; their report should be referenced in your audit).
Implementation: - WAF (optional, AWS WAF managed rule sets) at ALB - Rate limiting per IP and per API key - Failed authentication lockout (configurable threshold, default 5 attempts → 15-minute lockout) - Suspicious activity alerts (impossible travel, unusual access patterns) - VPC isolation: application and DB tier in private subnets, no public IPs on application tasks
Customer responsibility for AWS S3 / EBS configuration. CompliTru-created buckets follow least-privilege defaults; see SECURITY.md.
Implementation: - All data in transit encrypted (TLS 1.2+) - Internal ALB → ECS task traffic in private subnet (encryption-in-transit configurable) - AWS service calls (Secrets Manager, RDS, S3, Bedrock) use TLS by default - See SECURITY.md
Implementation: - All admin actions write to immutable audit log (Object Lock S3 bucket, 7-year retention by default) - Configuration drift detection identifies changes to scanned resources - Approval workflows require multi-stage approval for sensitive changes - IAM policies follow least-privilege per service component
Self-hosted CompliTru implementation: - Container images built from minimal base, scanned with Trivy at CI time, SBOM published - Customer-side ECR scanning recommended for ongoing CVE detection on mirrored images - Quarterly security patches with 72-hour SLA for critical CVEs - See SECURITY.md
Customer environment scanning (the product itself): - Continuous vulnerability scanning of customer cloud resources - Vulnerability findings prioritized by EPSS score, exploitability, blast radius
Implementation: - CloudWatch container insights enabled on ECS cluster - Pre-configured alarms for: ECS task health, ALB error rates, RDS performance, license expiration - All application logs ship to CloudWatch Logs - Audit log mirrored to immutable S3 bucket
Implementation:
- Application audit log entries can be queried via UI (Settings → Audit Log)
- CloudWatch Logs Insights provides query language for incident investigation
- Optional integration with SIEM (Splunk, Datadog, Sumo Logic, Security Hub) for correlation
Implementation:
- Documented incident response runbook in DEPLOYMENT.md
- Containment actions (scale to zero, WAF block, secret rotation) documented with example commands
- CompliTru security team contact for product-side incidents: security@complitru.ai
Implementation: - RDS automated backups with point-in-time recovery (default 30-day retention) - S3 versioning on report bucket with lifecycle to Glacier - ECS stateless — redeploy from known-good container image - Documented rotation procedures for SECRET_KEY, ENCRYPTION_KEY, JWT_SECRET, license key, DB credentials
CompliTru product change management (vendor responsibility): - All changes via pull request with required code review - CI runs SAST (Bandit), SCA (npm audit, Trivy), secret scanning (Gitleaks) - All commits signed - SBOM published per release - Release notes describe every change
Customer change management to deployment: - Terraform state captures every infrastructure change - ECS task definitions versioned (cannot be edited, only replaced) - Container image digests pinnable for reproducible deployments - Application configuration changes via Secrets Manager versioning
Implementation: - Risk dashboard ranks open findings by severity × exploitability × asset criticality - Remediation suggestions provided for every finding - Impact analysis evaluates blast radius before remediation applied - Auto-remediation available for whitelisted check categories
Implementation:
- Subprocessor list maintained at complitru.ai/legal/subprocessors
- AWS sub-processors used: KMS, Secrets Manager, S3, RDS, ECS, ALB, CloudWatch, CloudTrail, ACM, Bedrock (optional), Textract (optional), Comprehend Medical (optional)
- No SaaS sub-processors when self-hosted with Bedrock and ai_provider = "bedrock" and ai_enabled = true
- See SECURITY.md for CompliTru's own supply chain controls
Implementation: - ECS auto-scaling: target tracking on CPU > 70% (configurable) - CloudWatch alarms on memory utilization, queue depth, RDS connections - Capacity dashboards in CloudWatch container insights
Customer responsibility / AWS: - Multi-AZ RDS deployment by default - Multi-AZ ECS service distribution - Optional multi-region disaster recovery (RDS cross-region snapshot copy)
Customer responsibility: - Documented backup verification procedure in DEPLOYMENT.md - Recommended quarterly DR drill: restore RDS from snapshot to verification instance, confirm application starts
Implementation: - Field-level encryption (Fernet) for sensitive data: API keys, integration credentials, webhook secrets - Encryption keys stored in Secrets Manager (KMS-encrypted) - Database-level encryption at rest via RDS KMS encryption - Application classifies findings by sensitivity (public / internal / confidential / restricted)
Implementation: - Configurable retention per finding type (default 365 days for resolved findings) - Audit log retention: configurable, default 7 years (Object Lock prevents earlier deletion) - User deletion: GDPR-style erasure with audit trail of what was removed - S3 lifecycle rules move old reports to Glacier then expire per policy
Implementation: - Scan results include source resource ID, scan timestamp, check version - Findings include reproducible "evidence" — the exact API responses that triggered the finding - Provenance trail from finding → scan → check version → policy that was evaluated
Implementation: - All inputs validated at API boundary (JSON schemas, type checking) - File uploads restricted (MIME allowlist, size limits, optional virus scanning hook) - ORM-only database access (no raw SQL on user input) - ReDoS protection on regex patterns from custom checks
Implementation: - Idempotent scan operations (re-running a scan produces consistent findings) - Atomic remediation operations with rollback capture - Distributed task execution with idempotency keys (Celery + Redis) - Database transactions wrap multi-step operations
Implementation: - Findings tagged with confidence score and check version - Reports include scan metadata: timestamp, scope, exclusions, scanner version - Output encoding on all rendered HTML to prevent injection - Signed JWT for API responses where integrity is critical
Implementation: - All data at rest encrypted (RDS KMS, S3 SSE-S3 / SSE-KMS, Secrets Manager KMS) - Field-level encryption for highest-sensitivity fields - Database backups encrypted, retention enforced - Audit log written to Object Lock bucket (immutable per retention period)
When deployed self-hosted:
| Sub-processor | Purpose | Customer relationship |
|---|---|---|
| AWS (your account) | Compute, storage, network, secrets, KMS | Direct — your AWS contract |
| AWS Bedrock (your account) | AI inference (if ai_provider=bedrock) |
Direct — your AWS contract |
| OpenAI | AI inference (if ai_provider=openai) |
Direct — your OpenAI contract |
| AWS Comprehend Medical (your account) | PHI detection (if de-identification used) | Direct — your AWS contract |
| AWS Textract (your account) | OCR for image-based documents (if used) | Direct — your AWS contract |
license.complitru.ai |
License key validation (outbound only, no customer data) | CompliTru |
When ai_enabled = false and image OCR / de-identification not used, CompliTru is the only sub-processor with any visibility — and visibility is limited to license validation pings (license key ID + timestamp, no customer data).
For auditors performing a SOC 2 examination of the customer's CompliTru deployment:
| Evidence type | Where to find it |
|---|---|
| Audit log of admin actions | RDS table audit_log; export via Settings → Audit Log → Export CSV |
| Audit log of authentication events | RDS table audit_log filtered by category = 'auth'; CloudWatch Logs /aws/ecs/complitru-backend filtered by event = 'auth.*' |
| Configuration baseline | Terraform state file (S3-backed); terraform.tfvars versioned in customer repo |
| Encryption verification | aws rds describe-db-instances, aws s3api get-bucket-encryption, aws kms describe-key |
| Backup verification | aws rds describe-db-snapshots, S3 versioning enabled state |
| Vulnerability scan results | ECR image scan results, customer-side Trivy or Inspector results on container images |
| Compliance posture history | CompliTru Reports → Compliance Posture History; export PDF or CSV |
| Change management | Terraform state history (S3 versioning); ECS task definition revisions; container image digest history in ECR |
CompliTru can produce supplementary letters of attestation for: - Image build provenance (signed attestations from CompliTru's CI) - SBOM accuracy (Syft-generated, signed) - Security patch SLA performance (historical mean time to patch)
Contact support@complitru.ai to request these for a specific reporting period.