See also: SECURITY-MODEL.md for the architectural security model (trust boundaries, privilege asymmetry, and reasoning behind the security posture).
This document covers operational security for the cloudcoop sandbox environment: what to configure, how to monitor, and how to respond to incidents.
The sandbox is designed to:
- Contain agent actions - Agents run in tmux sessions with shared VM access
- Limit GCP access - Minimal IAM permissions via service account
- Prevent lateral movement - Network isolation within the VM
- Audit activity - Logging of all agent actions
Note: Agents run as tmux windows on the VM, not in Docker containers. However, agents may use Docker as part of their work (building images, running tests, etc.). The Docker configuration below applies to containers the agents create, not to the agents themselves.
cloudcoop requires a dedicated service account for VMs. The cloud.gcp.service_account
configuration is mandatory and VM creation will fail without it. This prevents VMs from inheriting
the overly-permissive default Compute Engine service account (which has Editor permissions).
The VM should run under a dedicated service account with minimal permissions:
✅ Recommended Permissions:
- roles/logging.logWriter # Write logs to Cloud Logging
- roles/monitoring.metricWriter # Write metrics
- roles/artifactregistry.reader # Pull container images (optional)
- roles/secretmanager.secretAccessor # Access specific secrets (optional)
❌ NOT Allowed (Never Grant These):
- roles/compute.* # Cannot create/delete VMs
- roles/iam.* # Cannot modify IAM
- roles/storage.* # Cannot access arbitrary buckets
- roles/billing.* # Cannot affect billing
- roles/editor # Overly broad permissions
- roles/owner # Full project access
See SETUP-FLOW.md for setup instructions.
- Dedicated VPC - Sandbox runs in isolated network
- No ingress - Default deny all inbound (except SSH via IAP)
- Egress allowed - Agents need internet for API calls
- IAP-only SSH - No public IP exposure required
When agents create Docker containers as part of their work, recommend these security settings:
security_opt:
- no-new-privileges:true # Prevent privilege escalation
deploy:
resources:
limits:
cpus: "2" # CPU limits
memory: 6G # Memory limitsInstead of service account keys, use Workload Identity:
# The VM's service account automatically provides credentials
# No key files to manage or rotate- Read/write files within their workspace
- Execute arbitrary code (that's the point)
- Make API calls to external services
- Run Docker commands to build/test their work (Docker installed on VM)
- Access secrets via Secret Manager (configured ones only)
- Access other agents' workspaces (directory conventions)
- Modify GCP IAM permissions
- Create/delete GCP resources
- Access production databases (not in network)
# Update the secret
echo "new-api-key" | gcloud secrets versions add anthropic-api-key --data-file=-
# Old versions remain accessible until disabled
gcloud secrets versions disable anthropic-api-key --version=1For higher isolation, run the sandbox in a dedicated GCP project:
# Sandbox project has no access to production resources
gcloud projects create claude-sandbox-project# In terraform/main.tf
resource "google_project_iam_audit_config" "all" {
project = var.project_id
service = "allServices"
audit_log_config {
log_type = "ADMIN_READ"
}
audit_log_config {
log_type = "DATA_WRITE"
}
}# Alert on suspicious activity
gcloud alpha monitoring policies create \
--notification-channels=YOUR_CHANNEL \
--condition="metric.type=\"compute.googleapis.com/instance/cpu/utilization\" > 0.9"- Review Cloud Audit Logs weekly
- Check for unexpected API calls
- Monitor costs for anomalies
- Rotate credentials quarterly
-
Stop all agents immediately:
./scripts/stop-agents.sh
-
Revoke API key:
gcloud secrets versions disable anthropic-api-key --version=latest
-
Isolate the VM:
gcloud compute instances stop claude-sandbox --zone=ZONE
-
Review logs:
gcloud logging read "resource.type=gce_instance"
-
Rotate all credentials before resuming
- Data residency: Agents make API calls to Anthropic (US-based)
- PII handling: Do not process PII in sandbox environment
- Audit trail: Cloud Logging retains logs for 30 days by default
- Service account has minimal permissions (enforced by config validation)
- IAP enabled for SSH access
- No external IP (or restricted firewall)
- API key stored in Secret Manager
- Audit logging enabled
- Alerts configured for anomalies
- Docker images scanned for vulnerabilities
- Regular credential rotation scheduled