-
Notifications
You must be signed in to change notification settings - Fork 2
Cloud Logging
TMI supports cloud-native logging to send application logs directly to cloud provider logging services. This enables centralized log management, long-term retention, and integration with cloud monitoring and alerting systems.
The cloud logging feature provides:
-
Provider-agnostic interface - The
CloudLogWriterinterface supports multiple cloud providers - OCI Logging implementation - First-class support for Oracle Cloud Infrastructure Logging
- Asynchronous buffered writes - Non-blocking log delivery with configurable batching
- Health tracking - Monitors cloud logging connectivity
- Graceful shutdown - Flushes pending logs before application exit
- Dual logging - Sends logs to both local files and cloud simultaneously
┌─────────────────────────────┐
│ TMI Application │
│ │
│ slog.Logger │
│ │ │
│ ▼ │
│ CloudLogHandler │
│ │ │
│ ├──► Local Handler │──► File/Console
│ │ │
│ └──► CloudLogWriter │──► Cloud Provider
│ (async buffer) │
└─────────────────────────────┘
| Variable | Description | Default |
|---|---|---|
TMI_LOG_LEVEL |
Minimum log level | info |
TMI_LOG_DIR |
Directory for local log files | logs/ |
TMI_CLOUD_LOG_ENABLED |
Enable cloud logging | false |
TMI_CLOUD_LOG_LEVEL |
Minimum level for cloud logs | Same as TMI_LOG_LEVEL
|
Note: The async write buffer size (default: 1000) is configured programmatically via Config.CloudLogBufferSize, not through an environment variable.
| Variable | Description | Required |
|---|---|---|
TMI_CLOUD_LOG_PROVIDER |
Cloud provider (oci) |
Yes |
TMI_OCI_LOG_ID |
OCI Log OCID | Yes |
- OCI Logging service enabled in your tenancy
- Log Group created in OCI Console
- Custom Log created within the Log Group
- IAM policy allowing OKE worker nodes (or other compute) to write logs
The OCI cloud writer uses the following authentication priority:
-
Explicit ConfigProvider - If provided in
OCICloudWriterConfig - Resource Principal - For OCI Container Instances and Functions
- Instance Principal - For OCI VMs
-
Default config (
~/.oci/config) - For local development
The Terraform logging module creates the required OCI resources. It supports OKE control plane logs (SERVICE type) and container stdout/stderr log collection via the OCI Unified Monitoring Agent:
module "logging" {
source = "../../modules/logging/oci"
compartment_id = var.compartment_id
tenancy_ocid = var.tenancy_ocid
name_prefix = var.name_prefix
object_storage_namespace = data.oci_objectstorage_namespace.ns.namespace
# OKE control plane log (SERVICE log)
create_oke_log = true
oke_cluster_id = var.oke_cluster_id
# Container stdout/stderr log collection via Unified Monitoring Agent
create_container_log = true
retention_days = 30
archive_retention_days = 365
create_archive_bucket = true
create_alert_topic = true
alert_email = "[email protected]"
create_alarms = true
tags = local.tags
}Note: Setting create_container_log = true also creates the dynamic group and IAM policy for OKE worker nodes to ship logs. Application logs must be in JSON format (use slog.NewJSONHandler, not TextHandler) for the Unified Monitoring Agent parser to extract structured fields.
If not using Terraform:
- Create Log Group:
oci logging log-group create \
--compartment-id <compartment-ocid> \
--display-name "tmi-logs"- Create Custom Log:
oci logging log create \
--log-group-id <log-group-ocid> \
--display-name "tmi-application" \
--log-type CUSTOM \
--is-enabled true- Create IAM Policy:
Allow dynamic-group tmi-oke-workers to use log-content in compartment id <compartment-ocid>
Allow dynamic-group tmi-oke-workers to manage log-groups in compartment id <compartment-ocid>
The CloudLogWriter interface enables support for multiple cloud providers:
type CloudLogWriter interface {
io.Writer
// WriteLog sends a structured log entry to the cloud provider
WriteLog(ctx context.Context, entry LogEntry) error
// Flush forces any buffered logs to be sent immediately
Flush(ctx context.Context) error
// Close releases resources and flushes remaining logs
Close() error
// Name returns the provider name for identification
Name() string
// IsHealthy returns true if the cloud provider is reachable
IsHealthy(ctx context.Context) bool
}type LogEntry struct {
Timestamp time.Time
Level slog.Level
Message string
Attrs map[string]any
Source string // file:line if available
}The OCICloudWriter implements CloudLogWriter for OCI Logging service.
- Batched writes - Collects entries and sends in batches (default: 100 entries)
- Periodic flushing - Flushes buffer every 5 seconds even if not full
- Health tracking - Monitors successful/failed write operations
- Structured logging - Preserves log attributes as JSON fields
config := OCICloudWriterConfig{
LogID: "ocid1.log.oc1...",
Source: "tmi-server",
Subject: "production",
BatchSize: 100,
FlushTimeout: 5 * time.Second,
ConfigProvider: nil, // Uses default OCI config
}
writer, err := NewOCICloudWriter(ctx, config)Logs appear in OCI Logging with this structure:
{
"data": {
"level": "INFO",
"message": "Request processed successfully",
"source": "api/server.go:245",
"request_id": "abc123",
"user_id": "user-456",
"latency_ms": 45
},
"id": "1706123456789000000",
"time": "2024-01-24T12:34:56.789Z"
}import "github.com/ericfitz/tmi/internal/slogging"
// Create OCI cloud writer
ociWriter, err := slogging.NewOCICloudWriter(ctx, slogging.OCICloudWriterConfig{
LogID: os.Getenv("TMI_OCI_LOG_ID"),
Source: "tmi-server",
})
if err != nil {
log.Fatalf("Failed to create OCI writer: %v", err)
}
// Initialize logger with cloud support
cloudLevel := slogging.LogLevelInfo
err = slogging.Initialize(slogging.Config{
Level: slogging.LogLevelDebug,
IsDev: false,
LogDir: "logs",
AlsoLogToConsole: true,
CloudWriter: ociWriter,
CloudLogLevel: &cloudLevel,
CloudLogBufferSize: 1000,
})Always close the logger to flush pending cloud logs:
defer func() {
if err := slogging.Get().Close(); err != nil {
log.Printf("Error closing logger: %v", err)
}
}()logger := slogging.Get()
// Check error count
if errors := logger.CloudLogErrors(); errors > 0 {
log.Printf("Cloud logging has %d errors", errors)
}
// Get last error
if err := logger.CloudLogLastError(); err != nil {
log.Printf("Last cloud log error: %v", err)
}The Terraform logging module creates (resources are conditional based on input variables):
| Resource | Description | Condition |
|---|---|---|
| Log Group | Container for related logs | Always created |
| OKE Control Plane Log | SERVICE log for kube-apiserver, controller manager, scheduler | create_oke_log = true |
| Container Custom Log | Custom log for container stdout/stderr | create_container_log = true |
| Unified Monitoring Agent | Fluentd-based agent that tails /var/log/containers/*.log on OKE workers |
create_container_log = true |
| Dynamic Group | IAM group matching OKE worker node instances | create_container_log = true |
| IAM Policy | Permissions for worker nodes to ship logs | create_container_log = true |
| Object Storage Bucket | Archive storage (Archive tier) | create_archive_bucket = true |
| Service Connector | Automated log archival to Object Storage |
create_archive_bucket = true and create_oke_log = true
|
| Notification Topic | Alert delivery channel | create_alert_topic = true |
| Monitoring Alarm | Error rate alarm |
create_alarms = true and create_oke_log = true
|
| Tier | Retention | Cost |
|---|---|---|
| Live Logs | 30 days (configurable, 1-180 days) | Standard |
| Archive | 365 days (configurable) | Archive tier pricing |
The module can create a monitoring alarm:
- Error Rate Alarm - Triggers when error count exceeds the configured threshold (default: 10 errors in 1 minute) with a 5-minute pending duration. Notifications are sent to the alert topic if configured.
- Navigate to Observability & Management > Logging > Logs
- Select your log group and log
- Use the search interface to filter logs
# Search logs
oci logging-search search-logs \
--search-query "search \"ocid1.log.oc1...\" | where level='ERROR'" \
--time-start 2024-01-24T00:00:00Z \
--time-end 2024-01-24T23:59:59Z
# Get recent logs
oci logging log-content get \
--log-id <log-ocid> \
--start-time 2024-01-24T12:00:00Z-- Find all errors
search "ocid1.log.oc1..." | where level='ERROR'
-- Find by request ID
search "ocid1.log.oc1..." | where request_id='abc123'
-- Find slow requests
search "ocid1.log.oc1..." | where latency_ms > 1000
-- Find by user
search "ocid1.log.oc1..." | where user_id='user-456'| Level | Cloud | Use Case |
|---|---|---|
| DEBUG | Optional | Detailed debugging (high volume) |
| INFO | Yes | Normal operations, request logs |
| WARN | Yes | Recoverable issues |
| ERROR | Yes | Failures requiring attention |
Consider setting cloud log level to INFO while local logs capture DEBUG for cost optimization.
Always use structured logging for better searchability:
logger.Info("Request processed",
"request_id", requestID,
"user_id", userID,
"method", r.Method,
"path", r.URL.Path,
"status", statusCode,
"latency_ms", latency.Milliseconds(),
)Cloud logging errors don't fail the application. Monitor error counts:
// Periodic health check
go func() {
ticker := time.NewTicker(5 * time.Minute)
for range ticker.C {
if errors := slogging.Get().CloudLogErrors(); errors > 100 {
// Alert on high error count
alerting.Send("Cloud logging degraded", errors)
}
}
}()To add support for a new cloud provider (e.g., AWS CloudWatch):
- Create a new file
aws_cloud_writer.go - Implement the
CloudLogWriterinterface - Add configuration handling
- Update documentation
Example skeleton:
type AWSCloudWriter struct {
client *cloudwatchlogs.Client
// ...
}
func NewAWSCloudWriter(ctx context.Context, config AWSCloudWriterConfig) (*AWSCloudWriter, error) {
// Initialize AWS CloudWatch Logs client
}
func (w *AWSCloudWriter) WriteLog(ctx context.Context, entry LogEntry) error {
// Send to CloudWatch
}
// Implement remaining interface methods...- Verify
TMI_OCI_LOG_IDis correct - Check IAM policy grants write access
- Verify container can reach OCI services (Service Gateway)
- Check
CloudLogErrors()for failures
- Increase
BatchSizeto reduce API calls - Verify network connectivity to OCI
- Check if buffer is consistently full (increase
CloudLogBufferSize)
If logs are being dropped:
- Increase
CloudLogBufferSize - Reduce
FlushTimeoutfor faster delivery - Consider higher
CloudLogLevelto reduce volume
- Terraform-Deployment - Infrastructure provisioning with Terraform
- Monitoring-and-Health - Monitoring and health checks
- Configuration-Reference - Server and application configuration
- OCI Logging Documentation
- Using TMI for Threat Modeling
- Accessing TMI
- Authentication
- Creating Your First Threat Model
- Understanding the User Interface
- Working with Data Flow Diagrams
- Managing Threats
- Collaborative Threat Modeling
- Using Notes and Documentation
- Timmy AI Assistant
- Metadata and Extensions
- Planning Your Deployment
- Terraform Deployment (AWS, OCI, GCP, Azure)
- Deploying TMI Server
- OCI Container Deployment
- Certificate Automation
- Deploying TMI Web Application
- Setting Up Authentication
- Database Setup
- Component Integration
- Post-Deployment
- Branding and Customization
- Monitoring and Health
- Cloud Logging
- Database Operations
- Security Operations
- Performance and Scaling
- Maintenance Tasks
- Getting Started with Development
- Architecture and Design
- API Integration
- Testing
- Contributing
- Extending TMI
- Dependency Upgrade Plans
- DFD Graphing Library Reference
- Migration Instructions