Skip to content

fix: replace removed bitnami/kafka with apache/kafka, production-hard…#38

Merged
tusharkhatriofficial merged 1 commit intomainfrom
dev
Feb 18, 2026
Merged

fix: replace removed bitnami/kafka with apache/kafka, production-hard…#38
tusharkhatriofficial merged 1 commit intomainfrom
dev

Conversation

@tusharkhatriofficial
Copy link
Copy Markdown
Owner

…en deployment

  • Switch Kafka image from bitnami/kafka:3.7 (removed from Docker Hub) to apache/kafka:latest (official KRaft image)
  • Convert Bitnami KAFKA_CFG_* env vars to native KAFKA_* format
  • Rewrite Dockerfile as 3-stage build: Node (dashboard) → Maven (JAR) → JRE runtime — serves both API and dashboard on :8080
  • Add .dockerignore to exclude .git, node_modules, ai-docs from builds
  • Use eclipse-temurin:21-jre instead of jdk (saves ~300MB)
  • Run app as non-root user (eventara)
  • Add resource limits to prod compose (postgres 1G, kafka 1G, redis 512M, eventara 1G)
  • Add healthcheck start_period for slower VPS environments
  • Remove SPRING_JPA_HIBERNATE_DDL_AUTO override that conflicted with Flyway in dev compose
  • Rewrite DEPLOYMENT.md with Coolify, DigitalOcean, and VPS guides
  • Add EVENTARA_PORT and JAVA_OPTS to .env.example

…en deployment

- Switch Kafka image from bitnami/kafka:3.7 (removed from Docker Hub)
  to apache/kafka:latest (official KRaft image)
- Convert Bitnami KAFKA_CFG_* env vars to native KAFKA_* format
- Rewrite Dockerfile as 3-stage build: Node (dashboard) → Maven (JAR)
  → JRE runtime — serves both API and dashboard on :8080
- Add .dockerignore to exclude .git, node_modules, ai-docs from builds
- Use eclipse-temurin:21-jre instead of jdk (saves ~300MB)
- Run app as non-root user (eventara)
- Add resource limits to prod compose (postgres 1G, kafka 1G, redis
  512M, eventara 1G)
- Add healthcheck start_period for slower VPS environments
- Remove SPRING_JPA_HIBERNATE_DDL_AUTO override that conflicted with
  Flyway in dev compose
- Rewrite DEPLOYMENT.md with Coolify, DigitalOcean, and VPS guides
- Add EVENTARA_PORT and JAVA_OPTS to .env.example
Copy link
Copy Markdown

@charliecreates charliecreates Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Key deployment hardening items are currently undermined by (1) apache/kafka:latest being unpinned and (2) Compose deploy.resources.limits often being ignored outside Swarm. The container’s JAVA_OPTS setting is also not applied by the Dockerfile entrypoint, so JVM tuning guidance won’t work. .dockerignore is overly aggressive (notably ignoring Dockerfile), which can break remote/CI builders that rely on the build context.

Additional notes (2)
  • Compatibility | docker-compose.prod.yaml:18-18
    deploy.resources.limits in docker-compose.prod.yaml is ignored by docker compose up outside of Swarm mode for many Docker setups. That means the documented “memory limits” may not actually apply on the target VPS, defeating the stated hardening.

If your intent is to enforce limits in standard Compose, you should use Compose-supported settings (e.g., mem_limit) or document clearly that limits require Swarm / a compatible implementation.

  • Maintainability | Dockerfile:1-8
    The dashboard build stage only copies eventara-dashboard/package*.json before npm ci. If the dashboard uses package-lock.json (good) this is fine, but if it uses pnpm-lock.yaml/yarn.lock or npm workspaces, the caching/install may be incorrect. Also, npm ci --production=false is an odd flag; npm ci already installs devDependencies by default unless NODE_ENV=production or --omit=dev is used.

This isn’t necessarily broken, but it’s brittle and a bit confusing in a “production hardened” Dockerfile.

Summary of changes

Summary of changes

  • Added a .dockerignore to reduce Docker build context (ignores node_modules, .git, docs, compose files, etc.).
  • Updated .env.example to reflect Kafka KRaft CLUSTER_ID and documented EVENTARA_PORT + JAVA_OPTS.
  • Rewrote DEPLOYMENT.md with updated guidance (Coolify + VPS steps), resource limits, and env var table.
  • Replaced bitnami/kafka:3.7 with apache/kafka:latest in both compose files and migrated Bitnami KAFKA_CFG_* env vars to native KAFKA_*.
  • Hardened docker-compose.prod.yaml with start_period on healthchecks, memory limits, configurable app port, and JAVA_OPTS.
  • Reworked Dockerfile into a 3-stage build (dashboard → backend → JRE runtime) and runs the container as a non-root user.
  • Minor README tweak to remove the pinned Kafka version string.

Comment thread docker-compose.prod.yaml
Comment on lines 43 to +69
kafka:
image: bitnami/kafka:3.7
image: apache/kafka:latest
restart: unless-stopped
environment:
ALLOW_PLAINTEXT_LISTENER: "yes"
KAFKA_KRAFT_CLUSTER_ID: ${KAFKA_KRAFT_CLUSTER_ID:-NvDmnaWzQgiH8qbnraqxcg}
CLUSTER_ID: ${CLUSTER_ID:-NvDmnaWzQgiH8qbnraqxcg}

KAFKA_CFG_NODE_ID: 1
KAFKA_CFG_PROCESS_ROLES: controller,broker
KAFKA_CFG_CONTROLLER_QUORUM_VOTERS: 1@kafka:9094
KAFKA_CFG_CONTROLLER_LISTENER_NAMES: CONTROLLER
KAFKA_NODE_ID: 1
KAFKA_PROCESS_ROLES: controller,broker
KAFKA_CONTROLLER_QUORUM_VOTERS: 1@kafka:9094
KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER

# Internal-only listener for production (no host port published)
KAFKA_CFG_LISTENERS: INTERNAL://:9092,CONTROLLER://:9094
KAFKA_CFG_ADVERTISED_LISTENERS: INTERNAL://kafka:9092
KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,CONTROLLER:PLAINTEXT
KAFKA_CFG_INTER_BROKER_LISTENER_NAME: INTERNAL
KAFKA_LISTENERS: INTERNAL://:9092,CONTROLLER://:9094
KAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,CONTROLLER:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL

KAFKA_CFG_AUTO_CREATE_TOPICS_ENABLE: "true"
KAFKA_CFG_NUM_PARTITIONS: 3
KAFKA_CFG_DEFAULT_REPLICATION_FACTOR: 1
KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
KAFKA_NUM_PARTITIONS: 3
KAFKA_DEFAULT_REPLICATION_FACTOR: 1

KAFKA_CFG_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_CFG_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_CFG_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_CFG_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
volumes:
- kafka_data:/bitnami/kafka
- kafka_data:/opt/kafka/kafka-logs
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using apache/kafka:latest is a reproducibility and stability risk. A new upstream release can change defaults, KRaft metadata behavior, or paths, and your deployment could break on the next docker compose up.

Given this PR’s goal is to “production-harden” deployment, the image should be pinned to a specific Kafka version (or at least a major/minor tag) and ideally to a digest.

Suggestion

Pin the Kafka image to a known-good version and (optionally) a digest, e.g.:

  • image: apache/kafka:3.7.2 (or whatever you validated)
  • or image: apache/kafka@sha256:<digest>

Do the same in docker-compose.yaml.

Reply with "@CharlieHelps yes please" if you'd like me to add a commit pinning the image tag in both compose files.

Comment thread Dockerfile
Comment on lines 30 to 31
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "app.jar"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JAVA_OPTS is set in the container environment, but the Dockerfile ENTRYPOINT never consumes it (it always runs java -jar app.jar). Unless your base image provides a wrapper that reads JAVA_OPTS (Temurin images typically do not), this variable will be ignored and the memory tuning guidance in .env.example/docs won’t work.

This is a functional issue: users will think memory limits are applied when they aren’t.

Suggestion

Update the entrypoint to actually apply JAVA_OPTS. Common patterns:

  • Shell form:
    • ENTRYPOINT ["sh", "-c", "exec java $JAVA_OPTS -jar app.jar"]

Or keep exec form by using CMD:

  • ENTRYPOINT ["sh", "-c"]
  • CMD ["exec java $JAVA_OPTS -jar app.jar"]

Reply with "@CharlieHelps yes please" if you'd like me to add a commit implementing this change safely (including quoting/exec).

Comment thread Dockerfile
Comment on lines +1 to +16
# ===== Stage 1: Build the dashboard =====
FROM node:20-alpine AS dashboard-build
WORKDIR /app/eventara-dashboard
COPY eventara-dashboard/package*.json ./
RUN npm ci --production=false
COPY eventara-dashboard/ .
# Vite builds to ../src/main/resources/static (relative to eventara-dashboard)
RUN npm run build

# ===== Stage 2: Build the Spring Boot app =====
FROM maven:3.9.9-eclipse-temurin-21 AS backend-build
WORKDIR /app
COPY pom.xml .
COPY src ./src
# Copy dashboard build output into Spring Boot static resources
COPY --from=dashboard-build /app/src/main/resources/static ./src/main/resources/static/
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Dockerfile comment says Vite builds to ../src/main/resources/static relative to eventara-dashboard, but the actual copy step pulls from /app/src/main/resources/static (i.e., outside /app/eventara-dashboard). That only works if the Vite config truly outputs to /app/src/main/resources/static inside the container.

Given the build stage WORKDIR is /app/eventara-dashboard, a common output would be /app/eventara-dashboard/dist. If the Vite output path ever changes, the Docker build will silently produce an image without the dashboard.

This is fragile because it relies on a non-standard output directory and a hard-coded cross-directory path.

Suggestion

Make the dashboard artifact path explicit and robust:

  • Prefer building to the default dist/ and then copying into backend resources:
    • COPY --from=dashboard-build /app/eventara-dashboard/dist ./src/main/resources/static/

If you intentionally build directly into ../src/main/resources/static, add a quick sanity check to fail fast:

  • RUN test -f /app/src/main/resources/static/index.html

Reply with "@CharlieHelps yes please" if you'd like me to add a commit that switches to copying from dist/ (or adds the fail-fast check).

Comment thread .env.example
Comment on lines 5 to 7
# Required for Kafka KRaft mode. Do not change after first boot unless you wipe the kafka volume.
KAFKA_KRAFT_CLUSTER_ID=NvDmnaWzQgiH8qbnraqxcg
CLUSTER_ID=NvDmnaWzQgiH8qbnraqxcg

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You renamed KAFKA_KRAFT_CLUSTER_IDCLUSTER_ID in .env.example and compose. That’s fine for the new image, but it’s a breaking change for anyone with existing .env files from previous versions. The docs mention stability, but there’s no explicit migration note that users must rename the variable (or they’ll silently fall back to the default).

Given the importance of cluster IDs, silently defaulting could accidentally create a new cluster metadata state when users expected their old one.

Suggestion

Add a short migration note in DEPLOYMENT.md (or README) such as:

  • "If upgrading from older versions, rename KAFKA_KRAFT_CLUSTER_ID to CLUSTER_ID in your .env."
  • Optionally support both temporarily in compose: CLUSTER_ID: ${CLUSTER_ID:-${KAFKA_KRAFT_CLUSTER_ID:-NvDm...}}

Reply with "@CharlieHelps yes please" if you'd like me to add a commit implementing backward-compatible env var fallback plus the doc note.

Comment thread DEPLOYMENT.md
Comment on lines +69 to +81
## Resource limits

The prod compose file sets memory limits:

| Service | Memory Limit |
|---|---|
| Postgres | 1 GB |
| Kafka | 1 GB |
| Redis | 512 MB |
| Eventara (API + Dashboard) | 1 GB |

Total: ~3.5 GB. A 4 GB VPS works for demos and light use.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prod docs state the compose file "sets memory limits" totaling ~3.5GB. Given deploy.resources often won’t apply in standard Docker Compose (non-Swarm), this section is potentially misleading and could cause operators to size their VPS incorrectly.

This should be aligned with the actual behavior of the compose file.

Suggestion

Clarify how limits are enforced:

  • Add a note like: "deploy.resources is applied in Swarm; for plain docker compose up limits may not be enforced".
  • If you adopt mem_limit (see compose suggestion), update the docs to say limits are enforced by Docker.

Reply with "@CharlieHelps yes please" if you'd like me to add a commit updating DEPLOYMENT.md to accurately describe limit enforcement.

@charliecreates charliecreates Bot removed the request for review from CharlieHelps February 18, 2026 18:04
@tusharkhatriofficial tusharkhatriofficial merged commit cb50cc9 into main Feb 18, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant