Conversation
Not up to standards ⛔🔴 Issues
|
| Category | Results |
|---|---|
| BestPractice | 6 medium 2 minor |
| CodeStyle | 24 minor |
🟢 Metrics 572 complexity
Metric Results Complexity 572
TIP This summary will be updated as you push new changes. Give us feedback
Code ReviewThis is a substantial and well-motivated change - replacing ~6,000 lines of ad-hoc Raft-like code with Apache Ratis 3.2.1 is a significant improvement in correctness guarantees. The new test suite (36 tests including 17 comprehensive ones) is thorough. Below are issues worth addressing before merge. Bugs1.
// HALog.java - remove this line:
System.out.println("[HA-" + level + "] " + formatted);2. The 3. final String formatted = String.format(message.replace("%s", "%s").replace("%d", "%s"), args);
4. Spurious In }
{ // <-- unnecessary anonymous block
try {
proxyToLeader(exchange, leaderAddr);
return;
} catch ...
}
sendErrorResponse(...)The extra Data Durability Concern5. Leader commits locally before achieving quorum
Standard Raft commits only after quorum acknowledgement. This design is explicitly documented as intentional, but it weakens the durability guarantees. At a minimum, this should be prominently documented as a trade-off (the new Security6. Timing-attack-vulnerable cluster token comparison // validateClusterForwardedAuth:
!expectedToken.equals(providedToken)
!MessageDigest.isEqual(expectedToken.getBytes(StandardCharsets.UTF_8), providedToken.getBytes(StandardCharsets.UTF_8))7. User enumeration via error message sendErrorResponse(exchange, 401, "Unknown forwarded user: " + forwardedUserValues.getFirst(), null, null);Exposing the username in the response body of an auth failure enables user enumeration. Return a generic Design / Maintainability8. // RemoteHttpComponent:
if (maxRetry < 3)
maxRetry = 3;This overrides the caller's intent without signalling it. Callers passing 9. ATTRIBUTIONS.md not updated Per 10.
Minor
Overall this is a high-quality implementation of a very complex change. The comprehensive test coverage and the transparent design documentation are appreciated. Addressing the data durability concern (#5) and the security issues (#6, #7) are the highest priority before merge. |
There was a problem hiding this comment.
Code Review
This pull request replaces the custom HA replication protocol with Apache Ratis, providing a formally verified Raft consensus implementation. The changes include a new Ratis state machine, snapshot management, and improved cluster monitoring. I have identified a flaw in the HTTP proxy's session-based authentication forwarding, a reliance on unreliable Thread.sleep in tests, and some unused code that should be cleaned up. Please address these issues to ensure robust authentication and reliable test execution.
| final String basicAuth = exchange.getAttachment(BASIC_AUTH_KEY); | ||
| if (basicAuth != null) { | ||
| // Extract username from stored Basic auth | ||
| final String decoded = new String(Base64.getDecoder().decode( | ||
| basicAuth.substring(AUTHORIZATION_BASIC.length() + 1)), java.nio.charset.StandardCharsets.UTF_8); | ||
| conn.setRequestProperty(HEADER_FORWARDED_USER, decoded.split(":")[0]); | ||
| } |
There was a problem hiding this comment.
There appears to be a flaw in the logic for forwarding session-based authentication. This code attempts to get the original Basic auth from an attachment on the current exchange. However, if the request was authenticated with a session Bearer token, this attachment will not be present, and the forwarded user will not be set.
To fix this, you should retrieve the session object from the session token and get the user's name from there. This would be more robust and would not require attaching the basic auth string to the exchange.
Here's a suggested implementation:
if (auth != null && auth.startsWith("Bearer AU-")) {
// Session token: use cluster-internal auth headers instead
conn.setRequestProperty(HEADER_CLUSTER_TOKEN, raftHA.getClusterToken());
final String sessionToken = auth.substring(AUTHORIZATION_BEARER.length()).trim();
final HttpAuthSession session = httpServer.getAuthSessionManager().getSession(sessionToken);
if (session != null) {
conn.setRequestProperty(HEADER_FORWARDED_USER, session.getUser().getName());
} else {
// Handle case where session is invalid or expired
}
} else if (auth != null) {
// ...| final JSONObject cluster = getClusterInfo(container); | ||
| if (cluster.has("isLeader") && cluster.getBoolean("isLeader")) | ||
| return true; | ||
| } catch (final Exception ignored) {} |
There was a problem hiding this comment.
Empty catch blocks should be avoided, even in tests, as they can swallow exceptions and hide underlying problems. It would be better to at least log the exception at a DEBUG or TRACE level. This would make debugging test failures easier. This applies to other similar empty catch blocks in this file and other new test files.
} catch (final Exception e) {
// Ignored during polling for leader
}| // Toxiproxy "slow_close" + "limit_data" simulate packet loss behavior. | ||
| // The "timeout" toxic with a probability achieves drop behavior. | ||
| // Using bandwidth toxic with very low rate to simulate packet-level loss. |
There was a problem hiding this comment.
The comment here is a bit misleading. It mentions slow_close, limit_data, and timeout toxics to simulate packet loss, but the implementation uses the bandwidth toxic. While limiting bandwidth can simulate some effects of packet loss (like timeouts), it's not the same as randomly dropping packets. The comment should be updated to accurately reflect the implementation, for example, by stating that it simulates packet loss effects by severely limiting bandwidth.
| private ResultSet forwardCommandToLeader(final String language, final String query, final Map<String, Object> namedParams, | ||
| final Object[] positionalParams) { | ||
| HALog.log(this, HALog.DETAILED, "Forwarding command to leader: %s %s (db=%s)", language, query, getName()); | ||
|
|
||
| // Rollback the local transaction started by DatabaseAbstractHandler.transaction() wrapper. | ||
| // The command executes on the leader, so no local changes should be committed. | ||
| if (isTransactionActive()) | ||
| rollback(); | ||
|
|
||
| final RaftHAServer raftHA = server.getRaftHA(); | ||
| final byte[] resultBytes = raftHA.forwardCommand(getName(), language, query, namedParams, positionalParams); | ||
| HALog.log(this, HALog.TRACE, "Command forwarded successfully: %d bytes result", resultBytes.length); | ||
|
|
||
| // Wait for the leader's WAL changes to be applied locally on this follower. | ||
| // Without this, a subsequent read on this server may not see the changes yet. | ||
| raftHA.waitForLocalApply(); | ||
|
|
||
| // Check for error response | ||
| if (resultBytes.length > 0 && resultBytes[0] == 'E') { | ||
| final String error = new String(resultBytes, 1, resultBytes.length - 1); | ||
| throw new com.arcadedb.exception.CommandExecutionException(error); | ||
| } | ||
|
|
||
| // Deserialize binary result into ResultSet | ||
| final java.util.List<Map<String, Object>> rows = | ||
| com.arcadedb.server.ha.ratis.RaftLogEntry.deserializeCommandResult(resultBytes); | ||
| final com.arcadedb.query.sql.executor.InternalResultSet rs = new com.arcadedb.query.sql.executor.InternalResultSet(); | ||
| for (final Map<String, Object> row : rows) { | ||
| final com.arcadedb.query.sql.executor.ResultInternal result = new com.arcadedb.query.sql.executor.ResultInternal(proxied); | ||
| result.setPropertiesFromMap(row); | ||
| rs.add(result); | ||
| } | ||
| return rs; | ||
| } |
There was a problem hiding this comment.
The private method forwardCommandToLeader seems to be unused. The logic in command() now throws ServerIsNotTheLeaderException, and the forwarding is handled by the HTTP proxy layer. This method appears to be a remnant of a different implementation strategy. To improve maintainability and avoid confusion, it should be removed if it's no longer needed.
| .atMost(5, TimeUnit.MINUTES) | ||
| .pollInterval(1, TimeUnit.SECONDS) | ||
| .until(() -> getServer(serverNumber).getHA().getMessagesInQueue() == 0); | ||
| // With Ratis, replication is handled internally. Wait for state machine application. |
There was a problem hiding this comment.
Replacing the explicit wait for the replication queue to be empty with a fixed Thread.sleep() is unreliable and can lead to flaky tests. A better approach would be to implement a more deterministic wait. For example, you could get the leader's commit index and then poll the followers until their applied index matches the leader's commit index. This would ensure that replication is actually complete before proceeding with test assertions.
// With Ratis, replication is handled internally. Await for state machine application on followers.
final ArcadeDBServer leader = getLeaderServer();
if (leader == null)
return; // NO LEADER, CANNOT WAIT
final long leaderCommitIndex = leader.getHA().getCommitIndex();
for (int i = 0; i < getServerCount(); ++i) {
final ArcadeDBServer server = getServer(i);
if (server != leader)
Awaitility.await().atMost(1, TimeUnit.MINUTES).until(() -> server.getHA().getLastAppliedIndex() >= leaderCommitIndex);
}
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
|
Created first HA benchmark with multiple server, testing an insert and the results are VERY PPOR: Key findings:
The benchmark is tagged @tag("benchmark") so it's excluded from normal test runs. Run with: mvn test -pl server -Dtest=HAInsertBenchmark -Dgroups=benchmark We need to analyze this before merging. |
|
Code review posted - see full review details in the next comment |
|
test4 ignore |
|
Deleting test - please ignore this comment |
Code Review - Replace custom Raft protocol with Apache RatisThis is a major and well-structured change. Replacing a custom, ad-hoc Raft implementation with Apache Ratis is the right call for long-term reliability. The overall architecture is sound, the test coverage is impressive (36 tests + Docker e2e), and the documentation is thorough. Below are issues found during review. CRITICAL 1. System.out.println in production code (HALog.java) HALog.log() unconditionally calls System.out.println in addition to LogManager. Per project conventions (CLAUDE.md): remove any System.out you used for debug when you have finished. This will spam stdout in production when arcadedb.ha.logVerbose is nonzero. Also, the .replace in the format call is a no-op, and building a String.format() string just for the println while also calling LogManager is redundant. The println should simply be removed. 2. Platform-default charset in UUID.nameUUIDFromBytes(clusterName.getBytes()) in RaftHAServer.java All nodes must agree on the same groupId or they will never form a cluster. Using .getBytes() without a charset argument is platform-dependent. This should use StandardCharsets.UTF_8 explicitly, as already done for the token derivation two lines later: UUID.nameUUIDFromBytes(clusterName.getBytes(StandardCharsets.UTF_8))HIGH 3. Busy-wait polling in waitForAppliedIndex() and waitForLocalApply() Both methods spin with Thread.sleep(10) up to the quorum timeout. This adds up to 10ms latency on every READ_YOUR_WRITES and LINEARIZABLE read and holds a thread. Since ArcadeDBStateMachine already has lastAppliedIndex as an AtomicLong, a LockSupport-based condition or a CompletableFuture completed when the index advances would be lower-latency. 4. Dead code: forwardCommandToLeader() private method in ReplicatedDatabase.java The method is defined but never called. All command paths now throw ServerIsNotTheLeaderException instead, relying on the HTTP proxy. The method should be removed to avoid confusion. 5. No connection/read timeout on the HTTP proxy in AbstractServerHttpHandler.proxyToLeader() The call to new java.net.URI(targetUrl).toURL().openConnection() has no setConnectTimeout or setReadTimeout. If the leader is slow or unreachable, the follower thread blocks indefinitely. Explicit timeouts matching quorumTimeout should be set. 6. Fully qualified names in production code CLAUDE.md says do not use fully qualified names if possible, always import the class. There are many FQN usages in production code in RaftHAServer.java and ReplicatedDatabase.java. For example: peerHttpAddresses is declared as new java.util.concurrent.ConcurrentHashMap, lagMonitorExecutor as java.util.concurrent.ScheduledExecutorService, and getFollowerStates returns java.util.List of java.util.Map. These should be imported at the top of the file. MEDIUM 7. isCurrentNodeLeader() race in ArcadeDBStateMachine.applyTransaction() The state machine skips apply on the leader (returns early if isCurrentNodeLeader()). During a leadership transition, a node that was leader when commit2ndPhase() ran could lose leadership before applyTransaction() is called, causing double-application of WAL page changes. Worth adding a comment explaining why this is safe in the Ratis model, or verifying it actually cannot happen. 8. Breaking change: connectCluster / disconnectCluster throw UnsupportedOperationException Existing users calling POST /api/v1/server with command connect or disconnect will get a 500 response instead of a helpful message. These should return a proper HTTP 400 with a migration hint rather than an uncaught exception. 9. Spurious double-block in exception handler in AbstractServerHttpHandler.java There is an extra unnecessary block wrapping the proxyToLeader try block in the ServerIsNotTheLeaderException catch handler. It is a no-op but visually confusing and should be removed. LOW / NITS 10. Cluster token written back to GlobalConfiguration initClusterToken() calls configuration.setValue(GlobalConfiguration.HA_CLUSTER_TOKEN, this.clusterToken). Since GlobalConfiguration values can appear in server info dumps and logs, the derived token (which is a function of the root password) could be inadvertently exposed. Consider keeping it only in the clusterToken field without writing it back to the global config. 11. getFollowerStates() index correlation assumption The method assumes matchIndices[i] corresponds to the i-th non-self peer in raftGroup.getPeers(). This assumption depends on Ratis internal ordering that may not be guaranteed across versions. Should be validated against the Ratis API contract. 12. System.out.println in test infrastructure RaftClusterStarter System.out usage is fine as CLI output, but actual test classes such as RaftHAComprehensiveIT should not emit to stdout to avoid cluttering CI output. Summary
The architectural direction is excellent and the test suite is thorough. The System.out.println in HALog and the platform-default charset in the group ID derivation are the two that should be addressed before merge. |
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
Code Review - PR 3798: Replace Custom Raft with Apache RatisThis is a substantial, well-motivated architectural replacement. Replacing a hand-rolled Raft protocol with Apache Ratis is the right call - the PR description is honest about the limitations of the old implementation. The new code is well-structured and comes with an impressive test suite (17 comprehensive tests + Docker e2e tests with network fault injection). The documentation in docs/arcadedb-ha-26.4.1.md is thorough. Below are the issues found, grouped by severity. SECURITY - High1. Weak cluster token derivation (RaftHAServer.initClusterToken())
Recommendation: use HMAC-SHA256 ( Mac mac = Mac.getInstance("HmacSHA256");
mac.init(new SecretKeySpec(rootPassword.getBytes(StandardCharsets.UTF_8), "HmacSHA256"));
this.clusterToken = Base64.getUrlEncoder().withoutPadding().encodeToString(mac.doFinal(clusterName.getBytes(StandardCharsets.UTF_8)));2. SnapshotHttpHandler - unauthenticated database download surface The snapshot endpoint is registered on SECURITY - Medium3. validateClusterForwardedAuth() - username blindly trusted from header After validating the cluster token, the username from 4. proxyToLeader() - silent privilege escalation to root conn.setRequestProperty(HEADER_CLUSTER_TOKEN, raftHA.getClusterToken());
conn.setRequestProperty(HEADER_FORWARDED_USER, "root");If a request arrives at a follower with no Authorization header (from a handler where 5. Snapshot downloaded over plain HTTP with no integrity verification
BUGS - High6. System.out.println in production HALog.java
Additionally the format string mutation 7. DDL replication inside the database write lock (ReplicatedDatabase.recordFileChanges())
BUGS - Medium8. Files.list() stream not closed in RaftHAServer.startService() final boolean storageExists = java.nio.file.Files.exists(storagePath)
&& java.nio.file.Files.list(storagePath).findAny().isPresent();
9. Potential double-apply for TRANSACTION_FORWARD entries
10. getFollowerStates() - array index alignment not guaranteed The code assumes PERFORMANCE11. proxyToLeader() uses HttpURLConnection with no connection pool or read timeout Every proxied follower-to-leader request opens a new TCP connection. The default read timeout is infinite (0). Under load this creates many TIME_WAIT sockets and risks blocking proxy threads indefinitely. 12. Busy-poll in waitForLocalApply() under LINEARIZABLE reads The 10ms poll loop for applied index creates unnecessary scheduling overhead. Ratis provides CODE QUALITY13. Fully-qualified class names in production code Multiple files use 14. Duplicate Javadoc block on waitForLocalApply() Two consecutive 15. forwardCommandToLeader() in ReplicatedDatabase appears to be dead code Writes now throw 16. Magic byte 'C' for state machine routing ASCII Summary
Must fix before merge:
The overall design and test coverage are excellent. The Ratis integration is architecturally sound. These are fixable issues that should be addressed before merging to |
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
Code Review - PR #3798 (follow-up)There are already thorough reviews on this PR. This comment focuses on two areas not yet addressed in the review thread. Performance: 29x HA overhead warrants documentation before mergeThe benchmark posted by @lvca shows:
A 29x throughput drop is expected for synchronous Raft consensus (each batch requires a majority-quorum round-trip over gRPC), but the absolute numbers matter for users: 53 op/s at batch-size 100 means roughly 5,300 single-record writes per second before the leader becomes the bottleneck. Before merge, the HA guide should document:
Test coverage gaps in critical pathsCodecov reports 47.48% patch coverage with 1,548 lines missing. The most concerning gaps:
Remaining items from today's review thread still openPer the 18:08 UTC review today, these items remain:
Items 3 and 4 are low-risk changes that should be addressed before merge. |
Code Review - Replace custom Raft protocol with Apache RatisThis is a very substantial and well-architected change. Replacing the custom Raft implementation with Apache Ratis is the right call - battle-tested consensus implementations are notoriously difficult to get right, and using Ratis brings pre-vote, parallel voting, and gRPC streaming essentially for free. Below are observations organized by priority. Most are edge cases; the overall design quality is high. High Priority
The origin-skip optimization (leader skips re-applying its own log entries because they were applied via
When Medium Priority
final long maxBytes = 10L * 1024 * 1024 * 1024; // line ~434For clusters storing large graphs this will be hit. A 10 GB cap on a single database file will cause follower sync to fail silently. This should be exposed as a
The watchdog thread closes the underlying
When
If the serialization format ever needs to change (new entry type, field reordering) there is no version tag to distinguish old from new entries in a mixed-version rolling upgrade window. Adding a single version byte immediately after the type byte would make future format evolution safe. Low Priority / Observations
The token is derived from
The comment already notes this is "NOT a substitute for mTLS" and references #3890. For visibility: until mTLS lands, any host that can spoof DNS or sit in the same DNS zone could potentially bypass the IP-based allowlist. The correct mitigation is the referenced mTLS work, and the current approach is an acceptable interim measure as long as the limitation is documented in the HA guide (it is - good).
If all pods in the resolved DNS set are unreachable (e.g. during a full cluster cold start) the loop completes without a successful Positive Highlights
Overall this is a high-quality implementation of a genuinely hard problem. The issues above are mostly edge cases in failure modes, not steady-state correctness bugs. The test coverage (unit + integration + Docker e2e) is comprehensive for a change of this scope. |
Code Review - Replace custom Raft protocol with Apache RatisThis is a major, well-motivated architectural improvement. Replacing the custom RAFT implementation with Apache Ratis is the right call - a formally correct, battle-tested consensus library is far more reliable than a homegrown protocol. The implementation is thorough, well-documented, and includes impressive test coverage (unit, integration, Docker E2E with Toxiproxy). Here is my detailed feedback. Highlights worth calling out
Issues to fix1. // Current (lines 91-93):
protected final ArcadeDBServer server;
protected final LocalDatabase proxied;
protected final long timeout;These fields are set in the constructor and never modified. They are 2. Looking at lines 189-200: if (!snapshotSemaphore.tryAcquire()) {
// ...return 503...
}
ScheduledFuture<?> watchdog = null;
try {
// ... rest of logic ...
} finally {
if (watchdog != null) watchdog.cancel(false);
snapshotSemaphore.release();
}The only statement between the acquire and the ScheduledFuture<?> watchdog = null;
boolean acquired = snapshotSemaphore.tryAcquire();
if (!acquired) {
// return 503
}
try {
// ...
} finally {
if (acquired) snapshotSemaphore.release();
if (watchdog != null) watchdog.cancel(false);
}3. // Line 285:
public org.apache.ratis.util.LifeCycle.State getRaftLifeCycleState() {The 4. return CompletableFuture.supplyAsync(() -> {
// ... heavy HTTP download + db.close() + db.reopen() ...
});
5. Stale follower risk after failed snapshot download (acknowledged, but worth flagging) In needsSnapshotDownload.set(true);
lifecycleExecutor.submit(() -> {
if (needsSnapshotDownload.compareAndSet(true, false)) {
snapshotInstaller.installDatabasesFromLeader();
// If this fails, the flag stays false
}
});The comment correctly notes: "If the download itself fails, the flag stays false until the next apply failure re-arms it." On a quiet cluster with no new writes, the follower remains permanently stale. Consider adding a periodic health check in 6. // ArcadeDBStateMachine.java lines 934-940:
@Deprecated
static WALFile.WALTransaction parseWalTransaction(final Binary buffer) {
return RaftLogEntryCodec.parseWalTransaction(buffer);
}This deprecated method was added in the same PR that introduced its replacement. If no callers exist outside of this PR, remove it entirely rather than deprecating it immediately. 7. The Javadoc comment states: "all current callers pass 8.
Minor / non-blocking observations
Test coverageExcellent. The combination of:
...is significantly better coverage than most HA implementations. The use of Overall this is high-quality work. The critical issues (#1, #3) are easy fixes. Issue #4 (ForkJoinPool) and #5 (stale follower re-arm) are worth addressing before merge given their potential impact in production. The rest are improvements or minor clarifications. 🤖 Generated with Claude Code |
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
1 similar comment
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
Not up to standards ⛔🔴 Issues
|
| Category | Results |
|---|---|
| BestPractice | 11 medium |
| Documentation | 11 minor |
| ErrorProne | 2 high |
| CodeStyle | 22 minor |
| Complexity | 1 medium |
🟢 Metrics 1261 complexity
Metric Results Complexity 1261
TIP This summary will be updated as you push new changes. Give us feedback
Code Review - Apache Ratis HA IntegrationThis is a significant and well-motivated architectural improvement. Replacing a bespoke, under-tested Raft implementation with Apache Ratis (battle-tested in Apache Ozone, IoTDB, Alluxio) is the right call. The test suite depth - especially the Toxiproxy-based chaos tests - is excellent. Below are issues grouped by severity. Critical / Data Safety1. Leader commits locally before achieving quorum ( The PR description states that This pattern may be intentional for performance reasons, but it needs to be prominently documented as a durability trade-off, not just a performance one. The new 2. Ratis shading conflict with Ratis ships with a shaded gRPC ( Security3. Token comparison must use // Vulnerable:
if (!expectedToken.equals(providedToken)) { ... }
// Safe:
if (!MessageDigest.isEqual(expectedToken.getBytes(StandardCharsets.UTF_8),
providedToken.getBytes(StandardCharsets.UTF_8))) { ... }4. Deriving a shared secret as
Use 5. Filtering by IP address provides a very shallow security boundary - IPs can be spoofed and containers often share NAT. Without mTLS (tracked in issue #3890), peer identity is entirely unverified. This is acceptable as a known gap for the initial integration, but a security warning should be added to the documentation and the configuration. 6. The auto-join logic queries the Kubernetes API to discover peers. Please verify:
Correctness / Bugs7. Missing HTTP timeout in
8.
9. String.format(message.replace("%s", "%s").replace("%d", "%s"), args)
10. The 11. There are now two e2e test modules: Performance12.
13.
Module Structure14. Per
Verify that Minor / Non-blocking
Summary
The test coverage - particularly the Toxiproxy chaos tests and split-brain scenarios - is genuinely excellent. Issues #3, #7, and #8 are straightforward to fix. Issue #1 (durability semantics) is the most architecturally significant and deserves explicit documentation even if the current behavior is intentional. 🤖 Generated with Claude Code |
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
Code ReviewThis is an excellent, well-engineered replacement of the custom Raft implementation with Apache Ratis. The code quality is very high: thorough Javadoc, clear crash-safety analysis, strong security posture, and a well-factored module structure. The following observations are in rough priority order. Bugs / Correctness1. Emergency-stop thread is not a daemon thread ( In the final Thread stopThread = new Thread(() -> {
try { server.stop(); } catch (final Throwable t) { ... }
}, "arcadedb-emergency-stop");
stopThread.start(); // no setDaemon(true)Every other background stop/restart thread in the module calls 2. Both methods call 3. Misleading comment in The comment reads:
Security4. private static boolean constantTimeTokenEquals(final String expected, final String provided) {
return MessageDigest.isEqual(
expected.getBytes(StandardCharsets.UTF_8),
provided.getBytes(StandardCharsets.UTF_8));
}
5. gRPC port authentication gap in K8s mode (logged but no startup guard) The Read Consistency6. if (remaining <= 0) {
HALog.log(this, HALog.DETAILED, "waitForLocalApply timed out...");
return; // silent
}On timeout the method returns without throwing, so 7. LINEARIZABLE on a follower without a bookmark silently degrades to READ_YOUR_WRITES The Javadoc in Performance / Operational8. Schema reload blocks the Ratis apply thread (
9. Confirm quorumTimeout sizing for ALL quorum batch shared deadline In Minor10. Missing return (RET) result.get();11. In What's Well Done
Overall this is a very solid PR. The identified issues are mostly minor or documentation-level, with the daemon-thread omission (#1) being the most important to address before merge. |
… new test case and improved code coverage
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
Code Review - Update (apache-ratis branch, be3ba69)This is a follow-up review of the updated branch. The issues called out in the April 5 review have been addressed:
The overall code quality is high. The security additions (PBKDF2 token derivation with 100k iterations, constant-time comparisons, zip-slip + symlink + decompression-bomb protection in Remaining items worth addressing1.
if (!leaderReady)
LogManager.instance().log(... "waitForLeaderReady timed out after %dms - proceeding with potentially stale leader state" ...);After a leader election, if the state machine catch-up task does not complete within 2.
if (System.currentTimeMillis() - lastResolveMs >= refreshIntervalMs) {
resolveNow();
if (allowedIps.get().contains(ip))
return attrs;
}
3.
if ("production".equals(configuration.getValueAsString(GlobalConfiguration.SERVER_MODE))
&& "arcadedb".equalsIgnoreCase(clusterName))
LogManager.instance().log(this, Level.WARNING, ...);The warning about using the default cluster name (weaker token domain separation) fires only in production mode. A staging or pre-production cluster using the default name gets no feedback. The production-mode gate for the rotating-token warning is appropriate, but the default-name warning should probably fire at all modes since the risk is the same regardless of 4. Test Several integration test files use bare 5. Minor - constant definitions placed after first use
Security note (already documented but worth reiterating)The PR description and SummaryThe implementation is significantly more mature than the previous review cycle. Items 1-3 above are the most actionable before merge; the rest are improvement suggestions for follow-up. The Apache Ratis integration, snapshot machinery, and two-phase commit protocol are well-designed and thoroughly tested with a comprehensive suite of unit, integration, and Docker e2e tests. |
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
|
Code Review - Follow-up (Apache Ratis integration) Thank you for the significant iteration - 30+ fix commits implementing prior review recommendations show real responsiveness. This follow-up focuses on the benchmark concern and remaining open items. PERFORMANCE - 29x overhead warrants investigation before merge @lvca noted 53 ops/sec (HA, 3-node) vs 1,546 ops/sec (no-HA) with batch=100. The P99 latency of ~28ms per batch on what is likely a single-machine test suggests suboptimal Ratis configuration rather than pure network cost. Suggested investigation:
If the benchmark ran all 3 nodes on one machine: real multi-machine RTT (1-5ms) typically gives 5-10x better throughput. Please note the hardware setup in the benchmark comment so others can calibrate. CI FAILURES Three checks are currently failing on this push:
TEST COVERAGE GAPS Codecov shows critical-path files at very low coverage:
Not asking for 80% coverage, but at least one integration test exercising SnapshotInstaller's happy path would materially reduce the risk of shipping a broken recovery path. STILL OPEN: Snapshot partial-failure cleanup Commit 18ced57 acknowledged this as 'valid but requires significant refactoring'. The risk: if ZIP extraction fails mid-way in SnapshotInstaller, the partially extracted database dir is left behind and corrupts the node on the next restart. An atomic swap (extract to db.tmp, then rename) fixes this. Please open a GitHub issue to track this before merging so it is not lost. RESOLVED from prior reviews (good work) Platform-default charset in group ID derivation, timing-attack on cluster token comparison, System.out.println in HALog, dead code (COMMAND_FORWARD), missing proxy timeouts, ATTRIBUTIONS.md update - all appear addressed based on commit messages. Summary The architectural direction is correct and the implementation has matured substantially through multiple fix rounds. Before merge:
The 36-test suite plus Docker e2e coverage and the detailed documentation are appreciated for a change of this scope. |
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
Code Review - Replace custom Raft protocol with Apache RatisThis is a substantial and well-motivated change. Replacing a bespoke Raft implementation with Apache Ratis is the right architectural call - Ratis is battle-tested at scale (Apache Ozone, IoTDB) and provides correctness guarantees that would take years to replicate manually. The test coverage is commendable: unit tests, integration tests, and Docker-based E2E tests with network partition and packet-loss injection are exactly what this kind of infrastructure change needs. Below are observations grouped by severity. Bugs / Correctness Issues1.
2. Spurious no-op The first 3. Missing HTTP timeout in The follower-to-leader proxy path opens an 4. In the retry loop, the caller-supplied Security Issues5. Timing-attack vulnerability in cluster token comparison ( Token equality is checked with Example fix: // Instead of:
if (!expected.equals(provided)) { ... }
// Use:
if (!MessageDigest.isEqual(expected.getBytes(StandardCharsets.UTF_8),
provided.getBytes(StandardCharsets.UTF_8))) { ... }6. User enumeration in auth error messages Some auth failure responses include the username in the error message (e.g. Data Durability Concern7. Leader applies locally before replication quorum In Please verify (and document in a comment) that the local apply step in Code Quality / Maintainability8. 39 new Many new 9. The snapshot download endpoint must be protected by the cluster token. Verify it checks 10. The background flusher 11. TODO section in The documentation contains a TODO section. Per project conventions, open items should be tracked as GitHub issues rather than left in committed docs. Convert these to issues before merging. Attribution / Licensing12. ATTRIBUTIONS.md not updated for Apache Ratis Per the project's CLAUDE.md, every new dependency must have an entry in Minor / Nits
Overall this is a strong PR. Addressing the security issues (5, 6, 9) and the data durability question (7) are the most important before merge. The remaining items are improvements that can follow in a subsequent PR if needed. |
Not up to standards ⛔🔴 Issues
|
| Category | Results |
|---|---|
| BestPractice | 11 medium |
| Documentation | 11 minor |
| ErrorProne | 2 high |
| CodeStyle | 23 minor |
| Complexity | 1 medium |
🟢 Metrics 1261 complexity
Metric Results Complexity 1261
🟢 Coverage 64.65% diff coverage · -10.46% coverage variation
Metric Results Coverage variation ✅ -10.46% coverage variation Diff coverage ✅ 64.65% diff coverage Coverage variation details
Coverable lines Covered lines Coverage Common ancestor commit (e076207) 117023 86128 73.60% Head commit (88cea6b) 149282 (+32259) 94262 (+8134) 63.14% (-10.46%) Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch:
<coverage of head commit> - <coverage of common ancestor commit>Diff coverage details
Coverable lines Covered lines Diff coverage Pull request (#3798) 3972 2568 64.65% Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified:
<covered lines added or modified>/<coverable lines added or modified> * 100%
TIP This summary will be updated as you push new changes. Give us feedback
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
|
Code Review - Follow-up (2026-04-17, latest commit: d52f160) This is a follow-up review pass after the extensive prior iteration. The PR has come a long way and most of the earlier concerns have been addressed. Here is the current status. Confirmed Fixed (latest commits)
Still Open1. TOCTOU race in SnapshotHttpHandler (NPE risk) The Fix: add a null guard before unwrapping: final DatabaseInternal db = server.getDatabase(databaseName);
if (db == null) {
exchange.setStatusCode(404);
return;
}
db.getEmbedded()...2. The configuration description says "Raft log gap threshold for lag warnings (0=disabled)" but does not say what N means. An operator expecting milliseconds would set this to 1000 and never get warnings (since 1000 log entries behind is a significant gap). Suggest appending: "A value of N means the replica is N committed log entries behind the leader." 3.
4. Using hash codes to deduplicate WARN log output is unreliable due to hash collisions. Two different cluster configurations could have the same hash code, suppressing a legitimate warning. For a safety-critical log, cache the last-printed string and compare with Overall AssessmentThe PR is in good shape. The core architecture (three-phase commit, group commit batching, read consistency levels, K8s auto-join, crash-safe snapshot swap) is sound. The security fundamentals (PBKDF2 cluster token, constant-time comparison, zip-slip protection, decompression bomb guards) are well-implemented. The test matrix - unit, integration, Docker e2e - is excellent. Issues 3 and 4 are minor. Issue 1 is a correctness concern that should be fixed before merge. Issue 2 is a documentation improvement that would help operators. Once issue 1 is addressed, this is ready to merge. |
This comes from @robfrank after working into improving HA for many months with poor results. The main reason was the RAFT implementation itself I created years ago (my fault) that was quite limited and poorly tested. In order to reach the level of rock-solidity we want, it might take 1 year or more. So @robfrank had the idea to replace our internal RAFT system with Apache Ratis 3.2.2 - a battle-tested, formally correct implementation of the Raft consensus protocol used in production by Apache Ozone, IoTDB, and Alluxio.
After reviewing Apache Ratis internal architecture, I started from scratch with the integration to go in parallel with @robfrank branch
ha-redesign. I've taken from @robfrank's branchha-redesignmany tests and some components.This change is transparent to users - the HTTP API, database API, query languages, and client libraries remain unchanged. The internal HA protocol now uses gRPC (shaded by Ratis) instead of custom TCP binary messages.
92 files changed, +7,917 / -6,639 lines (net +1,278)
What was removed (34 files)
The entire custom HA stack: HAServer, Leader2ReplicaNetworkExecutor, Replica2LeaderNetworkExecutor, LeaderNetworkListener, ReplicationLogFile, ReplicationProtocol, 21 message classes (TxRequest, CommandForwardRequest, etc.), and related infrastructure.
What was added (22 new files)
Core HA (5 production files in server/ha/ratis/):
Tests (10 files):
Docker e2e tests (7 files, tagged e2e-ha, require Docker):
Key features
New configuration settings
Test plan