fix: InfluxDB line protocol silent data loss and missing gzip support#3833
fix: InfluxDB line protocol silent data loss and missing gzip support#3833
Conversation
…#3821) Return 400 instead of silent 204 when line protocol samples reference unknown timeseries types, and decompress Content-Encoding: gzip bodies in the write handler. Adds negative grammar tests for undocumented CREATE TIMESERIES TYPE syntax (PRECISION, COMPACTION INTERVAL). Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
There was a problem hiding this comment.
Code Review
This pull request fixes critical bugs in the TimeSeries InfluxDB Line Protocol, including silent data loss on unknown measurement types and missing gzip decompression support. It also adds grammar validation tests for the CREATE TIMESERIES TYPE statement. Review feedback identifies a race condition in the asynchronous payload handling and suggests using JSONObject for safer error response generation.
| final AtomicReference<byte[]> bytesRef = new AtomicReference<>(); | ||
| e.getRequestReceiver().receiveFullBytes( | ||
| (exchange, data) -> bytesRef.set(data), | ||
| (exchange, err) -> { | ||
| LogManager.instance().log(this, Level.SEVERE, "receiveFullBytes completed with an error: %s", err, err.getMessage()); | ||
| exchange.setStatusCode(StatusCodes.INTERNAL_SERVER_ERROR); | ||
| exchange.getResponseSender().send("Invalid Request"); | ||
| }); | ||
|
|
||
| final byte[] rawBytes = bytesRef.get(); | ||
| if (rawBytes == null) { | ||
| rawPayload = null; | ||
| return null; | ||
| } |
There was a problem hiding this comment.
The use of receiveFullBytes is asynchronous and does not block the calling thread. Since bytesRef.get() is called immediately after, it will likely return null because the request body has not been fully read yet, leading to an incorrect "Request body is empty" error. Since this handler runs on a worker thread and has called e.startBlocking(), you should use the blocking getInputStream() to read the payload.
final byte[] rawBytes;
try {
rawBytes = e.getInputStream().readAllBytes();
} catch (final IOException ex) {
LogManager.instance().log(this, Level.SEVERE, "Error reading request payload", ex);
return null;
}
if (rawBytes.length == 0) {
rawPayload = null;
return null;
}There was a problem hiding this comment.
Pushback: AbstractServerHttpHandler.parseRequestPayload (lines 63-73) uses this exact same pattern - AtomicReference + receiveFullBytes + .get(). When blocking mode is started and the handler runs on a worker thread, receiveFullBytes completes synchronously before returning. This is the established pattern across all handlers in the codebase.
| return new ExecutionResponse(400, | ||
| "{ \"error\" : \"Unknown timeseries type(s): " + String.join(", ", unknownTypes) | ||
| + ". Create the type first with CREATE TIMESERIES TYPE.\"}"); |
There was a problem hiding this comment.
Manually constructing JSON strings is error-prone and can lead to invalid JSON if any measurement names contain characters that require escaping (like double quotes). It is safer to use the JSONObject class to build the response body.
| return new ExecutionResponse(400, | |
| "{ \"error\" : \"Unknown timeseries type(s): " + String.join(", ", unknownTypes) | |
| + ". Create the type first with CREATE TIMESERIES TYPE.\"}"); | |
| return new ExecutionResponse(400, | |
| new JSONObject().put("error", "Unknown timeseries type(s): " + String.join(", ", unknownTypes) | |
| + ". Create the type first with CREATE TIMESERIES TYPE.").toString()); |
There was a problem hiding this comment.
Fixed in f02ef8b. Switched to new JSONObject().put("error", ...).toString() to prevent JSON injection from measurement names containing quotes.
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 10 |
🟢 Coverage 67.57% diff coverage · -7.84% coverage variation
Metric Results Coverage variation ✅ -7.84% coverage variation Diff coverage ✅ 67.57% diff coverage Coverage variation details
Coverable lines Covered lines Coverage Common ancestor commit (598496b) 116361 85639 73.60% Head commit (f02ef8b) 147613 (+31252) 97073 (+11434) 65.76% (-7.84%) Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch:
<coverage of head commit> - <coverage of common ancestor commit>Diff coverage details
Coverable lines Covered lines Diff coverage Pull request (#3833) 37 25 67.57% Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified:
<covered lines added or modified>/<coverable lines added or modified> * 100%
TIP This summary will be updated as you push new changes. Give us feedback
|
Code Review Good fixes overall - the silent data loss and gzip issues are real bugs worth addressing. A few concerns to address before merging. Critical: JSON injection in error response PostTimeSeriesWriteHandler.java line 181 builds JSON via string concatenation using measurement names that come directly from the InfluxDB line protocol request body. A name containing a quote character or backslash will produce malformed or injected JSON. Use JSONObject as the project convention requires (CLAUDE.md: use com.arcadedb.serializer.json.JSONObject): return new ExecutionResponse(400, Medium: gzip decompression failure returns 500 not 400 The IllegalArgumentException thrown on line 98 propagates to the framework as a 500 Internal Server Error, even though the client sent a malformed gzip body (a 400-level error). Catch it in execute() or return ExecutionResponse(400, ...) instead. Medium: Planning document committed to repo root 3821-timeseries-line-protocol-bugs.md at the repo root is a work-notes file and should not be committed to the repository. Remove it before merging. Minor: Silent skip when type exists but is not a LocalTimeSeriesType When inserted == 0 and unknownTypes.isEmpty(), the code returns 204 silently. This happens when samples match a schema type that is not a LocalTimeSeriesType (the continue on line 146). That branch also hides data loss - worth logging a warning there too, consistent with the unknown-type handling added in this PR. Pre-existing issue (not introduced by this PR): rawPayload is an instance field on a shared handler. Since Undertow handlers are typically singleton instances, concurrent requests could overwrite each other's rawPayload. Not introduced here but worth tracking. Positive observations:
|
…on-timeseries types, remove planning doc - Use JSONObject to build error response body instead of string concatenation, preventing JSON injection from measurement names - Log a warning when samples match types that exist but are not TimeSeries types (consistent with unknown-type handling) - Remove work-tracking doc from repo root Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
|
Addressed in f02ef8b: JSON injection: Fixed. Error response now uses JSONObject.put().toString() instead of string concatenation. gzip 500 vs 400: Pushback. AbstractServerHttpHandler.handleRequest() line 202-206 catches IllegalArgumentException and maps it to sendErrorResponse(exchange, 400, ...). The parseRequestPayload call at line 155 is inside that same try block, so the IllegalArgumentException from gzip decompression failure is already returned as 400. Planning doc: Removed from the repo. Non-timeseries type silent skip: Fixed. Added nonTimeSeriesTypes tracking set with a Level.WARNING log, consistent with the unknown-type handling. rawPayload thread safety: Acknowledged as pre-existing. The parent class has the same pattern (rawPayload is an instance field on a singleton handler). Out of scope for this PR. |
|
Good fixes overall - the gzip support and silent-data-loss prevention are both real improvements. A few issues worth addressing before merging: Bug: Silent data loss still possible for non-timeseries typesThe 400 error is only returned when if (inserted == 0 && !unknownTypes.isEmpty())should cover both skip reasons: if (inserted == 0 && (!unknownTypes.isEmpty() || !nonTimeSeriesTypes.isEmpty()))Otherwise a payload whose only measurement is a known-but-wrong type still returns 204 with no data written - the exact silent-data-loss problem this PR is fixing for unknown types. A test covering that path is also missing. Empty transaction committed before returning 400When all samples are skipped and a 400 is about to be returned, Thread-safety of
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3833 +/- ##
==========================================
+ Coverage 64.73% 65.30% +0.56%
==========================================
Files 1579 1579
Lines 116361 116399 +38
Branches 24675 24679 +4
==========================================
+ Hits 75328 76011 +683
+ Misses 30751 30063 -688
- Partials 10282 10325 +43 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary
Fixes #3821 (reported in discussion #3819)
PostTimeSeriesWriteHandlernow returns 400 with a descriptive error when all line protocol samples reference unknown measurement types, instead of silently returning 204. Partial success (some known, some unknown) still returns 204 but logs a warning.Content-Encoding: gziprequest bodies before parsing. Telegraf's[[outputs.influxdb]]plugin sends gzip by default, which previously produced garbled "malformed line protocol" errors.CREATE TIMESERIES TYPEsyntax (PRECISION NANOSECOND,COMPACTION INTERVAL) is properly rejected withCommandParsingException.Test plan
CreateTimeSeriesTypeStatementTest- 8 tests (6 existing + 2 new negative grammar tests)LineProtocolParserTest- 15 existing tests (no regressions)PostTimeSeriesWriteHandlerIT- 5 tests (3 existing + 2 new:unknownMeasurementTypeReturnsError,gzipCompressedBodyIsAccepted)🤖 Generated with Claude Code