Skip to content

fix: InfluxDB line protocol silent data loss and missing gzip support#3833

Open
robfrank wants to merge 2 commits intomainfrom
fix/3821-timeseries-line-protocol-bugs
Open

fix: InfluxDB line protocol silent data loss and missing gzip support#3833
robfrank wants to merge 2 commits intomainfrom
fix/3821-timeseries-line-protocol-bugs

Conversation

@robfrank
Copy link
Copy Markdown
Collaborator

Summary

Fixes #3821 (reported in discussion #3819)

  • Silent data loss: PostTimeSeriesWriteHandler now returns 400 with a descriptive error when all line protocol samples reference unknown measurement types, instead of silently returning 204. Partial success (some known, some unknown) still returns 204 but logs a warning.
  • Gzip support: The write handler now decompresses Content-Encoding: gzip request bodies before parsing. Telegraf's [[outputs.influxdb]] plugin sends gzip by default, which previously produced garbled "malformed line protocol" errors.
  • Grammar negative tests: Added regression tests confirming that undocumented CREATE TIMESERIES TYPE syntax (PRECISION NANOSECOND, COMPACTION INTERVAL) is properly rejected with CommandParsingException.

Test plan

  • CreateTimeSeriesTypeStatementTest - 8 tests (6 existing + 2 new negative grammar tests)
  • LineProtocolParserTest - 15 existing tests (no regressions)
  • PostTimeSeriesWriteHandlerIT - 5 tests (3 existing + 2 new: unknownMeasurementTypeReturnsError, gzipCompressedBodyIsAccepted)

🤖 Generated with Claude Code

…#3821)

Return 400 instead of silent 204 when line protocol samples reference
unknown timeseries types, and decompress Content-Encoding: gzip bodies
in the write handler. Adds negative grammar tests for undocumented
CREATE TIMESERIES TYPE syntax (PRECISION, COMPACTION INTERVAL).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes critical bugs in the TimeSeries InfluxDB Line Protocol, including silent data loss on unknown measurement types and missing gzip decompression support. It also adds grammar validation tests for the CREATE TIMESERIES TYPE statement. Review feedback identifies a race condition in the asynchronous payload handling and suggests using JSONObject for safer error response generation.

Comment on lines +78 to +91
final AtomicReference<byte[]> bytesRef = new AtomicReference<>();
e.getRequestReceiver().receiveFullBytes(
(exchange, data) -> bytesRef.set(data),
(exchange, err) -> {
LogManager.instance().log(this, Level.SEVERE, "receiveFullBytes completed with an error: %s", err, err.getMessage());
exchange.setStatusCode(StatusCodes.INTERNAL_SERVER_ERROR);
exchange.getResponseSender().send("Invalid Request");
});

final byte[] rawBytes = bytesRef.get();
if (rawBytes == null) {
rawPayload = null;
return null;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The use of receiveFullBytes is asynchronous and does not block the calling thread. Since bytesRef.get() is called immediately after, it will likely return null because the request body has not been fully read yet, leading to an incorrect "Request body is empty" error. Since this handler runs on a worker thread and has called e.startBlocking(), you should use the blocking getInputStream() to read the payload.

    final byte[] rawBytes;
    try {
      rawBytes = e.getInputStream().readAllBytes();
    } catch (final IOException ex) {
      LogManager.instance().log(this, Level.SEVERE, "Error reading request payload", ex);
      return null;
    }

    if (rawBytes.length == 0) {
      rawPayload = null;
      return null;
    }

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushback: AbstractServerHttpHandler.parseRequestPayload (lines 63-73) uses this exact same pattern - AtomicReference + receiveFullBytes + .get(). When blocking mode is started and the handler runs on a worker thread, receiveFullBytes completes synchronously before returning. This is the established pattern across all handlers in the codebase.

Comment on lines +180 to +182
return new ExecutionResponse(400,
"{ \"error\" : \"Unknown timeseries type(s): " + String.join(", ", unknownTypes)
+ ". Create the type first with CREATE TIMESERIES TYPE.\"}");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Manually constructing JSON strings is error-prone and can lead to invalid JSON if any measurement names contain characters that require escaping (like double quotes). It is safer to use the JSONObject class to build the response body.

Suggested change
return new ExecutionResponse(400,
"{ \"error\" : \"Unknown timeseries type(s): " + String.join(", ", unknownTypes)
+ ". Create the type first with CREATE TIMESERIES TYPE.\"}");
return new ExecutionResponse(400,
new JSONObject().put("error", "Unknown timeseries type(s): " + String.join(", ", unknownTypes)
+ ". Create the type first with CREATE TIMESERIES TYPE.").toString());

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in f02ef8b. Switched to new JSONObject().put("error", ...).toString() to prevent JSON injection from measurement names containing quotes.

@robfrank robfrank added this to the 26.4.1 milestone Apr 12, 2026
@codacy-production
Copy link
Copy Markdown

codacy-production bot commented Apr 12, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 10 complexity

Metric Results
Complexity 10

View in Codacy

🟢 Coverage 67.57% diff coverage · -7.84% coverage variation

Metric Results
Coverage variation -7.84% coverage variation
Diff coverage 67.57% diff coverage

View coverage diff in Codacy

Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (598496b) 116361 85639 73.60%
Head commit (f02ef8b) 147613 (+31252) 97073 (+11434) 65.76% (-7.84%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#3833) 37 25 67.57%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

TIP This summary will be updated as you push new changes. Give us feedback

@claude
Copy link
Copy Markdown

claude bot commented Apr 12, 2026

Code Review

Good fixes overall - the silent data loss and gzip issues are real bugs worth addressing. A few concerns to address before merging.

Critical: JSON injection in error response

PostTimeSeriesWriteHandler.java line 181 builds JSON via string concatenation using measurement names that come directly from the InfluxDB line protocol request body. A name containing a quote character or backslash will produce malformed or injected JSON.

Use JSONObject as the project convention requires (CLAUDE.md: use com.arcadedb.serializer.json.JSONObject):

return new ExecutionResponse(400,
new JSONObject().put("error",
"Unknown timeseries type(s): " + String.join(", ", unknownTypes)
+ ". Create the type first with CREATE TIMESERIES TYPE.").toString());

Medium: gzip decompression failure returns 500 not 400

The IllegalArgumentException thrown on line 98 propagates to the framework as a 500 Internal Server Error, even though the client sent a malformed gzip body (a 400-level error). Catch it in execute() or return ExecutionResponse(400, ...) instead.

Medium: Planning document committed to repo root

3821-timeseries-line-protocol-bugs.md at the repo root is a work-notes file and should not be committed to the repository. Remove it before merging.

Minor: Silent skip when type exists but is not a LocalTimeSeriesType

When inserted == 0 and unknownTypes.isEmpty(), the code returns 204 silently. This happens when samples match a schema type that is not a LocalTimeSeriesType (the continue on line 146). That branch also hides data loss - worth logging a warning there too, consistent with the unknown-type handling added in this PR.

Pre-existing issue (not introduced by this PR): rawPayload is an instance field on a shared handler. Since Undertow handlers are typically singleton instances, concurrent requests could overwrite each other's rawPayload. Not introduced here but worth tracking.

Positive observations:

  • Gzip handling correctly checks Content-Encoding (not Content-Type), matching RFC 7231.
  • LinkedHashSet for unknownTypes preserves insertion order for deterministic error messages.
  • The unknownMeasurementTypeReturnsError and gzipCompressedBodyIsAccepted integration tests are well-structured.
  • The negative grammar tests are a good regression net for parser-level contract validation.

…on-timeseries types, remove planning doc

- Use JSONObject to build error response body instead of string
  concatenation, preventing JSON injection from measurement names
- Log a warning when samples match types that exist but are not
  TimeSeries types (consistent with unknown-type handling)
- Remove work-tracking doc from repo root

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@robfrank
Copy link
Copy Markdown
Collaborator Author

robfrank commented Apr 12, 2026

Addressed in f02ef8b:

JSON injection: Fixed. Error response now uses JSONObject.put().toString() instead of string concatenation.

gzip 500 vs 400: Pushback. AbstractServerHttpHandler.handleRequest() line 202-206 catches IllegalArgumentException and maps it to sendErrorResponse(exchange, 400, ...). The parseRequestPayload call at line 155 is inside that same try block, so the IllegalArgumentException from gzip decompression failure is already returned as 400.

Planning doc: Removed from the repo.

Non-timeseries type silent skip: Fixed. Added nonTimeSeriesTypes tracking set with a Level.WARNING log, consistent with the unknown-type handling.

rawPayload thread safety: Acknowledged as pre-existing. The parent class has the same pattern (rawPayload is an instance field on a singleton handler). Out of scope for this PR.

@claude
Copy link
Copy Markdown

claude bot commented Apr 12, 2026

Good fixes overall - the gzip support and silent-data-loss prevention are both real improvements. A few issues worth addressing before merging:

Bug: Silent data loss still possible for non-timeseries types

The 400 error is only returned when unknownTypes is non-empty, but a measurement that exists as a non-timeseries type also silently drops data when it is the only kind of sample. The condition on line 182:

if (inserted == 0 && !unknownTypes.isEmpty())

should cover both skip reasons:

if (inserted == 0 && (!unknownTypes.isEmpty() || !nonTimeSeriesTypes.isEmpty()))

Otherwise a payload whose only measurement is a known-but-wrong type still returns 204 with no data written - the exact silent-data-loss problem this PR is fixing for unknown types. A test covering that path is also missing.

Empty transaction committed before returning 400

When all samples are skipped and a 400 is about to be returned, database.commit() has already been called (line 176) on an empty transaction. It is harmless, but wasteful and slightly misleading. Consider moving the commit inside the inserted > 0 branch or rolling back when there is nothing to commit.

Thread-safety of rawPayload instance field

rawPayload is a mutable instance field on what appears to be a singleton handler. Under concurrent requests this is a data race: one request can overwrite another's payload before execute() reads it. This pre-dates the PR, but the PR's new gzip branch depends on this field, so the risk surface is slightly larger now. The cleanest fix is to pass the decompressed string back through the return value and have execute() receive it via the payload parameter (or a ThreadLocal). At minimum it is worth opening a follow-up issue.

Partial-success behaviour is underdocumented

The PR description says "Partial success (some known, some unknown) still returns 204 but logs a warning." That is a policy decision - InfluxDB itself returns 204 with a X-Influxdb-Error header for partial writes. Worth noting in the handler's Javadoc so future maintainers understand why the mixed case was left as a warning rather than an error.

Minor: gzip.finish() vs try-with-resources in the test

In gzipCompressedBodyIsAccepted the test calls gzip.finish() inside the try block but the stream is also closed by the try-with-resources - finish() is called twice. It works but is redundant; removing the explicit gzip.finish() call would clean it up.


The grammar regression tests are a nice addition - testing that undocumented syntax is rejected clearly is useful. The overall direction of the PR is correct.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 12, 2026

Codecov Report

❌ Patch coverage is 48.64865% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 65.30%. Comparing base (598496b) to head (f02ef8b).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...erver/http/handler/PostTimeSeriesWriteHandler.java 48.64% 12 Missing and 7 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3833      +/-   ##
==========================================
+ Coverage   64.73%   65.30%   +0.56%     
==========================================
  Files        1579     1579              
  Lines      116361   116399      +38     
  Branches    24675    24679       +4     
==========================================
+ Hits        75328    76011     +683     
+ Misses      30751    30063     -688     
- Partials    10282    10325      +43     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TimeSeries InfluxDB Line Protocol: silent data loss, no gzip support, doc/grammar mismatch

1 participant