c/engine: add litert_lm_engine_settings_set_max_num_images (C API parity for #1686) by DenisovAV · Pull Request #2157 · google-ai-edge/LiteRT-LM

DenisovAV · 2026-05-03T10:23:27Z

Summary

Adds the C API parity for `max_num_images` so multimodal vision models work via the C API path. Without this, vision input is silently dropped on the C API path even though the Kotlin/JVM path was fixed in #1686.

Closes #2156.

What changes

`c/engine.h` — new declaration:

```c
LITERT_LM_C_API_EXPORT
void litert_lm_engine_settings_set_max_num_images(
LiteRtLmEngineSettings* settings, int max_num_images);
```

`c/engine.cc` — one-line impl that mirrors what `kotlin/.../jni/litertlm.cc:474` already does:

```cpp
settings->settings->GetMutableMainExecutorSettings().SetMaxNumImages(max_num_images);
```

with a `max_num_images > 0` guard matching the Kotlin `require` precondition.

`+21/-0` total. No new behavior on the runtime side — `SetMaxNumImages` already exists in the executor settings; this just exposes a setter to the C API surface.

Test plan

flutter_gemma carries this exact addition as a downstream patch via `patch_c_api.sh` since v0.14.0; multimodal Gemma 3 Nano (vision) on Android / iOS / macOS Native-Assets-bundled FFI works correctly with this setter.
Behavior parity verified against `kotlin/.../jni/litertlm.cc:474` — same field, same conditional, same downstream effect.
Upstream CI on this branch.

Cross-references

JVM SDK: Vision encoder loads but max_num_images=0, model ignores images #1686 — original bug (closed when Kotlin was fixed; this PR completes the parity for the C API)
[Feature Request] C API parity for max_num_images — multimodal vision silently disabled on the C API path #2156 — issue this PR closes

If a Windows shared-lib build (#2154 / PR #2155) lands, the new entry will need to be added to `c/windows_exports.def`. Happy to follow up there if both land.

CLA

Will sign Individual CLA before merge if needed; flagging upfront.

Multimodal vision models require max_num_images > 0 to be set on the main executor settings before engine_create — otherwise the vision tower stays uninitialized and image inputs are silently dropped, so the model hallucinates a response from text alone (the original bug in google-ai-edge#1686). Kotlin EngineConfig.maxNumImages and the JNI's GetMutableMainExecutorSettings().SetMaxNumImages() call (added in the google-ai-edge#1686 fix) handle this on the JVM path. The C API in c/engine.h didn't get the parity setter, so any cross-language consumer still hits the original bug. Add a thin wrapper that mirrors the JNI behavior: void litert_lm_engine_settings_set_max_num_images( LiteRtLmEngineSettings* settings, int max_num_images); with a max_num_images > 0 guard on the impl side (matches the Kotlin `require(maxNumImages == null || maxNumImages > 0)` precondition). Closes google-ai-edge#2156.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

c/engine: add litert_lm_engine_settings_set_max_num_images (C API parity for #1686)#2157

c/engine: add litert_lm_engine_settings_set_max_num_images (C API parity for #1686)#2157
DenisovAV wants to merge 1 commit intogoogle-ai-edge:mainfrom
DenisovAV:feat/c-api-set-max-num-images

DenisovAV commented May 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

DenisovAV commented May 3, 2026

Summary

What changes

Test plan

Cross-references

CLA

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant