Skip to content

ALEC-295: Add demo module and router scenario integration tests#153

Open
joseanesONMS wants to merge 3 commits intoOpenNMS-Plugins:developfrom
joseanesONMS:ja/jira/ALEC-295
Open

ALEC-295: Add demo module and router scenario integration tests#153
joseanesONMS wants to merge 3 commits intoOpenNMS-Plugins:developfrom
joseanesONMS:ja/jira/ALEC-295

Conversation

@joseanesONMS
Copy link
Copy Markdown

@joseanesONMS joseanesONMS commented Apr 29, 2026

Summary

Adds a new demo module and supporting integration tests that:

  • Populate realistic devices with a realistic topology into OpenNMS.
  • Inject alarms that should generate situations.
  • Allow demonstrating ALEC functionality end-to-end.
  • Provide cleanup scripts to remove the demo afterwards, so the demo can be run again from a clean state.

This should also benefit regression testing.

Ticket: ALEC-295

Changes

  • New demo Maven module registered in the parent pom.xml
  • DemoRunner with setup / cleanup subcommands; cleanup uses demo-state.json to remove exactly what setup created
  • RouterTopologyBuilder, AlarmInjector, SituationVerifier, and an OpenNMSClient for REST interactions
  • New engine integration tests under engine/itest: MibIIAlarms, RouterScenarioTest, RouterTopology
  • .gitignore updated to exclude the runtime-generated demo-state.json

Test plan

  • mvn -pl demo -am verify builds the new module
  • mvn -pl engine/itest verify passes the new integration tests
  • Run DemoRunner setup against a local OpenNMS, verify nodes/alarms/situations appear
  • Run DemoRunner cleanup and confirm everything created by setup is removed

Create a demo that:

- Populates realistic devices with a realistic topology into OpenNMS.
- Injects alarms that should generate situations.
- Allows us to show the alec functionality.
- Has cleanup scripts to remove the demo afterwards, so that we can do
  the whole demo all over again next time.

This should also benefit in regression testing.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new standalone demo module for provisioning demo topologies in OpenNMS, injecting alarms, and verifying ALEC situations, alongside new engine-level router scenario tests. It extends the repo with both live-system demo tooling and synthetic regression coverage for DBSCAN correlation behavior.

Changes:

  • Registers a new Maven demo module with a CLI runner, REST client, state tracking, setup/cleanup flow, and runtime logging.
  • Adds live integration tests in demo and engine-level router scenario tests/helpers under engine/itest.
  • Adds supporting docs/config for running the demo and ignoring generated state.

Reviewed changes

Copilot reviewed 21 out of 22 changed files in this pull request and generated 16 comments.

Show a summary per file
File Description
pom.xml Registers the new demo module in the reactor build.
engine/itest/src/test/java/org/opennms/alec/engine/itest/RouterTopology.java Adds reusable synthetic router topology builders for engine tests.
engine/itest/src/test/java/org/opennms/alec/engine/itest/RouterScenarioTest.java Adds DBSCAN scenario tests for single, chain, and star failures.
engine/itest/src/test/java/org/opennms/alec/engine/itest/MibIIAlarms.java Adds helper methods for generating synthetic router alarm patterns.
demo/src/test/java/org/opennms/alec/demo/regression/RegressionIT.java Adds live OpenNMS integration tests for demo scenarios.
demo/src/main/resources/log4j2.xml Adds logging config for the demo module.
demo/src/main/java/org/opennms/alec/demo/verify/SituationVerifier.java Adds polling/summary logic for ALEC situations.
demo/src/main/java/org/opennms/alec/demo/topology/RouterTopologyBuilder.java Adds OpenNMS topology provisioning for single, chain, and star scenarios.
demo/src/main/java/org/opennms/alec/demo/state/DemoState.java Adds persisted state tracking for cleanup.
demo/src/main/java/org/opennms/alec/demo/inject/AlarmInjector.java Adds alarm/event injection helpers and alarm polling.
demo/src/main/java/org/opennms/alec/demo/DemoRunner.java Adds the CLI entry point for setup/inject/verify/cleanup flows.
demo/src/main/java/org/opennms/alec/demo/client/OpenNMSClient.java Adds the OpenNMS REST/v2 client used by the demo tooling.
demo/src/main/java/org/opennms/alec/demo/client/model/SnmpInterfaceDef.java Adds SNMP interface request model/XML serialization.
demo/src/main/java/org/opennms/alec/demo/client/model/NodeDef.java Adds requisition node request model/XML serialization.
demo/src/main/java/org/opennms/alec/demo/client/model/Event.java Adds event payload model/factories for injected alarms.
demo/src/main/java/org/opennms/alec/demo/client/model/AlarmList.java Adds REST alarm list deserialization model.
demo/src/main/java/org/opennms/alec/demo/client/model/Alarm.java Adds REST alarm/situation deserialization model.
demo/src/main/java/org/opennms/alec/demo/cleanup/DemoCleanup.java Adds cleanup flow for UDLs, nodes, requisitions, alarms, and state.
demo/README.md Documents demo usage, scenarios, expectations, and test commands.
demo/pom.xml Defines build, test, packaging, and runtime dependencies for the demo module.
demo/FINDINGS.md Documents investigation notes about default ALEC clustering behavior.
.gitignore Ignores the runtime-generated demo state file.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread demo/src/main/java/org/opennms/alec/demo/cleanup/DemoCleanup.java Outdated
Comment on lines +110 to +113
// 7. Remove state file
try {
state.delete(stateFile);
LOG.info("Removed state file: {}", stateFile);
Comment on lines +78 to +82
LOG.info("Waiting for at least {} alarms (timeout: {}s)...", expectedCount, timeout.getSeconds());
await().atMost(timeout.toMillis(), TimeUnit.MILLISECONDS)
.pollInterval(3, TimeUnit.SECONDS)
.until(() -> client.getAlarms().size() >= expectedCount);
LOG.info("Found {} alarms", client.getAlarms().size());
Comment on lines +52 to +57
public List<Alarm> waitForSituation(Duration timeout) {
LOG.info("Waiting for at least 1 active situation (timeout: {}s)...", timeout.getSeconds());
await().atMost(timeout.toMillis(), TimeUnit.MILLISECONDS)
.pollInterval(5, TimeUnit.SECONDS)
.until(() -> !client.getActiveSituations().isEmpty());
List<Alarm> situations = client.getActiveSituations();
Comment on lines +154 to +155
LOG.info("Alarm injection complete. Waiting for alarms to be processed...");
injector.waitForAlarms(3, Duration.ofMinutes(1));

AlarmInjector injector = new AlarmInjector(client, state);
injector.injectLinearChainFailure();
injector.waitForAlarms(3, Duration.ofMinutes(1));

AlarmInjector injector = new AlarmInjector(client, state);
injector.injectStarFailure();
injector.waitForAlarms(3, Duration.ofMinutes(1));
Comment on lines +173 to +177
private void saveState() {
try {
state.save(stateFile);
} catch (IOException e) {
LOG.warn("Failed to save state: {}", e.getMessage());
Comment on lines +121 to +128
List<InventoryObject> inventory = new ArrayList<>();
for (String name : names) {
addRouter(inventory, name, DEFAULT_INTERFACES_PER_ROUTER);
}
for (int i = 0; i < names.length; i++) {
int next = (i + 1) % names.length;
addLink(inventory, names[i], 1, names[next], 0);
}
Comment on lines +73 to +79
public void tearDown() {
if (state != null) {
try {
new DemoCleanup(client, state, stateFile).cleanup();
} catch (Exception e) {
System.err.println("Cleanup failed: " + e.getMessage());
}
joseanesONMS and others added 2 commits May 4, 2026 15:00
Co-authored-by: Copilot Autofix powered by AI <[email protected]>
Demo scoping (so the demo is safe to run on a non-empty OpenNMS):
- AlarmInjector.waitForAlarms now counts only alarms whose nodeId is
  in DemoState's tracked nodes; previously the wait could complete
  immediately if the system already had >= expectedCount alarms from
  unrelated activity.
- SituationVerifier now takes the demo's node IDs and only counts
  situations that involve at least one of them (own nodeId or any
  related-alarm nodeId). Without this, an unrelated pre-existing
  situation could satisfy the wait.
- DemoCleanup now clears alarms only on the demo's own nodes; it no
  longer touches alarms outside the demo. Cleanup also tracks per-step
  failures: if any deletion fails, the state file is intentionally
  kept so cleanup can be retried instead of stranding demo resources.
- DemoRunner.doNuke now still runs its brute-force pass when the
  targeted DemoCleanup returns a partial-failure status (previously
  it short-circuited and never ran the brute-force branch).

Test expectations / waits:
- DemoRunner waits for the scenario-correct alarm count (3, 6, or 15)
  rather than a hard-coded 3, so verify doesn't race the rest of
  the scenario's alarms.
- RegressionIT does the same and passes the demo node-id set into
  SituationVerifier, so the assertion reflects the demo's situations
  rather than any pre-existing one.
- RegressionIT.tearDown no longer swallows cleanup failures —
  failing fast keeps the next test from running on a polluted instance.

Topology / API error handling:
- RouterTopologyBuilder.createLink no longer downgrades a UDL creation
  failure to a warning; UDL lookup retries with backoff and throws if
  the link cannot be located after creation, so the demo state
  accurately tracks every link for cleanup.
- saveState now throws on IOException; silently failing to persist
  state would let setup keep creating resources that cleanup has no
  record of.
- OpenNMSClient.createSnmpInterface no longer treats every HTTP 500 as
  "already exists" — it now only does so when the response body looks
  like a unique-constraint violation, otherwise it fails loudly.

engine/itest:
- RouterTopology class-level docstring updated to match the
  PEER_WEIGHT=40 actually used by the constants below it.
- New RouterScenarioTest.linkFailureInRingTopology exercises
  RouterTopology.ring(), which previously had no scenario coverage.
Copy link
Copy Markdown
Contributor

@cgorantla cgorantla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine.

I think RegressionIT needs live OpenNMS instance to run the tests. This may fail in our CI pipeline. https://app.circleci.com/pipelines/github/OpenNMS-Plugins/alec/1541/workflows/1e81cf0e-5ffd-43d6-b8e9-1c88920f56d3/jobs/8600/tests

Although it didn't report test failures, there were test failures.

Ideally we can gate running tests in demo module to a custom maven profile that won't run by default

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants