ALEC-295: Add demo module and router scenario integration tests#153
ALEC-295: Add demo module and router scenario integration tests#153joseanesONMS wants to merge 3 commits intoOpenNMS-Plugins:developfrom
Conversation
Create a demo that: - Populates realistic devices with a realistic topology into OpenNMS. - Injects alarms that should generate situations. - Allows us to show the alec functionality. - Has cleanup scripts to remove the demo afterwards, so that we can do the whole demo all over again next time. This should also benefit in regression testing.
There was a problem hiding this comment.
Pull request overview
Adds a new standalone demo module for provisioning demo topologies in OpenNMS, injecting alarms, and verifying ALEC situations, alongside new engine-level router scenario tests. It extends the repo with both live-system demo tooling and synthetic regression coverage for DBSCAN correlation behavior.
Changes:
- Registers a new Maven
demomodule with a CLI runner, REST client, state tracking, setup/cleanup flow, and runtime logging. - Adds live integration tests in
demoand engine-level router scenario tests/helpers underengine/itest. - Adds supporting docs/config for running the demo and ignoring generated state.
Reviewed changes
Copilot reviewed 21 out of 22 changed files in this pull request and generated 16 comments.
Show a summary per file
| File | Description |
|---|---|
pom.xml |
Registers the new demo module in the reactor build. |
engine/itest/src/test/java/org/opennms/alec/engine/itest/RouterTopology.java |
Adds reusable synthetic router topology builders for engine tests. |
engine/itest/src/test/java/org/opennms/alec/engine/itest/RouterScenarioTest.java |
Adds DBSCAN scenario tests for single, chain, and star failures. |
engine/itest/src/test/java/org/opennms/alec/engine/itest/MibIIAlarms.java |
Adds helper methods for generating synthetic router alarm patterns. |
demo/src/test/java/org/opennms/alec/demo/regression/RegressionIT.java |
Adds live OpenNMS integration tests for demo scenarios. |
demo/src/main/resources/log4j2.xml |
Adds logging config for the demo module. |
demo/src/main/java/org/opennms/alec/demo/verify/SituationVerifier.java |
Adds polling/summary logic for ALEC situations. |
demo/src/main/java/org/opennms/alec/demo/topology/RouterTopologyBuilder.java |
Adds OpenNMS topology provisioning for single, chain, and star scenarios. |
demo/src/main/java/org/opennms/alec/demo/state/DemoState.java |
Adds persisted state tracking for cleanup. |
demo/src/main/java/org/opennms/alec/demo/inject/AlarmInjector.java |
Adds alarm/event injection helpers and alarm polling. |
demo/src/main/java/org/opennms/alec/demo/DemoRunner.java |
Adds the CLI entry point for setup/inject/verify/cleanup flows. |
demo/src/main/java/org/opennms/alec/demo/client/OpenNMSClient.java |
Adds the OpenNMS REST/v2 client used by the demo tooling. |
demo/src/main/java/org/opennms/alec/demo/client/model/SnmpInterfaceDef.java |
Adds SNMP interface request model/XML serialization. |
demo/src/main/java/org/opennms/alec/demo/client/model/NodeDef.java |
Adds requisition node request model/XML serialization. |
demo/src/main/java/org/opennms/alec/demo/client/model/Event.java |
Adds event payload model/factories for injected alarms. |
demo/src/main/java/org/opennms/alec/demo/client/model/AlarmList.java |
Adds REST alarm list deserialization model. |
demo/src/main/java/org/opennms/alec/demo/client/model/Alarm.java |
Adds REST alarm/situation deserialization model. |
demo/src/main/java/org/opennms/alec/demo/cleanup/DemoCleanup.java |
Adds cleanup flow for UDLs, nodes, requisitions, alarms, and state. |
demo/README.md |
Documents demo usage, scenarios, expectations, and test commands. |
demo/pom.xml |
Defines build, test, packaging, and runtime dependencies for the demo module. |
demo/FINDINGS.md |
Documents investigation notes about default ALEC clustering behavior. |
.gitignore |
Ignores the runtime-generated demo state file. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // 7. Remove state file | ||
| try { | ||
| state.delete(stateFile); | ||
| LOG.info("Removed state file: {}", stateFile); |
| LOG.info("Waiting for at least {} alarms (timeout: {}s)...", expectedCount, timeout.getSeconds()); | ||
| await().atMost(timeout.toMillis(), TimeUnit.MILLISECONDS) | ||
| .pollInterval(3, TimeUnit.SECONDS) | ||
| .until(() -> client.getAlarms().size() >= expectedCount); | ||
| LOG.info("Found {} alarms", client.getAlarms().size()); |
| public List<Alarm> waitForSituation(Duration timeout) { | ||
| LOG.info("Waiting for at least 1 active situation (timeout: {}s)...", timeout.getSeconds()); | ||
| await().atMost(timeout.toMillis(), TimeUnit.MILLISECONDS) | ||
| .pollInterval(5, TimeUnit.SECONDS) | ||
| .until(() -> !client.getActiveSituations().isEmpty()); | ||
| List<Alarm> situations = client.getActiveSituations(); |
| LOG.info("Alarm injection complete. Waiting for alarms to be processed..."); | ||
| injector.waitForAlarms(3, Duration.ofMinutes(1)); |
|
|
||
| AlarmInjector injector = new AlarmInjector(client, state); | ||
| injector.injectLinearChainFailure(); | ||
| injector.waitForAlarms(3, Duration.ofMinutes(1)); |
|
|
||
| AlarmInjector injector = new AlarmInjector(client, state); | ||
| injector.injectStarFailure(); | ||
| injector.waitForAlarms(3, Duration.ofMinutes(1)); |
| private void saveState() { | ||
| try { | ||
| state.save(stateFile); | ||
| } catch (IOException e) { | ||
| LOG.warn("Failed to save state: {}", e.getMessage()); |
| List<InventoryObject> inventory = new ArrayList<>(); | ||
| for (String name : names) { | ||
| addRouter(inventory, name, DEFAULT_INTERFACES_PER_ROUTER); | ||
| } | ||
| for (int i = 0; i < names.length; i++) { | ||
| int next = (i + 1) % names.length; | ||
| addLink(inventory, names[i], 1, names[next], 0); | ||
| } |
| public void tearDown() { | ||
| if (state != null) { | ||
| try { | ||
| new DemoCleanup(client, state, stateFile).cleanup(); | ||
| } catch (Exception e) { | ||
| System.err.println("Cleanup failed: " + e.getMessage()); | ||
| } |
Co-authored-by: Copilot Autofix powered by AI <[email protected]>
Demo scoping (so the demo is safe to run on a non-empty OpenNMS): - AlarmInjector.waitForAlarms now counts only alarms whose nodeId is in DemoState's tracked nodes; previously the wait could complete immediately if the system already had >= expectedCount alarms from unrelated activity. - SituationVerifier now takes the demo's node IDs and only counts situations that involve at least one of them (own nodeId or any related-alarm nodeId). Without this, an unrelated pre-existing situation could satisfy the wait. - DemoCleanup now clears alarms only on the demo's own nodes; it no longer touches alarms outside the demo. Cleanup also tracks per-step failures: if any deletion fails, the state file is intentionally kept so cleanup can be retried instead of stranding demo resources. - DemoRunner.doNuke now still runs its brute-force pass when the targeted DemoCleanup returns a partial-failure status (previously it short-circuited and never ran the brute-force branch). Test expectations / waits: - DemoRunner waits for the scenario-correct alarm count (3, 6, or 15) rather than a hard-coded 3, so verify doesn't race the rest of the scenario's alarms. - RegressionIT does the same and passes the demo node-id set into SituationVerifier, so the assertion reflects the demo's situations rather than any pre-existing one. - RegressionIT.tearDown no longer swallows cleanup failures — failing fast keeps the next test from running on a polluted instance. Topology / API error handling: - RouterTopologyBuilder.createLink no longer downgrades a UDL creation failure to a warning; UDL lookup retries with backoff and throws if the link cannot be located after creation, so the demo state accurately tracks every link for cleanup. - saveState now throws on IOException; silently failing to persist state would let setup keep creating resources that cleanup has no record of. - OpenNMSClient.createSnmpInterface no longer treats every HTTP 500 as "already exists" — it now only does so when the response body looks like a unique-constraint violation, otherwise it fails loudly. engine/itest: - RouterTopology class-level docstring updated to match the PEER_WEIGHT=40 actually used by the constants below it. - New RouterScenarioTest.linkFailureInRingTopology exercises RouterTopology.ring(), which previously had no scenario coverage.
cgorantla
left a comment
There was a problem hiding this comment.
This looks fine.
I think RegressionIT needs live OpenNMS instance to run the tests. This may fail in our CI pipeline. https://app.circleci.com/pipelines/github/OpenNMS-Plugins/alec/1541/workflows/1e81cf0e-5ffd-43d6-b8e9-1c88920f56d3/jobs/8600/tests
Although it didn't report test failures, there were test failures.
Ideally we can gate running tests in demo module to a custom maven profile that won't run by default
Summary
Adds a new
demomodule and supporting integration tests that:This should also benefit regression testing.
Ticket: ALEC-295
Changes
demoMaven module registered in the parentpom.xmlDemoRunnerwithsetup/cleanupsubcommands; cleanup usesdemo-state.jsonto remove exactly what setup createdRouterTopologyBuilder,AlarmInjector,SituationVerifier, and anOpenNMSClientfor REST interactionsengine/itest:MibIIAlarms,RouterScenarioTest,RouterTopology.gitignoreupdated to exclude the runtime-generateddemo-state.jsonTest plan
mvn -pl demo -am verifybuilds the new modulemvn -pl engine/itest verifypasses the new integration testsDemoRunner setupagainst a local OpenNMS, verify nodes/alarms/situations appearDemoRunner cleanupand confirm everything created by setup is removed