Skip to content

fix: Fix the result disorder caused by multiprocessing queue.#12

Open
tripletpoi wants to merge 2 commits intomainfrom
fix-multiprocess-order
Open

fix: Fix the result disorder caused by multiprocessing queue.#12
tripletpoi wants to merge 2 commits intomainfrom
fix-multiprocess-order

Conversation

@tripletpoi
Copy link
Copy Markdown
Collaborator

Fix misordered results caused by multiprocessing queue

Background

When running analyzers in parallel, results were collected from a multiprocessing.Queue.
However, Queue.get() retrieves items in the order they arrive, not in the order tasks were submitted.
This caused inconsistent or misordered results when multiple worker processes finished at different speeds.

What was happening

  • Each analyzer task was submitted with an implicit index
  • Worker processes may complete at different times
  • Results were pushed into the queue in a non‑deterministic order
  • Misorder was found in cv.log that replica's CVs do not have the correct label

This led to incorrect or unstable output in several analyzer modules.

What this PR does

  • Introduces explicit task indexing
  • Ensures each worker returns (task_id, result)
  • Collects results into a list and sorts by task_id before further processing
  • Guarantees deterministic ordering regardless of process execution timing
  • Applies consistent fixes across:
    • a_d.py
    • association.py
    • dissociation.py
    • ee.py
    • rmsd.py
    • superAnalyzer.py
    • target.py

Testing

  • Inserted random time.sleep() calls inside calculate_cv() to simulate unpredictable worker timing
  • Under the old implementation, this reliably reproduced misordered results due to nondeterministic queue retrieval
  • With the fix applied, results remain correctly ordered even under randomized delays
  • Verified stable output across multiple runs
  • No changes to public API or user‑facing behavior

@Hase1534
Copy link
Copy Markdown

Hase1534 commented Dec 13, 2025

In superAnalyzer.py, the current multiprocessing result collection looks incorrect and can still cause replica mix-ups when n_parallel > 1.

  • Each worker puts queue.put((replica, ret)), but the parent loop appends queue.get() directly into cv_arr and then sorts tmp (which stays empty), so the “sort by replica” never happens.

  • Update the abstract method signature of SuperAnalyzer.calculate_cv to match the actual call site (settings, cycle, replica, queue) to avoid future analyzers implementing the wrong signature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants