Skip to content

Revamp of the I3Calorimetry extractor. #866

Open
Aske-Rosted wants to merge 11 commits intographnet-team:mainfrom
Aske-Rosted:revert_calo
Open

Revamp of the I3Calorimetry extractor. #866
Aske-Rosted wants to merge 11 commits intographnet-team:mainfrom
Aske-Rosted:revert_calo

Conversation

@Aske-Rosted
Copy link
Copy Markdown
Collaborator

The changes made to the Calorimetric introduced by #819 were developed using low energy events. While these changes did improve the correctness of the labels they unfortunately were not written in a way that made them suitable for high energy files.

This PR Rethinks the approach by having adding a hierarchy in the energy counting, that is tracks energies take priority over cascade energies.

If a track is found the deposit energy inside the detector volume it is recorded and then the particle along with the sub-tree is removed by using the frame[self.mctree].erase(track.id). The cascades are then counted by looking at only the leaf-nodes in the mctree which ensures that we are not double inside the cascades themselves and that we are only cascades terminating inside the detector volume.

This should ensure that we are eliminating both of the double counting scenarios explained in #816.

@Aske-Rosted Aske-Rosted requested a review from sevmag March 17, 2026 08:36
@sevmag
Copy link
Copy Markdown
Collaborator

sevmag commented Apr 20, 2026

Hi @Aske-Rosted, I'm currently reviewing the changes. Just a small question. What do you mean by

were not written in a way that made them suitable for high energy files.

Are you talking about the correctness of the labels or performance issues?

@sevmag
Copy link
Copy Markdown
Collaborator

sevmag commented Apr 20, 2026

I have some concerns regarding whether the current track handling produces the intended outcome. 🧐

Hypothetical Scenario 🧪

Consider a NuTau CC interaction that creates a tau. The tau enters the detector volume and decays into a muon mid-detector, which then exits the volume.

This results in a track_list like this:

track_list = [tau, mu]

In this case, the mu is a child in the tau's subtree.

When iterating through the list as seen here, the result seems highly dependent on the ordering of track_list:


Scenario 1: The tau is processed first 📍

In this case, we only iterate over the tau. Because the muon is in the tau's subtree, it is erased after the tau is processed.

  • e_deposited: Only includes energy from the tau (entry to decay). It misses the energy deposited by the muon. ❌
  • e_entrance: Correctly identifies the tau energy upon entering the detector. ✅

Scenario 2: The muon is processed first 📍

Here, we iterate over the muon first, then the tau.

  • e_deposited: Correct; it sums the energy of both the muon and the tau. ✅
  • e_entrance: Incorrect; the entrance energy of both particles is counted, even though only the tau should be considered the primary entering particle. ❌

Since the behavior changes based on list order, we might need to implement a more robust check for parent-child relationships before processing. What do you think?

Copy link
Copy Markdown
Collaborator

@sevmag sevmag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the direction this is going, and I think we can increase its speed quite a bit by leveraging the tree-like structure. However, I do still have some concerns about the handling of the track particles (see my comment). Let me know if there are any questions!

primary_energy = sum(
[
p.energy
for p in self.check_primary_energy(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You use the check_primary_energy function from the I3Extractor. Could you explain in which scenarios the energy of the primary particle is Nan? The remedy in these scenarios is to take the daughters, are we entirely sure that the daughters always have a non NaN energy? Might be worth refining the docstring for this function in

since I actually don't know why

p.energy
for p in self.check_primary_energy(
frame,
self.get_primaries(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.get_primaries is called at three locations. Maybe lets call it once at the beginning of call and save the primary variable and pass it to total_track_energy and total_cascade_energy

"""Get the total energy of track particles on entrance."""
e_entrance = 0
e_deposited = 0
primaries = self.get_primaries(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above for get_primaries usage

) -> float:
"""Get the total energy of cascade particles on entrance."""
particles = deque(
self.get_primaries(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above for get_primaries usage

# Sanity check ensuring no double counting
if self.daughters:
assert e_entrance <= sum(
[p.energy for p in primaries]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sum of primary energies has already been calculated on line 70. Reuse this value for consistency.

assert e_entrance <= sum(
[p.energy for p in primaries]
), "Energy on entrance is greater than primary energy"
assert e_deposited <= sum(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sum of primary energies has already been calculated on line 70. Reuse this value for consistency.

except RuntimeError as e:
if "particle not found" in str(e):
# log warning with event header
self.warning(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can it be that the primary particle is not found anymore? Are we sure it is safe to keep this as a warning?


return e_cascade, e_dep_track, e_ent_track
# Check if the track actually enters the volume
if not (
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment on why this check also works for starting events. (Then the intersection.first is negative, correct?)

pos = pos + direc * length
except RuntimeError as e:
if (
"sum of losses is smaller than "
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not sure what the reason for this MuonGun error is, but I think we should not just ignore it and throw a warning, but rather set the energies to NaN to make sure that we are not just using a corrupt ground truth that might not resemble reality at all. What do you think @Aske-Rosted ?

)

return self.hull.point_in_hull(pos)
if len(particles) == 0:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This scenario should never happen, should it? Maybe we should throw an error here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants