Skip to content

refactor(destination): implement shared-filtering#15166

Open
adleong wants to merge 15 commits intomainfrom
alex/shared-filtering
Open

refactor(destination): implement shared-filtering#15166
adleong wants to merge 15 commits intomainfrom
alex/shared-filtering

Conversation

@adleong
Copy link
Copy Markdown
Member

@adleong adleong commented Apr 13, 2026

In our testing and in reports from users, we have found that destination controller memory use scales linearly with the total number of concurrent subscribers that it servers. In one load test, we found the scaling rate to be approximately 330 kiB per subscriber. This causes the destination controller to use very large amounts of memory when there are a large number of subscribers. A cluster with 10,000 subscribers might expect the destination controller to use 3.4 GiB of memory.

Memory profiling revealed that the majority of this memory is consumed by the per-subscriber endpoint translator. More specifically, by the endpoint snapshot that it holds in order to calculate filtering and diffs to send to the client.

We reduce the memory scaling factor by noticing that this filtering and diff state can be shared between multiple subscribers if they have the same filtering conditions. In other words, subscribers only need their own state if they have different node-local or zone-local filtering. Therefore, we can group subscribers by a filter key where
all subscribers with the same filtering can share state.

To implement this, we break the unwieldy endpoints_watcher.go into a number of different files, aligned around the major types defined in this file:

  • endpoints_watcher.go: the top level EndpointsWatcher type
  • service_publisher.go: the servicePublisher type, per service
  • port_publisher.go: the portPublisher type, per service port
  • filtered_listener_group.go: a new filteredListenerGroup type, per filter key

We move the filtering and diffing logic out of endpointTranlator and into filteredListenerGroup so that it can be shared. In this way, endpoindTranslator becomes a much simpler translation layer.

With these changes, our load test measures up to an 84% reduction in destination controller memory use, depending on the distribution of subscriber filter keys.

adleong added 15 commits March 11, 2026 19:34
Signed-off-by: Alex Leong <alex@buoyant.io>
Signed-off-by: Alex Leong <alex@buoyant.io>
Signed-off-by: Alex Leong <alex@buoyant.io>
Signed-off-by: Alex Leong <alex@buoyant.io>
Signed-off-by: Alex Leong <alex@buoyant.io>
Signed-off-by: Alex Leong <alex@buoyant.io>
Signed-off-by: Alex Leong <alex@buoyant.io>
Signed-off-by: Alex Leong <alex@buoyant.io>
Signed-off-by: Alex Leong <alex@buoyant.io>
Signed-off-by: Alex Leong <alex@buoyant.io>
Signed-off-by: Alex Leong <alex@buoyant.io>
Signed-off-by: Alex Leong <alex@buoyant.io>
Signed-off-by: Alex Leong <alex@buoyant.io>
Signed-off-by: Alex Leong <alex@buoyant.io>
@adleong adleong changed the title [WIP] refactor(destination): implement shared-filtering refactor(destination): implement shared-filtering Apr 16, 2026
@adleong adleong marked this pull request as ready for review April 16, 2026 22:28
@adleong adleong requested a review from a team as a code owner April 16, 2026 22:28
Copy link
Copy Markdown
Member

@alpeb alpeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great changes, the Destination code definitely looks less entagled now 👍
Do you think it would be useful to expose metrics about the different filteredListenerGoups, or is that too much of an implementation detail?

Comment on lines +48 to +55
for k, v := range addr.Addresses {
addresses[k] = v
}

labels := make(map[string]string)
for k, v := range addr.Labels {
labels[k] = v
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is old code, but a good opportunity to modernize these idioms with maps.Copy()

Comment on lines +44 to +47
/////////////////////
/// portPublisher ///
/////////////////////

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed now this has its own file

Suggested change
/////////////////////
/// portPublisher ///
/////////////////////

Comment on lines +45 to +48
////////////////////////
/// servicePublisher ///
////////////////////////

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed now this has its own file

Suggested change
////////////////////////
/// servicePublisher ///
////////////////////////

Comment on lines +94 to +96
for id, address := range pp.addresses.Addresses {
updatedAddressSet.Addresses[id] = address
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use maps.Copy

Comment on lines +103 to +105
for id, address := range newAddressSet.Addresses {
updatedAddressSet.Addresses[id] = address
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use maps.Copy

Comment on lines +474 to +476
group.updateLocalTrafficPolicy(localTrafficPolicy)
}
pp.publishFilteredSnapshots()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both group.updateLocalTrafficPolicy and pp.publishFilteredSnapshots are calling publishDiff, so this might result in duplicate responses

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the tested behaviors that were removed here recovered elsewhere?


log := sp.log.WithField("port", srcPort)

metrics, err := endpointsVecs.newEndpointsMetrics(sp.metricsLabels(srcPort))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why removing the hostname from the labels? The caller (subscribe()) has access to it so it can pass it along.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants