refactor(destination): implement shared-filtering#15166
refactor(destination): implement shared-filtering#15166
Conversation
Signed-off-by: Alex Leong <alex@buoyant.io>
Signed-off-by: Alex Leong <alex@buoyant.io>
Signed-off-by: Alex Leong <alex@buoyant.io>
Signed-off-by: Alex Leong <alex@buoyant.io>
alpeb
left a comment
There was a problem hiding this comment.
Great changes, the Destination code definitely looks less entagled now 👍
Do you think it would be useful to expose metrics about the different filteredListenerGoups, or is that too much of an implementation detail?
| for k, v := range addr.Addresses { | ||
| addresses[k] = v | ||
| } | ||
|
|
||
| labels := make(map[string]string) | ||
| for k, v := range addr.Labels { | ||
| labels[k] = v | ||
| } |
There was a problem hiding this comment.
This is old code, but a good opportunity to modernize these idioms with maps.Copy()
| ///////////////////// | ||
| /// portPublisher /// | ||
| ///////////////////// | ||
|
|
There was a problem hiding this comment.
Can be removed now this has its own file
| ///////////////////// | |
| /// portPublisher /// | |
| ///////////////////// |
| //////////////////////// | ||
| /// servicePublisher /// | ||
| //////////////////////// | ||
|
|
There was a problem hiding this comment.
Can be removed now this has its own file
| //////////////////////// | |
| /// servicePublisher /// | |
| //////////////////////// |
| for id, address := range pp.addresses.Addresses { | ||
| updatedAddressSet.Addresses[id] = address | ||
| } |
| for id, address := range newAddressSet.Addresses { | ||
| updatedAddressSet.Addresses[id] = address | ||
| } |
| group.updateLocalTrafficPolicy(localTrafficPolicy) | ||
| } | ||
| pp.publishFilteredSnapshots() |
There was a problem hiding this comment.
Both group.updateLocalTrafficPolicy and pp.publishFilteredSnapshots are calling publishDiff, so this might result in duplicate responses
There was a problem hiding this comment.
Are the tested behaviors that were removed here recovered elsewhere?
|
|
||
| log := sp.log.WithField("port", srcPort) | ||
|
|
||
| metrics, err := endpointsVecs.newEndpointsMetrics(sp.metricsLabels(srcPort)) |
There was a problem hiding this comment.
Why removing the hostname from the labels? The caller (subscribe()) has access to it so it can pass it along.
In our testing and in reports from users, we have found that destination controller memory use scales linearly with the total number of concurrent subscribers that it servers. In one load test, we found the scaling rate to be approximately 330 kiB per subscriber. This causes the destination controller to use very large amounts of memory when there are a large number of subscribers. A cluster with 10,000 subscribers might expect the destination controller to use 3.4 GiB of memory.
Memory profiling revealed that the majority of this memory is consumed by the per-subscriber endpoint translator. More specifically, by the endpoint snapshot that it holds in order to calculate filtering and diffs to send to the client.
We reduce the memory scaling factor by noticing that this filtering and diff state can be shared between multiple subscribers if they have the same filtering conditions. In other words, subscribers only need their own state if they have different node-local or zone-local filtering. Therefore, we can group subscribers by a filter key where
all subscribers with the same filtering can share state.
To implement this, we break the unwieldy
endpoints_watcher.gointo a number of different files, aligned around the major types defined in this file:endpoints_watcher.go: the top level EndpointsWatcher typeservice_publisher.go: the servicePublisher type, per serviceport_publisher.go: the portPublisher type, per service portfiltered_listener_group.go: a newfilteredListenerGrouptype, per filter keyWe move the filtering and diffing logic out of
endpointTranlatorand intofilteredListenerGroupso that it can be shared. In this way,endpoindTranslatorbecomes a much simpler translation layer.With these changes, our load test measures up to an 84% reduction in destination controller memory use, depending on the distribution of subscriber filter keys.