Skip to content

Commit efc19ec

Browse files
authored
feat(starfish): evict pending commit votes via quorum commit index (#11392)
# Description of change Bounds the in-memory `pending_commit_votes` tracker in `DagState` and tightens the proposal skip paths in `Core`. ### Eviction (issue #11391) - Switch `pending_commit_votes` from `VecDeque<CommitVote>` to `BTreeSet<CommitVote>` so eviction uses `split_off`. Insertion order is preserved (the linearizer produces commits monotonically); insertion dedup is a free side benefit. - New field on `DagState`: `last_known_quorum_commit_index: CommitIndex` (no `Arc<CommitVoteMonitor>` reference). `Core` is the sole holder of the monitor and pushes the value via a setter. - `evict_pending_commit_votes()` drops votes with `index <= last_known_quorum_commit_index - gc_depth`. Gated on `consensus_block_restrictions`; no-op otherwise. - Eviction is called both from `take_commit_votes` (proposal path) and from `flush()` (mirrors the existing `pending_acknowledgments` pattern). - `Core` refreshes `DagState::last_known_quorum_commit_index` from `CommitVoteMonitor::quorum_commit_index()` at three places: `Core::try_commit` after every successful `commit_observer.handle_committed_leaders` (regular consensus path); `Core::handle_committed_sub_dags_from_fast_sync` (fast-sync path); `Core::should_propose` (right before we propose). ### `should_propose` cleanup (issue #11400) - Centralised every skip reason behind a new `SkipProposalReason` enum + `skip_proposal` helper, so the `core_skipped_proposals` metric label space is disjoint and visible in one place. - New skip reason `behind_quorum_commit_round` — the `consensus_block_restrictions` gate (formerly inside `try_new_block`) moves up to `should_propose`. - The `clock_round <= last_proposed_round` recheck moves up too. It is silent (no metric, no debug log) since it fires on every accepted block in a round we have already proposed in — counting it would drown the genuinely interesting reasons. - `try_new_block` now reads `clock_round` only; all gates are upstream. ## Links to any relevant issues Fixes #11391 Fixes #11400 ## How the change has been tested - [x] Basic tests (linting, compilation, formatting, unit/integration tests) - [x] New unit test `dag_state::test::test_evict_pending_commit_votes` covering: quorum-driven eviction with the flag on; default-field no-op (no setter call); flag-off no-op even with a non-zero field - [x] Existing 280 starfish-core lib tests still pass - [x] `cargo ci-clippy` clean across the workspace - [ ] Patch-specific tests (correctness, functionality coverage)
1 parent b0db814 commit efc19ec

2 files changed

Lines changed: 238 additions & 60 deletions

File tree

crates/starfish/core/src/core.rs

Lines changed: 135 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -155,6 +155,34 @@ impl ReasonToCreateBlock {
155155
}
156156
}
157157

158+
/// Reason a `Core::should_propose` call returned `false`. Used as the `reason`
159+
/// label on the `core_skipped_proposals` metric. Keeping the variants in one
160+
/// enum makes the label space disjoint and centrally visible. Variant data is
161+
/// the per-reason context interpolated into the corresponding `debug!` line.
162+
///
163+
/// The "already proposed at this round" branch is intentionally not modeled
164+
/// here: it fires on every block accepted in a round we have already proposed
165+
/// in, which is a normal high-rate event. Counting and logging it would drown
166+
/// the genuinely interesting reasons below.
167+
#[derive(Clone, Copy)]
168+
pub(crate) enum SkipProposalReason {
169+
NoQuorumSubscriber,
170+
NoLastKnownProposedRound,
171+
HigherLastKnownProposedRound { last_known: Round },
172+
BehindQuorumCommitRound { approx_quorum: Round },
173+
}
174+
175+
impl SkipProposalReason {
176+
fn label(self) -> &'static str {
177+
match self {
178+
Self::NoQuorumSubscriber => "no_quorum_subscriber",
179+
Self::NoLastKnownProposedRound => "no_last_known_proposed_round",
180+
Self::HigherLastKnownProposedRound { .. } => "higher_last_known_proposed_round",
181+
Self::BehindQuorumCommitRound { .. } => "behind_quorum_commit_round",
182+
}
183+
}
184+
}
185+
158186
impl Core {
159187
pub(crate) fn new(
160188
context: Arc<Context>,
@@ -588,10 +616,18 @@ impl Core {
588616
dag_state.set_fast_sync_ongoing_flag(true);
589617
}
590618

619+
// After a positive commit advance, refresh the quorum commit index
620+
// on `DagState` so the eviction inside `flush()` is bounded.
621+
let quorum_commit_index = self.commit_vote_monitor.quorum_commit_index();
622+
591623
// Flush commits to storage so they're available for
592624
// get_block_refs_for_recent_commits when close-to-quorum mode
593625
// triggers header fetching.
594-
self.dag_state.write().flush();
626+
{
627+
let mut dag_state = self.dag_state.write();
628+
dag_state.set_last_known_quorum_commit_index(quorum_commit_index);
629+
dag_state.flush();
630+
}
595631

596632
// Then process subdags as usual
597633
self.commit_observer.finalize_and_send_solid_subdags(
@@ -738,9 +774,9 @@ impl Core {
738774
Ok((None, BTreeMap::new()))
739775
}
740776

741-
/// Attempts to propose a new block for the next round. If a block has
742-
/// already proposed for latest or earlier round, then no block is
743-
/// created and None is returned.
777+
/// Attempts to propose a new block at the current clock round. Eligibility
778+
/// (round bounds, block restrictions) is enforced upstream in
779+
/// `should_propose`.
744780
#[instrument(level = "trace", skip_all)]
745781
fn try_new_block(&mut self, reason: ReasonToCreateBlock) -> Option<VerifiedBlock> {
746782
let _s = self
@@ -751,31 +787,9 @@ impl Core {
751787
.with_label_values(&["Core::try_new_block"])
752788
.start_timer();
753789

754-
// Ensure the new block has a higher round than the last proposed block
755-
// and, under `consensus_block_restrictions`, also the approximate quorum commit
756-
// round. Blocks at or below the quorum commit round will not improve
757-
// the commit rule.
758-
let clock_round = {
759-
let dag_state = self.dag_state.read();
760-
let clock_round = dag_state.threshold_clock_round();
761-
let last_proposed_round = dag_state.get_last_proposed_block_header().round();
762-
let min_round = if self.context.protocol_config.consensus_block_restrictions() {
763-
let quorum_commit_index = self.commit_vote_monitor.quorum_commit_index();
764-
let local_commit_index = dag_state.last_commit_index();
765-
let local_commit_round = dag_state.last_commit_round();
766-
// Lower bound on the quorum commit round: at least 1 round per
767-
// commit, so the gap in indices maps to at least that many rounds.
768-
let approx_quorum_round =
769-
local_commit_round + quorum_commit_index.saturating_sub(local_commit_index);
770-
last_proposed_round.max(approx_quorum_round)
771-
} else {
772-
last_proposed_round
773-
};
774-
if clock_round <= min_round {
775-
return None;
776-
}
777-
clock_round
778-
};
790+
// Eligibility checks (round bounds, block restrictions) are enforced
791+
// upstream in `should_propose`.
792+
let clock_round = self.dag_state.read().threshold_clock_round();
779793

780794
// There must be a quorum of blocks from the previous round.
781795
let quorum_round = clock_round.saturating_sub(1);
@@ -1123,10 +1137,16 @@ impl Core {
11231137
);
11241138
all_missing_committed_txns.extend(missing_transactions_refs);
11251139

1140+
// After a positive commit advance, refresh the quorum commit index
1141+
// on `DagState` so the eviction of `pending_commit_votes` is bounded.
11261142
// Both pending and solid sub DAGs should be added to scoring subdags.
1127-
self.dag_state
1128-
.write()
1129-
.add_scoring_subdags(subdags.iter().map(|s| s.base.clone()).collect());
1143+
{
1144+
let mut dag_state = self.dag_state.write();
1145+
dag_state.set_last_known_quorum_commit_index(
1146+
self.commit_vote_monitor.quorum_commit_index(),
1147+
);
1148+
dag_state.add_scoring_subdags(subdags.iter().map(|s| s.base.clone()).collect());
1149+
}
11301150

11311151
committed_sub_dags.extend(subdags);
11321152

@@ -1198,41 +1218,104 @@ impl Core {
11981218
info!("Last known proposed round set to {round}");
11991219
}
12001220

1201-
/// Whether the core should propose new blocks.
1221+
/// Returns true when Core should propose at the current clock round. As a
1222+
/// side effect, when proposal is greenlit under
1223+
/// `consensus_block_restrictions`, refreshes `DagState`'s last-known
1224+
/// quorum commit index to enable eviction for commit votes
12021225
pub(crate) fn should_propose(&self) -> bool {
1203-
let clock_round = self.dag_state.read().threshold_clock_round();
1204-
let core_skipped_proposals = &self.context.metrics.node_metrics.core_skipped_proposals;
1226+
let (clock_round, last_proposed_round, local_commit_index, local_commit_round) = {
1227+
let dag_state = self.dag_state.read();
1228+
(
1229+
dag_state.threshold_clock_round(),
1230+
dag_state.get_last_proposed_block_header().round(),
1231+
dag_state.last_commit_index(),
1232+
dag_state.last_commit_round(),
1233+
)
1234+
};
12051235

12061236
if !self.quorum_subscribers_exists {
1207-
debug!("Skip proposing for round {clock_round}, don't have a quorum of subscribers.");
1208-
core_skipped_proposals
1209-
.with_label_values(&["no_quorum_subscriber"])
1210-
.inc();
1211-
return false;
1237+
return self.skip_proposal(clock_round, SkipProposalReason::NoQuorumSubscriber);
12121238
}
12131239

12141240
let Some(last_known_proposed_round) = self.last_known_proposed_round else {
1215-
debug!(
1216-
"Skip proposing for round {clock_round}, last known proposed round has not been synced yet."
1217-
);
1218-
core_skipped_proposals
1219-
.with_label_values(&["no_last_known_proposed_round"])
1220-
.inc();
1221-
return false;
1241+
return self.skip_proposal(clock_round, SkipProposalReason::NoLastKnownProposedRound);
12221242
};
12231243
if clock_round <= last_known_proposed_round {
1224-
debug!(
1225-
"Skip proposing for round {clock_round} as last known proposed round is {last_known_proposed_round}"
1244+
return self.skip_proposal(
1245+
clock_round,
1246+
SkipProposalReason::HigherLastKnownProposedRound {
1247+
last_known: last_known_proposed_round,
1248+
},
12261249
);
1227-
core_skipped_proposals
1228-
.with_label_values(&["higher_last_known_proposed_round"])
1229-
.inc();
1250+
}
1251+
1252+
// Silently skip when we already proposed at or above `clock_round`.
1253+
// This branch fires on every accepted block within the same clock round
1254+
if clock_round <= last_proposed_round {
12301255
return false;
12311256
}
12321257

1258+
// Under `consensus_block_restrictions`, skip if the candidate round
1259+
// does not exceed an approximation of the quorum commit round. Blocks
1260+
// at or below it cannot improve the commit rule.
1261+
if self.context.protocol_config.consensus_block_restrictions() {
1262+
let quorum_commit_index = self.commit_vote_monitor.quorum_commit_index();
1263+
let approx_quorum_round =
1264+
local_commit_round + quorum_commit_index.saturating_sub(local_commit_index);
1265+
if clock_round <= approx_quorum_round {
1266+
return self.skip_proposal(
1267+
clock_round,
1268+
SkipProposalReason::BehindQuorumCommitRound {
1269+
approx_quorum: approx_quorum_round,
1270+
},
1271+
);
1272+
}
1273+
1274+
// We are about to propose: refresh DagState's known quorum commit
1275+
// index so the eviction of `pending_commit_votes` is bounded.
1276+
self.dag_state
1277+
.write()
1278+
.set_last_known_quorum_commit_index(quorum_commit_index);
1279+
}
1280+
12331281
true
12341282
}
12351283

1284+
/// Records a skipped proposal: emits the per-reason `debug!` line and
1285+
/// increments `core_skipped_proposals` with the matching label. Always
1286+
/// returns `false` so call sites can `return self.skip_proposal(...)`.
1287+
fn skip_proposal(&self, clock_round: Round, reason: SkipProposalReason) -> bool {
1288+
match reason {
1289+
SkipProposalReason::NoQuorumSubscriber => {
1290+
debug!(
1291+
"Skip proposing for round {clock_round}, don't have a quorum of subscribers."
1292+
);
1293+
}
1294+
SkipProposalReason::NoLastKnownProposedRound => {
1295+
debug!(
1296+
"Skip proposing for round {clock_round}, last known proposed round has not been synced yet."
1297+
);
1298+
}
1299+
SkipProposalReason::HigherLastKnownProposedRound { last_known } => {
1300+
debug!(
1301+
"Skip proposing for round {clock_round} as last known proposed round is {last_known}"
1302+
);
1303+
}
1304+
SkipProposalReason::BehindQuorumCommitRound { approx_quorum } => {
1305+
debug!(
1306+
"Skip proposing for round {clock_round}, behind approximate quorum commit round {approx_quorum}"
1307+
);
1308+
}
1309+
}
1310+
self.context
1311+
.metrics
1312+
.node_metrics
1313+
.core_skipped_proposals
1314+
.with_label_values(&[reason.label()])
1315+
.inc();
1316+
false
1317+
}
1318+
12361319
/// Retrieves the next ancestors to propose to form a block at `clock_round`
12371320
/// round.
12381321
fn ancestors_to_propose(&mut self, clock_round: Round) -> Vec<VerifiedBlockHeader> {

0 commit comments

Comments
 (0)