Summary
The current production matchmaker (base_mmr_delta_pct = 0.50) optimises hard for the populated middle of the MMR distribution. The pair-gate window is generous enough that 411-MMR players match in roughly a minute. The cost is paid by extreme-MMR players: the 0–199 MMR cohort hits a 71-minute max wait and an 84% match rate; the 1000+ MMR cohort only matches at 88%. Both ends suffer because their percent-of-MMR window is small in absolute terms relative to the populated band, and there aren't enough peers to fill a lobby.
Tighter delta makes balance look better on paper. It also kills the cross-tier exposure combines need.
The naive fix — tighten base_mmr_delta_pct — produces matches that are uniform within tier (98%+ TIGHT spread) but eliminates the cross-tier exposure that combines exist to provide. Calibration data dries up. The right fix is to add the MMR clamp shipped in PR #300, which lets extreme-MMR players "see themselves" at a populated MMR for the pair gate while preserving real MMR for snake-draft balance. With the clamp in place, the wide pair-gate window can stay wide, MIXED-spread lobbies (the calibration sweet spot) stay frequent, and the tails get matched faster.
The hard part is choosing how aggressive the clamp should be. A 100-MMR Recruit clamped to 150 stays inside Recruit; clamped to 300 they're queued into matches against Contenders. The simulator shows the latter is dramatically faster on tail wait but stretches tier fairness. One night of arrival data is too thin to commit to a single answer, so this report presents three candidate configurations and recommends starting with the most conservative one — the codification of what your operators are already doing manually.
Codify the manual MMR-bump-to-150 as mmr_floor.
base_mmr_delta_pct = 0.50 # unchanged max_mmr_delta_pct = 0.55 # marginal headroom mmr_floor = 150 # NEW — automates the manual bump you do today mmr_ceiling = 1100 # NEW — covers the populated edge of Premier # everything else unchanged
This is the most conservative possible change: the operator team has been manually editing low-MMR players up to ~150 to make matches happen under the existing config. Setting mmr_floor = 150 automates exactly that. A 100-MMR Recruit clamped to 150 with a 50% delta matches in raw range 75–225 — entirely inside the Recruit tier (100–254), so no cross-tier mismatching.
There are stronger candidates in the simulator (see §9), but they aggressively cross tier boundaries — a 100-MMR player clamped to 300 matches in 150–450, spanning Recruit through Contender. The simulator can't fully resolve which side of that trade-off matters more for your league because we have one night of arrival data and the simulator's 5-second tick under-counts production's faster firing. So: ship the safe change, observe one combine night under the new config, then escalate to a more aggressive clamp if the bottom tail is still waiting too long.
Three concrete candidates are tabulated in §9 — they span the tier-fairness ↔ tail-recovery trade-off. Pick whichever level of aggression matches your appetite for risk.
Data sources
Jan 5 – Apr 29 (2026)
- Bot interaction logs from Loki (
{service_name="csc-bot"} |~ "created an interaction with command (queue|leavequeue)"). Each row carries a Discord user ID and millisecond timestamp. Loki retention is approximately six weeks, which means only the most recent combine night supplies replay-grade arrival data. CombineMatchesrows from core's PostgreSQL —scheduled_dateis the de-facto pop timestamp (set totz_now()in_pop_mmr_queue_sync). M2M relationshome/awaygive the rosters. PR #300'sCombineQueueMatchLogwould give exact per-player wait times directly, but those rows didn't exist for the analysed night — the PR shipped after.Playersnapshots — current MMR,type(DE/FA/Signed),tier_id. MMR is treated as stable retroactively; drift since the queue event is small relative to the population spread.- Historical match rosters from January 5–19 (a 14-night season) and April 27–29. The January nights have no Loki coverage, so they contribute production-side balance and skill-spread metrics only — no replay analysis is possible without arrival timestamps.
The algorithm
_pop_mmr_queue_sync actually does.The combine matchmaker lives in apps/matches/mutations.py:_pop_mmr_queue_sync. Each invocation runs as a single transaction with row-level locks on every CombinesQueue row, which is what lets concurrent callers safely race for the same pop. The five-step recipe:
- Snapshot under lock.
CombinesQueue.objects.select_for_update().filter(…)pulls every queue row into memory. Each row carries itscreated_at, the player's most recentrequeue_reason(NONE/CANCELLED/FINISHED), and arequeue_count. - Compute per-player priority.
type_weight ∈ {DE: 1.0, FA: 0.6, Signed: 0.3} time_factor = log(time_in_queue + 1) / 10 requeue_adj ∈ {NONE: 0, CANCELLED: +0.5, FINISHED: −0.3} priority_score = type_weight + time_factor + requeue_adjScores break ties between otherwise-compatible candidates and decide who gets anchored first. DEs and long-waiters anchor. - Compute the per-player MMR window.
effective_pct = clamp(base_mmr_delta_pct + integrated_pct, min_mmr_delta_pct, max_mmr_delta_pct) effective_mmr = clamp(player.mmr, mmr_floor, mmr_ceiling) # if/elif — never clamps twice effective_delta = effective_pct * abs(effective_mmr)integrated_pctgrows with time-in-queue via the warmup ramp + linear expansion in_compute_effective_delta_pct. The clamp only participates ineffective_deltaand the pair-gate distance below; snake-draft balance and tier resolution always read rawplayer.mmr. - Greedy 10-player group formation. For each anchor (highest priority first), iterate candidates and keep those whose
|c.effective_mmr − a.effective_mmr| ≤ min(a.effective_delta, c.effective_delta). Score each compatible candidate bypriority + tier_proximity_adj(same tier+tier_proximity_bonus, ±1 tier+tier_proximity_bonus*0.5, beyond−max_tier_difference_penalty * (tier_diff − 1)). Take top 9, validate every pair in the resulting 10 is mutually within both deltas, lock in. - Snake-draft and persist. Sort the 10-player group by raw MMR descending, deal them home/away on pattern
[0,1,1,0,0,1,1,0,0,1]. Create theCombineMatchesrow withscheduled_date=tz_now(), write aCombineQueueMatchLogaudit row (post-#300), observecombines_queue_wait_secondsper player, request a Dathost server.
Two parameters dominate observed behaviour: base_mmr_delta_pct (initial pair-gate width as a fraction of effective_mmr) and the optional mmr_floor/mmr_ceiling clamp. Everything else is around the edges — tier proximity is a soft preference within the candidate set, time-based expansion just grows the window over a multi-minute timescale.
3.1Try it: pair-gate window
Pick a player, adjust the config knobs, watch the pair-gate window move across the eligible population. The histogram below is the full 635-player eligible roster — every Player with mmr > 0, type &neq; Spec in the DB right now — not just one night's queue. Tier bands are tinted in the background so you can see when a window crosses tier boundaries. The shaded green band is the candidate range that falls inside [effective_mmr ± effective_delta]. Compatible peers counts how many of the 635 also have this player inside their window — the pair gate is mutual.
SkillUnits
The simulator's existing balance metric is |sum(home.mmr) − sum(away.mmr)|, with bands at 50/100/200. That metric is fine when MMR is linear in skill, which it isn't on this ladder — tier widths vary by an order of magnitude:
| # | Tier | MMR min | MMR max | Span |
|---|
A 200-MMR gap inside Recruit (span 154) is the entire tier and then some. The same 200-MMR gap inside Premier (span 1551) is 13% of one tier — barely a ranking distinction. Treating them as equivalent inflates "good balance" numbers in the high-MMR end of the distribution and obscures real mismatches at the low end.
The replacement metric, used throughout this report:
SkillUnits(player) = tier_index + (mmr − tier.mmrMin) / (tier.mmrMax − tier.mmrMin)
1.0 SkillUnit always means "one tier", regardless of where on the ladder the comparison happens. Two derived match metrics:
- team_skill_balance
|sum(home_SU) − sum(away_SU)|— does the snake draft produce even teams? Bands at 0.5 / 1.0 / 2.0 SU.- match_skill_spread
max(SU) − min(SU)across all 10 players — how wide is the lobby's skill range? Bands at 1 / 2 / 3 SU.
match_skill_spread is the load-bearing metric for the rest of this report. The team-balance metric reports zero for a perfectly-balanced lobby with a 1500-MMR player on each side flanked by 100-MMR players — the totals cancel — even though that match will play terribly. match_skill_spread catches that case directly.
What "good" looks like
It would be tempting to treat low match_skill_spread as the optimisation target — pick the config that produces the most TIGHT lobbies. That treatment is wrong for combines specifically. Combines exist to generate cross-tier observation data so that the placement system can correctly bucket players. A combine night where every match is 10 same-tier players gives nearly zero placement signal: the algorithm only learns that Elite plays Elite.
So the optimisation target is not "minimise spread." It's maximise MIXED, hold WIDE within tolerance, drive BLOWOUT to zero. TIGHT is acceptable but not desirable — it means the night burned a match opportunity producing no calibration evidence. WIDE is the noisy edge of the useful range — fine in moderation, costly in volume because gameplay quality drops.
The pair-gate base_pct controls the spread distribution directly: tight delta produces TIGHT matches, wide delta produces MIXED-and-up. The clamp shifts that distribution rightward at the edges — by mapping extreme players into the populated band, it converts what would be "no match" into either a MIXED or WIDE lobby. Both outcomes feed the placement signal.
The night, in detail
6.1The eligible population (635 players)
Before the queue-specific view, here's the population that could queue: every Player with mmr > 0 and type &neq; Spec. This frames the floor/ceiling choice — clamping a 100-MMR player to 200 is meaningful only if the 200-MMR band has actual peers.
You'll see 34 players in the 150–199 band and zero below 150. The system's true MMR floor is 100; what looks like a thin band starting at 150 is the result of manual MMR adjustments made on previous combine nights to give bottom-tier players a chance at matches under the existing base_mmr_delta_pct = 0.50 config. Those manual bumps are exactly the operational pain the algorithmic clamp is meant to eliminate. Read the histogram as: there are players who belong at MMR 100, but they've been temporarily lifted to ~150 to function inside the current matchmaker.
6.2MMR distribution of joiners (one night)
Median joiner MMR is 411, with 78% of joiners between 200–700 MMR. Only 25 unique joiners below 200; 17 above 1000. Of the 635 eligible players, 384 (60%) actually queued on this night — the queue subset is not the population.
6.3Wait time, by MMR cohort
Wait times below come from production data — each matched player's most recent JOIN before the POP that included them. The bottom and top cohorts dominate the long-wait list:
| MMR band | joiners | matched | match % | p50 | p95 | max | >30m | >60m |
|---|
6.4Match outcomes — both balance metrics
Each match is one point below. Horizontal axis: team_skill_balance (snake-draft fairness). Vertical axis: match_skill_spread (lobby skill range). The tightly-clustered points along the bottom are TIGHT same-tier matches; the points climbing the y-axis are the calibration matches we actually want, with BLOWOUT territory above 3.0.
Simulator A/B
The replay command (core/apps/matches/management/commands/replay_combine_night.py) drives the production simulate_pop() function — which is the exact _pop_mmr_queue_sync body extracted in simulate_matchmaking.py — with the night's real arrival/departure stream from Loki. The matchmaker decides pop timing per config; we observe outcomes.
The table below ranks scenarios by MIXED % (the calibration metric), not by TIGHT. Top of the table = configs that produce the most cross-tier exposure while staying compatible with reasonable gameplay quality.
| scenario | matches | vs actual | tight % | mixed % | wide % | blowout % | spread p95 |
|---|
7.1Per-cohort wait times across scenarios
Heatmap rows are MMR cohorts, columns are scenarios, cells are p95 wait in minutes. The wide-delta + clamp scenarios (snapshot_300_1000, 40pct_300_1000) keep the middle rows green and collapse the tail rows — the recommended trade.
7.2Match-skill-spread distribution per scenario
Stacked bars show the share of matches in each spread band. Read this looking for the largest green + amber stack at the bottom, with the smallest red sliver on top. snapshot looks great here on calibration share but produces the worst tail wait. Adding the clamp shifts distribution rightward (more WIDE, slightly more BLOWOUT) but rescues the tails.
7.3Calibration vs throughput
Each point is one configuration. X-axis: matches formed. Y-axis: MIXED + WIDE share (combined "produces calibration data"). The dashed vertical line marks production's actual match count. Top-right is the goal: more matches and more calibration evidence.
Historical season
Loki retention doesn't cover the January 5–19 stretch, so the simulator has nothing to replay there. We can compute SkillUnits-aware metrics on the actual matches that ran — useful as a baseline for what historical algorithms produced.
8.1Match volume per night
8.2Skill-spread distribution under historical algorithms
The histogram below covers all 1 044 historical matches. Note the right tail: the historical algorithms (a mix of the old tier-locked path and the new MMR-based one) produced a meaningful BLOWOUT share that the new MMR-based pair gate eliminates almost entirely in our simulator runs.
8.3Per-night skill-spread trajectory
p50 / p95 / max match_skill_spread per night. Premier-heavy nights produce wider spreads; this confirms the SkillUnits choice over raw-MMR balance — the same algorithm produces visibly different skill-spread profiles depending on who showed up.
Recommendation
One night of arrival data is not enough to commit to a single configuration with confidence. The simulator output below maps the trade-off space; the pick depends on which axis matters more right now — tier fairness (how far across tier boundaries the clamp drags low-MMR players for matchmaking) or tail recovery (how aggressively bottom-tail wait time drops).
The constraint to respect: a 100-MMR Recruit (raw tier 100–254) clamped to 300 matches in raw range 150–450, which spans Recruit, Prospect, and Contender. Snake-draft balance still uses raw MMR — they end up on a team with strong Contender teammates against a similar mix on the other side. The match happens, but their gameplay experience is "I am the lowest skill in this lobby by a lot." Codifying that as policy is a real design decision, not a free win.
9.1Three candidates
| Tier-fair | Balanced | Aggressive | |
|---|---|---|---|
| base_mmr_delta_pct | 0.50 | 0.40 | 0.50 |
| mmr_floor | 150 | 250 | 300 |
| mmr_ceiling | 1100 | 1100 | 1000 |
| 100-MMR player matches in raw range | 75–225 | 175–325 | 150–450 |
| Tier crossing for a 100-MMR player | none (Recruit) | 1 tier (Recruit→Prospect) | 2 tiers (R→P→Contender) |
| Bottom-tail (0–199) p95 wait, sim | 21.5m | 9.6m | 5.4m |
| Top-tail (1000+) p95 wait, sim | 23.3m | 24.9m | 20.9m |
| Middle (300–799) p50 wait, sim | 1.0m | 1.9m | 1.0m |
| Match-skill bands TIGHT / MIXED / WIDE / BLOWOUT | 43 / 57 / 0 / 0 | 30 / 68 / 2 / 0 | 13 / 55 / 31 / 1 |
| Total matches formed (vs 118 actual) | 127 | 129 | 135 |
9.2How to read the table
The Tier-fair candidate (floor=150) is the smallest possible algorithmic change: it does in code exactly what your operators have been doing manually — bumping low-MMR players up to ~150 to make matches form. A 100-MMR player matches inside Recruit only. The simulator shows a 21.5m bottom-tail p95, worse than current production, but this is almost certainly an artifact of the simulator's 5-second tick. Production fires the matchmaker on every queue mutation, which is faster than once-per-5s. The fact that the manual-bump-to-150 has been working in production tells us this configuration is at-or-better than current reality, just without the operator overhead.
The Balanced candidate (floor=250, base_pct=0.40) tightens the delta and lifts the floor to where the populated middle starts. A 100-MMR player matches in 175–325 (Recruit + Prospect). Tier-crossing happens but stays at one tier. Best Pareto on this night's simulator data: 9.6m bottom-tail, 0% BLOWOUT, 68% MIXED.
The Aggressive candidate (floor=300, base_pct=0.50) optimises hard for tail wait time at the cost of tier fairness. 5.4m bottom-tail p95 — by far the fastest — at the cost of dragging Recruits into matches against Contenders for the purpose of the pair gate. Snake draft still balances by raw MMR, but the lowest-skill player feels the spread.
9.3What not to do
The earlier draft of this report recommended base_pct = 0.10–0.15 with various clamps. With the calibration framing established in §5 (combines exist to generate cross-tier observation data), those configs are wrong: they produce 70%+ TIGHT same-tier-only lobbies and starve the placement system of signal. Don't ship them.
9.4Rollout
- Pick a candidate. Recommended starting point: Tier-fair — it formalises what's already working in production manually, and risk of regression is minimal.
- Update the active
MatchmakingConfigrow viaupdateMatchmakingConfigGraphQL mutation, or in the Django admin. Bothmmr_floorandmmr_ceilingare nullableFloatFields;full_clean()enforcesmmr_floor < mmr_ceiling. - Watch
combines_queue_wait_secondsin Grafana during the night. The histogram is labelled byplayer_typeandrequeue_reason; tail-cohort effects show up most clearly when filtered toplayer_typematching the typical tail composition. - After one full combine night, re-export with
scripts/fetch_queue_timeline.py+export_combine_night, re-run this analysis, and compare predicted vs observed wait p95 per cohort. If bottom-tail p95 is still >15 minutes, escalate to Balanced. If it's already <10 minutes, hold at the current candidate. - Iterate floor/ceiling values seasonally as MMR drifts. The eligible-population histogram in §6.1 is the correct guide — set the floor at the lower edge of the populated middle (where bucket density crosses ~10 players per 100-MMR band), and the ceiling at the corresponding upper edge.
Caveats
- Single-night replay sample. Only the 2026-04-28 EDT session has both Loki bot logs and DB rosters in the analysis. The 14 January nights contribute production-side
match_skill_spreaddistributions only — we can't replay simulator scenarios there. Confidence in the recommended config is bounded by that sample size; soft-canary the change and re-run this analysis after one combine night under the new config. - Simulator approximations. The replay loop ticks every 5 seconds for pop attempts; production is event-driven via the
pop_mmr_queuemutation, which fires on every queue mutation that could form a lobby. The simulator under-countsFINISHEDrequeues — without match-completion data we can't distinguish post-game requeues from a simple JOIN — and models cancellations asCANCELLEDrequeues whenever a player rejoins after a sim-pop. Server-availability is unmodelled (sim assumes infinite Dathost capacity). - MMR snapshot is current, not historical. A player's MMR moves slightly after each match. The "MMR at queue time" used by the simulator is approximated as "MMR right now." Effect should be O(±20 MMR) for active players, small relative to the 50-MMR histogram bins and 100-MMR cohort buckets.
- SkillUnits depends on stable tier boundaries. If
Tiers.mmrMin/Tiers.mmrMaxchange, the metric needs re-computing. Current bands:
Generated end-to-end by scripts/fetch_queue_timeline.py, core/apps/matches/management/commands/{export_combine_night, replay_combine_night, sweep_combine_night}.py, and an inline data bundler. All raw data is embedded in this document — see window.REPORT_DATA in the JS console for the full bundle.