Mercurius — Strategy Changelog

25 MAY 2025 MAJOR

Mercurius v7.1: Self-Improving ML Pipeline & Shadow P&L

Closed the loop from data collection to model improvement. The ML filter now retrains weekly on all historical data, tracks what would have happened on blocked trades, and reports results via Telegram. Pair confluence features from combo analysis add 5 new predictive signals.

ML Features

15 → 20

+5 pair confluence

AUC

0.707 → 0.761

+0.054 from pairs

Blocked WR

2.6%

ML correctly rejects

Boosted WR

72.0%

ML correctly boosts

1. ML Threshold Calibration

Block threshold calibrated to base rate — Original thresholds (block <40%, boost >70%) would have blocked every trade since the base win rate is 30% (system profits via 3.25:1 payoff ratio, not win rate). Recalibrated: block <15% (half base rate), boost >40%.
Validation — Blocked trades had 4.3% actual WR, boosted trades had 72.2% WR. Strong discrimination.

2. Self-Improving ML Pipeline

Weekly retrain with model comparison — Sunday 23:00 UTC retrain compares accuracy and AUC delta against previous model. Feature importance tracking identifies which signals matter most.
Live prediction audit — Joins ml_predictions with closed positions to verify ML discrimination is maintained in production (blocked vs boosted actual win rates).
Telegram retrain report — Comprehensive report sent after each retrain: training stats, model comparison, threshold calibration, top 5 features, live audit, shadow P&L summary.

3. Shadow P&L for Blocked Trades

Counterfactual tracking — Every blocked trade (ML filter, event guard, anti-stack, regime gate, etc.) logs entry price to shadow_pnl table. Price checked 4h later to determine if the trade would have won.
Guard effectiveness validation — Shadow P&L by block reason shows whether each guard is filtering good or bad trades. Reported weekly in ML retrain Telegram report.

4. Pair Confluence Features

Combo analysis on 652 historical trades — Atlas+Pulse pair: 62.2% WR (+£1,017, 45 trades). Oracle+Pulse: 8.3% WR. Cipher+Oracle: 7.7% WR. Strong predictive signal in which agents agree.
5 binary pair features in ML model — has_atlas_pulse, has_cipher_oracle, has_oracle_pulse, has_atlas_cipher, has_atlas_sentinel. Top feature after retrain: has_atlas_sentinel at 33.2% importance.

5. Weekly Anomaly Flags

7 automated flags in Telegram digest — Trade drought (5+ days), win rate drop (>15% vs prior week), agent accuracy below 30%, concentration risk (single instrument >60% of P&L), high ML block rate (>50%), shadow P&L showing over-aggressive guards, large weekly drawdown (>£50).

25 MAY 2025 MAJOR

Mercurius v7.0: Edge, ML & Market Structure Upgrade

Comprehensive upgrade across six areas: mathematical edge, position sizing, agent intelligence, execution quality, and ML-based trade filtering. Built on the +£1,102 all-time paper performance (671 trades, 70% win rate last week).

Voting Agents

5 → 6

+Structure agent

Guard Layers

7 → 9

+Market Hours, R:R, ML

Edge Formula

Kelly

Proper (p×b - q)/b

Data Sources

+2

Congress trades, Wyckoff

1. Market Hours & Execution Fixes

Market hours guard (Layer 0) — Skips instruments where the market is closed. Parses all IG config formats (24/7, 24/5, 08-21, 09:30-18:30). Prevents wasteful API calls and IG MARKET_CLOSED rejections.
Extended rejection memory — MARKET_CLOSED and MARKET_ROLLED rejections now tracked with longer cooldown. Instruments aren't retried until is_market_open() returns True.

2. Kelly Criterion Edge & Position Sizing

Edge formula replaced — Old: expected_value / avg_loss × conviction_ratio. New: (p × b - q) / b × conviction_ratio × regime_mult. Proper Kelly criterion with regime multipliers (trending=1.0, volatile=0.7, ranging=0.8).
Position sizing from edge — kelly_raw = edge instead of avg_confidence × 0.5. Size now scales with actual mathematical edge, not just confidence.
R:R enforcement (Layer 8) — Rejects trades where limit_distance / stop_distance < 1.5. Every trade must have asymmetric payoff.
Pre-trade margin check — Skips if estimated margin exceeds 90% of available balance.

3. Non-Linear Agent Weights

Weight formula — Old: accuracy × 2 (range 0.3–2.5). New: accuracy^1.5 × 3 (range 0.2–3.0). Cipher (83.7% acc) now gets 2.30 weight vs Sentinel (36.6%) at 0.66 — a 3.5x gap.
EMA decay alpha — 0.10 → 0.15 for faster adaptation to recent performance.

4. Structure Agent (6th Voting Agent)

Wyckoff phase detection — Identifies accumulation, distribution, markup, and markdown phases from 20-bar price/volume windows.
Volume Profile (POC/Value Area) — Point of Control and Value Area computed from candle data. Signal: price above/below POC.
Market structure (swing analysis) — Detects HH+HL (uptrend) and LH+LL (downtrend) patterns from swing highs and lows.
Support/Resistance levels — Key levels from price clustering. Signal from proximity and reaction.
Conviction sizing updated — {3: 1.0, 4: 1.5, 5: 2.5, 6: 3.5} to accommodate the 6th agent.

5. Congressional Trading Data

House Stock Watcher collector — Fetches congressional stock transactions from public S3 data. Maps tickers to instruments (NVDA→NAS100, XOM→OIL_BRENT, GLD→XAU/USD). Daily at 10:00 UTC.
Oracle integration — Congressional purchase/sale signals feed into Oracle's fundamental convergence logic for SP500, NAS100, FTSE100, OIL_BRENT, XAU/USD, coffee.

6. XGBoost ML Trade Filter (Layer 9)

15-feature binary classifier — Features: conviction, avg_confidence, num_dissenting, RSI, BB %B, ATR ratio, composite signal, Fear & Greed, ADX, BB width pctl, regime one-hot (3), hour of day, day of week.
Win probability gate — Block if win_prob < 15% (well below 30% base rate). Boost conviction +1 if win_prob > 40%. Thresholds calibrated to base rate since system profits via R:R, not win rate. Graceful degradation: returns neutral 0.5 until 50+ training samples.
Weekly retrain — Walk-forward split (80/20 chronological). Sunday 23:00 UTC. Model saved to data/xgb_trade_filter.pkl.
Audit trail — Every ML prediction logged to ml_predictions table with features, win probability, and decision.

20 MAY 2025 MAJOR

Mercurius v2.1: £100K Virtual Bankroll + CFD-Only Operation

Account consolidation. Disabled the spreadbet account entirely — Mercurius now operates CFD-only. All UI simplified: removed the account selector dropdown, hardcoded to the CFD account. One account, one mode, less surface area for confusion.

Virtual bankroll. Set the virtual bankroll to £100,000 for all position sizing calculations. The IG demo account maintains ~£10K actual balance, but all sizing, risk parameters, and P&L tracking use the £100K virtual figure. This decouples position sizing from the demo balance constraint and lets the system trade at the scale it was designed for.

Risk parameters scaled to match:

Before

bankroll = £8,000

max_per_trade = £50

max_daily_loss = £500

asset max_gbp = £150

After

bankroll = £100,000

max_per_trade = £500

max_daily_loss = £5,000

asset max_gbp = £1,500

Strategy presets scaled. Per-trade sizing across all presets updated to reflect the new bankroll:

Aggressive

£500

per trade

Balanced

£1,000

per trade

Conservative

£1,500

per trade

Sniper

£3,000

per trade

P&L tracking decoupled from IG balance. The dashboard now uses DB-based P&L (sum of realized + unrealized from the positions table) instead of comparing the IG balance against starting capital. The header shows virtual portfolio value (bankroll + DB P&L), making performance reporting independent of IG account fluctuations.

Trade history archived. All pre-v2 data (638 positions, 1,868 trades) moved to archive tables (positions_archive, trades_archive). V2_CUTOFF set to May 20, 2025 — all stats, benchmarks, and performance tracking start fresh from this date.

Dashboard visual uplift. Refined CSS across all pages — improved card layouts, table styling, navigation consistency, typography spacing, and custom scrollbars. The system looks as serious as its ambitions.

20 MAY 2025 ADDITION

Dynamic Governance + Trading Config API

New /api/trading-config endpoint — Exposes all trading parameters, conviction sizing multipliers, hold periods, and governance configuration dynamically. No more hardcoded values scattered across the frontend.
Governance page fetches params from API — The governance page now pulls all trading rules, thresholds, and guard configurations from the live API instead of hardcoding them in HTML. Changes to config propagate instantly.
Strategy presets visible and switchable — All four strategy presets (aggressive, balanced, conservative, sniper) are now displayed on the governance page with their full parameter sets, and can be switched from the UI.

16 MAY 2025 HOTFIX

Regime Gate Exemption & Volatile Guard Relaxation

Post-deployment analysis of the first hotfix revealed FTSE100 SELL consensus was still being blocked (59 blocked decisions in 7 days) by the regime gate. Despite being restricted to SELL-only (which historically worked in all regimes), the asset-class-level gate for ("indices", "ranging") blocked every signal. Additionally, XAG/USD in volatile regime required conviction ≥4, which is effectively 80% of all agents agreeing — too high for the current market environment.

FTSE100 exempt from regime gate — Added REGIME_GATE_EXEMPTIONS set. FTSE100 passes through regardless of regime since it already has the SELL-only directional restriction as a guardrail.
Volatile regime guard relaxed — Was: conviction ≥4 required. Now: conviction 3 allowed through if avg_confidence ≥ 0.55. This means strong consensus at lower conviction can still trade in volatile markets.
Governance docs updated to v5.0 — Full rewrite: 5-agent table, 7-layer guard system documented, regime gate rules, conviction sizing updated, instrument restrictions table, historical edge formula, trading parameters corrected.

16 MAY 2025 HOTFIX

Unblock Trading: Confidence Threshold & Sizing Corrections

Four-day post-overhaul review revealed the system was completely paralyzed — zero new trades executed since May 12. The guards worked too well: every instrument was blocked by at least one gate. The system went from over-trading (31/day) to not trading at all (0/day).

Trades Since Overhaul

0

Target: 5-8/day

Consensus Formed

2,585

All blocked by gates

Guards Blocked

373

Regime + restriction + stacking

SP500 Rejected

INSUF FUNDS

IG margin too low at £150/trade

Root cause analysis:

MIN_VOTE_CONFIDENCE too high (0.35) — Sentinel votes at 0.30 and Pulse at 0.10-0.29. Both were filtered on every single vote, leaving only Atlas (0.40) + Oracle (0.71) + sometimes Cipher as eligible voters. With only 2-3 qualifying agents, consensus rarely formed.
FTSE100 min_conviction=4 unreachable — With 5 agents, getting 4 to agree on SELL is extremely rare. Every FTSE consensus was SELL with conviction=3, then blocked by the restriction. Hundreds of legitimate SELL signals were discarded.
SP500 INSUFFICIENT_FUNDS — The only instrument that passed all guards (trending regime, BUY consensus). But £150/trade at 20:1 leverage exceeded remaining margin on the demo account (balance ~£8K after £2K in losses). IG rejected every order.
Commodities not blocked, but no consensus — OIL_BRENT, coffee, XAU, XAG all had regime gate set to ALLOWED for commodities. The issue was upstream: too few agents passed the confidence filter to form consensus.

Before (Paralyzed)

MIN_VOTE_CONFIDENCE = 0.35

FTSE100 min_conviction = 4

max_per_trade = £150

bankroll_gbp = £10,000

Result: 0 trades in 4 days

After (Fixed)

MIN_VOTE_CONFIDENCE = 0.25

FTSE100 min_conviction = 3

max_per_trade = £50

bankroll_gbp = £8,000

All 5 agents can participate

Lesson: When multiple independent guards each have a 60-80% pass rate, the combined pass rate is multiplicative. Seven guards at 80% each = 0.8⁷ = 21% pass rate. The system needs each individual gate to be permissive enough that the combination still allows quality trades through.

12 MAY 2025 MAJOR OVERHAUL

Mercurius v2: Quality Over Quantity

After 21 days of live demo trading (651 closed positions), a comprehensive analysis revealed that the system was massively over-trading with inverted conviction signals. This overhaul restructures the entire trading pipeline.

Win Rate

29.6%

Target: 50%+

Trades/Day

31

Target: 5-8

Total P&L

+$530

21 days

BUY Bias

88%

Target: <65%

12 MAY 2025 AGENTS REMOVED

Agent Council Pruned: 8 → 5 Voting Agents

Three agents consistently produced noise rather than signal. Removing them raises the consensus bar from 37.5% (3/8) to 60% (3/5), meaning every trade now requires genuine majority agreement.

Astral — 20-30% accuracy. Moon phases and seasonal patterns had no predictive value.
Contrarian — 4-29% accuracy. Overlapped with Pulse's contrarian logic, added confusion.
Correlation — 11-27% accuracy. Intermarket divergence signals were consistently wrong.

Before

8 voting agents

3/8 = 37.5% consensus

Conviction 3-7 sizing

5-agent consensus = 0% WR

After

5 voting agents

3/5 = 60% consensus

Conviction 3-5 sizing

Higher bar = higher quality

12 MAY 2025 INSTRUMENTS REMOVED

Instrument Set Refocused: Forex Eliminated, Commodities Core

Forex pairs consumed 47% of all trades but produced no meaningful P&L. The system's genuine edge is in commodities where CFTC positioning, weather data, and fundamental analysis provide information advantages.

GBP/USD — 114 trades, exactly $0 P&L. Every single trade was breakeven.
AUD/USD — 191 trades, $2 P&L. 9 stacked positions at time of removal.
EUR/USD, USD/JPY, USD/CHF — No edge, low conviction, removed.
BTC/USD, ETH/USD — 1:1 leverage, no data advantage over crypto-native traders.
DAX40, Sugar, NATGAS — Low conviction or too volatile.

Before (9 instruments)

5 forex pairs

3 indices

1 commodity

After (7 instruments)

4 commodities: Oil, Coffee, Gold, Silver

3 indices: FTSE, S&P, NASDAQ

0 forex, 0 crypto

12 MAY 2025 GUARDS ADDED

Seven-Layer Guard System in Arbiter

The Arbiter now runs every consensus decision through seven sequential guards before creating a trade opportunity. Each blocked decision is stored in the database with full reasoning for audit and analysis.

Event Guard — Block trades near high-impact economic events (existing)
Anti-Stacking — No duplicate positions in same direction. Root cause of 9 stacked AUD/USD positions.
Daily Trade Cap — Max 8 trades/day (was averaging 31/day)
Instrument Cooldown — 4-6 hour cooldown per instrument after any position
Regime Gate — Block ranging markets for indices/forex/crypto. Ranging regime had 8.3% WR vs 57.5% for trending.
Volatile Regime — Require conviction ≥ 4 in volatile markets
Instrument Restrictions — FTSE100 SELL-only (BUY lost heavily), S&P/NASDAQ trending-only

12 MAY 2025 EDGE OVERHAUL

Historical Performance-Based Edge Calculation

The old edge formula (conviction / N) * avg_confidence was synthetic — it always produced "tradeable" edges regardless of whether the system actually made money on that instrument. A 3/5 consensus at 0.50 confidence gave 0.30 edge, well above the 0.05 threshold.

Old Formula

edge = (conviction / N) * confidence

Always produces positive edge

5% threshold (too low)

No connection to actual P&L

New Formula

30-day historical win rate + avg P&L

Losing instruments get negative edge

8% threshold (raised)

Cold start: 50% haircut + BUY penalty

This means instruments that actually lose money will self-correct — their edge drops below threshold and trading halts until performance improves.

12 MAY 2025 SELF-IMPROVEMENT

Closing the Self-Improvement Loop

Previously, the strategy review system generated 24 recommendations ("halt ranging trading", "blacklist GBP/USD") but none were actually implemented. The system was read-only.

Auto-cap underperformers — Daily 06:00 UTC job finds agent-instrument combos with 20+ evaluations and <30% accuracy, caps their weight to 0.3
Combo performance tracking — New agent_combo_performance table records which agent combinations produce winning trades
Performance benchmarks API — /api/benchmarks endpoint tracks win rate, trades/day, weekly P&L against targets

12 MAY 2025 BUG FIXES

Hardcoded Agent Count & Confidence Threshold

Position manager edge formula — Was hardcoded conviction / 8, now uses conviction / len(VOTING_AGENTS)
MIN_VOTE_CONFIDENCE — Raised from 0.25 to 0.35 to filter low-conviction agent noise

10 MAY 2025 FIX

Edge Evaporation Killing Positions Instantly

Positions were being closed immediately after opening because the edge decay formula was too aggressive. Fixed the decay curve to use quadratic rather than linear decay.

10 MAY 2025 FIX

Trade Drought & Instant Position Closure

Four critical issues were causing a trade drought and instant position closures. Fixed consensus edge calculation, position sizing, and stale thesis detection.

09 MAY 2025 FIX

Pulse Base Confidence Too Low

Raised Pulse agent base confidence from 0.30 to 0.35 so moderate IG sentiment signals could pass the MIN_VOTE_CONFIDENCE filter and participate in consensus.

09 MAY 2025 FIX

IG Sentiment Market IDs Corrected

Fixed incorrect IG API market IDs for sentiment data collection. Updated to use correct IG format (e.g., UK100, USTEC) instead of generic identifiers.