A record of every strategic decision, what the data showed, what we changed, and why. The self-improvement loop made visible.
Closed the loop from data collection to model improvement. The ML filter now retrains weekly on all historical data, tracks what would have happened on blocked trades, and reports results via Telegram. Pair confluence features from combo analysis add 5 new predictive signals.
1. ML Threshold Calibration
2. Self-Improving ML Pipeline
ml_predictions with closed positions to verify ML discrimination
is maintained in production (blocked vs boosted actual win rates).3. Shadow P&L for Blocked Trades
shadow_pnl table. Price checked 4h later to determine if the
trade would have won.4. Pair Confluence Features
has_atlas_pulse, has_cipher_oracle, has_oracle_pulse,
has_atlas_cipher, has_atlas_sentinel. Top feature after retrain:
has_atlas_sentinel at 33.2% importance.5. Weekly Anomaly Flags
Comprehensive upgrade across six areas: mathematical edge, position sizing, agent intelligence, execution quality, and ML-based trade filtering. Built on the +£1,102 all-time paper performance (671 trades, 70% win rate last week).
1. Market Hours & Execution Fixes
is_market_open() returns True.2. Kelly Criterion Edge & Position Sizing
expected_value / avg_loss × conviction_ratio.
New: (p × b - q) / b × conviction_ratio × regime_mult.
Proper Kelly criterion with regime multipliers (trending=1.0, volatile=0.7, ranging=0.8).kelly_raw = edge instead of avg_confidence × 0.5.
Size now scales with actual mathematical edge, not just confidence.limit_distance / stop_distance < 1.5.
Every trade must have asymmetric payoff.3. Non-Linear Agent Weights
accuracy × 2 (range 0.3–2.5).
New: accuracy^1.5 × 3 (range 0.2–3.0).
Cipher (83.7% acc) now gets 2.30 weight vs Sentinel (36.6%) at 0.66 — a 3.5x gap.4. Structure Agent (6th Voting Agent)
5. Congressional Trading Data
6. XGBoost ML Trade Filter (Layer 9)
data/xgb_trade_filter.pkl.ml_predictions table with features,
win probability, and decision.Account consolidation. Disabled the spreadbet account entirely — Mercurius now operates CFD-only. All UI simplified: removed the account selector dropdown, hardcoded to the CFD account. One account, one mode, less surface area for confusion.
Virtual bankroll. Set the virtual bankroll to £100,000 for all position sizing calculations. The IG demo account maintains ~£10K actual balance, but all sizing, risk parameters, and P&L tracking use the £100K virtual figure. This decouples position sizing from the demo balance constraint and lets the system trade at the scale it was designed for.
Risk parameters scaled to match:
Strategy presets scaled. Per-trade sizing across all presets updated to reflect the new bankroll:
P&L tracking decoupled from IG balance. The dashboard now uses DB-based P&L (sum of realized + unrealized from the positions table) instead of comparing the IG balance against starting capital. The header shows virtual portfolio value (bankroll + DB P&L), making performance reporting independent of IG account fluctuations.
Trade history archived. All pre-v2 data (638 positions, 1,868 trades) moved to
archive tables (positions_archive, trades_archive). V2_CUTOFF set to
May 20, 2025 — all stats, benchmarks, and performance tracking start fresh from this date.
Dashboard visual uplift. Refined CSS across all pages — improved card layouts, table styling, navigation consistency, typography spacing, and custom scrollbars. The system looks as serious as its ambitions.
/api/trading-config endpoint —
Exposes all trading parameters, conviction sizing multipliers, hold periods, and governance
configuration dynamically. No more hardcoded values scattered across the frontend.Post-deployment analysis of the first hotfix revealed FTSE100 SELL consensus was still being blocked
(59 blocked decisions in 7 days) by the regime gate. Despite being restricted to SELL-only (which
historically worked in all regimes), the asset-class-level gate for ("indices", "ranging")
blocked every signal. Additionally, XAG/USD in volatile regime required conviction ≥4, which is
effectively 80% of all agents agreeing — too high for the current market environment.
REGIME_GATE_EXEMPTIONS set. FTSE100 passes through regardless of regime since
it already has the SELL-only directional restriction as a guardrail.avg_confidence ≥ 0.55.
This means strong consensus at lower conviction can still trade in volatile markets.Four-day post-overhaul review revealed the system was completely paralyzed — zero new trades executed since May 12. The guards worked too well: every instrument was blocked by at least one gate. The system went from over-trading (31/day) to not trading at all (0/day).
Root cause analysis:
Lesson: When multiple independent guards each have a 60-80% pass rate, the combined pass rate is multiplicative. Seven guards at 80% each = 0.87 = 21% pass rate. The system needs each individual gate to be permissive enough that the combination still allows quality trades through.
After 21 days of live demo trading (651 closed positions), a comprehensive analysis revealed that the system was massively over-trading with inverted conviction signals. This overhaul restructures the entire trading pipeline.
Three agents consistently produced noise rather than signal. Removing them raises the consensus bar from 37.5% (3/8) to 60% (3/5), meaning every trade now requires genuine majority agreement.
Forex pairs consumed 47% of all trades but produced no meaningful P&L. The system's genuine edge is in commodities where CFTC positioning, weather data, and fundamental analysis provide information advantages.
The Arbiter now runs every consensus decision through seven sequential guards before creating a trade opportunity. Each blocked decision is stored in the database with full reasoning for audit and analysis.
The old edge formula (conviction / N) * avg_confidence was synthetic — it always
produced "tradeable" edges regardless of whether the system actually made money on that instrument.
A 3/5 consensus at 0.50 confidence gave 0.30 edge, well above the 0.05 threshold.
This means instruments that actually lose money will self-correct — their edge drops below threshold and trading halts until performance improves.
Previously, the strategy review system generated 24 recommendations ("halt ranging trading", "blacklist GBP/USD") but none were actually implemented. The system was read-only.
agent_combo_performance table records which agent combinations produce winning trades/api/benchmarks endpoint tracks win rate, trades/day, weekly P&L against targetsconviction / 8, now uses conviction / len(VOTING_AGENTS)Positions were being closed immediately after opening because the edge decay formula was too aggressive. Fixed the decay curve to use quadratic rather than linear decay.
Four critical issues were causing a trade drought and instant position closures. Fixed consensus edge calculation, position sizing, and stale thesis detection.
Raised Pulse agent base confidence from 0.30 to 0.35 so moderate IG sentiment signals could pass the MIN_VOTE_CONFIDENCE filter and participate in consensus.
Fixed incorrect IG API market IDs for sentiment data collection. Updated to use correct IG format (e.g., UK100, USTEC) instead of generic identifiers.