MERCURIUS

GOVERNANCE & METHODOLOGY
DASHBOARD GOVERNANCE INTELLIGENCE BACKTEST RESEARCH CHANGELOG
1. Agent Council Overview

Six named intelligence agents independently analyze every instrument and cast directional votes. The Arbiter aggregates votes through a 9-layer guard system (including market hours check, R:R enforcement, and ML trade filter), and requires a minimum consensus of 3 agents (50%) before any trade is entered. No single agent can trigger a trade alone. All execution is via IG Group CFD account with a £100,000 virtual bankroll for position sizing.

Agent Role Description Data Sources Cycle
Sentinel Market Scanner Monitors real-time prices, spreads, and volume across all instruments. Detects unusual price movements, spread widening, and momentum shifts. First line of market awareness. IG live prices, volume, spread 5 min
Atlas Macro Strategist Analyses the macroeconomic environment. Detects regime changes using yield curves, inflation data, employment figures, and central bank signals. Maps macro conditions to asset class positioning. FRED economics, economic calendar 30 min
Cipher Technical Analyst & Gate Runs multi-timeframe technical analysis using 6 indicators (Ichimoku, Bollinger, RSI, EMA, MACD, VWAP) on all active instruments. Produces composite directional signals across DAY, 4H, and 15M timeframes. Also detects market regime (trending/ranging/volatile) used by the Regime Gate. IG candles (DAY, 4H, 15M) 15 min
Oracle Fundamental Analyst Evaluates supply/demand fundamentals, positioning data, crowd probabilities, and geopolitical events. Provides the longest-horizon view with deep fundamental reasoning. Our strongest edge — CFTC + weather + USDA data provides genuine information advantage. CFTC, USDA, weather, news, Polymarket 1 hour
Pulse Sentiment Reader Reads market mood from multiple angles: crowd fear/greed, retail positioning via IG client sentiment, and news sentiment. Acts as a contrarian filter when sentiment reaches extremes. Incorporates crowd-fading logic from the retired Contrarian agent. Fear & Greed, IG sentiment, news 15 min
Structure Market Microstructure Analyses market structure using Wyckoff phase detection, volume profile (POC/Value Area), swing high/low patterns (HH/HL/LH/LL), and support/resistance levels from price clustering. Provides a fundamentally different perspective from indicator-based agents. IG candles (DAY, 4H) 15 min
Arbiter Consensus Judge Collects all 6 voting agent votes within the vote window. Runs 9-layer guard system (Market Hours, Event Guard, Anti-stacking, Daily cap, Cooldown, Regime gate, Volatile regime, Instrument restrictions, R:R enforcement). Filters low-confidence votes (below 0.25). Requires 3+ agents consensus. Calculates Kelly criterion edge with regime multiplier. ML filter gate blocks low-probability trades (<15%) and boosts high-probability ones (>40%). Self-improving: weekly retrain compares accuracy/AUC delta, recalibrates thresholds, and sends Telegram report. All agent votes + guards + ML filter + historical performance 15 min
2. Consensus Mechanism

The Agent Council filters out noise by requiring agreement from multiple independent perspectives before entering any position. A single agent's conviction, no matter how strong, cannot trigger a trade. With 6 voting agents and a minimum of 3 required, this means 50% agreement is needed.

VOTE WINDOW
30 minutes
CONSENSUS THRESHOLD
3/6 agents (50%)
MIN VOTE CONFIDENCE
0.25
WEIGHTED THRESHOLD
≥2.5 combined
VOTE TYPES
BUY / SELL / NEUTRAL
DISSENT TRACKING
consensus_decisions

Conviction scaling directly determines position size. More agents in agreement means higher conviction and a larger allocation (out of 6 voting agents):

3/6
1.0x size
4/6
1.5x size
5/6
2.5x size
6/6
3.5x size

How consensus is formed:

BUY
Sentinel
BUY
Atlas
SELL
Cipher
BUY
Oracle
N
Pulse
BUY
Structure
ARBITER
BUY
4/6 · 1.5x

In this example, Sentinel, Atlas, Oracle, and Structure agree on BUY (4/6) while Cipher dissents with SELL and Pulse is neutral. The Arbiter records a BUY consensus at conviction level 4 with a 1.5x position size multiplier. Dissenting views are stored in the consensus_decisions table.

9-Layer Guard System: Even when consensus is achieved, the trade must pass through all guards before execution:

0 Market Hours: Skips instruments where the market is currently closed. Prevents wasteful API calls and IG rejections on weekends/off-hours.
1 Event Guard: Blocks during high-impact economic events (NFP, FOMC, CPI, etc.). 2h before to 1h after blocking window.
2 Anti-Stacking: Blocks if an open position already exists in the same instrument and direction. Reversals (opposite direction) are allowed.
3 Daily Trade Cap: Maximum 8 trades per day. Prevents over-trading that historically degraded performance.
4 Instrument Cooldown: After closing a position, cooldown period before re-entry (forex: 4h, indices: 4h, commodity: 6h).
5 Regime Gate: Blocks trades in ranging regime for indices, forex, and crypto. Commodities allowed through (fundamental edge persists).
6 Volatile Regime Guard: In volatile markets, requires conviction ≥4 to proceed. Volatile markets are tradeable but need stronger consensus.
7 Instrument Restrictions: Per-instrument rules (e.g., FTSE100 is SELL-only, SP500/NAS100 require trending regime).
8 R:R Enforcement: Rejects trades where the reward-to-risk ratio (limit_distance / stop_distance) is below 1.5. Ensures every trade has asymmetric payoff.
9 ML Trade Filter: XGBoost classifier predicts win probability from 20 features (consensus, technical, sentiment, regime, time, and 5 pair confluence signals). Blocks if <15% (well below 30% base rate), boosts conviction by 1 level if >40%. Thresholds calibrated to base win rate since the system profits via R:R, not win rate. Self-improving: weekly retrain on all closed positions, walk-forward split, compares accuracy/AUC delta, and sends Telegram report. Shadow P&L tracks blocked trades to validate guard effectiveness.

Blocked trades are still recorded in consensus_decisions with status='blocked' and the specific guard that rejected them. This provides an audit trail and allows post-hoc analysis of whether blocked trades would have been profitable.

3. Instruments

7 active instruments are traded via IG Group CFD account. Focus is on commodities (proven edge via CFTC/weather/USDA data) and select indices with restrictions. All forex has been removed after 305 combined trades producing £2.29 total PnL. The table below is fetched live from the API.

Instrument Asset Class Leverage Spread (pts) Hours
Loading instruments...
4. Position Sizing

Positions are sized using the fractional Kelly criterion at 35% of the full Kelly recommendation, combined with conviction-based multipliers from the Agent Council. The edge calculation is now based on historical performance (30-day rolling) rather than a synthetic formula. Multiple hard caps ensure no single trade can cause outsized damage.

KELLY FRACTION
35%
BANKROLL
£100,000
MAX PER TRADE
£1,000
MAX DAILY LOSS
£5,000
MAX POSITION SIZE
5% of bankroll (£5,000)
EDGE THRESHOLD
8%
MAX DAILY TRADES
8
MAX OPEN POSITIONS
10

Conviction multipliers scale the base position size (out of 6 voting agents):

Agents AgreeConvictionMultiplierMax Per Trade (balanced)
3 of 6Standard (50%)1.0xup to £1,000
4 of 6Moderate (67%)1.5xup to £1,500
5 of 6High (83%)2.5xup to £2,500
6 of 6Maximum (100%)3.5xup to £3,500

All sizes are still capped at 5% of bankroll margin (£5,000). The “balanced” strategy preset is active on the CFD account.

Instrument restrictions provide additional per-instrument guardrails:

InstrumentRestrictionRationale
FTSE100 SELL-only, min conviction 3 Historical BUY trades lost heavily; SELL direction had best wins
SP500 Trending regime only New instrument — protected with regime gate until track record established
NAS100 Trending regime only New instrument — protected with regime gate until track record established
KELLY CRITERION EDGE (v7)
// Query last 30 days of closed positions for this instrument p = win_rate = wins / total_trades b = avg_win / avg_loss // payoff ratio q = 1 - p // Proper Kelly edge with regime and conviction scaling edge = (p × b - q) / b × conviction_ratio × regime_mult // Regime multipliers: trending = 1.0, volatile = 0.7, ranging = 0.8 // Cold start (<5 historical trades): synthetic formula with 50% haircut // Direction bias: SELL historically outperformed BUY (39.4% vs 27.7% WR) // ML filter gate (when active, 50+ training samples, 20 features): // win_prob < 0.15 → BLOCK trade (well below 30% base rate) // win_prob > 0.40 → BOOST conviction by 1 level // Pair confluence: Atlas+Pulse (62% WR), Cipher+Oracle (8% WR) encoded as features // Shadow P&L: blocked trades tracked for 4h to validate guard effectiveness
FINAL POSITION SIZE
kelly_raw = edge // position size scales with actual edge base_size = kelly_raw × kelly_fraction × bankroll conviction_mult = {3: 1.0, 4: 1.5, 5: 2.5, 6: 3.5}[conviction] raw_size = base_size × conviction_mult position_size = min(raw_size, max_per_trade_gbp) // preset cap (balanced: £1,000) position_size = min(position_size, bankroll × max_position_pct) // 5% of £100K = £5,000 // R:R enforcement: reject if limit_distance / stop_distance < 1.5 // Margin check: skip if estimated margin > 90% of available balance IG sizing: £/point = position_size / stop_distance Minimum: £0.50/point
5. Technical Indicators

Cipher runs multi-timeframe technical analysis on every active instrument using 6 indicators from the ta library. Each indicator produces a directional signal between -1.0 (strong sell) and +1.0 (strong buy). ATR is used as a modifier for confidence and stop distances, not as a directional signal. Cipher also performs regime detection (trending/ranging/volatile) used by the Arbiter's Regime Gate.

Indicator Weight Role Implementation
Ichimoku Cloud 30% Trend direction (price vs kumo, tenkan/kijun cross, chikou span) ta.trend.IchimokuIndicator
Bollinger Bands 15% Mean reversion (%B position, band width, squeeze detection) ta.volatility.BollingerBands
RSI (14) 15% Momentum (oversold/overbought, divergence from price) ta.momentum.RSIIndicator
EMA Cross (9/21) 15% Trend confirmation (fast/slow EMA crossover and spread) ta.trend.EMAIndicator
MACD 15% Momentum + divergence (MACD line vs signal, histogram direction) ta.trend.MACD
VWAP 10%* Institutional flow (price position relative to volume-weighted average) ta.volume.VolumeWeightedAveragePrice
ATR modifier Adjusts confidence ±0.1 and stop distances. Not directional. ta.volatility.AverageTrueRange

* When volume data is unavailable, VWAP weight drops to 0% and the remaining 5 indicators are redistributed proportionally (33% / 17% / 17% / 17% / 16%).

COMPOSITE SIGNAL (WITH VOLUME)
composite = 0.30 × ichimoku + 0.15 × bollinger + 0.15 × rsi + 0.15 × ema_cross + 0.15 × macd + 0.10 × vwap
COMPOSITE SIGNAL (NO VOLUME)
composite = 0.33 × ichimoku + 0.17 × bollinger + 0.17 × rsi + 0.17 × ema_cross + 0.16 × macd

Multi-timeframe weights: Cipher analyses three timeframes and combines them into a single directional signal.

TimeframeIG ResolutionWeightRole
DAY Daily candles 40% Trend direction — sets the overall bias
HOUR_4 4-hour candles 30% Structure — confirms or refutes daily trend
MINUTE_15 15-minute candles 30% Entry timing — fine-grained momentum for entries
MULTI-TIMEFRAME COMPOSITE
final_signal = 0.40 × DAY_composite + 0.30 × HOUR_4_composite + 0.30 × MINUTE_15_composite direction = BUY if final_signal > +threshold // per asset class SELL if final_signal < -threshold // forex: 0.08, indices: 0.10, commodity: 0.10 NEUTRAL otherwise confidence = f(timeframe_agreement, indicator_agreement, ATR_modifier)
6. Time Decay

Edge naturally decays as a position ages. The system uses a quadratic decay function that starts slow and accelerates — matching the intuition that information becomes stale faster over time. When effective edge drops below the threshold, the position is exited.

QUADRATIC DECAY FUNCTION
age_ratio = hours_held / max_hold_hours decay_factor = 1.0 - (age_ratio²) effective_edge = raw_edge × decay_factor Exit when effective_edge < 8%

Maximum hold periods vary by asset class, reflecting the typical information half-life for each market:

Asset ClassMax HoldRationale
forex 96 hours (4d) Macro themes shift quickly; short-term mean reversion dominates
indices 144 hours (6d) Moderate holding period; earnings and data releases drive regime
commodity 336 hours (14d) Supply/demand fundamentals evolve over days to weeks

Decay examples for a 10% raw edge across different hold durations:

Hold TimeAge RatioDecay FactorEffective EdgeStatus
0h (entry)0.00100%10.0%Active
25% of max0.2593.8%9.4%Active
50% of max0.5075.0%7.5%Active
75% of max0.7543.8%4.4%Below threshold → EXIT
100% of max1.000.0%0.0%Max hold → FORCE EXIT
7. Circuit Breakers

Hard limits that halt all trading when triggered. These are non-negotiable safety mechanisms — the autonomous system cannot override them. Each breaker independently blocks new trade execution.

MAX DAILY LOSS
£5,000
MAX OPEN POSITIONS
10
MAX PER TRADE
£1,000
MAX DAILY TRADES
8

What happens when each breaker triggers:

1 Daily loss limit (£5,000): All pending trade opportunities are marked as skipped. No new positions are opened until the next trading day (UTC midnight reset). Existing positions remain open with normal exit rules.
2 Open positions limit (10): New trades are blocked until at least one existing position is closed (via take-profit, stop-loss, decay exit, or manual close). The Arbiter continues voting but the executor queues rather than executes.
3 Per-trade cap (£1,000 balanced): Applied at sizing time. If the Kelly + conviction calculation produces a size exceeding the preset cap, it is clamped. The active “balanced” preset caps at £1,000 per trade; other presets vary (aggressive: £500, conservative: £1,500, sniper: £3,000).
4 Daily trade cap (8): No more than 8 trades per calendar day. Prevents over-trading that historically produced 31 trades/day with 29.6% win rate. Target: 5-8 quality trades.

Circuit breaker status is visible in real-time on the dashboard's POSITIONS tab, with bar charts showing current usage against limits.

8. Reasoning Chains

Every trade executed by Mercurius has a complete reasoning chain stored in the database and visible in the Trade Log. This ensures full transparency — no black box decisions. The chain records the path from raw data to trade execution.

1 Agent votes: Each agent's directional view (BUY / SELL / NEUTRAL), confidence level (0.0 - 1.0), and the specific reasoning behind the vote. Stored in agent_votes table.
2 Consensus: Which agents agreed on direction, the conviction level (3/4/5/6), any dissenting views and their reasoning. Stored in consensus_decisions table.
3 Entry / sizing: Full Kelly criterion calculation with edge, odds, conviction multiplier, asset class weight. Final position size in £/point. Stored with the trade record.
4 Invalidation: Specific conditions that would break the trade thesis — e.g., "Consensus reversed to SELL" or "Edge decayed below 5% after 36h hold". Recorded as exit reason.
5 Exit rules: Which exit condition was triggered (take-profit, stop-loss, stale thesis, consensus reversal, max hold, edge decay). P&L and duration recorded.

Click any trade in the dashboard Trade Log to expand its full reasoning chain. Every field is queryable via the Intelligence Terminal using natural language.

9. Exit Rules

Positions are monitored continuously and exited when any of the following conditions are met. The position manager checks these rules on every cycle (every 5 minutes for active positions).

Exit RuleConditionPriorityDescription
Take Profit 85% of target 1 Lock in profits at 85% of the original target move. Research from Polymarket analysis shows that waiting for 100% leaves significant profit on the table due to mean reversion.
Stale Thesis 24h + <2% move 2 If the position has been open for 24 hours and price has moved less than 2% from entry, the thesis is considered stale. The market has not confirmed the edge and capital should be freed.
Consensus Reversal Direction flips 3 If the Agent Council reaches a new consensus in the opposite direction of the open position, the position is exited immediately. The collective intelligence has changed its view.
Max Hold Period Per asset class 4 Forex: 96h (4d), Indices: 144h (6d), Commodity: 336h (14d). Forces capital rotation and prevents indefinite holds on decaying edges.
Edge Threshold Effective edge < 8% 5 When the time-decayed effective edge drops below 8% (the entry threshold), the position no longer justifies the risk. Exit regardless of P&L status.

Exit priority: When multiple exit conditions trigger simultaneously, the highest priority rule (lowest number) determines the recorded exit reason. All applicable conditions are still logged for analysis.

TAKE-PROFIT CALCULATION
target_move = entry_price × edge × direction_sign take_profit_price = entry_price + (target_move × 0.85) Example: BUY EUR/USD at 1.0850 with 7% edge target_move = 1.0850 × 0.07 = 0.0760 take_profit = 1.0850 + (0.0760 × 0.85) = 1.1496
STALE THESIS CHECK
hours_held = (now - entry_time).total_seconds() / 3600 price_change_pct = abs(current_price - entry_price) / entry_price if hours_held ≥ 24 and price_change_pct < 0.02: exit(reason="stale_thesis")
10. Agent Rating System

Every agent is scored on accuracy using an Exponential Moving Average (EMA) that weights recent performance more heavily. Scores are tracked per agent, per instrument, and per market regime. Weights influence the Arbiter's consensus calculation.

EMA ALPHA
0.15
WEIGHT RANGE
0.2 — 3.0
COLD START
1.0 (equal)
MIN EVALUATIONS
10
EMA ACCURACY UPDATE
new_accuracy = alpha × outcome + (1 - alpha) × old_accuracy where outcome = 1.0 if vote direction matched price move, else 0.0 alpha = 0.15 (recent results weighted ~6x more than distant past) weight = (accuracy ^ 1.5) × 3 // non-linear: rewards top performers weight is clamped to [0.2, 3.0] range // Example weights (current agents): // Cipher (83.7% acc) → 2.30 weight // Pulse (44.3% acc) → 0.89 weight // Sentinel(36.6% acc) → 0.66 weight

Weight lookup priority: The Arbiter looks up agent weights in this order: (1) agent + instrument + regime, (2) agent + instrument, (3) agent global, (4) default 1.0. This ensures that an agent proven accurate on a specific instrument in a specific regime gets the highest influence.

Weighted consensus: In addition to the raw vote count (≥3 of 6 agents), a consensus can also trigger when 3+ agents agree and their combined weighted score ≥ 2.5. This rewards agents with strong track records. Votes below 0.25 confidence are filtered out before consensus counting.

Self-improvement loop: A daily job (06:00 UTC) scans agent_scores for agent-instrument combos with 20+ evaluations and <30% accuracy, auto-capping their weight to MIN_WEIGHT (0.2). This closes the feedback loop that strategy reviews had been recommending but never implementing.

11. Regime Detection

Every 15 minutes, each instrument is classified into one of three market regimes using a combination of ADX (trend strength), ATR (volatility), and Bollinger Band width. The detected regime directly gates trade execution via the Regime Gate (Layer 5 of the guard system) and influences agent weights.

RegimeDetectionEffect
Trending ADX > 25 All trades allowed. Trend-following agents (Cipher, Sentinel) get higher weight. Wider stops allowed. SP500/NAS100 only trade in this regime.
Volatile ATR above 1.5× 20-period average OR BB width in top quartile Requires conviction ≥4. Tradeable but needs stronger consensus. Position sizes reduced.
Ranging Default (neither trending nor volatile) BLOCKED for indices. Commodities pass through (fundamental edge persists regardless of technical regime). FTSE100 is exempt from the regime gate. Historical ranging WR: 8.3%.
REGIME CLASSIFIER
adx = ta.trend.ADXIndicator(high, low, close, window=14) atr = ta.volatility.AverageTrueRange(high, low, close, window=14) bb_width = ta.volatility.BollingerBands(close, window=20).bollinger_wband() if adx.adx().iloc[-1] > 25: regime = "trending" elif atr / atr.rolling(20).mean() > 1.5 or bb_width > bb_width.quantile(0.75): regime = "volatile" else: regime = "ranging"
12. Live Safety

Before deploying to live trading, Mercurius enforces a preflight checklist and graduated safety measures. These ensure the system has been thoroughly validated in paper mode before risking real capital.

GRADUATED SIZING
50% first week
MAX DRAWDOWN
15% breaker
EMERGENCY STOP
Closes all
PREFLIGHT CHECKS
8 required

Preflight checklist (all 8 must pass before live mode is activated):

1 IG authentication: Valid API credentials and session established.
2 Account balance: Sufficient funds in IG account for minimum position sizing.
3 Agent votes: At least one recent vote cycle completed successfully.
4 Paper trading days: Minimum number of paper trading days completed.
5 Paper trade count: Minimum number of paper trades executed.
6 Paper win rate: Win rate meets minimum threshold in paper mode.
7 Circuit breakers: All breakers in normal state (not triggered).
8 Database integrity: All required tables exist and are populated.

Emergency stop can be triggered via the dashboard kill switch or CLI (python -m mercurius stop). It immediately closes all open positions at market price and halts the scheduler. Available via POST /api/emergency-stop.

13. Event Guard Mechanism

The Event Guard is a gatekeeper agent that does not vote directionally. Instead, it votes BLOCK or CLEAR per instrument. When a BLOCK is active, the Arbiter will not execute trades even if consensus exists. The trade is recorded as "blocked" and deferred.

BLOCK BEFORE
2 hours
BLOCK AFTER
1 hour
CYCLE
30 min

Data sources: Economic calendar (high-impact events), Nager.Date API (market holidays), and NFP first-Friday pattern detection.

Currency mapping: USD events block SP500, NAS100, XAU/USD, XAG/USD, OIL_BRENT. GBP events block FTSE100. Events are mapped to active instruments only.

1 Event Guard scans economic calendar + Nager.Date + NFP pattern
2 Maps events to affected instruments via currency impact table
3 If instrument is within blocking window (2h before → 1h after) → BLOCK
4 Arbiter checks Event Guard before executing → blocked trades are recorded but not placed
14. Intermarket Awareness

Intermarket correlations are monitored as context for the remaining agents (particularly Oracle and Atlas) rather than as a standalone voting agent. The dedicated Correlation agent was retired in May 2025 due to 11-27% accuracy, but the relationships remain relevant reference data.

Reference Instrument Expected Correlation Used By
Dollar Index (DXY) XAU/USD Negative Oracle (fundamental context)
VIX SP500 Negative Atlas (macro regime)
Dollar Index (DXY) OIL_BRENT Negative Oracle (fundamental context)
Gold (GC=F) XAG/USD Positive Oracle (precious metals)

Contrarian logic is now embedded within the Pulse agent rather than operating as a standalone agent. When IG client sentiment exceeds 75% one-sided, or Fear & Greed hits extremes (>85 greed or <15 fear), Pulse factors this as a contrarian signal in its vote. This consolidation reduced noise while preserving the signal.

15. Strategy Review System

The Reviewer agent is NOT a voting agent. It runs on a separate schedule and provides two types of analysis:

A Daily Performance Report (22:00 UTC): P&L breakdown, win rate, agent accuracy, best/worst trades, conviction efficiency, risk metrics. Claude Haiku generates narrative synthesis.
B Weekly Strategy Review (Sunday 20:00 UTC): Agent combination analysis, regime-specific performance, instrument ranking, exit analysis, week-over-week comparison. Full Haiku synthesis with actionable recommendations.

Metrics computed: Total P&L, win rate, profit factor, Sharpe ratio, max drawdown, expectancy, average winner/loser, R-multiple, best/worst streaks, time-in-trade distribution, and agent accuracy by regime.

All reviews are stored in the strategy_reviews table and visible in the REVIEW tab of the dashboard. The review provides a continuous feedback loop for the trading system, identifying what's working, what isn't, and what to adjust.

Self-improving ML pipeline: The XGBoost trade filter retrains weekly (Sunday 23:00 UTC) on all closed positions. Each retrain cycle: (1) compares accuracy and AUC against the previous model, (2) recalibrates block/boost thresholds relative to base win rate, (3) audits live predictions against actual outcomes, (4) reports shadow P&L stats for blocked trades, and (5) sends a comprehensive Telegram report with all metrics.

Weekly Telegram digest anomaly flags: The weekly performance report includes automated anomaly detection — flags for trade drought (5+ days without a trade), win rate drops (>15% vs prior week), agent accuracy falling below 30%, concentration risk (single instrument >60% of P&L), high ML block rate (>50%), shadow P&L suggesting over-aggressive guards, and significant drawdowns.