← Back to projects
Research Work in Progress

Portfolio Allocation as Layered System Optimization

A research exploration into reframing portfolio construction not as a single prediction problem, but as a stacked optimization architecture — where candidate screening, regime detection, position sizing, and risk controls operate as independent tunable layers with walk-forward provenance tracking.

⚠ Work in Progress

This page outlines the research direction, methodology, and preliminary findings. The study is actively evolving — additional pilot universes, regime classifiers, and PufferLib-based RL allocation layers are in development. The current implementation runs screened walk-forward backtests with provenance ledgers tracked per experiment run.

Why Layered Optimization?

Traditional portfolio optimization typically treats the problem as a single-stage mathematical program: given a universe of assets and some return/covariance estimates, find the weight vector that maximizes a risk-adjusted objective (Sharpe, Sortino, minimum variance, etc.). This works cleanly in theory but struggles in practice because real markets exhibit regime shifts, non-stationary correlations, and survivorship effects that a flat optimization cannot anticipate.

The key insight is that portfolio construction naturally decomposes into distinct decisions — which assets to consider, what market regime we're in, how to size positions, and what risk constraints to enforce. Each of these can be treated as a separate optimization layer with its own objective, state representation, and feedback signal. The layers are stacked: outputs from one layer become inputs or constraints for the next.

Layer 1 — Pilot Screening

Filters the investable universe using fundamental and technical criteria. Produces a screened candidate set per rebalance period.

Layer 2 — Attractiveness Scoring

Assigns a composite score to each candidate using multi-indicator fusion: momentum, mean-reversion, volatility-normalized signals.

Layer 3 — Regime Detection

Classifies the current market regime (trending, mean-reverting, high-vol, crisis) to modulate risk budgets and strategy weights.

Layer 4 — Allocation

Converts scores and regime into position weights. Supports rule-based sizing, mean-variance optimization, and RL policy approaches.

Methodology

Walk-Forward Validation

All backtests use a strict walk-forward (anchored expanding window) framework to avoid look-ahead bias. Each rebalance period trains on all data up to that point, generates a forward allocation, and records out-of-sample returns. No future information leaks into any layer's parameters.

Provenance Ledgers

Every experiment run produces a provenance ledger: a structured log of which screening criteria were active, what regime was detected per period, which signals contributed to the attractiveness score, and the resulting allocation vector. This makes it possible to trace any portfolio outcome back to the specific decisions made at each layer — essential for debugging and understanding what drives performance.

Pilot Universe Design

Multiple screened pilot universes are tested in parallel:

Regime Detection Approaches

Several regime classifiers are compared:

RL Allocation Layer (In Development)

The current rule-based allocation layer (equal-weight top-N, volatility-weighted) serves as a strong baseline. The next phase replaces this with a reinforcement learning policy trained using PufferLib, where the state space includes the attractiveness scores, regime probabilities, and portfolio-level risk metrics. The action space is continuous position sizing, with the reward function defined as risk-adjusted return over a forward window.

This is where the layered approach shines: the RL agent doesn't need to learn screening or regime detection from raw price data — those are already handled by lower layers. Its job is strictly given these signals and this regime, how much should I allocate? This reduces the state space dramatically and should lead to faster, more stable training.

Current Findings

Preliminary walk-forward results across multiple pilot universes show several consistent patterns:

Next steps: Complete PufferLib RL integration for the allocation layer, add cross-asset-class pilot universes, implement transaction cost modeling, and produce a formal paper with full statistical significance testing across all layer combinations.

📁 Fractal Research Repo ↗ ← Back to Portfolio