Source code for ergodic_insurance.ergodic_analyzer

# pylint: disable=too-many-lines
# This module contains comprehensive ergodic analysis functionality with extensive
# documentation. The length is justified by the complexity of the subject matter
# and the need for thorough explanations of ergodic theory applications.

"""Ergodic analysis framework for comparing time-average vs ensemble-average growth.

This module provides the theoretical foundation and computational tools for applying
ergodic economics to insurance decision making. It implements Ole Peters' framework
for distinguishing between ensemble averages (what we expect to happen across many
parallel scenarios) and time averages (what actually happens to a single entity
over time).

The key insight is that for multiplicative processes like business growth with
volatile losses, the ensemble average and time average diverge significantly.
Insurance transforms the growth process in ways that traditional expected value
analysis cannot capture, often making insurance optimal even when premiums
exceed expected losses by substantial margins.

Key Concepts:
    **Time Average Growth Rate**: The growth rate experienced by a single business
    entity over time, calculated as g = (1/T) * ln(X(T)/X(0)). This captures the
    actual compound growth experience.

    **Ensemble Average Growth Rate**: The expected growth rate calculated across
    many parallel scenarios at each time point. This represents the traditional
    expected value approach.

    **Ergodic Divergence**: The difference between time and ensemble averages,
    indicating non-ergodic behavior where individual experience differs from
    statistical expectations.

    **Survival Rate**: The fraction of simulation paths that remain solvent,
    capturing the probability dimension ignored by pure growth metrics.

Theoretical Foundation:
    Based on Ole Peters' ergodic economics framework (Peters, 2019; Peters & Gell-Mann, 2016),
    this module demonstrates that:

    1. **Multiplicative Growth**: Business equity follows multiplicative dynamics
       where losses compound over time in non-linear ways.

    2. **Jensen's Inequality**: For concave utility functions (log wealth), the
       expected value of a function differs from the function of expected values.

    3. **Path Dependence**: The order and timing of losses matters critically,
       making time-average analysis essential for decision making.

    4. **Insurance as Growth Optimization**: Insurance can increase time-average
       growth rates even when premiums appear "expensive" from ensemble perspective.

Core Classes:
    - :class:`ErgodicAnalyzer`: Main analysis engine with comparison methods
    - :class:`ErgodicData`: Standardized data container for time series analysis
    - :class:`ErgodicAnalysisResults`: Comprehensive results from integrated analysis
    - :class:`ValidationResults`: Insurance impact validation results

Examples:
    Basic ergodic comparison between insured and uninsured scenarios::

        import numpy as np
        from ergodic_insurance import ErgodicAnalyzer

        # Initialize analyzer
        analyzer = ErgodicAnalyzer(convergence_threshold=0.01)

        # Simulate equity trajectories (example data)
        insured_trajectories = [
            np.array([10e6, 10.2e6, 10.5e6, 10.8e6, 11.1e6]),    # Stable growth
            np.array([10e6, 10.1e6, 10.3e6, 10.6e6, 10.9e6]),    # Stable growth
            np.array([10e6, 10.3e6, 10.7e6, 11.0e6, 11.4e6])     # Stable growth
        ]

        uninsured_trajectories = [
            np.array([10e6, 10.5e6, 8.2e6, 12.1e6, 13.5e6]),     # Volatile
            np.array([10e6, 9.8e6, 5.1e6, 0]),                   # Bankruptcy
            np.array([10e6, 10.8e6, 11.2e6, 14.8e6, 16.2e6])     # High growth
        ]

        # Compare scenarios
        comparison = analyzer.compare_scenarios(
            insured_trajectories,
            uninsured_trajectories,
            metric="equity"
        )

        print(f"Insured time-average growth: {comparison['insured']['time_average_mean']:.1%}")
        print(f"Uninsured time-average growth: {comparison['uninsured']['time_average_mean']:.1%}")
        print(f"Ergodic advantage: {comparison['ergodic_advantage']['time_average_gain']:.1%}")
        print(f"Survival rate improvement: {comparison['ergodic_advantage']['survival_gain']:.1%}")

    Monte Carlo analysis with convergence checking::

        from ergodic_insurance.simulation import run_monte_carlo

        # Run Monte Carlo simulations (pseudo-code)
        simulation_results = run_monte_carlo(
            n_simulations=1000,
            time_horizon=20,
            insurance_enabled=True
        )

        # Analyze batch results
        analysis = analyzer.analyze_simulation_batch(
            simulation_results,
            label="Insured Scenario"
        )

        print(f"Time-average growth: {analysis['time_average']['mean']:.2%} ± {analysis['time_average']['std']:.2%}")
        print(f"Ensemble average: {analysis['ensemble_average']['mean']:.2%}")
        print(f"Ergodic divergence: {analysis['ergodic_divergence']:.2%}")
        print(f"Convergence: {analysis['convergence']['converged']} (SE: {analysis['convergence']['standard_error']:.4f})")

    Integration with loss modeling::

        from ergodic_insurance import LossData, InsuranceProgram, WidgetManufacturer

        # Set up integrated analysis
        loss_data = LossData.from_distribution(
            frequency_lambda=2.5,
            severity_mean=1_000_000,
            severity_cv=2.0
        )

        insurance = InsuranceProgram(
            layers=[(0, 1_000_000, 0.015), (1_000_000, 10_000_000, 0.008)]
        )

        manufacturer = WidgetManufacturer(config)

        # Run integrated ergodic analysis
        results = analyzer.integrate_loss_ergodic_analysis(
            loss_data=loss_data,
            insurance_program=insurance,
            manufacturer=manufacturer,
            time_horizon=20,
            n_simulations=1000
        )

        print(f"Time-average growth rate: {results.time_average_growth:.2%}")
        print(f"Ensemble average growth: {results.ensemble_average_growth:.2%}")
        print(f"Survival rate: {results.survival_rate:.1%}")
        print(f"Insurance benefit: ${results.insurance_impact['net_benefit']:,.0f}")
        print(f"Analysis valid: {results.validation_passed}")

Implementation Notes:
    - All growth rate calculations use natural logarithms for mathematical consistency
    - Infinite values (from bankruptcy) are handled gracefully in statistical calculations
    - Convergence checking uses standard error to determine Monte Carlo adequacy
    - Significance testing employs t-tests for comparing growth rate distributions
    - Variable-length trajectories (due to insolvency) are supported throughout

Performance Optimization:
    - Vectorized numpy operations for large Monte Carlo batches
    - Efficient handling of mixed-length trajectory data
    - Memory-conscious processing of large simulation datasets
    - Configurable convergence thresholds to balance accuracy and computation time

References:
    - Peters, O. (2019). "The ergodicity problem in economics." Nature Physics, 15(12), 1216-1221.
    - Peters, O., & Gell-Mann, M. (2016). "Evaluating gambles using dynamics." Chaos, 26(2), 023103.
    - Kelly, J. L. (1956). "A new interpretation of information rate." Bell System Technical Journal, 35(4), 917-926.

See Also:
    :mod:`~ergodic_insurance.simulation`: Monte Carlo simulation framework
    :mod:`~ergodic_insurance.manufacturer`: Financial model for business dynamics
    :mod:`~ergodic_insurance.insurance_program`: Insurance structure modeling
    :mod:`~ergodic_insurance.optimization`: Optimization algorithms using ergodic metrics
"""

from dataclasses import dataclass, field
import logging
from typing import TYPE_CHECKING, Any, Dict, List, Optional, Tuple, Union

import numpy as np
from scipy import stats

from .simulation import SimulationResults

if TYPE_CHECKING:
    from .insurance_program import InsuranceProgram
    from .loss_distributions import LossData

logger = logging.getLogger(__name__)


[docs] @dataclass class ErgodicData: """Standardized data container for ergodic time series analysis. This class provides a consistent format for storing and validating time series data used in ergodic calculations. It ensures data integrity and provides metadata tracking for analysis reproducibility. Attributes: time_series (np.ndarray): Array of time points corresponding to values. Should be monotonically increasing for meaningful analysis. values (np.ndarray): Array of observed values (e.g., equity, assets) at each time point. Must have same length as time_series. metadata (Dict[str, Any]): Dictionary containing analysis metadata such as simulation parameters, data source, units, etc. Examples: Create ergodic data for analysis:: import numpy as np # Equity trajectory over 10 years data = ErgodicData( time_series=np.arange(11), # Years 0-10 values=np.array([10e6, 10.2e6, 10.5e6, 10.1e6, 10.8e6, 11.2e6, 10.9e6, 11.5e6, 12.1e6, 12.8e6, 13.2e6]), metadata={ 'units': 'USD', 'metric': 'equity', 'simulation_id': 'run_001', 'scenario': 'insured' } ) # Validate data consistency assert data.validate(), "Data validation failed" Handle validation failures:: # Mismatched lengths will fail validation invalid_data = ErgodicData( time_series=np.arange(10), values=np.arange(5), # Wrong length metadata={'note': 'This will fail validation'} ) if not invalid_data.validate(): print("Data validation failed - fix before analysis") Note: The validate() method should be called before using data in ergodic calculations to ensure mathematical operations will succeed. See Also: :class:`ErgodicAnalyzer`: Main analysis class that uses ErgodicData :class:`ErgodicAnalysisResults`: Results format for ergodic calculations """ time_series: np.ndarray = field(default_factory=lambda: np.array([])) values: np.ndarray = field(default_factory=lambda: np.array([])) metadata: Dict[str, Any] = field(default_factory=dict)
[docs] def validate(self) -> bool: """Validate data consistency and integrity. Performs comprehensive validation of the ergodic data to ensure it meets requirements for mathematical analysis. This includes checking array lengths, data types, and basic reasonableness of values. Returns: bool: True if all validation checks pass, False otherwise. False indicates the data should not be used in ergodic calculations without correction. Examples: Validate data before analysis:: data = ErgodicData( time_series=np.arange(10), values=np.random.randn(10) + 100, metadata={'units': 'USD'} ) if data.validate(): print("Data validated - ready for analysis") else: print("Data validation failed - check inputs") Validation Checks: - Arrays have matching lengths - Arrays are not empty - Time series is monotonic (if more than one point) - Values are numeric (not NaN in inappropriate places) """ if len(self.time_series) == 0 or len(self.values) == 0: return False return len(self.time_series) == len(self.values)
[docs] @dataclass class ErgodicAnalysisResults: """Comprehensive results from integrated ergodic analysis. This class encapsulates all results from a complete ergodic analysis, including growth metrics, survival statistics, insurance impacts, and validation status. It provides a standardized format for reporting and comparing different insurance strategies. Attributes: time_average_growth (float): Mean time-average growth rate across all valid simulation paths. Calculated as the average of individual path growth rates: mean(ln(X_final/X_initial)/T). May be -inf if all paths resulted in bankruptcy. ensemble_average_growth (float): Ensemble average growth rate calculated from the mean of initial and final values across all paths: ln(mean(X_final)/mean(X_initial))/T. Always finite for valid data. survival_rate (float): Fraction of simulation paths that remained solvent throughout the analysis period. Range: [0.0, 1.0]. ergodic_divergence (float): Difference between time-average and ensemble average growth rates (time_average_growth - ensemble_average_growth). Positive values indicate time-average exceeds ensemble average. insurance_impact (Dict[str, float]): Dictionary containing insurance-related metrics such as: - 'premium_cost': Total premium payments - 'recovery_benefit': Total insurance recoveries - 'net_benefit': Net financial benefit of insurance - 'growth_improvement': Improvement in growth rate from insurance validation_passed (bool): Whether the analysis passed internal validation checks for data consistency and mathematical validity. metadata (Dict[str, Any]): Additional analysis metadata including: - 'n_simulations': Number of Monte Carlo simulations - 'time_horizon': Analysis time horizon - 'n_survived': Absolute number of paths that survived - 'loss_statistics': Statistics about loss distributions Examples: Interpret analysis results:: # Example results from ergodic analysis results = ErgodicAnalysisResults( time_average_growth=0.045, # 4.5% annual growth ensemble_average_growth=0.052, # 5.2% ensemble average survival_rate=0.95, # 95% survival rate ergodic_divergence=-0.007, # -0.7% divergence insurance_impact={ 'premium_cost': 2_500_000, 'recovery_benefit': 8_200_000, 'net_benefit': 5_700_000, 'growth_improvement': 0.012 }, validation_passed=True, metadata={ 'n_simulations': 1000, 'time_horizon': 20, 'n_survived': 950 } ) # Interpret results if results.validation_passed: print(f"Time-average growth: {results.time_average_growth:.1%}") print(f"Ensemble average: {results.ensemble_average_growth:.1%}") if results.ergodic_divergence < 0: print("Insurance reduces volatility drag (ergodic benefit)") if results.insurance_impact['net_benefit'] > 0: print(f"Insurance provides net benefit: ${results.insurance_impact['net_benefit']:,.0f}") else: print("Analysis validation failed - results may be unreliable") Compare multiple scenarios:: def compare_results(results_a, results_b, label_a="Scenario A", label_b="Scenario B"): print(f"{label_a} vs {label_b}:") print(f" Time-average growth: {results_a.time_average_growth:.2%} vs {results_b.time_average_growth:.2%}") print(f" Survival rate: {results_a.survival_rate:.1%} vs {results_b.survival_rate:.1%}") print(f" Ergodic divergence: {results_a.ergodic_divergence:.3f} vs {results_b.ergodic_divergence:.3f}") growth_advantage = results_a.time_average_growth - results_b.time_average_growth survival_advantage = results_a.survival_rate - results_b.survival_rate print(f" Advantages: Growth={growth_advantage:.2%}, Survival={survival_advantage:.1%}") Note: All growth rates are expressed as decimal values (0.05 = 5% annual growth). Negative ergodic_divergence indicates insurance reduces "volatility drag". Always check validation_passed before interpreting results. See Also: :class:`ErgodicAnalyzer`: Class that generates these results :meth:`ErgodicAnalyzer.integrate_loss_ergodic_analysis`: Method producing these results :class:`ValidationResults`: Detailed validation information """ time_average_growth: float ensemble_average_growth: float survival_rate: float ergodic_divergence: float insurance_impact: Dict[str, float] validation_passed: bool metadata: Dict[str, Any] = field(default_factory=dict)
[docs] @dataclass class ValidationResults: """Comprehensive results from insurance impact validation analysis. This class encapsulates the results of detailed validation checks performed on insurance effects in ergodic analysis. It provides both high-level validation status and detailed diagnostic information to help identify and resolve any modeling inconsistencies. Attributes: premium_deductions_correct (bool): Whether insurance premiums are properly deducted from cash flows. True indicates expected premium costs match observed differences in net income between scenarios. recoveries_credited (bool): Whether insurance recoveries are properly credited to improve financial outcomes. True indicates insured scenarios show appropriate financial benefit from loss recoveries. collateral_impacts_included (bool): Whether letter of credit costs and asset restrictions are properly modeled. True indicates collateral requirements are reflected in financial calculations. time_average_reflects_benefit (bool): Whether time-average growth rate calculations properly reflect insurance benefits. True indicates growth improvements are consistent with insurance effects. overall_valid (bool): Master validation flag indicating whether all individual checks passed. True means the ergodic analysis results are reliable and properly reflect insurance impacts. details (Dict[str, Any]): Detailed diagnostic information from each validation check, including specific metrics, calculations, and discrepancy measurements. Used for troubleshooting validation failures. Examples: Interpret validation results:: validation = analyzer.validate_insurance_ergodic_impact( base_scenario, insurance_scenario, insurance_program ) if validation.overall_valid: print("✓ All validation checks passed") print("Ergodic analysis results are reliable") else: print("⚠ Validation issues detected:") if not validation.premium_deductions_correct: print(" - Premium deduction mismatch") if not validation.recoveries_credited: print(" - Recovery crediting issue") if not validation.collateral_impacts_included: print(" - Collateral impact missing") if not validation.time_average_reflects_benefit: print(" - Growth calculation inconsistency") print("Review model implementation before using results") Access detailed diagnostics:: if 'premium_check' in validation.details: premium_info = validation.details['premium_check'] expected = premium_info['expected'] actual = premium_info['actual_diff'] print(f"Premium validation: Expected ${expected:,.0f}, Got ${actual:,.0f}") if abs(expected - actual) > expected * 0.05: # 5% tolerance print("⚠ Significant premium discrepancy detected") Note: A failed overall validation doesn't necessarily mean the analysis is wrong - it may indicate edge cases or modeling assumptions that need review. Always examine the details for specific guidance on issues. See Also: :meth:`ErgodicAnalyzer.validate_insurance_ergodic_impact`: Method generating these results :class:`ErgodicAnalysisResults`: Main analysis results that this validation supports """ premium_deductions_correct: bool recoveries_credited: bool collateral_impacts_included: bool time_average_reflects_benefit: bool overall_valid: bool details: Dict[str, Any] = field(default_factory=dict)
[docs] class ErgodicAnalyzer: """Advanced analyzer for ergodic properties of insurance strategies. This class implements the core computational engine for ergodic economics analysis in insurance contexts. It provides methods to calculate and compare time-average versus ensemble-average growth rates, demonstrating the fundamental difference between traditional expected-value thinking and actual experienced growth over time. The analyzer addresses the key ergodic insight that for multiplicative processes (like business growth with volatile losses), what happens to an ensemble of businesses differs from what happens to any individual business over time. Insurance can improve time-average growth even when it appears costly from an ensemble (expected value) perspective. Key Capabilities: - Time-average growth rate calculation for individual trajectories - Ensemble average computation across multiple simulation paths - Statistical significance testing of insurance benefits - Monte Carlo convergence analysis - Integrated loss modeling and insurance impact assessment - Comprehensive validation of insurance effects Attributes: convergence_threshold (float): Standard error threshold for determining Monte Carlo convergence. Lower values require more simulations but provide higher confidence in results. Mathematical Foundation: **Time-Average Growth**: For a trajectory X(t), the time-average growth rate is: g_time = (1/T) * ln(X(T)/X(0)) **Ensemble Average Growth**: Across N paths, the ensemble growth rate is: g_ensemble = (1/T) * ln(⟨X(T)⟩/⟨X(0)⟩) **Ergodic Divergence**: The difference g_time - g_ensemble indicates non-ergodic behavior where individual experience differs from statistical expectations. Examples: Basic analyzer setup and usage:: from ergodic_insurance import ErgodicAnalyzer import numpy as np # Initialize with tight convergence criteria analyzer = ErgodicAnalyzer(convergence_threshold=0.005) # Calculate time-average growth for a single trajectory equity_path = np.array([10e6, 10.5e6, 9.8e6, 11.2e6, 12.1e6]) time_avg_growth = analyzer.calculate_time_average_growth(equity_path) print(f"Time-average growth: {time_avg_growth:.2%} annually") Ensemble analysis with multiple trajectories:: # Multiple simulation paths (some ending in bankruptcy) trajectories = [ np.array([10e6, 10.5e6, 11.2e6, 11.8e6, 12.5e6]), # Survivor np.array([10e6, 9.2e6, 8.1e6, 6.8e6, 0]), # Bankruptcy np.array([10e6, 10.8e6, 11.5e6, 12.8e6, 14.2e6]), # High growth np.array([10e6, 9.8e6, 10.2e6, 10.6e6, 11.1e6]) # Stable growth ] # Calculate ensemble statistics ensemble_stats = analyzer.calculate_ensemble_average( trajectories, metric="growth_rate" ) print(f"Ensemble growth rate: {ensemble_stats['mean']:.2%}") print(f"Survival rate: {ensemble_stats['survival_rate']:.1%}") print(f"Growth rate std dev: {ensemble_stats['std']:.2%}") Insurance scenario comparison:: # Compare insured vs uninsured scenarios insured_paths = generate_insured_trajectories() # Your simulation code uninsured_paths = generate_uninsured_trajectories() # Your simulation code comparison = analyzer.compare_scenarios( insured_paths, uninsured_paths, metric="equity" ) # Extract key insights time_avg_benefit = comparison['ergodic_advantage']['time_average_gain'] survival_benefit = comparison['ergodic_advantage']['survival_gain'] is_significant = comparison['ergodic_advantage']['significant'] print(f"Time-average growth improvement: {time_avg_benefit:.2%}") print(f"Survival rate improvement: {survival_benefit:.1%}") print(f"Statistically significant: {is_significant}") Monte Carlo convergence analysis:: # Run large Monte Carlo study simulation_results = run_monte_carlo_study(n_sims=2000) analysis = analyzer.analyze_simulation_batch( simulation_results, label="High-Coverage Insurance" ) # Check if we have enough simulations if analysis['convergence']['converged']: print("Monte Carlo has converged - results are reliable") print(f"Standard error: {analysis['convergence']['standard_error']:.4f}") else: print("Need more simulations for convergence") needed_se = analyzer.convergence_threshold current_se = analysis['convergence']['standard_error'] factor = (current_se / needed_se) ** 2 print(f"Suggest ~{int(2000 * factor)} simulations") Advanced Features: The analyzer provides several advanced capabilities for robust analysis: **Variable-Length Trajectories**: Handles paths that end early due to bankruptcy, maintaining proper statistics across mixed survival scenarios. **Significance Testing**: Built-in t-tests to determine if observed differences between scenarios are statistically meaningful. **Convergence Monitoring**: Automated checking of Monte Carlo convergence using rolling standard error calculations. **Integrated Validation**: Comprehensive validation of insurance effects to ensure results accurately reflect premium costs, recoveries, and collateral impacts. Performance Notes: - Optimized for large Monte Carlo datasets (1000+ simulations) - Memory-efficient processing of variable-length trajectories - Vectorized calculations where possible for speed - Graceful handling of edge cases (bankruptcy, infinite values) See Also: :class:`ErgodicAnalysisResults`: Comprehensive results format :class:`ValidationResults`: Insurance impact validation results :meth:`integrate_loss_ergodic_analysis`: End-to-end analysis pipeline :meth:`compare_scenarios`: Core scenario comparison functionality """ def __init__(self, convergence_threshold: float = 0.01): """Initialize the ergodic analyzer with configuration parameters. Sets up the analyzer with the specified convergence criteria for Monte Carlo analysis. The convergence threshold determines when sufficient simulations have been run to provide reliable statistical estimates. Args: convergence_threshold (float): Standard error threshold for determining Monte Carlo convergence. Lower values require more simulations but provide higher confidence. Typical values: - 0.005: High precision (2000+ simulations) - 0.01: Standard precision (1000+ simulations) - 0.02: Quick analysis (500+ simulations) Defaults to 0.01 for balanced accuracy and computational efficiency. Examples: Initialize for different analysis needs:: # High precision for final analysis precise_analyzer = ErgodicAnalyzer(convergence_threshold=0.005) # Quick analysis for parameter exploration quick_analyzer = ErgodicAnalyzer(convergence_threshold=0.02) # Standard analysis for most use cases standard_analyzer = ErgodicAnalyzer() # Uses default 0.01 Note: The convergence threshold affects the balance between computational cost and result reliability. Choose based on your analysis requirements and available computational resources. """ self.convergence_threshold = convergence_threshold
[docs] def calculate_time_average_growth( self, values: np.ndarray, time_horizon: Optional[int] = None ) -> float: """Calculate time-average growth rate for a single trajectory. This method implements the core ergodic calculation for individual path growth rates using the logarithmic growth formula. It handles edge cases gracefully, including bankruptcy scenarios and invalid data. The time-average growth rate represents the actual compound growth experienced by a single entity over time, which differs fundamentally from ensemble averages in multiplicative processes. Args: values (np.ndarray): Array of values over time (e.g., equity, assets, wealth). Should be monotonic in time with positive values for meaningful growth calculations. Length must be >= 2 for growth calculation. time_horizon (Optional[int]): Specific time horizon to use for calculation. If None, uses the full trajectory length minus 1. Useful for comparing trajectories of different lengths or analyzing partial periods. Returns: float: Time-average growth rate as decimal (0.05 = 5% annual growth). Special return values: - -inf: Trajectory ended in bankruptcy (final value <= 0) - 0.0: Single time point or zero time horizon - Finite value: Calculated growth rate Examples: Calculate growth for successful trajectory:: import numpy as np # 5-year equity trajectory equity = np.array([10e6, 10.5e6, 11.2e6, 11.8e6, 12.5e6]) growth = analyzer.calculate_time_average_growth(equity) print(f"Growth rate: {growth:.2%} annually") # Output: Growth rate: 5.68% annually Handle bankruptcy scenario:: # Trajectory ending in bankruptcy failed_equity = np.array([10e6, 9.2e6, 7.1e6, 4.8e6, 0]) growth = analyzer.calculate_time_average_growth(failed_equity) print(f"Growth rate: {growth}") # Output: Growth rate: -inf Analyze partial trajectory:: # Long trajectory, analyze first 3 years only long_equity = np.array([10e6, 10.5e6, 11.2e6, 11.8e6, 12.5e6, 13.1e6]) partial_growth = analyzer.calculate_time_average_growth( long_equity, time_horizon=3 ) # Analyzes first 4 points (years 0-3) Handle trajectories with initial zeros:: # Trajectory starting from zero (invalid) invalid_equity = np.array([0, 1e6, 2e6, 3e6, 4e6]) growth = analyzer.calculate_time_average_growth(invalid_equity) # Will find first valid positive value and calculate from there Mathematical Details: The calculation uses the formula: g = (1/T) * ln(X(T) / X(0)) Where: - g: time-average growth rate - T: time horizon - X(T): final value - X(0): initial value - ln: natural logarithm This formula gives the constant compound growth rate that would produce the observed change from initial to final value. Edge Cases: - Empty array: Returns -inf - Single value: Returns 0.0 - Final value <= 0: Returns -inf (bankruptcy) - All values <= 0: Returns -inf - Zero time horizon: Returns 0.0 if positive, -inf if negative Note: This is the fundamental calculation in ergodic economics, representing the growth rate that a single entity actually experiences over time, as opposed to what we might expect from ensemble averages. Warning: The method filters out non-positive values when finding the initial value, which may skip early periods of the trajectory. Ensure your data represents meaningful business values (positive equity/assets). See Also: :meth:`calculate_ensemble_average`: For ensemble growth calculations :meth:`compare_scenarios`: For comparing time vs ensemble averages """ # Handle edge cases and invalid trajectories if len(values) == 0: return -np.inf if values[-1] <= 0: return -np.inf if len(values) == 1: return 0.0 # Filter out zero or negative values for calculating growth valid_mask = values > 0 if not np.any(valid_mask): return -np.inf # Get first valid positive value first_idx = np.argmax(valid_mask) initial_value = values[first_idx] final_value = values[-1] # Calculate time period if time_horizon is None: time_horizon = int(len(values) - 1 - first_idx) # Calculate growth rate if final_value > 0 and initial_value > 0 and time_horizon > 0: growth_rate = float((1.0 / time_horizon) * np.log(final_value / initial_value)) # LIMITED LIABILITY: Floor at -100% (-1.0) due to equity constraint # Companies with limited liability cannot lose more than 100% of equity return max(growth_rate, -1.0) return 0.0 if time_horizon <= 0 else -np.inf
def _extract_trajectory_values( self, trajectories: List[np.ndarray] ) -> Tuple[np.ndarray, np.ndarray, np.ndarray]: """Extract initial, final values and lengths from trajectories.""" values_data = [(traj[-1], traj[0], len(traj)) for traj in trajectories if len(traj) > 0] if not values_data: return np.array([]), np.array([]), np.array([]) finals, initials, lengths = zip(*values_data) return np.array(finals), np.array(initials), np.array(lengths) def _calculate_growth_rates( self, finals: np.ndarray, initials: np.ndarray, lengths: np.ndarray ) -> np.ndarray: """Calculate growth rates from trajectory values.""" rates = [np.log(f / i) / (t - 1) for f, i, t in zip(finals, initials, lengths) if t > 1] return np.array(rates) if rates else np.array([]) def _process_variable_length_trajectories( self, trajectories: List[np.ndarray], metric: str ) -> Dict[str, float]: """Process trajectories with variable lengths.""" n_paths = len(trajectories) results = {} if metric in ["final_value", "growth_rate"]: # Get final and initial values from each trajectory final_values, initial_values, time_lengths = self._extract_trajectory_values( trajectories ) # Filter valid paths (positive initial and final values) if len(final_values) > 0: valid_mask = (initial_values > 0) & (final_values > 0) valid_finals = final_values[valid_mask] valid_initials = initial_values[valid_mask] valid_lengths = time_lengths[valid_mask] else: valid_finals = valid_initials = valid_lengths = np.array([]) if metric == "final_value": results["mean"] = np.mean(valid_finals) if len(valid_finals) > 0 else 0.0 results["std"] = np.std(valid_finals) if len(valid_finals) > 0 else 0.0 results["median"] = np.median(valid_finals) if len(valid_finals) > 0 else 0.0 else: # growth_rate growth_rates = self._calculate_growth_rates( valid_finals, valid_initials, valid_lengths ) results["mean"] = np.mean(growth_rates) if len(growth_rates) > 0 else 0.0 results["std"] = np.std(growth_rates) if len(growth_rates) > 0 else 0.0 results["median"] = np.median(growth_rates) if len(growth_rates) > 0 else 0.0 else: # full trajectory - not well-defined for different lengths results["mean_trajectory"] = None results["std_trajectory"] = None # Add survival statistics survived = sum(1 for traj in trajectories if len(traj) > 0 and traj[-1] > 0) results["survival_rate"] = survived / n_paths if n_paths > 0 else 0.0 results["n_survived"] = survived results["n_total"] = n_paths return results def _process_fixed_length_trajectories( self, trajectories: np.ndarray, metric: str ) -> Dict[str, float]: """Process trajectories with fixed lengths.""" n_paths, n_time = trajectories.shape results = {} if metric in ["final_value", "growth_rate"]: # Get final values (trajectories is now a numpy array) final_values = trajectories[:, -1] initial_values = trajectories[:, 0] # Filter valid paths (positive initial and final values) valid_mask = (initial_values > 0) & (final_values > 0) valid_finals = final_values[valid_mask] valid_initials = initial_values[valid_mask] if metric == "final_value": results["mean"] = np.mean(valid_finals) if len(valid_finals) > 0 else 0.0 results["std"] = np.std(valid_finals) if len(valid_finals) > 0 else 0.0 results["median"] = np.median(valid_finals) if len(valid_finals) > 0 else 0.0 else: # growth_rate growth_rates = np.log(valid_finals / valid_initials) / (n_time - 1) results["mean"] = np.mean(growth_rates) if len(growth_rates) > 0 else 0.0 results["std"] = np.std(growth_rates) if len(growth_rates) > 0 else 0.0 results["median"] = np.median(growth_rates) if len(growth_rates) > 0 else 0.0 else: # full trajectory results["mean_trajectory"] = np.mean(trajectories, axis=0) results["std_trajectory"] = np.std(trajectories, axis=0) # Add survival statistics survived = np.sum(trajectories[:, -1] > 0) results["survival_rate"] = survived / n_paths results["n_survived"] = survived results["n_total"] = n_paths return results
[docs] def calculate_ensemble_average( self, trajectories: Union[List[np.ndarray], np.ndarray], metric: str = "final_value" ) -> Dict[str, float]: """Calculate ensemble average and statistics across multiple simulation paths. This method computes ensemble statistics representing the traditional expected value approach to analyzing multiple parallel scenarios. It handles variable-length trajectories (due to bankruptcy) and provides comprehensive statistics for comparison with time-average calculations. The ensemble perspective answers: "What would happen on average across many parallel businesses?" This differs from the time-average perspective of "What happens to one business over time?" Args: trajectories (Union[List[np.ndarray], np.ndarray]): Multiple simulation trajectories. Can be: - List of 1D numpy arrays (supports variable lengths) - 2D numpy array with shape (n_paths, n_timesteps) Each trajectory represents values over time (equity, assets, etc.) metric (str): Type of ensemble statistic to compute: - "final_value": Statistics of final values across paths - "growth_rate": Statistics of growth rates across paths - "full": Average trajectory at each time step (fixed-length only) Defaults to "final_value". Returns: Dict[str, float]: Dictionary containing ensemble statistics: - 'mean': Mean of the selected metric across all valid paths - 'std': Standard deviation of the metric - 'median': Median value of the metric - 'survival_rate': Fraction of paths avoiding bankruptcy - 'n_survived': Absolute number of surviving paths - 'n_total': Total number of input paths For metric="full": - 'mean_trajectory': Mean values at each time step - 'std_trajectory': Standard deviations at each time step Examples: Analyze final equity values:: import numpy as np # Multiple simulation results trajectories = [ np.array([10e6, 10.5e6, 11.2e6, 11.8e6, 12.5e6]), # Success np.array([10e6, 9.2e6, 8.1e6, 6.8e6, 0]), # Bankruptcy np.array([10e6, 10.8e6, 11.5e6, 12.8e6, 14.2e6]), # High growth ] final_stats = analyzer.calculate_ensemble_average( trajectories, metric="final_value" ) print(f"Average final equity: ${final_stats['mean']:,.0f}") print(f"Survival rate: {final_stats['survival_rate']:.1%}") print(f"Standard deviation: ${final_stats['std']:,.0f}") Analyze growth rate distribution:: growth_stats = analyzer.calculate_ensemble_average( trajectories, metric="growth_rate" ) print(f"Average growth rate: {growth_stats['mean']:.2%}") print(f"Growth rate volatility: {growth_stats['std']:.2%}") print(f"Median growth: {growth_stats['median']:.2%}") Full trajectory analysis (fixed-length only):: # Convert to fixed-length array fixed_trajectories = np.array([ [10e6, 10.5e6, 11.2e6, 11.8e6, 12.5e6], [10e6, 9.8e6, 10.1e6, 10.6e6, 11.1e6], [10e6, 10.8e6, 11.5e6, 12.8e6, 14.2e6] ]) full_stats = analyzer.calculate_ensemble_average( fixed_trajectories, metric="full" ) mean_path = full_stats['mean_trajectory'] print(f"Mean trajectory: {mean_path}") # Shows average value at each time step Handle mixed survival scenarios:: mixed_trajectories = [ np.array([10e6, 11e6, 12e6]), # Short survivor np.array([10e6, 9e6, 0]), # Early bankruptcy np.array([10e6, 11e6, 12e6, 13e6]), # Long survivor ] stats = analyzer.calculate_ensemble_average(mixed_trajectories) print(f"{stats['n_survived']}/{stats['n_total']} paths survived") print(f"Survival rate: {stats['survival_rate']:.1%}") Statistical Interpretation: **Mean**: The expected value under the ensemble perspective. For multiplicative processes, this may differ significantly from what any individual entity experiences. **Standard Deviation**: Measures the spread of outcomes across the ensemble, indicating the uncertainty in individual results. **Survival Rate**: Critical metric often ignored in traditional expected value analysis. Shows the probability of avoiding bankruptcy. **Median**: Often more representative than mean for skewed distributions common in financial modeling. Edge Cases: - Empty trajectory list: Returns zeros/NaN appropriately - All paths end in bankruptcy: survival_rate=0, mean/median may be 0 - Single trajectory: Statistics reduce to that trajectory's values - Mixed lengths: Handled gracefully with proper filtering Performance Notes: - Optimized for large numbers of trajectories (1000+ paths) - Memory efficient for mixed-length trajectory lists - Vectorized calculations where possible See Also: :meth:`calculate_time_average_growth`: Individual trajectory analysis :meth:`compare_scenarios`: Ensemble vs time-average comparison :meth:`analyze_simulation_batch`: Comprehensive batch analysis """ # Handle list of arrays with potentially different lengths if isinstance(trajectories, list): # Check if all arrays have the same length lengths = [len(traj) for traj in trajectories if len(traj) > 0] if len(set(lengths)) == 1 and lengths: # All same length, can convert to 2D array trajectories_array = np.array(trajectories) return self._process_fixed_length_trajectories(trajectories_array, metric) # Different lengths, work with list return self._process_variable_length_trajectories(trajectories, metric) return self._process_fixed_length_trajectories(trajectories, metric)
[docs] def check_convergence(self, values: np.ndarray, window_size: int = 100) -> Tuple[bool, float]: """Check Monte Carlo convergence using rolling standard error analysis. This method determines whether a Monte Carlo simulation has run enough iterations to provide statistically reliable results. It uses rolling standard error calculations to assess whether adding more simulations would significantly change the estimated mean. Convergence analysis is crucial for ergodic analysis because insufficient simulations can lead to misleading conclusions about insurance benefits. The method provides both a binary convergence decision and quantitative standard error metrics for informed decision making. Args: values (np.ndarray): Array of values to check for convergence, typically time-average growth rates from Monte Carlo simulations. Should contain at least window_size values for meaningful analysis. Infinite values (from bankruptcy) are handled appropriately. window_size (int): Size of rolling window for convergence assessment. Larger windows provide more stable convergence detection but require more data points. Typical values: - 50: Quick convergence check for small samples - 100: Standard convergence analysis (default) - 200: Conservative convergence for high precision Must be <= len(values) for analysis to proceed. Returns: Tuple[bool, float]: Convergence assessment results: - converged (bool): Whether the series has converged according to the specified threshold. True indicates sufficient simulations. - standard_error (float): Current standard error of the mean based on the last window_size observations. Lower values indicate higher precision and greater confidence in results. Examples: Check convergence during Monte Carlo analysis:: import numpy as np # Simulate running Monte Carlo with growth rate collection growth_rates = [] for i in range(2000): # Up to 2000 simulations # Run single simulation (pseudo-code) result = run_single_simulation() growth_rate = analyzer.calculate_time_average_growth(result.equity) growth_rates.append(growth_rate) # Check convergence every 100 simulations if (i + 1) % 100 == 0 and i >= 100: converged, se = analyzer.check_convergence( np.array(growth_rates), window_size=100 ) print(f"Simulation {i+1}: SE={se:.4f}, Converged={converged}") if converged: print(f"✓ Convergence achieved after {i+1} simulations") break if not converged: print(f"⚠ Convergence not achieved after {len(growth_rates)} simulations") print(f"Current standard error: {se:.4f}") print(f"Target threshold: {analyzer.convergence_threshold:.4f}") Adaptive Monte Carlo with convergence monitoring:: def run_adaptive_monte_carlo(target_precision=0.01, max_sims=5000): '''Run Monte Carlo until convergence or maximum simulations.''' results = [] for i in range(max_sims): # Run simulation sim_result = run_single_simulation() results.append(sim_result) # Extract growth rates for convergence check growth_rates = [analyzer.calculate_time_average_growth(r.equity) for r in results] # Check convergence (need at least 100 for stability) if i >= 100: converged, se = analyzer.check_convergence( np.array([g for g in growth_rates if np.isfinite(g)]) ) if converged and se <= target_precision: print(f"Achieved target precision after {i+1} simulations") return results, True print(f"Maximum simulations reached without convergence") return results, False # Run adaptive analysis results, converged = run_adaptive_monte_carlo() if converged: print("Analysis complete with sufficient precision") else: print("Consider increasing maximum simulations") Convergence diagnostics and troubleshooting:: # Analyze convergence pattern growth_rates = np.array([...]) # Your Monte Carlo results # Check convergence with different window sizes window_sizes = [50, 100, 150, 200] print("=== Convergence Analysis ===") for ws in window_sizes: if len(growth_rates) >= ws: converged, se = analyzer.check_convergence(growth_rates, ws) print(f"Window {ws:3d}: SE={se:.5f}, Converged={converged}") # Plot convergence pattern (conceptual) rolling_means = [] rolling_ses = [] for i in range(100, len(growth_rates), 10): subset = growth_rates[:i] converged, se = analyzer.check_convergence(subset) rolling_means.append(np.mean(subset[np.isfinite(subset)])) rolling_ses.append(se) # Analyze convergence stability recent_se_trend = np.diff(rolling_ses[-10:]) # Last 10 points if np.mean(recent_se_trend) < 0: print("✓ Standard error decreasing - convergence improving") else: print("⚠ Standard error not decreasing - may need more simulations") Compare convergence across scenarios:: # Check convergence for both insured and uninsured scenarios scenarios = { 'insured': insured_growth_rates, 'uninsured': uninsured_growth_rates } convergence_status = {} for name, rates in scenarios.items(): converged, se = analyzer.check_convergence(rates) convergence_status[name] = { 'converged': converged, 'standard_error': se, 'n_simulations': len(rates) } print("=== Scenario Convergence Status ===") for name, status in convergence_status.items(): print(f"{name:10}: {status['converged']} " f"(SE={status['standard_error']:.4f}, n={status['n_simulations']})") # Determine if comparison is valid both_converged = all(s['converged'] for s in convergence_status.values()) if both_converged: print("✓ Both scenarios converged - comparison is reliable") else: print("⚠ Incomplete convergence - results may be unreliable") Mathematical Background: The method calculates the standard error of the mean for the most recent window_size observations: SE = σ / √n Where: - σ = standard deviation of the sample - n = sample size (window_size) Convergence is achieved when SE < convergence_threshold, indicating that the sample mean is stable within the desired precision. Convergence Guidelines: **Standard Error Thresholds**: - SE < 0.005: High precision (recommended for final analysis) - SE < 0.01: Standard precision (adequate for most decisions) - SE < 0.02: Low precision (suitable for initial exploration) - SE > 0.02: Insufficient precision (run more simulations) **Sample Size Rules of Thumb**: - n < 100: Generally insufficient for convergence assessment - n = 100-500: May achieve convergence for low-volatility scenarios - n = 500-2000: Standard range for most insurance analyses - n > 2000: High-precision analysis or high-volatility scenarios Edge Cases: - Fewer observations than window_size: Returns (False, inf) - All infinite values: Returns (False, inf) - High volatility data: May require very large samples for convergence - Bimodal distributions: Standard error may not capture full uncertainty Performance Notes: - Fast execution even for large arrays (10,000+ observations) - Memory efficient rolling window calculations - Robust handling of infinite and missing values See Also: :attr:`convergence_threshold`: Threshold used for convergence decision :meth:`analyze_simulation_batch`: Includes automatic convergence analysis :meth:`calculate_ensemble_average`: Ensemble statistics that benefit from convergence """ if len(values) < window_size: return False, np.inf # Calculate rolling mean and standard error _rolling_means = np.convolve(values, np.ones(window_size) / window_size, mode="valid") # Standard error of the mean se = np.std(values[-window_size:]) / np.sqrt(window_size) # Check if SE is below threshold converged = bool(se < self.convergence_threshold) return converged, se
[docs] def compare_scenarios( self, insured_results: Union[List[SimulationResults], np.ndarray], uninsured_results: Union[List[SimulationResults], np.ndarray], metric: str = "equity", ) -> Dict[str, Any]: """Compare insured vs uninsured scenarios using comprehensive ergodic analysis. This is the core method for demonstrating ergodic advantages of insurance. It performs side-by-side comparison of insured and uninsured scenarios, calculating both time-average and ensemble-average growth rates to reveal the fundamental difference between expected value thinking and actual experienced growth. The comparison reveals how insurance can be optimal from a time-average perspective even when it appears costly from an ensemble (expected value) perspective - the key insight of ergodic economics applied to insurance. Args: insured_results (Union[List[SimulationResults], np.ndarray]): Simulation results from insured scenarios. Can be: - List of SimulationResults objects from Monte Carlo runs - List of numpy arrays representing trajectories - 2D numpy array with shape (n_simulations, n_timesteps) uninsured_results (Union[List[SimulationResults], np.ndarray]): Simulation results from uninsured scenarios, same format as insured_results. Should have same number of simulations for valid comparison. metric (str): Financial metric to analyze for comparison: - "equity": Company equity over time (recommended) - "assets": Total assets over time - "cash": Available cash over time - Any attribute available in SimulationResults objects Defaults to "equity". Returns: Dict[str, Any]: Comprehensive comparison results with nested structure: - **'insured'** (Dict): Insured scenario statistics: - 'time_average_mean': Mean time-average growth rate - 'time_average_median': Median time-average growth rate - 'time_average_std': Standard deviation of growth rates - 'ensemble_average': Ensemble average growth rate - 'survival_rate': Fraction avoiding bankruptcy - 'n_survived': Absolute number of survivors - **'uninsured'** (Dict): Uninsured scenario statistics: - Same structure as 'insured' - **'ergodic_advantage'** (Dict): Comparative metrics: - 'time_average_gain': Difference in time-average growth - 'ensemble_average_gain': Difference in ensemble averages - 'survival_gain': Improvement in survival rate - 't_statistic': t-test statistic for significance - 'p_value': p-value for statistical significance - 'significant': Boolean indicating significance (p < 0.05) Examples: Basic insurance vs no insurance comparison: .. code-block:: python # Run Monte Carlo simulations (pseudo-code) insured_sims = run_simulations(insurance_enabled=True, n_sims=1000) uninsured_sims = run_simulations(insurance_enabled=False, n_sims=1000) # Compare scenarios comparison = analyzer.compare_scenarios( insured_sims, uninsured_sims, metric="equity" ) # Extract key insights time_avg_gain = comparison['ergodic_advantage']['time_average_gain'] survival_gain = comparison['ergodic_advantage']['survival_gain'] is_significant = comparison['ergodic_advantage']['significant'] print(f"Time-average growth improvement: {time_avg_gain:.2%}") print(f"Survival rate improvement: {survival_gain:.1%}") print(f"Statistical significance: {is_significant}") Detailed analysis of results: .. code-block:: python # Examine both perspectives insured = comparison['insured'] uninsured = comparison['uninsured'] advantage = comparison['ergodic_advantage'] print("\n=== ENSEMBLE PERSPECTIVE (Traditional Analysis) ===") print(f"Insured ensemble growth: {insured['ensemble_average']:.2%}") print(f"Uninsured ensemble growth: {uninsured['ensemble_average']:.2%}") print(f"Ensemble advantage: {advantage['ensemble_average_gain']:.2%}") print("\n=== TIME-AVERAGE PERSPECTIVE (Ergodic Analysis) ===") print(f"Insured time-average growth: {insured['time_average_mean']:.2%}") print(f"Uninsured time-average growth: {uninsured['time_average_mean']:.2%}") print(f"Time-average advantage: {advantage['time_average_gain']:.2%}") print("\n=== SURVIVAL ANALYSIS ===") print(f"Insured survival rate: {insured['survival_rate']:.1%}") print(f"Uninsured survival rate: {uninsured['survival_rate']:.1%}") print(f"Survival improvement: {advantage['survival_gain']:.1%}") # Interpret ergodic vs ensemble difference if advantage['time_average_gain'] > advantage['ensemble_average_gain']: print("\n✓ Insurance shows ergodic advantage!") print(" Time-average benefit exceeds ensemble expectation") else: print("\n! No clear ergodic advantage detected") Statistical significance analysis: .. code-block:: python if comparison['ergodic_advantage']['significant']: p_val = comparison['ergodic_advantage']['p_value'] t_stat = comparison['ergodic_advantage']['t_statistic'] print(f"Results are statistically significant:") print(f" t-statistic: {t_stat:.3f}") print(f" p-value: {p_val:.4f}") print(f" Confidence level: {(1-p_val)*100:.1f}%") else: print("Results not statistically significant") print("Consider running more simulations") Multiple metric analysis: .. code-block:: python # Compare different financial metrics metrics_to_analyze = ['equity', 'assets', 'cash'] results = {} for metric in metrics_to_analyze: results[metric] = analyzer.compare_scenarios( insured_sims, uninsured_sims, metric=metric ) # Find metric showing strongest insurance advantage best_metric = max(metrics_to_analyze, key=lambda m: results[m]['ergodic_advantage']['time_average_gain'] ) print(f"Strongest insurance advantage in: {best_metric}") gain = results[best_metric]['ergodic_advantage']['time_average_gain'] print(f"Time-average improvement: {gain:.2%}") Mathematical Background: The comparison reveals the ergodic/non-ergodic nature of financial processes by calculating: **Time-Average Growth**: Mean of individual trajectory growth rates: g_time = mean([ln(X_i(T)/X_i(0))/T for each path i]) **Ensemble Average Growth**: Growth of the ensemble mean: g_ensemble = ln(mean([X_i(T)])/mean([X_i(0)]))/T **Ergodic Divergence**: g_time - g_ensemble For multiplicative processes with volatility, these typically differ, with insurance often improving time-average more than ensemble average. Interpretation Guidelines: **Positive Time-Average Gain**: Insurance improves actual experienced growth rates, even if ensemble analysis suggests otherwise. **Survival Rate Improvement**: Critical for long-term viability, often the primary benefit of insurance in high-volatility scenarios. **Statistical Significance**: p < 0.05 indicates results are unlikely due to random chance, supporting reliability of conclusions. Edge Cases: - All paths bankrupt in one scenario: Handled with -inf growth rates - Mismatched simulation counts: Statistics calculated on available data - Identical scenarios: All advantages will be zero - High volatility: May require more simulations for significance Performance Notes: - Handles thousands of simulation paths efficiently - Memory-conscious processing of large trajectory datasets - Automatic handling of variable-length trajectories See Also: :meth:`calculate_time_average_growth`: Individual path analysis :meth:`calculate_ensemble_average`: Ensemble statistics :meth:`significance_test`: Statistical testing details :class:`ErgodicAnalysisResults`: Comprehensive results format """ # Extract trajectories if isinstance(insured_results, list) and isinstance(insured_results[0], SimulationResults): # Handle variable-length trajectories (e.g., due to insolvency) insured_trajectories = [getattr(r, metric) for r in insured_results] uninsured_trajectories = [getattr(r, metric) for r in uninsured_results] # Convert to list of arrays rather than 2D array to handle different lengths insured_trajectories = [np.asarray(traj) for traj in insured_trajectories] uninsured_trajectories = [np.asarray(traj) for traj in uninsured_trajectories] else: insured_trajectories = [np.asarray(traj) for traj in insured_results] uninsured_trajectories = [np.asarray(traj) for traj in uninsured_results] # Calculate time-average growth for each path insured_time_avg = [ self.calculate_time_average_growth(traj) for traj in insured_trajectories ] uninsured_time_avg = [ self.calculate_time_average_growth(traj) for traj in uninsured_trajectories ] # Filter out infinite values insured_time_avg_valid = [g for g in insured_time_avg if np.isfinite(g)] uninsured_time_avg_valid = [g for g in uninsured_time_avg if np.isfinite(g)] # Calculate ensemble averages insured_ensemble = self.calculate_ensemble_average( insured_trajectories, metric="growth_rate" ) uninsured_ensemble = self.calculate_ensemble_average( uninsured_trajectories, metric="growth_rate" ) # Compile results results = { "insured": { "time_average_mean": ( np.mean(insured_time_avg_valid) if insured_time_avg_valid else -np.inf ), "time_average_median": ( np.median(insured_time_avg_valid) if insured_time_avg_valid else -np.inf ), "time_average_std": ( np.std(insured_time_avg_valid) if insured_time_avg_valid else 0.0 ), "ensemble_average": insured_ensemble["mean"], "survival_rate": insured_ensemble["survival_rate"], "n_survived": insured_ensemble["n_survived"], }, "uninsured": { "time_average_mean": ( np.mean(uninsured_time_avg_valid) if uninsured_time_avg_valid else -np.inf ), "time_average_median": ( np.median(uninsured_time_avg_valid) if uninsured_time_avg_valid else -np.inf ), "time_average_std": ( np.std(uninsured_time_avg_valid) if uninsured_time_avg_valid else 0.0 ), "ensemble_average": uninsured_ensemble["mean"], "survival_rate": uninsured_ensemble["survival_rate"], "n_survived": uninsured_ensemble["n_survived"], }, "ergodic_advantage": { "time_average_gain": float( float(np.mean(insured_time_avg_valid) if insured_time_avg_valid else -np.inf) - float( np.mean(uninsured_time_avg_valid) if uninsured_time_avg_valid else -np.inf ) ), "ensemble_average_gain": insured_ensemble["mean"] - uninsured_ensemble["mean"], "survival_gain": insured_ensemble["survival_rate"] - uninsured_ensemble["survival_rate"], }, } # Add significance test if we have valid data if insured_time_avg_valid and uninsured_time_avg_valid: t_stat, p_value = self.significance_test( insured_time_avg_valid, uninsured_time_avg_valid ) results["ergodic_advantage"]["t_statistic"] = t_stat # type: ignore[index] results["ergodic_advantage"]["p_value"] = p_value # type: ignore[index] results["ergodic_advantage"]["significant"] = p_value < 0.05 # type: ignore[index] else: results["ergodic_advantage"]["t_statistic"] = np.nan # type: ignore[index] results["ergodic_advantage"]["p_value"] = np.nan # type: ignore[index] results["ergodic_advantage"]["significant"] = False # type: ignore[index] return results
[docs] def significance_test( self, sample1: Union[List[float], np.ndarray], sample2: Union[List[float], np.ndarray], test_type: str = "two-sided", ) -> Tuple[float, float]: """Perform statistical significance test between two growth rate samples. This method conducts a two-sample t-test to determine whether observed differences between insured and uninsured scenarios are statistically significant or could reasonably be attributed to random variation. Statistical significance provides confidence that ergodic advantages are genuine rather than artifacts of sampling variability. Args: sample1 (Union[List[float], np.ndarray]): First sample of growth rates, typically from insured scenarios. Should contain time-average growth rates from individual simulation paths. Infinite values (from bankruptcy) are automatically handled. sample2 (Union[List[float], np.ndarray]): Second sample of growth rates, typically from uninsured scenarios. Should be comparable to sample1 with same underlying business conditions but different insurance coverage. test_type (str): Type of statistical test to perform: - "two-sided": Tests if samples have different means (default) - "greater": Tests if sample1 mean > sample2 mean - "less": Tests if sample1 mean < sample2 mean Defaults to "two-sided" for general hypothesis testing. Returns: Tuple[float, float]: Statistical test results: - t_statistic (float): t-test statistic value. Positive values indicate sample1 has higher mean than sample2. - p_value (float): Probability of observing the data under the null hypothesis of no difference. Lower values indicate stronger evidence against the null hypothesis. Examples: Test insurance benefit significance:: import numpy as np # Growth rates from Monte Carlo simulations insured_growth = np.array([0.048, 0.051, 0.047, 0.049, 0.052, ...]) uninsured_growth = np.array([0.038, -np.inf, 0.042, 0.035, 0.041, ...]) # Two-sided test for any difference t_stat, p_value = analyzer.significance_test( insured_growth, uninsured_growth, test_type="two-sided" ) print(f"t-statistic: {t_stat:.3f}") print(f"p-value: {p_value:.4f}") if p_value < 0.05: print("✓ Statistically significant difference at 5% level") else: print("No significant difference detected") One-sided test for insurance superiority:: # Test if insurance provides superior growth rates t_stat, p_value = analyzer.significance_test( insured_growth, uninsured_growth, test_type="greater" ) print(f"Testing if insured > uninsured:") print(f"t-statistic: {t_stat:.3f}") print(f"p-value: {p_value:.4f}") if p_value < 0.01: print("✓ Strong evidence that insurance improves growth (p < 0.01)") elif p_value < 0.05: print("✓ Moderate evidence that insurance improves growth (p < 0.05)") elif p_value < 0.10: print("? Weak evidence that insurance improves growth (p < 0.10)") else: print("No significant evidence that insurance improves growth") Comprehensive significance analysis:: # Test multiple hypotheses tests = [ ("two-sided", "Any difference"), ("greater", "Insurance superior"), ("less", "Insurance inferior") ] print("=== Statistical Significance Analysis ===") for test_type, description in tests: t_stat, p_value = analyzer.significance_test( insured_growth, uninsured_growth, test_type ) significance = "***" if p_value < 0.001 else \ "**" if p_value < 0.01 else \ "*" if p_value < 0.05 else \ "" if p_value < 0.10 else "n.s." print(f"{description:20}: t={t_stat:6.3f}, p={p_value:.4f} {significance}") Sample size and power analysis:: # Check if samples are large enough for reliable testing n1, n2 = len(insured_growth), len(uninsured_growth) if n1 < 30 or n2 < 30: print(f"⚠ Small sample sizes (n1={n1}, n2={n2})") print("Consider running more simulations for robust results") # Calculate effect size (Cohen's d) mean1, mean2 = np.mean(insured_growth), np.mean(uninsured_growth) pooled_std = np.sqrt(((n1-1)*np.var(insured_growth) + (n2-1)*np.var(uninsured_growth)) / (n1+n2-2)) cohens_d = (mean1 - mean2) / pooled_std print(f"Effect size (Cohen's d): {cohens_d:.3f}") if abs(cohens_d) > 0.8: print("Large effect size") elif abs(cohens_d) > 0.5: print("Medium effect size") elif abs(cohens_d) > 0.2: print("Small effect size") else: print("Very small effect size") Statistical Interpretation: **p-value Guidelines**: - p < 0.001: Very strong evidence against null hypothesis (***) - p < 0.01: Strong evidence (**) - p < 0.05: Moderate evidence (*) - p < 0.10: Weak evidence - p >= 0.10: No significant evidence **t-statistic Guidelines**: - ``|t| > 3``: Very large effect - ``|t| > 2``: Large effect - ``|t| > 1``: Moderate effect - ``|t| < 1``: Small effect Assumptions and Limitations: **t-test Assumptions**: 1. Samples are independent 2. Data approximately normally distributed (robust to violations with large n) 3. Equal variances (Welch's t-test used automatically if needed) **Handling of Infinite Values**: The method automatically excludes infinite values (from bankruptcy scenarios) using scipy's nan_policy='omit'. This is appropriate since infinite values represent qualitatively different outcomes (business failure) rather than extreme but finite growth rates. **Multiple Testing**: If performing multiple significance tests, consider adjusting significance levels (e.g., Bonferroni correction) to account for increased Type I error probability. Performance Notes: - Efficient for samples up to 10,000+ observations - Automatic handling of missing/infinite values - Uses scipy.stats for robust statistical calculations See Also: :meth:`compare_scenarios`: Includes automatic significance testing :meth:`check_convergence`: For determining adequate sample sizes scipy.stats.ttest_ind: Underlying statistical test implementation """ # Perform independent samples t-test t_stat, p_value = stats.ttest_ind( sample1, sample2, alternative=test_type, nan_policy="omit" ) return t_stat, p_value
[docs] def analyze_simulation_batch( self, simulation_results: List[SimulationResults], label: str = "Scenario" ) -> Dict[str, Any]: """Perform comprehensive ergodic analysis on a batch of simulation results. This method provides a complete analysis of a single scenario (e.g., all insured simulations or all uninsured simulations), including time-average and ensemble statistics, convergence analysis, and survival metrics. It serves as a comprehensive diagnostic tool for understanding the ergodic properties of a particular insurance strategy. Args: simulation_results (List[SimulationResults]): List of simulation result objects from Monte Carlo runs. Each should contain trajectory data including equity, assets, years, and potential insolvency information. Typically 100-2000 simulations for robust analysis. label (str): Descriptive label for this batch of simulations, used in reporting and metadata. Examples: "High Deductible", "Full Coverage", "Base Case", "Stress Scenario". Defaults to "Scenario". Returns: Dict[str, Any]: Comprehensive analysis dictionary with nested structure: - **'label'** (str): The provided label for identification - **'n_simulations'** (int): Number of simulations analyzed - **'time_average'** (Dict): Time-average growth statistics: - 'mean': Mean time-average growth rate across all paths - 'median': Median time-average growth rate - 'std': Standard deviation of growth rates - 'min': Minimum growth rate observed - 'max': Maximum growth rate observed - **'ensemble_average'** (Dict): Ensemble statistics: - 'mean': Ensemble average growth rate - 'std': Standard deviation of ensemble - 'median': Median of ensemble - 'survival_rate': Fraction avoiding bankruptcy - 'n_survived': Absolute number of survivors - 'n_total': Total number of simulations - **'convergence'** (Dict): Monte Carlo convergence analysis: - 'converged': Boolean indicating if results have converged - 'standard_error': Current standard error of the estimates - 'threshold': Convergence threshold used - **'survival_analysis'** (Dict): Survival metrics: - 'survival_rate': Fraction avoiding bankruptcy (duplicate) - 'mean_survival_time': Average time to insolvency or end - **'ergodic_divergence'** (float): Difference between time-average and ensemble-average growth rates Examples: Analyze a batch of insured simulations: .. code-block:: python # Run Monte Carlo simulations insured_results = [] for i in range(1000): sim = run_single_simulation(insurance_enabled=True, seed=i) insured_results.append(sim) # Comprehensive analysis analysis = analyzer.analyze_simulation_batch( insured_results, label="Full Insurance Coverage" ) # Report key findings print(f"\n=== {analysis['label']} Analysis ===") print(f"Simulations: {analysis['n_simulations']}") print(f"Time-average growth: {analysis['time_average']['mean']:.2%} ± {analysis['time_average']['std']:.2%}") print(f"Ensemble average: {analysis['ensemble_average']['mean']:.2%}") print(f"Survival rate: {analysis['survival_analysis']['survival_rate']:.1%}") print(f"Ergodic divergence: {analysis['ergodic_divergence']:.3f}") Check Monte Carlo convergence: .. code-block:: python if analysis['convergence']['converged']: print(f"✓ Results have converged (SE: {analysis['convergence']['standard_error']:.4f})") print("Analysis is reliable for decision making") else: current_se = analysis['convergence']['standard_error'] target_se = analysis['convergence']['threshold'] print(f"⚠ Convergence not reached (SE: {current_se:.4f} > {target_se:.4f})") # Estimate additional simulations needed current_n = analysis['n_simulations'] factor = (current_se / target_se) ** 2 recommended_n = int(current_n * factor) additional_needed = recommended_n - current_n print(f"Recommend ~{additional_needed} additional simulations") Compare growth rate distributions: .. code-block:: python # Analyze distribution characteristics time_avg = analysis['time_average'] print(f"\n=== Growth Rate Distribution ===") print(f"Mean: {time_avg['mean']:.2%}") print(f"Median: {time_avg['median']:.2%}") print(f"Std Dev: {time_avg['std']:.2%}") print(f"Range: {time_avg['min']:.2%} to {time_avg['max']:.2%}") # Check for skewness if time_avg['mean'] > time_avg['median']: print("Distribution is right-skewed (long tail of high growth)") elif time_avg['mean'] < time_avg['median']: print("Distribution is left-skewed (long tail of poor performance)") else: print("Distribution appears roughly symmetric") Survival analysis insights: .. code-block:: python survival = analysis['survival_analysis'] ensemble = analysis['ensemble_average'] print(f"\n=== Survival Analysis ===") print(f"Survival rate: {survival['survival_rate']:.1%}") print(f"Survivors: {ensemble['n_survived']}/{ensemble['n_total']}") print(f"Mean time to insolvency/end: {survival['mean_survival_time']:.1f} years") # Risk assessment if survival['survival_rate'] < 0.9: print("⚠ High bankruptcy risk - consider more insurance") elif survival['survival_rate'] > 0.99: print("✓ Very low bankruptcy risk - insurance is effective") else: print("✓ Moderate bankruptcy risk - acceptable for most businesses") Ergodic divergence interpretation: .. code-block:: python divergence = analysis['ergodic_divergence'] if abs(divergence) < 0.001: # Less than 0.1% print("Minimal ergodic divergence - process is nearly ergodic") elif divergence > 0: print(f"Positive ergodic divergence ({divergence:.3f})") print("Time-average exceeds ensemble average - favorable") else: print(f"Negative ergodic divergence ({divergence:.3f})") print("Ensemble average exceeds time-average - volatility drag") Use Cases: **Single Scenario Analysis**: Understand the characteristics of one insurance configuration before comparing alternatives. **Convergence Diagnostics**: Determine if enough simulations have been run for reliable conclusions. **Risk Assessment**: Evaluate bankruptcy probabilities and growth rate distributions for risk management decisions. **Parameter Sensitivity**: Analyze how changes in insurance parameters affect ergodic properties by comparing batch analyses. Performance Notes: - Efficient processing of 1000+ simulation results - Memory-conscious handling of trajectory data - Automatic filtering of invalid/infinite growth rates - Vectorized calculations for speed Warning: Large numbers of bankruptcy scenarios may skew statistics. Check the survival rate and consider whether the scenario parameters are realistic for your analysis goals. See Also: :meth:`compare_scenarios`: For comparing multiple scenario batches :meth:`check_convergence`: For detailed convergence analysis :class:`SimulationResults`: Expected format for simulation_results :class:`ErgodicAnalysisResults`: Alternative comprehensive results format """ # Extract equity trajectories equity_trajectories = np.array([r.equity for r in simulation_results]) _asset_trajectories = np.array([r.assets for r in simulation_results]) # Calculate time-average growth for each path time_avg_growth = [ self.calculate_time_average_growth(equity) for equity in equity_trajectories ] # Filter valid growth rates valid_growth = [g for g in time_avg_growth if np.isfinite(g)] # Calculate ensemble statistics ensemble_stats = self.calculate_ensemble_average(equity_trajectories, metric="growth_rate") # Check convergence if len(valid_growth) > 0: converged, se = self.check_convergence(np.array(valid_growth)) else: converged, se = False, np.inf # Compile analysis analysis: Dict[str, Any] = { "label": label, "n_simulations": len(simulation_results), "time_average": { "mean": np.mean(valid_growth) if valid_growth else -np.inf, "median": np.median(valid_growth) if valid_growth else -np.inf, "std": np.std(valid_growth) if valid_growth else 0.0, "min": np.min(valid_growth) if valid_growth else -np.inf, "max": np.max(valid_growth) if valid_growth else -np.inf, }, "ensemble_average": ensemble_stats, "convergence": { "converged": converged, "standard_error": se, "threshold": self.convergence_threshold, }, "survival_analysis": { "survival_rate": ensemble_stats["survival_rate"], "mean_survival_time": np.mean( [ r.insolvency_year if r.insolvency_year else len(r.years) for r in simulation_results ] ), }, } # Calculate ergodic divergence if valid_growth: time_avg_mean = analysis["time_average"]["mean"] ensemble_mean = ensemble_stats["mean"] analysis["ergodic_divergence"] = time_avg_mean - ensemble_mean else: analysis["ergodic_divergence"] = np.nan return analysis
[docs] def integrate_loss_ergodic_analysis( # pylint: disable=too-many-locals self, loss_data: "LossData", insurance_program: Optional["InsuranceProgram"], manufacturer: Any, time_horizon: int, n_simulations: int = 100, ) -> ErgodicAnalysisResults: """Perform end-to-end integrated loss modeling and ergodic analysis. This method provides a complete pipeline from loss generation through insurance application to final ergodic analysis. It demonstrates the full power of the ergodic framework by seamlessly connecting actuarial loss modeling with business financial modeling and ergodic growth analysis. The integration pipeline follows these steps: 1. **Validate Input Data**: Ensure loss data meets quality standards 2. **Apply Insurance Program**: Calculate recoveries and net exposures 3. **Generate Annual Loss Aggregates**: Convert to time-series format 4. **Run Monte Carlo Simulations**: Execute business simulations with losses 5. **Calculate Ergodic Metrics**: Analyze time-average vs ensemble behavior 6. **Validate Results**: Ensure mathematical and business logic consistency 7. **Package Results**: Return comprehensive analysis in standardized format Args: loss_data (LossData): Standardized loss data object containing loss frequency and severity distributions. Must pass validation checks including proper distribution parameters and reasonable ranges. insurance_program (Optional[InsuranceProgram]): Insurance program to apply to losses. If None, analysis proceeds with no insurance coverage. Program should specify layers, deductibles, limits, and premium rates. manufacturer (Any): Manufacturer model instance for running business simulations. Should be configured with appropriate initial conditions and financial parameters. Must support claim processing and annual step operations. time_horizon (int): Analysis time horizon in years. Typical values: - 10-20 years: Standard analysis period - 50+ years: Long-term ergodic behavior - 5-10 years: Quick analysis for parameter exploration n_simulations (int): Number of Monte Carlo simulations to run. More simulations provide better statistical reliability: - 100: Quick analysis for development/testing - 1000: Standard analysis for decision making - 5000+: High-precision analysis for final recommendations Defaults to 100 for reasonable performance. Returns: ErgodicAnalysisResults: Comprehensive analysis results containing: - Time-average and ensemble-average growth rates - Survival rates and ergodic divergence - Insurance impact metrics (premiums, recoveries, net benefit) - Validation status and detailed metadata - All necessary information for decision making Examples: Basic integrated analysis: .. code-block:: python from ergodic_insurance import LossData, InsuranceProgram, WidgetManufacturer, ManufacturerConfig # Set up loss data loss_data = LossData.from_poisson_lognormal( frequency_lambda=2.5, # 2.5 claims per year on average severity_mean=1_000_000, # $1M average claim severity_cv=2.0, # High variability time_horizon=20 ) # Configure insurance program insurance = InsuranceProgram([ # (attachment, limit, rate) (0, 1_000_000, 0.015), # $1M primary layer at 1.5% (1_000_000, 10_000_000, 0.008), # $10M excess at 0.8% (11_000_000, 50_000_000, 0.004) # $50M umbrella at 0.4% ]) # Set up manufacturer config = ManufacturerConfig( initial_assets=25_000_000, base_operating_margin=0.08, asset_turnover_ratio=0.75 ) manufacturer = WidgetManufacturer(config) # Run integrated analysis results = analyzer.integrate_loss_ergodic_analysis( loss_data=loss_data, insurance_program=insurance, manufacturer=manufacturer, time_horizon=20, n_simulations=1000 ) # Interpret results if results.validation_passed: print(f"Time-average growth: {results.time_average_growth:.2%}") print(f"Ensemble average: {results.ensemble_average_growth:.2%}") print(f"Survival rate: {results.survival_rate:.1%}") print(f"Ergodic divergence: {results.ergodic_divergence:.3f}") net_benefit = results.insurance_impact['net_benefit'] print(f"Insurance net benefit: ${net_benefit:,.0f}") if results.ergodic_divergence > 0: print("✓ Insurance shows ergodic advantage") else: print("⚠ Analysis validation failed - check inputs") Compare insured vs uninsured scenarios: .. code-block:: python # Run analysis with insurance insured_results = analyzer.integrate_loss_ergodic_analysis( loss_data, insurance, manufacturer, 20, 1000 ) # Run analysis without insurance uninsured_results = analyzer.integrate_loss_ergodic_analysis( loss_data, None, manufacturer, 20, 1000 ) # Compare outcomes if insured_results.validation_passed and uninsured_results.validation_passed: growth_improvement = (insured_results.time_average_growth - uninsured_results.time_average_growth) survival_improvement = (insured_results.survival_rate - uninsured_results.survival_rate) print(f"Growth rate improvement: {growth_improvement:.2%}") print(f"Survival rate improvement: {survival_improvement:.1%}") if growth_improvement > 0 and survival_improvement > 0: print("✓ Insurance provides clear benefits") elif growth_improvement > 0: print("✓ Insurance improves growth despite survival costs") elif survival_improvement > 0: print("✓ Insurance improves survival despite growth costs") else: print("? Insurance benefits unclear - review parameters") Parameter sensitivity analysis: .. code-block:: python # Test different loss frequencies frequencies = [1.0, 2.0, 3.0, 4.0, 5.0] results = {} for freq in frequencies: test_loss_data = LossData.from_poisson_lognormal( frequency_lambda=freq, severity_mean=1_000_000, severity_cv=2.0, time_horizon=20 ) result = analyzer.integrate_loss_ergodic_analysis( test_loss_data, insurance, manufacturer, 20, 500 ) results[freq] = result # Find optimal frequency range for insurance benefit for freq, result in results.items(): if result.validation_passed: print(f"Frequency {freq}: Growth={result.time_average_growth:.2%}, " f"Survival={result.survival_rate:.1%}") Detailed insurance impact analysis: .. code-block:: python if results.validation_passed: impact = results.insurance_impact metadata = results.metadata print(f"\n=== Insurance Impact Analysis ===") print(f"Total premiums paid: ${impact.get('premium_cost', 0):,.0f}") print(f"Total recoveries: ${impact.get('recovery_benefit', 0):,.0f}") print(f"Net financial benefit: ${impact.get('net_benefit', 0):,.0f}") print(f"Growth rate improvement: {impact.get('growth_improvement', 0):.2%}") # Calculate benefit ratios premium_cost = impact.get('premium_cost', 1) # Avoid division by zero if premium_cost > 0: recovery_ratio = impact.get('recovery_benefit', 0) / premium_cost benefit_ratio = impact.get('net_benefit', 0) / premium_cost print(f"\n=== Efficiency Metrics ===") print(f"Recovery ratio: {recovery_ratio:.2f}x premiums") print(f"Net benefit ratio: {benefit_ratio:.2f}x premiums") if benefit_ratio > 0: print("✓ Insurance provides positive net value") else: print("⚠ Insurance costs exceed benefits in expectation") print(" (But may still provide ergodic advantages)") Validation and Error Handling: The method includes comprehensive validation at multiple stages: **Input Validation**: - Loss data consistency checks - Insurance program parameter validation - Manufacturer model state verification **Process Validation**: - Simulation convergence monitoring - Mathematical consistency checks - Business logic validation **Output Validation**: - Result reasonableness checks - Statistical significance assessment - Cross-validation with alternative methods Performance Considerations: - Optimized for 100-5000 simulation runs - Memory-efficient trajectory storage - Parallel processing capabilities where available - Progress monitoring for long-running analyses Error Conditions: Returns results with validation_passed=False if: - Loss data fails validation checks - All simulation paths end in bankruptcy - Mathematical inconsistencies detected - Insufficient data for statistical analysis See Also: :class:`ErgodicAnalysisResults`: Detailed results format :class:`~ergodic_insurance.loss_distributions.LossData`: Loss data requirements :class:`~ergodic_insurance.insurance_program.InsuranceProgram`: Insurance setup :meth:`validate_insurance_ergodic_impact`: Additional validation methods """ from .loss_distributions import LossEvent from .simulation import Simulation # Validate input data if not loss_data.validate(): logger.warning("Loss data validation failed") return ErgodicAnalysisResults( time_average_growth=-np.inf, ensemble_average_growth=0.0, survival_rate=0.0, ergodic_divergence=-np.inf, insurance_impact={}, validation_passed=False, metadata={"error": "Invalid loss data"}, ) # Apply insurance if provided if insurance_program: insured_loss_data = loss_data.apply_insurance(insurance_program) insurance_metadata = insured_loss_data.metadata else: insured_loss_data = loss_data insurance_metadata = {} # Convert to annual aggregates for simulation annual_losses = insured_loss_data.get_annual_aggregates(time_horizon) # Run Monte Carlo simulations simulation_results = [] for sim_idx in range(n_simulations): # Create fresh manufacturer instance for each simulation import copy mfg_copy = copy.deepcopy(manufacturer) # Create simulation with manufacturer sim = Simulation(manufacturer=mfg_copy, time_horizon=time_horizon, seed=sim_idx) # Initialize result storage sim.years = np.arange(time_horizon) sim.assets = np.zeros(time_horizon) sim.equity = np.zeros(time_horizon) sim.roe = np.zeros(time_horizon) sim.revenue = np.zeros(time_horizon) sim.net_income = np.zeros(time_horizon) sim.claim_counts = np.zeros(time_horizon, dtype=int) sim.claim_amounts = np.zeros(time_horizon) sim.insolvency_year = None # Step through each year manually for year in range(time_horizon): # Get losses for this year (if any) loss_amount = annual_losses.get(year, 0.0) # Create loss events for this year losses = [] if loss_amount > 0: losses.append( LossEvent(time=float(year), amount=loss_amount, loss_type="aggregate") ) # Execute time step metrics = sim.step_annual(year, losses) # Store results sim.assets[year] = metrics.get("assets", 0) sim.equity[year] = metrics.get("equity", 0) sim.roe[year] = metrics.get("roe", 0) sim.revenue[year] = metrics.get("revenue", 0) sim.net_income[year] = metrics.get("net_income", 0) sim.claim_counts[year] = metrics.get("claim_count", 0) sim.claim_amounts[year] = metrics.get("claim_amount", 0) # Check for insolvency if metrics.get("equity", 0) <= 0: sim.insolvency_year = year # Fill remaining years with zeros sim.assets[year + 1 :] = 0 sim.equity[year + 1 :] = 0 sim.roe[year + 1 :] = np.nan sim.revenue[year + 1 :] = 0 sim.net_income[year + 1 :] = 0 break # Create results object from .simulation import SimulationResults result = SimulationResults( years=sim.years[: year + 1] if sim.insolvency_year else sim.years, assets=sim.assets[: year + 1] if sim.insolvency_year else sim.assets, equity=sim.equity[: year + 1] if sim.insolvency_year else sim.equity, roe=sim.roe[: year + 1] if sim.insolvency_year else sim.roe, revenue=sim.revenue[: year + 1] if sim.insolvency_year else sim.revenue, net_income=sim.net_income[: year + 1] if sim.insolvency_year else sim.net_income, claim_counts=( sim.claim_counts[: year + 1] if sim.insolvency_year else sim.claim_counts ), claim_amounts=( sim.claim_amounts[: year + 1] if sim.insolvency_year else sim.claim_amounts ), insolvency_year=sim.insolvency_year, ) simulation_results.append(result) # Calculate ergodic metrics equity_trajectories = [r.equity for r in simulation_results] # Time-average growth rates time_avg_growth_rates = [ self.calculate_time_average_growth(traj) for traj in equity_trajectories ] valid_time_avg = [g for g in time_avg_growth_rates if np.isfinite(g)] # Ensemble statistics ensemble_stats = self.calculate_ensemble_average(equity_trajectories, metric="growth_rate") # Calculate insurance impact insurance_impact = {} if insurance_metadata: insurance_impact = { "premium_cost": insurance_metadata.get("total_premiums", 0), "recovery_benefit": insurance_metadata.get("total_recoveries", 0), "net_benefit": insurance_metadata.get("net_benefit", 0), "growth_improvement": np.mean(valid_time_avg) if valid_time_avg else 0, } # Calculate ergodic divergence time_avg_mean = float(np.mean(valid_time_avg)) if valid_time_avg else -np.inf ensemble_mean = float(ensemble_stats["mean"]) ergodic_divergence = time_avg_mean - ensemble_mean # Validate results validation_passed = ( len(valid_time_avg) > 0 and ensemble_stats["survival_rate"] > 0 and np.isfinite(ergodic_divergence) ) return ErgodicAnalysisResults( time_average_growth=time_avg_mean, ensemble_average_growth=ensemble_mean, survival_rate=ensemble_stats["survival_rate"], ergodic_divergence=ergodic_divergence, insurance_impact=insurance_impact, validation_passed=validation_passed, metadata={ "n_simulations": n_simulations, "time_horizon": time_horizon, "n_survived": ensemble_stats["n_survived"], "loss_statistics": insured_loss_data.calculate_statistics(), }, )
[docs] def validate_insurance_ergodic_impact( # pylint: disable=too-many-locals self, base_scenario: SimulationResults, insurance_scenario: SimulationResults, insurance_program: Optional["InsuranceProgram"] = None, ) -> ValidationResults: """Comprehensively validate insurance effects in ergodic calculations. This method performs detailed validation to ensure that insurance impacts are properly reflected in the ergodic analysis. It checks the mathematical consistency and business logic of insurance effects on cash flows, growth rates, and survival probabilities. The validation is crucial for ensuring that ergodic analysis results are reliable and that observed insurance benefits (or costs) are genuine rather than artifacts of modeling errors or inconsistent implementations. Validation Checks Performed: 1. **Premium Deduction Validation**: Verifies that insurance premiums are properly deducted from cash flows and reflected in net income 2. **Recovery Credit Validation**: Confirms that insurance recoveries are properly credited and improve financial outcomes 3. **Collateral Impact Validation**: Checks that letter of credit costs and asset restrictions are properly modeled 4. **Growth Rate Consistency**: Validates that time-average growth calculations properly reflect insurance benefits Args: base_scenario (SimulationResults): Simulation results from baseline scenario without insurance coverage. Should represent the same business conditions and loss realizations as insurance_scenario but without insurance program applied. insurance_scenario (SimulationResults): Simulation results from scenario with insurance coverage. Should be directly comparable to base_scenario with only insurance coverage as the differentiating factor. insurance_program (Optional[InsuranceProgram]): The insurance program that was applied in insurance_scenario. If provided, enables more detailed validation of premium calculations and coverage effects. If None, performs validation based on observed differences only. Returns: ValidationResults: Comprehensive validation results containing: - premium_deductions_correct: Boolean indicating premium validation - recoveries_credited: Boolean indicating recovery validation - collateral_impacts_included: Boolean indicating collateral validation - time_average_reflects_benefit: Boolean indicating growth validation - overall_valid: Boolean indicating overall validation status - details: Dict with detailed validation information and metrics Examples: Basic validation after scenario comparison: .. code-block:: python # Run paired simulations base_sim = run_simulation(insurance_enabled=False, seed=12345) insured_sim = run_simulation(insurance_enabled=True, seed=12345) # Validate insurance effects validation = analyzer.validate_insurance_ergodic_impact( base_sim, insured_sim, insurance_program ) if validation.overall_valid: print("✓ Insurance effects properly modeled") print(f" Premium deductions: {validation.premium_deductions_correct}") print(f" Recoveries credited: {validation.recoveries_credited}") print(f" Collateral impacts: {validation.collateral_impacts_included}") print(f" Growth consistency: {validation.time_average_reflects_benefit}") else: print("⚠ Validation issues detected") print("Review modeling implementation") Detailed validation diagnostics: .. code-block:: python validation = analyzer.validate_insurance_ergodic_impact( base_scenario, insurance_scenario, insurance_program ) # Examine premium validation details if 'premium_check' in validation.details: premium_info = validation.details['premium_check'] print(f"\n=== Premium Validation ===") print(f"Expected premium: ${premium_info['expected']:,.0f}") print(f"Actual cost difference: ${premium_info['actual_diff']:,.0f}") print(f"Validation passed: {premium_info['valid']}") if not premium_info['valid']: diff = abs(premium_info['expected'] - premium_info['actual_diff']) print(f"⚠ Premium discrepancy: ${diff:,.0f}") # Examine recovery validation details if 'recovery_check' in validation.details: recovery_info = validation.details['recovery_check'] print(f"\n=== Recovery Validation ===") print(f"Base scenario claims: ${recovery_info['base_claims']:,.0f}") print(f"Insured scenario claims: ${recovery_info['insured_claims']:,.0f}") print(f"Base final equity: ${recovery_info['base_final_equity']:,.0f}") print(f"Insured final equity: ${recovery_info['insured_final_equity']:,.0f}") print(f"Validation passed: {recovery_info['valid']}") # Examine growth rate validation if 'growth_check' in validation.details: growth_info = validation.details['growth_check'] print(f"\n=== Growth Rate Validation ===") print(f"Base growth rate: {growth_info['base_growth']:.2%}") print(f"Insured growth rate: {growth_info['insured_growth']:.2%}") print(f"Growth improvement: {growth_info['improvement']:.2%}") print(f"Validation passed: {growth_info['valid']}") if growth_info['improvement'] > 0: print("✓ Insurance improves time-average growth") elif np.isfinite(growth_info['insured_growth']) and not np.isfinite(growth_info['base_growth']): print("✓ Insurance prevents bankruptcy (infinite improvement)") Validation in Monte Carlo context: .. code-block:: python # Validate across multiple random seeds validation_results = [] for seed in range(10): # Test 10 paired simulations base = run_simulation(insurance_enabled=False, seed=seed) insured = run_simulation(insurance_enabled=True, seed=seed) validation = analyzer.validate_insurance_ergodic_impact( base, insured, insurance_program ) validation_results.append(validation.overall_valid) # Check consistency across seeds validation_rate = sum(validation_results) / len(validation_results) print(f"Validation rate across seeds: {validation_rate:.1%}") if validation_rate < 0.8: print("⚠ Inconsistent validation - check model implementation") else: print("✓ Consistent validation across scenarios") Integration with scenario comparison: .. code-block:: python # Run comparison analysis comparison = analyzer.compare_scenarios( [insured_sim], [base_sim], metric="equity" ) # Validate the comparison validation = analyzer.validate_insurance_ergodic_impact( base_sim, insured_sim, insurance_program ) # Cross-check results if validation.overall_valid and comparison['ergodic_advantage']['significant']: print("✓ Validated significant ergodic advantage from insurance") print(f" Time-average improvement: {comparison['ergodic_advantage']['time_average_gain']:.2%}") print(f" Statistical significance: p = {comparison['ergodic_advantage']['p_value']:.4f}") elif validation.overall_valid: print("✓ Insurance effects validated but not statistically significant") print("Consider running more simulations or adjusting parameters") else: print("⚠ Validation failed - results may be unreliable") print("Review model implementation before drawing conclusions") Validation Logic Details: **Premium Validation**: Compares expected premium costs (from insurance program) with actual observed difference in net income between scenarios. Allows for small numerical differences (<1% of expected premium). **Recovery Validation**: Checks that insurance scenario shows better financial performance despite potential premium costs. Allows for 5% variance to account for timing differences and model approximations. **Collateral Validation**: Verifies that letter of credit costs and asset restrictions are reflected in the financial calculations. Checks for non-zero differences in asset levels between scenarios. **Growth Rate Validation**: Ensures that time-average growth calculations properly reflect insurance benefits, especially in scenarios with significant loss exposure. Handles bankruptcy cases appropriately. Common Validation Failures: - Premium costs not properly deducted from cash flows - Insurance recoveries not credited to reduce net losses - Letter of credit collateral costs not included in expense calculations - Inconsistent treatment of bankruptcy scenarios - Timing mismatches between premium payments and loss occurrences Troubleshooting: If validation fails, check: 1. Consistent random seed usage between base and insured scenarios 2. Proper integration of insurance program with manufacturer model 3. Correct timing of premium payments and loss recoveries 4. Accurate letter of credit cost calculations 5. Consistent handling of bankruptcy and survival scenarios Performance Notes: - Fast execution for single scenario pairs - Efficient for batch validation across multiple seeds - Comprehensive diagnostics with minimal computational overhead See Also: :class:`ValidationResults`: Detailed validation results format :meth:`compare_scenarios`: Main scenario comparison method :meth:`integrate_loss_ergodic_analysis`: End-to-end analysis pipeline :class:`~ergodic_insurance.insurance_program.InsuranceProgram`: Insurance modeling """ details = {} # Check premium deductions premium_deductions_correct = True if insurance_program and hasattr(insurance_program, "calculate_premium"): expected_premium = insurance_program.calculate_premium() actual_cost_diff = np.sum(base_scenario.net_income - insurance_scenario.net_income) premium_diff = abs(actual_cost_diff - expected_premium * len(base_scenario.years)) premium_deductions_correct = premium_diff < 0.01 * expected_premium details["premium_check"] = { "expected": expected_premium, "actual_diff": actual_cost_diff, "valid": premium_deductions_correct, } # Check recoveries are credited # With insurance, gross claims may be same or higher (if taking more risk), # but the net financial impact should be lower recoveries_credited = True total_base_claims = np.sum(base_scenario.claim_amounts) total_insured_claims = np.sum(insurance_scenario.claim_amounts) # Check if insurance scenario shows better financial performance # (either less net loss or better equity growth) base_final_equity = base_scenario.equity[-1] if len(base_scenario.equity) > 0 else 0 insured_final_equity = ( insurance_scenario.equity[-1] if len(insurance_scenario.equity) > 0 else 0 ) # Insurance is credited if final equity is better or claims don't reduce equity as much recoveries_credited = ( insured_final_equity >= base_final_equity * 0.95 ) # Allow small variance details["recovery_check"] = { "base_claims": total_base_claims, "insured_claims": total_insured_claims, "base_final_equity": base_final_equity, "insured_final_equity": insured_final_equity, "valid": recoveries_credited, } # Check collateral impacts (simplified check) collateral_impacts_included = True if insurance_program and hasattr(insurance_program, "collateral_requirement"): # Check if working capital is affected base_assets = base_scenario.assets insured_assets = insurance_scenario.assets asset_diff = np.mean(insured_assets - base_assets) collateral_impacts_included = abs(asset_diff) > 0 details["collateral_check"] = { "asset_difference": asset_diff, "valid": collateral_impacts_included, } # Check time-average growth benefit base_growth = self.calculate_time_average_growth(base_scenario.equity) insured_growth = self.calculate_time_average_growth(insurance_scenario.equity) growth_improvement = insured_growth - base_growth # Insurance should improve growth if losses are significant time_average_reflects_benefit = True if total_base_claims > 0: # Expect positive growth improvement with insurance time_average_reflects_benefit = growth_improvement >= 0 or np.isfinite(insured_growth) details["growth_check"] = { "base_growth": base_growth, "insured_growth": insured_growth, "improvement": growth_improvement, "valid": time_average_reflects_benefit, } # Overall validation overall_valid = ( premium_deductions_correct and recoveries_credited and collateral_impacts_included and time_average_reflects_benefit ) return ValidationResults( premium_deductions_correct=premium_deductions_correct, recoveries_credited=recoveries_credited, collateral_impacts_included=collateral_impacts_included, time_average_reflects_benefit=time_average_reflects_benefit, overall_valid=overall_valid, details=details, )