MCMC¶
The main result of running rsdfit mcmc ...
is a *.npz
file saved to the
output directory for each MCMC chain that was run. These results can be
loaded from file using the pyRSD.rsdfit.results.EmceeResults
class.
The EmceeResults
stores the entire MCMC chain
for each free parameter and can return statistics based on this chain for
each parameter, i.e., the median, \(1\sigma\) and \(2\sigma\)
errors, etc. This object also computes the corresponding MCMC chain
for the constrained parameter values and can return information
for each constrained parameter.
The median parameter values, as well as the 68% and 95% confidence
intervals can be quickly displayed by printing the
EmceeResults
object. For example,
In [1]: from pyRSD.rsdfit.results import EmceeResults
In [2]: results = EmceeResults.from_npz('mcmc_result.npz')
# print out a summary of the parameters, with mean values and 68% and 95% intervals
In [3]: print(results)
Free parameters [ median (+/-68%) (+/-95%) ]
_______________
<Parameter Nsat_mult: 2.423 (+0.2082 -0.1809) (+0.4361 -0.3432)>
<Parameter alpha_par: 1.009 (+0.00682 -0.007681) (+0.01353 -0.01379)>
<Parameter alpha_perp: 1.005 (+0.004241 -0.004049) (+0.008281 -0.008406)>
<Parameter b1_cA: 1.999 (+0.05749 -0.05801) (+0.1126 -0.1265)>
<Parameter f: 0.867 (+0.02897 -0.02502) (+0.05642 -0.05073)>
<Parameter f1h_sBsB: 3.569 (+0.5521 -0.6841) (+1.224 -1.396)>
<Parameter fs: 0.1434 (+0.007553 -0.008046) (+0.01554 -0.01544)>
<Parameter fsB: 0.4662 (+0.08146 -0.0792) (+0.1685 -0.1606)>
<Parameter gamma_b1sA: 1.31 (+0.1001 -0.1032) (+0.2119 -0.2205)>
<Parameter gamma_b1sB: 2.343 (+0.1661 -0.181) (+0.3225 -0.3636)>
<Parameter sigma8_z: 0.5411 (+0.01224 -0.01362) (+0.02611 -0.02452)>
<Parameter sigma_c: 0.9297 (+0.06205 -0.06473) (+0.1126 -0.1177)>
<Parameter sigma_sA: 3.443 (+0.2779 -0.2702) (+0.5436 -0.5378)>
Constrained parameters [ median (+/-68%) (+/-95%) ]
______________________
<Parameter NsBsB: 5.178e+04 (+1.721e+04 -1.36e+04) (+4.16e+04 -2.44e+04)>
<Parameter b1_sA: 2.617 (+0.169 -0.1761) (+0.3037 -0.3725)>
<Parameter sigma_sB: 5.71 (+0.3067 -0.2659) (+0.6624 -0.4928)>
<Parameter NcBs: 2.261e+04 (+2900 -2096) (+6130 -4001)>
<Parameter b1_sB: 4.686 (+0.2679 -0.3235) (+0.5596 -0.672)>
<Parameter b1_cB: 3.175 (+0.1504 -0.1736) (+0.2942 -0.3522)>
<Parameter fcB: 0.122 (+0.01336 -0.0149) (+0.02856 -0.02761)>
<Parameter b1_c: 2.141 (+0.04721 -0.04684) (+0.08834 -0.08337)>
<Parameter b1: 2.344 (+0.05369 -0.04764) (+0.1051 -0.1053)>
<Parameter fsigma8: 0.4689 (+0.009673 -0.009076) (+0.01866 -0.01848)>
<Parameter b1_s: 3.575 (+0.1907 -0.2037) (+0.3854 -0.4186)>
<Parameter alpha: 1.006 (+0.004107 -0.004331) (+0.007706 -0.0083)>
<Parameter epsilon: 0.00125 (+0.002298 -0.002462) (+0.004696 -0.004652)>
<Parameter b1sigma8: 1.268 (+0.007575 -0.008006) (+0.01531 -0.01495)>
<Parameter F_AP: 1.004 (+0.006927 -0.007386) (+0.01419 -0.01393)>
The EmceeParameter Object¶
The EmceeResults
object provides
access to the parameters via a dictionary-like behavior. When accessing
parameters using the name as the key, a EmceeParameter
object
is returned. This object has several useful attributes holding information
about the parameter:
- flat_trace : the flattened MCMC chain for this parameter
- median : the median of the trace
- mean: the average value of the trace
- one_sigma : tuple of the lower and upper \(1\sigma\) errors
- two_sigma : tuple of the lower and upper \(2\sigma\) errors
- three_sigma : tuple of the lower and upper \(3\sigma\) errors
- stderr : the average of the upper and lower \(1\sigma\) error
For example,
In [4]: f = results['f']
In [5]: print(f.median, f.one_sigma, f.two_sigma)
0.8669514683620745 [-0.02501616327798617, 0.02896606398246815] [-0.05073347536029271, 0.05641800142862763]
In [6]: sigma8 = results['sigma8_z']
In [7]: print(sigma8.median, sigma8.one_sigma, sigma8.two_sigma)
0.5410678224052163 [-0.013623734433715451, 0.01224434253786777] [-0.02452235949056536, 0.026109037129819712]
# access to constrained parameters too
In [8]: fs8 = results['fsigma8']
In [9]: print(fs8)
<Parameter fsigma8: 0.4689 (+0.009673 -0.009076) (+0.01866 -0.01848)>
Specifying the Burn-in Period¶
The user can specify a “burn-in” period for a given results object, by
changing the burnin
attribute of the EmceeResults
object.
The value of this attribute specifies the number of steps to ignore, starting
from the beginning of the MCMC chain. The ignored steps are not included when
computing any statistics from the parameter traces.
The attributes of the EmceeParameter
object automatically take into
account the value of the burnin attribute. Thus, the user just needs to set
the burnin
and the parameter values will automatically adjust
accordingly.
For example,
In [10]: results.burnin = 0 # ignore 0 steps
In [11]: print(results['fsigma8'].median)
0.4689141809940338
In [12]: results.burnin = 500 # ignore the first 500 steps
In [13]: print(results['fsigma8'].median) # slight change in value
0.468679279088974
Ideally, if the chain has converged for a given parameter, the user should see little change to the parameter’s value when adjusting the burnin period.
The Best-fit Values¶
The user can quickly access the best-fit parameter vector using the
values
attribute, which returns the median value of each free
parameter. Similarly, the constrained_values
attribute returns
the median value of each constrained parameter.
Alternatively, the user can access the parameter vector that maximizes
the log probability by accessing the max_lnprob_values
and
max_lnprob_constrained_values
attributes.