Quickstart¶
The pyRSD package provides a number of examples in order to get users
up and running quickly. Once these examples have been downloaded, the
user can start running their own parameter fits using the example data
and parameter files. The executable responsible for downloading the pyRSD
examples is pyrsd-quickstart
, which has the following calling sequence:
$ pyrsd-quickstart -h
usage: pyrsd-quickstart [-h]
{galaxy/survey-poles,galaxy/periodic-poles,galaxy/periodic-pkmu}
dirname
download example data and parameter files to get up and running with pyRSD
positional arguments:
{galaxy/survey-poles,galaxy/periodic-poles,galaxy/periodic-pkmu}
which example to download
dirname the output directory to save the downloaded files; if
it doesn't exist it will be created
optional arguments:
-h, --help show this help message and exit
The first argument provided to the command is the name of the example to download, which should be of the following:
Example Name | Description |
galaxy/periodic-pkmu | Fitting \(P(k,\mu)\) galaxy data from a periodic box simulation |
galaxy/periodic-poles | Fitting \(P_\ell(k)\) galaxy data from a periodic box simulation |
galaxy/survey-poles | Fitting \(P_\ell(k)\) galaxy data from a simulation with a realistic survey window function |
And the second argument is the name of the directory to download the files too.
So, for example, to download the files from the periodic-poles
examples to
a directory called pyRSD-example
simply do
pyrsd-quickstart galaxy/periodic-poles pyRSD-example
Running Parameter Fits¶
With the example downloaded, the user can run MCMC or nonlinear optimization
fits using the default model parametrization with the rsdfit
command.
See The Free Parameters for a table of the default free parameters and
The Constrained Parameters for a table of the default constrained parameters.
For example, to run 10 nonlinear optimization steps (using the LBFGS optimization algorithm), you can do
rsdfit nlopt -m pyRSD-example/model.npy -p pyRSD-example/params.dat -o pyRSD-example-results -i 10
or to run 10 MCMC iterations (using 30 emcee
walkers), you can do
rsdfit mcmc -m pyRSD-example/model.npy -p pyRSD-example/params.dat -o pyRSD-example-results -i 10 -w 30
The number of iterations has been set to 10 here just for illustration purporse. Typically, the LBFGS algoritm will take \(\sim100-200\) steps to converge, and the mcmc algorithm will need 1000s of iterations to fully explore the posterior distributions of the parameters.
Analyzing the Fit Results¶
The rsdfit
saves the best-fit parameter set to a numpy .npz
file in
the directory specified via the -o
output, which is pyRSD-example-results
in the example above. There are two Python objects in pyRSD that can read these
files, depending on the type of fit that was run. For mcmc
fits, use the
pyRSD.rsdfit.results.EmceeResults
class and for nlopt
fits, use
the pyRSD.rsdfit.results.LBFGSResults
class.
For example, to explore the fitting results from a mcmc
fit
In [1]: from pyRSD.rsdfit.results import EmceeResults, LBFGSResults
In [2]: mcmc_results = EmceeResults.from_npz('mcmc_result.npz')
# print out a summary of the parameters, with mean values and 68% and 95% intervals
In [3]: print(mcmc_results)
Free parameters [ median (+/-68%) (+/-95%) ]
_______________
<Parameter Nsat_mult: 2.423 (+0.2082 -0.1809) (+0.4361 -0.3432)>
<Parameter alpha_par: 1.009 (+0.00682 -0.007681) (+0.01353 -0.01379)>
<Parameter alpha_perp: 1.005 (+0.004241 -0.004049) (+0.008281 -0.008406)>
<Parameter b1_cA: 1.999 (+0.05749 -0.05801) (+0.1126 -0.1265)>
<Parameter f: 0.867 (+0.02897 -0.02502) (+0.05642 -0.05073)>
<Parameter f1h_sBsB: 3.569 (+0.5521 -0.6841) (+1.224 -1.396)>
<Parameter fs: 0.1434 (+0.007553 -0.008046) (+0.01554 -0.01544)>
<Parameter fsB: 0.4662 (+0.08146 -0.0792) (+0.1685 -0.1606)>
<Parameter gamma_b1sA: 1.31 (+0.1001 -0.1032) (+0.2119 -0.2205)>
<Parameter gamma_b1sB: 2.343 (+0.1661 -0.181) (+0.3225 -0.3636)>
<Parameter sigma8_z: 0.5411 (+0.01224 -0.01362) (+0.02611 -0.02452)>
<Parameter sigma_c: 0.9297 (+0.06205 -0.06473) (+0.1126 -0.1177)>
<Parameter sigma_sA: 3.443 (+0.2779 -0.2702) (+0.5436 -0.5378)>
Constrained parameters [ median (+/-68%) (+/-95%) ]
______________________
<Parameter NsBsB: 5.178e+04 (+1.721e+04 -1.36e+04) (+4.16e+04 -2.44e+04)>
<Parameter b1_sA: 2.617 (+0.169 -0.1761) (+0.3037 -0.3725)>
<Parameter sigma_sB: 5.71 (+0.3067 -0.2659) (+0.6624 -0.4928)>
<Parameter NcBs: 2.261e+04 (+2900 -2096) (+6130 -4001)>
<Parameter b1_sB: 4.686 (+0.2679 -0.3235) (+0.5596 -0.672)>
<Parameter b1_cB: 3.175 (+0.1504 -0.1736) (+0.2942 -0.3522)>
<Parameter fcB: 0.122 (+0.01336 -0.0149) (+0.02856 -0.02761)>
<Parameter b1_c: 2.141 (+0.04721 -0.04684) (+0.08834 -0.08337)>
<Parameter b1: 2.344 (+0.05369 -0.04764) (+0.1051 -0.1053)>
<Parameter fsigma8: 0.4689 (+0.009673 -0.009076) (+0.01866 -0.01848)>
<Parameter b1_s: 3.575 (+0.1907 -0.2037) (+0.3854 -0.4186)>
<Parameter alpha: 1.006 (+0.004107 -0.004331) (+0.007706 -0.0083)>
<Parameter epsilon: 0.00125 (+0.002298 -0.002462) (+0.004696 -0.004652)>
<Parameter b1sigma8: 1.268 (+0.007575 -0.008006) (+0.01531 -0.01495)>
<Parameter F_AP: 1.004 (+0.006927 -0.007386) (+0.01419 -0.01393)>
# access parameters like a dictionary
In [4]: fsat = mcmc_results['fs']
In [5]: print(fsat.median)
0.14339263725236592
and to explore the fitting results from a nlopt
fit
In [6]: nlopt_results = LBFGSResults.from_npz('nlopt_result.npz')
# print out a summary of the parameters, with best-fit values
In [7]: print(nlopt_results)
minimum chi2 = 120.57745038002743
Free parameters [ mean ]
_______________
Nsat_mult : 2.4324089012014207
alpha_par : 1.0072991239946898
alpha_perp : 1.0048333693698863
b1_cA : 2.0068377074748778
f : 0.8735346371321935
f1h_sBsB : 3.6140366462767153
fs : 0.14175570928456935
fsB : 0.4782779433107577
gamma_b1sA : 1.31213984320964
gamma_b1sB : 2.3651886897525896
sigma8_z : 0.5363993024676117
sigma_c : 0.9194602113178405
sigma_sA : 3.420304679802984
Constrained parameters [ mean ]
_________________________
NsBsB : 51736.1
b1_sA : 2.6332517
NcBs : 23183.68
fcB : 0.11864934
sigma_sB : 5.739127
b1_sB : 4.74655
b1_cB : 3.2117057
epsilon : 0.0008172965
b1_c : 2.1497946
b1sigma8 : 1.2667638
alpha : 1.0056546
F_AP : 1.0024539
b1_s : 3.6439955
fsigma8 : 0.46856338
b1 : 2.3616061
# access best-fit values like a dictionary
In [8]: fsat = nlopt_results['fs']
In [9]: print(fsat)
0.14175570928456935
Comparing the Best-fit Model to Data¶
Users can compare the best-fitting model to the data by loading the
results of a fitting run using the pyRSD.rsdfit.FittingDriver
.
We can easily initialize this object by passing the directory where the results
were written to the pyRSD.rsdfit.FittingDriver.from_directory
function.
For the example data downloaded above, we can explore both the data
and theory simulataneously using the included result file
nlopt_result.npz
:
from pyRSD.rsdfit import FittingDriver
# load the model and results into one object
d = FittingDriver.from_directory('pyRSD-example', model_file='pyRSD-example/model.npy', results_file='pyRSD-example/nlopt_result.npz')
# set the fit results
d.set_fit_results()
# the best-fit log probability (likelihood + priors)
print(d.lnprob())
# the best-fit chi2
print(d.chi2())
# the best-fit reduced chi2
print(d.reduced_chi2())
# make a plot of the data vs the theory
d.plot()
show()
In this plot, we show the monopole, quadrupole, and hexadecapole normalized
by the smooth, no-wiggle Eisenstein and Hu
monopole. All of the above steps are identical if we are analyzing \(P(k,\mu)\)
data rather than \(P_\ell(k)\) data. For example, if the periodic-pkmu
example is downloaded, running the function FittingDriver.plot()
using
the included result file nlopt_result.npz
produces the following figure:
This plot shows the best-fit theory and data for 5 wide \(\mu\) bins, normalized by the linear Kaiser \(P(k,\mu)\), using the no-wiggle Eisenstein and Hu linear power spectrum.