API

The pyRSD.rsdfit.FittingDriver class is responsible for running parameter fits. It combines data and theory to run a Bayesian likelihood analysis.

The properties that describe the data and theory configuration are:

Nb The number of data points
Np The number of free parameters
dof The number of degrees of freedom
model The model object which returns the P(k,mu) or multipoles theory
results The results object storing the fitting results

The functions that evaluate the likelihood, its derivatives, and associated statistics are:

lnprob([theta]) Set the theory free parameters, update the model, and return the log of the posterior probability function
lnlike([theta]) The log of the likelihood, equal to -0.5 * chi2()
minus_lnlike([theta, use_priors]) Return the negative log-likelihood, optionally including priors
grad_minus_lnlike([theta, epsilon, pool, …]) Return the vector of the gradient of the negative log likelihood, with respect to the free parameters
chi2([theta]) The chi-squared for the specified model function
reduced_chi2() The reduced chi squared value, using the current values of the free parameters
fisher([theta, epsilon, pool, use_priors, …]) Return the Fisher information matrix
marginalized_errors([params, theta, fixed]) Return the marginalized errors on the specified parameters, as computed from the Fisher matrix
run(solver_type[, pool, chains_comm]) Run the whole fitting analysis, from start to finish

The user can initialize a FittingDriver object from a results directory using

from_directory(dirname[, results_file, …]) Load a FittingDriver from a results directory

The best-fit parameters can be set and visualized using

set_fit_results([method]) Set the free parameters from the results objects and update the model
plot([usetex, ax, colors, use_labels]) Plot the best-fit theory and data points
plot_residuals() Plot the residuals of the best-fit theory and data
class pyRSD.rsdfit.FittingDriver(param_file, init_model=True, **kwargs)

A driver to run the parameter fitting pipeline, merging together a model, theory, and fitting algorithm

Parameters:

param_file : str

a string specifying the name of the main parameter file

init_model : bool, optional

if True, initialize the theoretical model upon initialization; default is True

Nb

The number of data points

Np

The number of free parameters

chi2(theta=None)

The chi-squared for the specified model function

This returns

\[\chi^2 = (\mathcal{M} - \mathcal{D})^T C^{-1} (\mathcal{M} - \mathcal{D})\]
Parameters:

theta : array_like, optional

an array of the free parameters to evaluate the statistic at; if None, the current values of the free parameters in theory.fit_params is used

dof

The number of degrees of freedom

This is equal to the number of data points minus the number of free parameters

fisher(theta=None, epsilon=0.0001, pool=None, use_priors=True, numerical=False, numerical_from_lnlike=False)

Return the Fisher information matrix

This is defined as the negative Hessian of the log likelihood with respect to the parameter vector

\[F_{ij} = - \frac{\partial^2 \log \mathcal{L}}{\partial \theta_i \partial \theta_j}\]

This uses a central-difference finite-difference approximation to compute the numerical derivatives

Parameters:

theta : array_like, optional

if provided, the values of the free parameters to compute the gradient at; if not provided, the current values of free parameters from theory.fit_params will be used

epsilon : float or array_like, optional

the step-size to use in the finite-difference derivative calculation; default is 1e-4 – can be different for each parameter

pool : MPIPool, optional

a MPI Pool object to distribute the calculations of derivatives to multiple processes in parallel

use_priors : bool, optional

whether to include the log priors in the objective function when minimizing the negative log probability

numerical : bool, optional

if True, evaluate gradients of P(k,mu) numerically using finite difference

numerical_from_lnlike : bool, optional

if True, evaluate the gradient by taking the numerical derivative of minus_lnlike()

classmethod from_directory(dirname, results_file=None, model_file=None, init_model=True, **kwargs)

Load a FittingDriver from a results directory

This reads params.dat file, optionally loading a pickled model and a results object from file

Parameters:

dirname : str

the name of the directory holding the results

results_file : str, optional

the name of the file holding the results. Default is None

model_file : str, optional

the name of the file holding the model to load. Default is None

init_model : bool, optional

whether to initialize the RSD model upon loading. If a model file exists in the specified directory, the model is loaded and no new model is initialized

grad_minus_lnlike(theta=None, epsilon=0.0001, pool=None, use_priors=True, numerical=False, numerical_from_lnlike=False)

Return the vector of the gradient of the negative log likelihood, with respect to the free parameters

This uses a central-difference finite-difference approximation to compute the numerical derivatives

Parameters:

theta : array_like, optional

if provided, the values of the free parameters to compute the gradient at; if not provided, the current values of free parameters from theory.fit_params will be used

epsilon : float or array_like, optional

the step-size to use in the finite-difference derivative calculation; default is 1e-4 – can be different for each parameter

pool : MPIPool, optional

a MPI Pool object to distribute the calculations of derivatives to multiple processes in parallel

use_priors : bool, optional

whether to include the log priors in the objective function when minimizing the negative log probability

numerical : bool, optional

if True, evaluate gradients of P(k,mu) numerically using finite difference

numerical_from_lnlike : bool, optional

if True, evaluate the gradient by taking the numerical derivative of minus_lnlike()

lnlike(theta=None)

The log of the likelihood, equal to -0.5 * chi2()

Parameters:

theta : array_like, optional

an array of the free parameters to evaluate the statistic at; if None, the current values of the free parameters in theory.fit_params is used

lnprob(theta=None)

Set the theory free parameters, update the model, and return the log of the posterior probability function

This returns -0.5 chi2() + lnprior()

Parameters:

theta : array_like, optional

an array of the free parameters to evaluate the statistic at; if None, the current values of the free parameters in theory.fit_params is used

marginalized_errors(params=None, theta=None, fixed=[], **kws)

Return the marginalized errors on the specified parameters, as computed from the Fisher matrix

This is given by: \((\mathcal{F}^{-1}))^(1/2)\)

Optionally, we can fix certain parameters, specified in fixed

Parameters:

params : list, optional

return errors for this parameters; default is all free parameters

theta : array_like, optional

the array of free parameters to evaluate the best-fit model at

fixed : list, optional

list of free parameters to fix when evaluating marginalized errors

**kws :

additional keywords passed to fisher()

Returns:

errors : array_like

the marginalized errors as computed from the inverse of the Fisher matrix

minus_lnlike(theta=None, use_priors=False)

Return the negative log-likelihood, optionally including priors

Parameters:

theta : array_like, optional

an array of the free parameters to evaluate the statistic at; if None, the current values of the free parameters in theory.fit_params is used

use_priors : bool, optional

whether to include the log priors in the objective function when minimizing the negative log probability

model

The model object which returns the P(k,mu) or multipoles theory

plot(usetex=False, ax=None, colors=None, use_labels=True, **kws)

Plot the best-fit theory and data points

plot_residuals()

Plot the residuals of the best-fit theory and data

reduced_chi2()

The reduced chi squared value, using the current values of the free parameters

results

The results object storing the fitting results

run(solver_type, pool=None, chains_comm=None)

Run the whole fitting analysis, from start to finish

Parameters:

solver_type : {‘mcmc’, ‘nlopt’}

either run a MCMC fit with emcee or a nonlinear optimization using LBFGS

pool : MPIPool, optional

a MPI pool object to distribute tasks too

chains_comm : MPI communicator, optional

a communicator for communicating between multiple MCMC chains

set_fit_results(method='median')

Set the free parameters from the results objects and update the model