API¶

The pyRSD.rsdfit.FittingDriver class is responsible for running parameter fits. It combines data and theory to run a Bayesian likelihood analysis.

The properties that describe the data and theory configuration are:

`Nb`	The number of data points
`Np`	The number of free parameters
`dof`	The number of degrees of freedom
`model`	The model object which returns the P(k,mu) or multipoles theory
`results`	The results object storing the fitting results

The functions that evaluate the likelihood, its derivatives, and associated statistics are:

`lnprob`([theta])	Set the theory free parameters, update the model, and return the log of the posterior probability function
`lnlike`([theta])	The log of the likelihood, equal to -0.5 * `chi2()`
`minus_lnlike`([theta, use_priors])	Return the negative log-likelihood, optionally including priors
`grad_minus_lnlike`([theta, epsilon, pool, …])	Return the vector of the gradient of the negative log likelihood, with respect to the free parameters
`chi2`([theta])	The chi-squared for the specified model function
`reduced_chi2`()	The reduced chi squared value, using the current values of the free parameters
`fisher`([theta, epsilon, pool, use_priors, …])	Return the Fisher information matrix
`marginalized_errors`([params, theta, fixed])	Return the marginalized errors on the specified parameters, as computed from the Fisher matrix
`run`(solver_type[, pool, chains_comm])	Run the whole fitting analysis, from start to finish

The user can initialize a FittingDriver object from a results directory using

from_directory(dirname[, results_file, …]) Load a FittingDriver from a results directory

The best-fit parameters can be set and visualized using

`set_fit_results`([method])	Set the free parameters from the results objects and update the model
`plot`([usetex, ax, colors, use_labels])	Plot the best-fit theory and data points
`plot_residuals`()	Plot the residuals of the best-fit theory and data

class pyRSD.rsdfit.FittingDriver(param_file, init_model=True, **kwargs)¶

A driver to run the parameter fitting pipeline, merging together a model, theory, and fitting algorithm

Parameters:

param_file : str

a string specifying the name of the main parameter file

init_model : bool, optional

if True, initialize the theoretical model upon initialization; default is True

Nb¶: The number of data points

Np¶: The number of free parameters

chi2(theta=None)¶

The chi-squared for the specified model function

This returns

\[\chi^2 = (\mathcal{M} - \mathcal{D})^T C^{-1} (\mathcal{M} - \mathcal{D})\]

Parameters:

theta : array_like, optional

an array of the free parameters to evaluate the statistic at; if None, the current values of the free parameters in theory.fit_params is used

dof¶

The number of degrees of freedom

This is equal to the number of data points minus the number of free parameters

fisher(theta=None, epsilon=0.0001, pool=None, use_priors=True, numerical=False, numerical_from_lnlike=False)¶

Return the Fisher information matrix

This is defined as the negative Hessian of the log likelihood with respect to the parameter vector

\[F_{ij} = - \frac{\partial^2 \log \mathcal{L}}{\partial \theta_i \partial \theta_j}\]

This uses a central-difference finite-difference approximation to compute the numerical derivatives

Parameters:

theta : array_like, optional

if provided, the values of the free parameters to compute the gradient at; if not provided, the current values of free parameters from theory.fit_params will be used

epsilon : float or array_like, optional

the step-size to use in the finite-difference derivative calculation; default is 1e-4 – can be different for each parameter

pool : MPIPool, optional

a MPI Pool object to distribute the calculations of derivatives to multiple processes in parallel

use_priors : bool, optional

whether to include the log priors in the objective function when minimizing the negative log probability

numerical : bool, optional

if True, evaluate gradients of P(k,mu) numerically using finite difference

numerical_from_lnlike : bool, optional

if True, evaluate the gradient by taking the numerical derivative of minus_lnlike()

classmethod from_directory(dirname, results_file=None, model_file=None, init_model=True, **kwargs)¶

Load a FittingDriver from a results directory

This reads params.dat file, optionally loading a pickled model and a results object from file

Parameters:

dirname : str

the name of the directory holding the results

results_file : str, optional

the name of the file holding the results. Default is None

model_file : str, optional

the name of the file holding the model to load. Default is None

init_model : bool, optional

whether to initialize the RSD model upon loading. If a model file exists in the specified directory, the model is loaded and no new model is initialized

grad_minus_lnlike(theta=None, epsilon=0.0001, pool=None, use_priors=True, numerical=False, numerical_from_lnlike=False)¶

Return the vector of the gradient of the negative log likelihood, with respect to the free parameters

This uses a central-difference finite-difference approximation to compute the numerical derivatives

Parameters:

theta : array_like, optional

if provided, the values of the free parameters to compute the gradient at; if not provided, the current values of free parameters from theory.fit_params will be used

epsilon : float or array_like, optional

the step-size to use in the finite-difference derivative calculation; default is 1e-4 – can be different for each parameter

pool : MPIPool, optional

a MPI Pool object to distribute the calculations of derivatives to multiple processes in parallel

use_priors : bool, optional

whether to include the log priors in the objective function when minimizing the negative log probability

numerical : bool, optional

if True, evaluate gradients of P(k,mu) numerically using finite difference

numerical_from_lnlike : bool, optional

if True, evaluate the gradient by taking the numerical derivative of minus_lnlike()

lnlike(theta=None)¶

The log of the likelihood, equal to -0.5 * chi2()

Parameters:

theta : array_like, optional

an array of the free parameters to evaluate the statistic at; if None, the current values of the free parameters in theory.fit_params is used

lnprob(theta=None)¶

Set the theory free parameters, update the model, and return the log of the posterior probability function

This returns -0.5 chi2() + lnprior()

Parameters:

theta : array_like, optional

an array of the free parameters to evaluate the statistic at; if None, the current values of the free parameters in theory.fit_params is used

marginalized_errors(params=None, theta=None, fixed=[], **kws)¶

Return the marginalized errors on the specified parameters, as computed from the Fisher matrix

This is given by: \((\mathcal{F}^{-1}))^(1/2)\)

Optionally, we can fix certain parameters, specified in fixed

Parameters:

params : list, optional

return errors for this parameters; default is all free parameters

theta : array_like, optional

the array of free parameters to evaluate the best-fit model at

fixed : list, optional

list of free parameters to fix when evaluating marginalized errors

**kws :

additional keywords passed to fisher()

Returns:

errors : array_like

the marginalized errors as computed from the inverse of the Fisher matrix

minus_lnlike(theta=None, use_priors=False)¶

Return the negative log-likelihood, optionally including priors

Parameters:

theta : array_like, optional

an array of the free parameters to evaluate the statistic at; if None, the current values of the free parameters in theory.fit_params is used

use_priors : bool, optional

whether to include the log priors in the objective function when minimizing the negative log probability

model¶: The model object which returns the P(k,mu) or multipoles theory

plot(usetex=False, ax=None, colors=None, use_labels=True, **kws)¶: Plot the best-fit theory and data points

plot_residuals()¶: Plot the residuals of the best-fit theory and data

reduced_chi2()¶: The reduced chi squared value, using the current values of the free parameters

results¶: The results object storing the fitting results

run(solver_type, pool=None, chains_comm=None)¶

Run the whole fitting analysis, from start to finish

Parameters:

solver_type : {‘mcmc’, ‘nlopt’}

either run a MCMC fit with emcee or a nonlinear optimization using LBFGS

pool : MPIPool, optional

a MPI pool object to distribute tasks too

chains_comm : MPI communicator, optional

a communicator for communicating between multiple MCMC chains

set_fit_results(method='median')¶: Set the free parameters from the results objects and update the model