Francesco Montanari

SAMP is an adapted version of Monte Python, a Markov chain Monte Carlo (MCMC) sampler for Bayesian analysis of cosmological data. Its main purpose is to provide reproducible results for the work presented in arXiv:1709.06022 [astro-ph.CO]. The related necessary likelihoods for polynomial and spline fits to supernovae and galaxy differential ages data are included. After installation, a montepython command is available to run the implemented MCMC chains and analyze them (see montepython -h for usage help).

SAMP also allows to use Python libraries as an alternative to installing the CLASS code (which still remains the recommended interface for difficult computations such as power spectra) to evaluate cosmological quantities. For instance, Astropy can be used to compute cosmological distances. Let's take the compressed likelihood for strong lensing time-delays data given in equation (26) of Suyu et al. (2013). This likelihood is simple enough and already available in the original Monte Python (and from SAMP main directory under montepython/likelihoods/timedelay/), where it is interfaced with CLASS to compute distances, so the two implementations can be compared.

On this page:

Writing a New Likelihood
Running and Analysing the Chains

Writing a New Likelihood

First, let's write the data points. Open a data/timedelay.txt file and add the following text. (File locations are relative to the SAMP main directory.)

# Data on quasar time delays.
#
# Columns are [zd, zs, lambda_d, mu_d, sigma_d]
# with zd = lens redhsift
#      zs = source redhsift
#
# Other three params refer to the compressed lognormal PDF of
# the time-delay distance.
#
# B1608+656 (Suyu et al 2010)
0.6304	 1.394	  4000.0    7.053     0.2282
# RXJ1131-1231 (Suyu et al 2013)
0.295	 0.658	  1425.6    6.4993     0.19377

Let's create a new folder containing the likelihood (let's call it pytimedelay to distinguish it from the existing CLASS-based implementation timedelay).

$ mkdir montepython/likelihoods/pytimedelay/

The new likelihood folder must contains two files, pytimedelay.data and __init__.py. Let's start with pytimedelay.data, which in our case lists the path to the default data directory and the data file name. It also lists the parameters varied in the chain, namely the Hubble constant H0 (its value is in km/s/Mpc) and the matter density parameter today Omega_m. The prefix is the same as the likelihood directory.

# Timedelay data.
pytimedelay.data_directory = data.path['data']
pytimedelay.fname = 'timedelay.txt'
pytimedelay.use_nuisance = ['H0', 'Omega_m']

Next, the file __init__.py contains the computation of the likelihood. While adding such an involved computation not related to package initialization to an __init__.py file is unusual for Pythonic code, this is the standard way to add a likelihood in Monte Python. More specifically, the file must contain a class with the same name as the likelihood folder and that inherits from the Likelihood Monte Python class. An initialization method __init__() and a loglkl() method are required. The arguments passed to those methods are Monte Python objects and paths that allow to retrieve information about the MCMC state and configuration. Here is the complete code.

"""Compressed time-delay likelihood, eq. 26 of Suyu et al. (2013)
[arXiv:1208.6010]. This module requires Astropy.
"""

import os

from astropy.cosmology import FlatLambdaCDM
from astropy import units as u
import numpy as np

from montepython.likelihood_class import Likelihood


class pytimedelay(Likelihood):
    def __init__(self, path, data, command_line):
        # Always required by Monte Python.
        Likelihood.__init__(self, path, data, command_line)

        # Read data points.
        datafile = os.path.join(self.data_directory, self.fname)
        td_data = np.loadtxt(datafile).transpose()
        self.zd, self.zs, self.lambda_d, self.mu_d, self.sigma_d = td_data

    def loglkl(self, cosmo, data):
        # Current point in MCMC parameter space.
        H0 = (data.mcmc_parameters['H0']['current']
              * data.mcmc_parameters['H0']['scale'])
        Omega_m = (data.mcmc_parameters['Omega_m']['current']
                   * data.mcmc_parameters['Omega_m']['scale'])

        # Call Astropy. Distances are in Mpc.
        cosmo = FlatLambdaCDM(H0=H0, Om0=Omega_m, Tcmb0=2.725)
        Dd = cosmo.angular_diameter_distance(self.zd)
        Ds = cosmo.angular_diameter_distance(self.zs)
        Dds = (((1. + self.zs) * Ds - (1 + self.zd) * Dd)
               / ( 1. + self.zs))

        # Time delay distance dt = Dt / 1 Mpc.
        dt = (1 + self.zd) * Dd * Ds / Dds / u.Mpc

        # Compute likelihood.
        if np.all(np.greater(dt, self.lambda_d)):
            lkls = - ((np.log(dt - self.lambda_d) - self.mu_d) ** 2 /
                      2. / self.sigma_d ** 2)\
                  - (np.log(np.sqrt(2. * np.pi) * (dt - self.lambda_d)
                            * self.sigma_d))
            lkl = np.sum(lkls)
        else:
            lkl = data.boundary_loglike

        return lkl

The file starts by importing the necessary modules. Note in particular the Astropy class used to compute cosmological quantities for a flat ΛCDM cosmological model. The __init__() method first initializes Monte Python Likelihood class (this is always required). Then we read the data points from the data file specified above.

The loglkl() method recovers the values of the fit parameters for the current point in the MCMC parameter space (rescaled by a value that the user may have specified in the parameter file, see below). Then it computes the angular diameter distances (to the deflector, to the source, and between the deflector and the source) needed for the time delay distance, rescaled to 1 Mpc. All these relations are given in equations (1) and (2) of Suyu et al. (2013), and equation (26) is used to return the logarithm of the likelihood. If a particular point in parameter space leads to an imaginary logarithm, we reject that point assigning a value at the likelihood boundaries. (Note that the data object used to retrieve the latter information and the value of the varying parameters is an object internal to Monte Python, it does not contain the observed data points.)

Running and Analysing the Chains

Once the likelihood is created, install the updated code:

$ python setup.py build
$ python setup.py install --user

This creates a command montepython available from the shell. Let's switch to a new directory to run and analyze the chains:

$ mkdir ~/pytimedelay_chains && cd ~/pytimedelay_chains

To run the chains, two files specifying the MCMC configuration are needed. Since CLASS is not required for the new likelihood, the first one is simply an empty file named default.conf:

$ touch default.conf

Then the newly create likelihood name, the cosmological parameters varied in the chains, and the number of accepted steps before writing the chain are specified into a file called pytimedelay.param:

#------Experiments to test (separated with commas)-----
data.experiments=['pytimedelay']

#------ Parameter list -------
# data.parameters[class name] = [mean, min, max, 1-sigma, scale, role]
data.parameters['H0'] = [70., 0., None, 3., 1, 'nuisance']
data.parameters['Omega_m'] = [0.3, 0., 1., 0.1, 1, 'nuisance']

#------ Mcmc parameters ----
data.write_step=5

When declaring the parameters we also have specified initial guesses for the mean, limiting values (i.e., flat priors), standard deviation, re-scale factor and a role. When using CLASS, the latter can take the values 'cosmo', 'derived' (in both cases the parameter is read by CLASS) or 'nuisance' (if the parameter is only varied in the likelihood code and not in CLASS). However, when not relying on CLASS all parameters must be declared as nuisance ones (even though strictly speaking they describe the cosmological model) as a technical implementation detail. (This may change in the future. So far the dependence on CLASS is bypassed in a trivial way through a simple dummy class replacing the classy Python wrapper to CLASS.)

Finally, run the chains and analyze the result:

$ montepython run -p pytimedelay.param -o ./output -N 10000
$ montepython info ./output

This creates several files containing the chains, information about mariginalized minimum credible intervals and plots. The 68% interval should be about (71.5 +/- 2.6) km/s/Mpc for the Hubble constant, while the matter density parameter won't be constrained by the data (that's expected).

The combination of Numpy and Astropy allows to write the likelihood expression in a compact vectorial form without introducing, e.g., for loops. Apart from the more readable code, the main advantage is that, thanks to Numpy optimizations, the execution time is still competitive despite relying on Python modules only (and it will scale nicely increasing the number of data points). In this simple example, requiring only the computation of angular diameter distances, the Astropy implementation is actually much faster than calling CLASS for every point in parameter space (a chain of 10000 points required 23s on an Intel Core i7-2620M CPU @ 2.70GHz, compared to 15m of the original CLASS code-based implementation; acceptance rate, which may slightly influence execution time, was about 0.3 for both cases). Of course, if involved computations such as power spectra or non-standard cosmologies are required, CLASS is the most efficient interface.

SAMP's Adapted Monty Python

Writing a New Likelihood

Running and Analysing the Chains