Understanding the Python-C Interface

The Python interface to the CCL C code is built using SWIG, which automatically scans the CCL C headers and builds a matching interface in Python. The autogenerated SWIG interface can be accessed through the pyccl.ccllib module. The actual CCL API in Python is a more user-friendly wrapper that has been written on top of the SWIG-generated code.

Originally, much of CCL was written in C with a Python wrapper via SWIG. The code has moved on significantly from this model, however, so most calculations implemented at the Python level do not have a C counterpart. We use C only to implement calculations that cannot be easily implemented in an efficient manner in Python. As the code evolves further, we expect it will rely less and less on the C layer.

Key Parts of SWIG Wrapper

The key parts/conventions of the SWIG wrapper are as follows:

  • Interface .i files: These are kept in the pyccl/ directory, and tell SWIG which functions to extract from the C headers. There are also commands in these files to generate basic function argument documentation, and to remove the ccl_ prefix from function names. The interface files also contain code that tells SWIG how to convert C array arguments to numpy arrays. For certain functions, this code may also contain a simple loop to vectorize the function.

  • Main interface file pyccl/ccl.i: This file imports all of the other interface files. Most of the CCL source files (e.g. ccl_core.c) have their own interface file too. For other files, mostly containing support/utility functions, SWIG only needs the C header (.h) file to be specified in the main ccl.i file.

  • Internal CCL C-level global state: All global state in the CCL library (e.g., whether or not the power spectra splines are initialized and calling the function to initialize them) is controlled by the Python code. This convention ensures that when the C code is running, all data needed already exists. It also allows us to more easily use OpenMP in the C layer, avoiding the problem of having to manage the global state from more than one thread. In practice, this means that before you call the CCL C library from Python, you need to check to make sure that any internal C data you need has been initialized. The pyccl.Cosmology object has methods to help with this (e.g., compute_distances()).

Error Handling

Errors in CCL happen both in the C and Python code. For the Python code, one should simply raise the appropriate error, often CCLError. Error handling in the C code is a bit more subtle because the C code cannot (currently) raise Python exceptions directly.

Instead, CCL takes the convention that each C function passes back or returns a status value. Non-zero values indicate an error. Further, certain values have special meanings that are defined in the include/ccl_error.h header file. The Python file pyccl/_types.py contains translations of the C header macros into Python strings. These translations are used to translate the numerical error codes returned by the C code into something more human readable.

Further, in the CCL C layer, functions can write information about errors into a string held by the C version of the pyccl.Cosmology structure. The Python layer can then read the value of this string and raise it with a Python error when a non-zero status is returned.

Using these two mechanisms, Python functions in the CCL Python API can check for errors in the C layer and report back useful information to the user.

Example Python and C Code

Putting together all of the information above, here is example code based on the old CCL pyccl.bcm module. Note that these code snippets are outdated, since the BCM module has now been fully migrated into python. Still, this provides a useful example of how the C and Python interact in CCL.

C Header File include/ccl_bcm.h

/** @file */
#ifndef __CCL_BCM_H_INCLUDED__
#define __CCL_BCM_H_INCLUDED__

CCL_BEGIN_DECLS

/**
 * Correction for the impact of baryonic physics on the matter power spectrum.
 * Returns f(k,a) [dimensionless] for given cosmology, using the method specified for the baryonic transfer function.
 * f(k,a) is the fractional change in the nonlinear matter power spectrum from the Baryon Correction Model (BCM) of Schenider & Teyssier (2015). The parameters of the model are passed as part of the cosmology class.
 * @param cosmo Cosmology parameters and configurations, including baryonic parameters.
 * @param k Fourier mode, in [1/Mpc] units
 * @param a scale factor, normalized to 1 for today
 * @param status Status flag. 0 if there are no errors, nonzero otherwise.
 * For specific cases see documentation for ccl_error.c
 * @return f(k,a).
 */
double ccl_bcm_model_fka(ccl_cosmology * cosmo, double k, double a, int *status);

CCL_END_DECLS

#endif

Notice that the first argument is a C-level cosmology structure, the order of the arguments for wavenumber and scale factor, and that the last argument is a pointer to the status variable.

This C header file is then included in the include/ccl.h header file via #include "ccl_bcm.h".

C Source File src/ccl_bcm.c

#include <math.h>
#include "ccl.h"

/* BCM correction */
// See Schneider & Teyssier (2015) for details of the model.
double ccl_bcm_model_fka(ccl_cosmology * cosmo, double k, double a, int *status) {
  double fkz;
  double b0;
  double bfunc, bfunc4;
  double kg;
  double gf,scomp;
  double kh;
  double z;

  if (a < 0) {
    *status = CCL_ERROR_PARAMETERS;
    return NaN;
  }

  // compute the BCM model here
  fkz = ...
  return fkz;
}

Here we see that if we encounter an error, the status variable is set and some fiducial value is returned. (Note that the check above does not actually exist in the main CCL source file.)

SWIG Interface File pyccl/ccl_bcm.i

%module ccl_bcm

%{
/* put additional #include here */
%}

// Enable vectorised arguments for arrays
%apply (double* IN_ARRAY1, int DIM1) {(double* k, int nk)};
%apply (int DIM1, double* ARGOUT_ARRAY1) {(int nout, double* output)};

%include "../include/ccl_bcm.h"

/* The python code here will be executed before all of the functions that
   follow this directive. */
%feature("pythonprepend") %{
    if numpy.shape(k) != (nout,):
        raise CCLError("Input shape for `k` must match `(nout,)`!")
%}

%inline %{
void bcm_model_fka_vec(ccl_cosmology * cosmo, double a, double* k, int nk,
                             int nout, double* output, int* status) {
    for(int i=0; i < nk; i++){
      output[i] = ccl_bcm_model_fka(cosmo, k[i], a, status);
    }
}

%}

/* The directive gets carried between files, so we reset it at the end. */
%feature("pythonprepend") %{ %}

This SWIG interface file contains several examples of important features in writing SWIG interfaces.

  • We have declared some inline C code to enable fast, vectorized computations of the BCM model.

  • We have used the numpy.i type signatures, described in their documentation to enable vectorized inputs and outputs. The pyccl/ccl.i interface file defines a global numpy type signature for the status variable.

  • We have used the SWIG pythonprepend feature to add a check on the sizes of the input and output arrays to ensure that we do not access memory out of bounds.

  • We made sure the module name at the top matches the C source file.

  • We made sure to include the C header file from the right relative path.

Finally, this SWIG interface file is included in the pyccl/ccl.i interface file.

Python Module pyccl/bcm.py

from . import ccllib as lib
from .pyutils import _vectorize_fn2


def bcm_model_fka(cosmo, k, a):
    """The BCM model correction factor for baryons.

    .. note:: BCM stands for the "baryonic correction model" of Schneider &
              Teyssier (2015; https://arxiv.org/abs/1510.06034). See the
              `DESC Note <https://github.com/LSSTDESC/CCL/blob/master/doc\
/0000-ccl_note/main.pdf>`_
              for details.

    .. note:: The correction factor is applied multiplicatively so that
              `P_corrected(k, a) = P(k, a) * factor(k, a)`.

    Args:
        cosmo (:obj:`Cosmology`): Cosmological parameters.
        k (float or array_like): Wavenumber; Mpc^-1.
        a (float): Scale factor.

    Returns:
        float or array_like: Correction factor to apply to the power spectrum.
    """
    return _vectorize_fn2(lib.bcm_model_fka,
                          lib.bcm_model_fka_vec, cosmo, k, a)

This file defines the actual API for the BCM model in CCL. It is the function signature and location of this function, along with what values it is supposed to return, that defines the API. Changes to any of the underlying C code or even the helper Python functions do not constitute an API breakage.

There are a few other features to note.

  • We have written a complete, Sphinx-compatible docstring on the module.

  • We are using the CCL helper function _vectorize_fn2 to call the SWIG-generated interfaces pyccl.ccllib.bcm_model_fka and pyccl.ccllib.bcm_model_fka_vec.

  • If any data needed to be initialized before calling the SWIG interfaces, we would need to initialize it by calling a function before vectorize_fn2. See the pyccl.background module for an example.

Finally, let’s take a look at the implementation of the function vectorize_fn2

def _vectorize_fn2(fn, fn_vec, cosmo, x, z, returns_status=True):

    """Generic wrapper to allow vectorized (1D array) access to CCL functions with
    one vector argument and one scalar argument, with a cosmology dependence.

    Args:
        fn (callable): Function with a single argument.
        fn_vec (callable): Function that has a vectorized implementation in
                           a .i file.
        cosmo (ccl_cosmology or Cosmology): The input cosmology which gets
                                            converted to a ccl_cosmology.
        x (float or array_like): Argument to fn.
        z (float): Scalar argument to fn.
        returns_stats (bool): Indicates whether fn returns a status.

    """
    # Access ccl_cosmology object
    cosmo_in = cosmo
    cosmo = cosmo.cosmo
    status = 0

    # If a scalar was passed, convert to an array
    if isinstance(x, int):
        x = float(x)
    if isinstance(x, float):
        # Use single-value function
        if returns_status:
            f, status = fn(cosmo, x, z, status)
        else:
            f = fn(cosmo, x, z)
    elif isinstance(x, np.ndarray):
        # Use vectorised function
        if returns_status:
            f, status = fn_vec(cosmo, z, x, x.size, status)
        else:
            f = fn_vec(cosmo, z, x, x.size)
    else:
        # Use vectorised function
        if returns_status:
            f, status = fn_vec(cosmo, z, x, len(x), status)
        else:
            f = fn_vec(cosmo, z, x, len(x))

    # Check result and return
    check(status, cosmo_in)
    return f

This function does a few nice things for us

  • It handles multiple input argument types, casting them to the right type to be passed to the SWIG-generated interface functions.

  • At the end, it calls another function check. It is this function that checks the status variable and if it is non-zero, raises an error using the error string passed back from the C library.

You will find quite a few versions of this function in pyccl.pyutils for use in calling the SWIG-generated API for different numbers of arguments.