Software
My GitHub profile provides an overview of my recent public open source software engineering activities. My primary working language is Python with help from C/C++ to do some of the computational heavy lifting.
Code Library
Most of my current public open source programming work is done as part of the esi-neuroscience organization on GitHub. One of the larger projects I am involved in is SyNCoPy, a framework for large-scale electrophysiology data-analysis in Python. I was one of its original authors and I am still a regular contributor. I am the maintainer of ACME, a stand-alone concurrent processing wrapper which also serves as SyNCoPy’s parallelization engine.
During my postdoc I have developed research codes for applications in computational neuroscience, stochastic dynamical systems, and general graph theory which are located in the Analytic Tools repository. The full documentation can be found here.
A collection of codes that I have written during my doctoral studies is available in my Math Imaging repo. These routines perform various image processing tasks, such as de-noising or segmentation.
For reference older (and probably now out-dated) instructions for custom-building NumPy and SciPy on Python 2.7 are provided in the section below.
Computational Working Environment
I have profound knowledge and multiple years of practical experience working with the following languages and tools:
Python including the packages NumPy, SciPy, Matplotlib, Pandas, Cython, and Plotly
Matlab including the Optimization Toolbox, Statistics and Machine Learning Toolbox and the Image Processing Toolbox
C including the libraries BLAS, LAPACK and SuiteSparse
HPC system administration of a variety of Linux distributions (specifically Red Hat Enterprise Linux and integration in existing active directory setups (LDAP, Kerberos, DNS, NFS, CIFS)
setup and administration of shared parallel cluster file-systems (specifically IBM Spectrum Scale GPFS) for HPC workloads
custom tooling and administration of a tape archive system (IBM Spectrum Archive LTFS)
cluster workload management and administration in SLURM
continuous integration pipelines (such as GitLab Runners) on multiple platforms (Windows, macOS, Linux) and hardware architectures (x86_64, IBM POWER ppc64le) and testing suites (e.g., pytest) for automated quality control of large code repositories
kernel-level virtualization technologies (such as Docker) for sandboxing applications within containers to ensure software portability
system administration and maintenance of macOS clients (specifically setup, customization and administration of munki, a managed software center for macOS)
version control systems (mainly git, previously CVS and SVN) for collaborative software development (visit my GitHub profile)
data storage models for processing and archiving large complex datasets, such as HDF5
In addition, I routinely use LaTeX as well as all common Microsoft Office components.
Building NumPy and SciPy
The following instructions explain how to set up and build NumPy and Scipy on a Unix-like system. This is of course not the only tutorial explaining this procedure. However, I found the official notes a little sparse in some (in my opinion) crucial places. This is especially the case for UMFPACK, SciPy’s default solver for sparse linear systems. UMFPACK is part of Tim Davis’ SuiteSparse and widely used in many software packages (e.g. MATLAB’s “backslash” operator relies heavily on it). However, SciPy does not require UMFPACK to be installed on your system (it will fall back to SuperLU instead, which is built auto-magically together with SciPy). If you do not care/want/need to set up UMFPACK for the use in SciPy, jump directly to Step 2. Otherwise let’s start with Step 1 (note that it is assumed that you already have BLAS and LAPACK installed on your system)
In contrast to many other packages SciPy requires to be linked against a shared library
libumfpack.so
. However, SuiteSparse does only build static libraries by default. Unfortunately tweaking SuiteSparse to build shared librarieslibumfpack.so
andlibamd.so
(both needed by SciPy) turned out to be painstakingly cumbersome. Thus it may be easiest to just replace the originalMakefiles
with my tuned versions and then run the build process as explained in the following.Download SuiteSparse from here. Extract the archive and move the generated directory to a convenient location. Let’s call this folder
$SuiteSparseDIR
in the following.UMFPACK can use METIS for fast matrix reordering. If you do not care/want/need to set up METIS just skip this step. Otherwise download METIS-4.0.1 from here. Unpack the archive and move the
metis
-directory to$SuiteSparseDIR
. Renamemetis-Makefile.in
toMakefile.in
and place it in$SuiteSparseDIR/metis/
. Fire up a terminal,cd
to$SuiteSparseDIR/metis/
and typemake
. If all goes well proceed to the next step.Download
SuiteSparse_config.mk
and edit the lines where the variablesINSTALL_LIB
andINSTALL_INCLUDE
are defined to your liking. Then put the file in$SuiteSparseDIR/SuiteSparse_config/
.Rename
GNUmakefile_umfpackLib
toGNUmakefile
and move it to$SuiteSparseDIR/UMFPACK/Lib/
. Similarly renameMakefile_umf
toMakefile
and put it in$SuiteSparseDIR/UMFPACK/
.Now the same for AMD: rename
GNUmakefile_amdLib
toGNUmakefile
and move it to$SuiteSparseDIR/AMD/Lib/
. Similarly renameMakefile_amd
toMakefile
and put it in$SuiteSparseDIR/AMD/
.Now
cd
to$SuiteSparseDIR
and typemake library
followed bymake install
(addsudo
ifINSTALL_LIB
andINSTALL_INCLUDE
are not in your home-folder). Note that these steps assume that you did not change the directory structure of SuiteSparse - if you moved any ofAMD
,CAMD
,CHOLMOD
etc. additional modifications are necessary (more details here [1]).
Download NumPy and SciPy and extract both archives. Edit the file
numpy-x.y.z/site.cfg
as follows. Make sure that the install location of your SuiteSparse, i.e.INSTALL_LIB
andINSTALL_INCLUDE
, appears in the[DEFAULT]
section (if you do not care/want/need to use UMFPACK ignore this). Thus if, for instance, SuiteSparse resides in/usr/local/lib
and/usr/local/include
it should be something like[DEFAULT] library_dirs = /usr/lib:/usr/lib64:/usr/local/lib include_dirs = /usr/include:/usr/local/include
Further, make sure to specify your BLAS/LAPACK/ATLAS installations
[blas_opt] libraries = ptf77blas, ptcblas, atlas [lapack_opt] libraries = lapack, ptf77blas, ptcblas, atlas
Finally make sure that your
site.cfg
has a section like[amd] amd_libs = amd [umfpack] umfpack_libs = umfpack
Since NumPy makes heavy use of BLAS it is crucial that the same FORTRAN compiler that built BLAS is used to build NumPy (see this note in the official documentation). If your shared BLAS is
/usr/lib64/libblas.so
then typeldd /usr/lib64/libblas.so
in a terminal. Iflibgfortran.so
appears somewhere in the output then your BLAS was most probably built withgfortran
. Thus to build NumPycd
tonumpy-x.y.z
and typepython setup.py build --fcompiler=gnu95
Conversely, if you find
libg2c.so
inldd
‘s output theng77
has been used to build your BLAS. Hence typepython setup.py build --fcompiler=gnu
Finally the command
python setup.py install --prefix=/path/to/somewhere
installs NumPy in
/path/to/somewhere
(sudo
will be necessary again if/path/to/somewhere
is not in your home directory).If you have installed NumPy in a non-standard location you have to tell Python where to find it before you can use it. Assume you have installed NumPy in the location
/path/to/somewhere
using Python 2.7 on a 64 bit system. Then type in a bash shellexport PYTHONPATH='/path/to/somewhere/lib64/python-2.7/site-packages':$PYTHONPATH
to update your Python path (csh users might want to use
setenv
instead). Then fire up Python and typeimport numpy numpy.__file__
This should result in the output
'/path/to/somewhere/lib64/python-2.7/site-packages/numpy/__init__.pyc'
Thus the just installed NumPy version has been imported. Now you may want to check the output of
numpy.show_config()
to make sure that all libraries have been detected correctly. Finallynumpy.test()
(which needs nose) tests the installation.After the successful installation of NumPy building SciPy is quite straight forward. Copy
site.cfg
fromnumpy-x.y.z
toscipy-x.y.z
. Make sure that your$PYTHONPATH
points to the just installed NumPy (explained above). Depending on your findings before build SciPy usingpython setup.py build --fcompiler=XXX
and install it with
python setup.py install --prefix=/path/to/somewhere
Test the installation with
import scipy scipy.__file__
which should similarly give
'/path/to/somewhere/lib64/python-2.7/site-packages/scipy/__init__.pyc'
Further, check
scipy.show_config()
to make sure that SciPy uses the right libraries. Due to someImportError
- silencing in SciPy (compare e.g. this post) it is not too easy to see if UMFPACK is actually used by SciPy after all. Even ifscipy.show_config()
proudly announces that it foundlibumfpack.so
usinglinalg.spsolve
may produce surprising results. Thus you might want to runtestumfpack.py
to check UMFPACK’s functionality. If you encounter troubles try tocd /path/to/somewhere/lib64/python-2.7/site-packages/scipy/sparse/linalg/dsolve/umfpack/
and
import _umfpack
. This should work without raising anImportError
. Finally you can test SciPy’s functionality by typingscipy.test()
(which also needs nose)If you are tired of updating your
$PYTHONPATH
in order to be able to use your homebrewed NumPy/SciPY each time you start a new shell you may want to include the lineexport PYTHONPATH='/path/to/somewhere/lib64/python-2.7/site-packages':$PYTHONPATH
in your
.bashrc
-file (for bash users).
If you have any comments/questions/criticism please do not hesitate to contact me.