Overview¶
Background¶
Python is a popular all-purpose scripting language, while R (an open source implementation of the S/Splus language) is a scripting language mostly popular for data analysis, statistics, and graphics. If you are reading this, there are good chances that you are at least familiar with one or both.
Having an interface between both languages to benefit from the libraries of one language while working in the other appeared desirable; an early option to achieve it was the RSPython project, itself part of the Omegahat project.
A bit later, the RPy project appeared and focused on providing simple and robust access to R from within Python, with the initial Unix-only releases quickly followed by Microsoft and MacOS compatible versions. This project is referred to as RPy-1.x in the rest of this document.
The present documentation describes RPy2, an evolution of RPy-1.x. Naturally RPy2 is inspired by RPy, but also by Alexander Belopolsky’s contributions that were waiting to be included into RPy.
This effort can be seen as a redesign and rewrite of the RPy package, and this unfortunately means there is not enough left in common to ensure compatibility.
Installation¶
Docker image¶
There is a Docker image available to try rpy2 out without even reading about requirements (e.g., R installed compiled with the shared library flag). The Docker image can also be an easy start for Windows users.
Its name is rpy2/rpy2, with currently two possible release tags (making the full image names either rpy2/rpy2:3.1.x or rpy2/rpy2:latest).
The image was primarily designed to run a jupyter notebook or an ipython terminal, as shown in further details below, but it can also constitute a base image for custom needs.
Note
If behind a proxy, one will need to pass the relevant environment variables when running a container from the image. Without this the container will not be able to communicate with the internet and perform operations such as downloading and installing additional R packages from CRAN or Python packages from pip.
ipython terminal¶
docker run \
-it --rm \
rpy2/rpy2:3.1.x ipython
jupyter notebook¶
To run the jupyter notebook on port 8888:
docker run \
--rm -p 8888:8888 \
rpy2/rpy2:3.1.x
Once started, point a web browser to http://localhost:8888.
Note
If using docker-machine (which should be the case when on a Mac or a Windows PC), this will not be localhost. The IP address will be given by:
docker-machine ip [name-of-your-docker-machine-vm]
If usure about the name of your docker-machine VM, check the output of the command docker-machine ls.
Requirements¶
Currently the development is done on UNIX-like operating systems with the following software versions. Those are the recommended versions to run rpy2 with.
Software |
Versions |
---|---|
Python |
>=3.6 |
R |
>=3.4 |
Running Rpy2 will require compiled libraries for R, Python, and readline; building rpy2 will require the corresponding development headers (check the documentation for more information about builing rpy2).
Alternative Python implementations¶
CPython is the target implementation, and because of presence of C code in rpy2 is it currently not possible to run the package on Jython. For that same reason, running it with Pypy is expected to require some effort.
Upgrading from an older release of rpy2¶
In order to upgrade one will have to first remove older installed rpy2 packages then and only then install a version of rpy2.
To do so, or to check whether you have an earlier version of rpy2 installed, do the following in a Python console:
import rpy2
rpy2.__path__
An error during execution means that you do not have any older version of rpy2 installed and you should proceed to the next section.
If this returns a string containing a path, you should go to that path and remove all files and directories starting with rpy2. To make sure that the cleaning is complete, open a new Python session and check that the above code results in an error.
Download¶
The following options are, or could be, available for download:
Source packages. Released versions are available on Pypi (Sourceforge is no longer used). Snapshots of the development version can be downloaded from bitbucket
Note
The repository on bitbucket has several branches. Make sure to select the one you are interested in.
Pre-compiled binary packages for
Microsoft’s Windows - unofficial and unsupported binaries are provided by Christoph Gohlke (http://www.lfd.uci.edu/~gohlke/pythonlibs/); there is otherwise currently close to no support for this platform
Apple’s MacOS X (although Fink and Macports are available, there does not seem to be binaries currently available)
Linux distributions
rpy2 has been reported compiling successfully on all 3 platforms, provided that development items such as Python headers and a C compiler are installed.
Note
Choose files from the rpy2 package, not rpy.
Note
The pip or easy_install commands can be used, although they currently only provide installation from source (see easy_install and pip).
Linux precompiled binaries¶
Linux distribution have packaging systems, and rpy2 is present in a number of them, either as a pre-compiled package or a source package compiled on-the-fly.
Note
Those versions will often be older than the latest rpy2 release.
Known distributions are: Debian and related (such as Ubuntu - often the most recent thanks to Dirk Eddelbuettel), Suse, RedHat, Mandrake, Gentoo.
OS X (MacOS) precompiled binaries¶
A binary wheel for OS X is available on pypi since rpy2-2.9.3 (see issue #403).
On, OS X rpy2 is in Macports, Homebrew, and Fink.
Microsoft’s Windows precompiled binaries¶
If available, the executable can be run; this will install the package in the default Python installation.
For few releases in the 2.0.x series, Microsoft Windows binaries were contributed by Laurent Oget from Predictix.
There is currently no binaries or support for Microsoft Windows (because of lack of ressources more than anything else), but the collection of Unofficial Windows Binaries for Python Extension Packages provided by Christoph Gohlke includes rpy2: http://www.lfd.uci.edu/~gohlke/pythonlibs/
Install from source¶
easy_install and pip¶
The source package is on the PYthon Package Index (PYPI), and the pip or easy_install scripts can be used whenever available. The shell command will then just be:
# recommended:
pip install rpy2
# or (but unsupported)
easy_install rpy2
Upgrading an existing installation is done with:
# recommended:
pip install rpy2 --upgrade
# or (but unsupported)
easy_install rpy2 --upgrade
Both utilities have a list of options and their respective documentation should be checked for details.
Note
Starting with rpy2 3.2.0, rpy2 can built and used with cffi
’s ABI or
API modes (releases 3.0.x and 3.1.x were using the ABI mode exclusively).
At the time of writing the default is still the ABI mode but the choice
can be controlled through the environment variable
RPY2_CFFI_MODE. If set, possible values are ABI (default if the environment
variable is not set), API, or BOTH. When the latter, both API and ABI
modes are built, and the choice of which one to use can be made at run time.
source archive¶
To install from a downloaded source archive <rpy_package>, do in a shell:
tar -xzf <rpy_package>.tar.gz
cd <rpy_package>
python setup.py build install
This will build the package, guessing the R HOME from the R executable found in the PATH.
Beside the regular options for distutils
-way of building and installing
Python packages, it is otherwise possible to give explicitly the location for the R HOME:
python setup.py build --r-home /opt/packages/R/lib install
Other options to build the package are:
--r-home-lib # for exotic location of the R shared libraries
--r-home-modules # for R shared modules
Compiling on Linux¶
Given that you have the libraries and development headers listed above, this should be butter smooth.
The most frequent errors seem to be because of missing headers.
Compiling on OS X¶
XCode tools will be required in order to compile rpy2. Please refer to the documentation on the Apple site for more details about what they are and how to install them.
On OS X “Snow Leopard” (10.6.8), it was reported that setting architecture flags was sometimes needed
env ARCHFLAGS="-arch i386 -arch x86_64" pip install rpy2
or
env ARCHFLAGS="-arch i386 -arch x86_64" python setup.py build install
Some people have reported trouble with OS X “Lion”. Please check the bug tracker if you are in that situation.
Using rpy2 with other versions of R or Python¶
Warning
When building rpy2, it is checked that this is against a recommended version of R. Building against a different version is possible, although not supported at all, through the flag –ignore-check-rversion
python setup.py build_ext --ignore-check-rversion install
Since recently, development R is no longer returning an R version and the check ends with an error “Error: R >= <some version> required (and R told ‘development.’).”. The flag –ignore-check-rversion is then required in order to build.
Note
When compiling R from source, do not forget to specify –enable-R-shlib at the ./configure step.
Test an installation¶
An installation can be tested for functionalities, and whenever necessary the different layers constituting the packages can be tested independently.
python -m 'rpy2.tests'
On Python 2.6, this should return that all tests were successful.
Whenever more details are needed, one can consider running tests for specific parts of the package. For example:
python -m rpy2.rinterface.tests.__init__
# or
python -m rpy2.robjects.tests.__init__
Note
Running the tests in an interactive session appears to trigger spurious exceptions when testing callback functions raising exceptions. If unsure, simply use the former way to test (in a shell).
Warning
For reasons that remain to be elucidated, running the test suites used to leave the Python interpreter in a fragile state, soon crashing after the tests have been run.
It is not clear whether this is still the case, but is recommended to terminate the Python process after the tests and start working with a fresh new session.
To test the rpy2.robjects
high-level interface:
python -m 'rpy2.robjects.tests.__init__'
or for a full control of options
import rpy2.robjects.tests
import unittest
# the verbosity level can be increased if needed
tr = unittest.TextTestRunner(verbosity = 1)
suite = rpy2.robjects.tests.suite()
tr.run(suite)
If interested in the lower-level interface, the tests can be run with:
python -m 'rpy2.rinterface.tests.__init__'
or for a full control of options
import rpy2.rinterface.tests
import unittest
# the verbosity level can be increased if needed
tr = unittest.TextTestRunner(verbosity = 1)
suite = rpy2.rinterface.tests.suite()
tr.run(suite)
Contents¶
The package is made of several sub-packages or modules:
rpy2.rinterface
¶
Low-level interface to R, when speed and flexibility matter most. Close to R’s C-level API.
rpy2.robjects
¶
High-level interface, when ease-of-use matters most. Should be the right pick for casual and general use. Based on the previous one.
rpy2.interactive
¶
High-level interface, with an eye for interactive work. Largely based
on rpy2.robjects
.
rpy2.rpy_classic
¶
High-level interface similar to the one in RPy-1.x. This is provided for compatibility reasons, as well as to facilitate the migration to RPy2.
rpy2.rlike
¶
Data structures and functions to mimic some of R’s features and specificities in pure Python (no embedded R process).
Design notes¶
When designing rpy2, attention was given to:
render the use of the module simple from both a Python or R user’s perspective,
minimize the need for knowledge about R, and the need for tricks and workarounds,
allow to customize a lot while remaining at the Python level (without having to go down to C-level).
rpy2.robjects
implements an extension to the interface in
rpy2.rinterface
by extending the classes for R
objects defined there with child classes.
The choice of inheritance was made to facilitate the implementation
of mostly inter-exchangeable classes between rpy2.rinterface
and rpy2.robjects
. For example, an rpy2.rinterface.SexpClosure
can be given any rpy2.robjects.RObject
as a parameter while
any rpy2.robjects.Function
can be given any
rpy2.rinterface.Sexp
. Because of R’s functional basis,
a container-like extension is also present.
The module rpy2.rpy_classic
is using delegation, letting us
demonstrate how to extend rpy2.rinterface
with an alternative
to inheritance.
Acknowledgements¶
Acknowledgements for contributions, support, and early testing go to (alphabetical order):
Alexander Belopolsky, Brad Chapman, Peter Cock, Dirk Eddelbuettel, Thomas Kluyver, Walter Moreira, Laurent Oget, John Owens, Nicolas Rapin, Grzegorz Slodkowicz, Nathaniel Smith, Gregory Warnes, as well as the JRI author(s), the R authors, R-help list responders, Numpy list responders, and other contributors.