Numpy

A popular solution for scientific computing with Python is numpy (previous instances were Numpy and numarray).

rpy2 has features to ease bidirectional communication with numpy.

High-level interface

From rpy2 to numpy:

R vectors or arrays can be converted to numpy arrays using numpy.array() or numpy.asarray():

import numpy

ltr = robjects.r.letters
ltr_np = numpy.array(ltr)

This behavior is inherited from the low-level interface; vector-like objects inheriting from rpy2.rinterface.SexpVector present an interface recognized by numpy.

from rpy2.robjects.packages import importr, data
import numpy

datasets = importr('datasets')
ostatus = data(datasets).fetch('occupationalStatus')['occupationalStatus']
ostatus_np = numpy.array(ostatus)
ostatus_npnc = numpy.asarray(ostatus)

The matrix ostatus is an 8x8 matrix:

>>> print(ostatus)
      destination
origin   1   2   3   4   5   6   7   8
     1  50  19  26   8   7  11   6   2
     2  16  40  34  18  11  20   8   3
     3  12  35  65  66  35  88  23  21
     4  11  20  58 110  40 183  64  32
     5   2   8  12  23  25  46  28  12
     6  12  28 102 162  90 554 230 177
     7   0   6  19  40  21 158 143  71
     8   0   3  14  32  15 126  91 106

Its content has been copied to a numpy array:

>>> ostatus_np
array([[ 50,  19,  26,   8,   7,  11,   6,   2],
       [ 16,  40,  34,  18,  11,  20,   8,   3],
       [ 12,  35,  65,  66,  35,  88,  23,  21],
       [ 11,  20,  58, 110,  40, 183,  64,  32],
       [  2,   8,  12,  23,  25,  46,  28,  12],
       [ 12,  28, 102, 162,  90, 554, 230, 177],
       [  0,   6,  19,  40,  21, 158, 143,  71],
       [  0,   3,  14,  32,  15, 126,  91, 106]])
>>> ostatus_np[0, 0]
50
>>> ostatus_np[0, 0] = 123
>>> ostatus_np[0, 0]
123
>>> ostatus.rx(1, 1)[0]
50

On the other hand, ostatus_npnc is a view on ostatus; no copy was made:

>>> ostatus_npnc[0, 0] = 456
>>> ostatus.rx(1, 1)[0]
456

Since we did modify an actual R dataset for the session, we should restore it:

>>> ostatus_npnc[0, 0] = 50

As we see, numpy.asarray(): provides a way to build a view on the underlying R array, without making a copy. This will be of particular appeal to developpers whishing to mix rpy2 and numpy code, with the rpy2 objects or the numpy view passed to functions, or for interactive users much more familiar with the numpy syntax.

Note

The current interface is relying on the __array_struct__ defined in numpy.

Python buffers, as defined in PEP 3118, is the way to the future, and rpy2 is already offering them… although as a (poorly documented) experimental feature.

From numpy to rpy2:

The activation (and deactivation) of the automatic conversion of numpy objects into rpy2 objects can be made with:

from rpy2.robjects import numpy2ri
numpy2ri.activate()
numpy2ri.deactivate()

Warning

In earlier versions of rpy2, the import was all that was needed to have the conversion. A side-effect when importing a module can lead to problems, and there is now an extra step to make the conversion active: call the function rpy2.robjects.numpy2ri.activate().

Note

Why make this an optional import, while it could have been included in the function py2ri() (as done in the original patch submitted for that function) ?

Although both are valid and reasonable options, the design decision was taken in order to decouple rpy2 from numpy the most, and do not assume that having numpy installed automatically meant that a programmer wanted to use it.

Note

The module numpy2ri is an example of how custom conversion to and from rpy2.robjects can be performed.

Low-level interface

The rpy2.rinterface.SexpVector objects are made to behave like arrays, as defined in the Python package numpy.

The functions numpy.array() and numpy.asarray() can be used to construct numpy arrays:

>>> import numpy
>>> rx = rinterface.SexpVector([1,2,3,4], rinterface.INTSXP)
>>> nx = numpy.array(rx)
>>> nx_nc = numpy.asarray(rx)

Note

when using numpy.asarray(), the data are not copied.

>>> rx[2]
3
>>> nx_nc[2] = 42
>>> rx[2]
42
>>>