Numpy

A popular solution for scientific computing with Python is numpy.

rpy2 has features to ease bidirectional communication with numpy.

High-level interface

From rpy2 to numpy:

R vectors or arrays can be converted to numpy arrays using numpy.array() or numpy.asarray():

import numpy

ltr = robjects.r.letters
ltr_np = numpy.array(ltr)

This behavior is inherited from the low-level interface; vector-like objects inheriting from rpy2.rinterface.SexpVector present an interface recognized by numpy.

from rpy2.robjects.packages import importr, data
import numpy

datasets = importr('datasets')
ostatus = data(datasets).fetch('occupationalStatus')['occupationalStatus']
ostatus_np = numpy.array(ostatus)
ostatus_npnc = numpy.asarray(ostatus)

The matrix ostatus is an 8x8 matrix:

>>> print(ostatus)
      destination
origin   1   2   3   4   5   6   7   8
     1  50  19  26   8   7  11   6   2
     2  16  40  34  18  11  20   8   3
     3  12  35  65  66  35  88  23  21
     4  11  20  58 110  40 183  64  32
     5   2   8  12  23  25  46  28  12
     6  12  28 102 162  90 554 230 177
     7   0   6  19  40  21 158 143  71
     8   0   3  14  32  15 126  91 106

Its content has been copied to a numpy array:

>>> ostatus_np
array([[ 50,  19,  26,   8,   7,  11,   6,   2],
       [ 16,  40,  34,  18,  11,  20,   8,   3],
       [ 12,  35,  65,  66,  35,  88,  23,  21],
       [ 11,  20,  58, 110,  40, 183,  64,  32],
       [  2,   8,  12,  23,  25,  46,  28,  12],
       [ 12,  28, 102, 162,  90, 554, 230, 177],
       [  0,   6,  19,  40,  21, 158, 143,  71],
       [  0,   3,  14,  32,  15, 126,  91, 106]])
>>> ostatus_np[0, 0]
50
>>> ostatus_np[0, 0] = 123
>>> ostatus_np[0, 0]
123
>>> ostatus.rx(1, 1)[0]
50

On the other hand, ostatus_npnc is a view on ostatus; no copy was made:

>>> ostatus_npnc[0, 0] = 456
>>> ostatus.rx(1, 1)[0]
456

Since we did modify an actual R dataset for the session, we should restore it:

>>> ostatus_npnc[0, 0] = 50

As we see, numpy.asarray(): provides a way to build a view on the underlying R array, without making a copy. This will be of particular appeal to developpers whishing to mix rpy2 and numpy code, with the rpy2 objects or the numpy view passed to functions, or for interactive users much more familiar with the numpy syntax.

Note

The current interface is relying on the __array_struct__ defined in numpy.

Python buffers, as defined in PEP 3118, is the way to the future, and rpy2 is already offering them… although as a (poorly documented) experimental feature.

From numpy to rpy2:

Some of the conversions operations require the copy of data in R structures into Python structures. Whenever this happens, the time it takes and the memory required will depend on object sizes. Because of this reason the use of a local converter is recommended: it makes limiting the use of conversion rules to code blocks of interest easier to achieve.

from rpy2.robjects import numpy2ri
from rpy2.robjects import default_converter
from rpy2.robjects.conversion localconverter

# Create a converter that starts with rpy2's default converter
# to which the numpy conversion rules are added.
np_cv_rules = default_converter + numpy2ri.converter

with localconverter(np_cv_rules) as cv:
    # Anything here and until the `with` block is exited
    # will use our numpy converter whenever objects are
    # passed to R or are returned by R while calling
    # rpy2.robjects functions.
    pass

An example of usage is:

from rpy2.robjects.packages import importr
stats = importr('base')
with localconverter(np_cv_rules) as cv:
    v_np = stats.rlogis(100, location=0, scale=1)
    # `v_np` is a numpy array

# Outside of the scope of the local converter the
# result will not be automatically converted to a
# numpy object.
v_nonp = stats.rlogis(100, location=0, scale=1)

Note

Why make numpy an optional feature for rpy2? This was a design decision taken in order to: - ensure that rpy2 can function without numpy. An early motivation for this was compatibility with Python 3 and dropping support for Python 2. rpy2 did that much earlier than numpy did. - make potentially resource-consuming conversions optional

Note

The module numpy2ri is an example of how custom conversion to and from rpy2.robjects can be performed.

Low-level interface

The rpy2.rinterface.SexpVector objects are made to behave like arrays, as defined in the Python package numpy.

The functions numpy.array() and numpy.asarray() can be used to construct numpy arrays:

>>> import numpy
>>> rx = rinterface.SexpVector([1,2,3,4], rinterface.INTSXP)
>>> nx = numpy.array(rx)
>>> nx_nc = numpy.asarray(rx)

Note

when using numpy.asarray(), the data are not copied.

>>> rx[2]
3
>>> nx_nc[2] = 42
>>> rx[2]
42
>>>