Mapping rpy2 objects to arbitrary python objects¶
Note
Switching between a conversion and a no conversion mode, an operation often present when working with RPy-1.x, is no longer necessary with rpy2.
The approach followed in rpy2 has 2 levels (rinterface and robjects), and conversion functions help moving between them.
Protocols¶
At the lower level (rpy2.rinterface
), the rpy2 objects exposing
R objects implement Python protocols to make them feel as natural to a Python
programmer as possible. With them they can be passed as arguments to many
non-rpy2 functions without the need for conversion.
R vectors are mapped to Python objects implementing the methods
__getitem__()
/ __setitem__()
in the sequence
protocol so elements can be accessed easily. They also implement the Python buffer protocol,
allowing them be used in numpy
functions without the need for data copying or conversion.
R functions are mapped to Python
objects implementing the __call__()
so they can be called just as if
they were functions.
R environments are mapped to Python objects implementing __getitem__()
/ __setitem__()
in the mapping
protocol so elements can be accessed similarly to in a Python dict
.
Note
The rinterface level is largely implemented in C, bridging Python and R C-APIs. There is no easy way to customize it.
Conversion¶
In its high-level interface rpy2
is using a conversion system that has the task of
convertion objects between the following 3 representations:
- lower-level interface to R (rpy2.rinterface
level),
- higher-level interface to R (rpy2.robjects
level)
- other (no rpy2
) representations
For example, if one wanted have all Pythontuple
turned into R character vectors (1D arrays of strings) as exposed by rpy2’s low-level interface the function would look like:
from rpy2.rinterface import StrSexpVector
def tuple_str(tpl):
res = StrSexpVector(tpl)
return res
Converter objects¶
The class rpy2.robjects.conversion.Converter
groups such conversion functions
into one object.
Our conversion function defined above can then be registered as follows:
from rpy2.robjects.conversion import Converter
my_converter = Converter('my converter')
my_converter.py2ri.register(tuple, tuple_str)
Converter objects are additive, which can be an easy way to create simple combinations of conversion rules. For example, creating a converter that adds the rule above to the default conversion rules is written:
from rpy2.robjects import default_converter
default_converter + my_converter
Local conversion rules¶
The conversion rules can be customized globally (See section Customizing the conversion) or through the use of local converters as context managers. The latter is recommended when experimenting or wishing a specific behavior of the conversion system that is limited in time.
We can use this to example, if we want to change rpy2’s current refusal to handle sequences of unspecified type.
The following code is throwing an error that rpy2 does not know how to handle Python sequences.
x = (1,2,'c')
from rpy2.robjects.packages import importr
base = importr('base')
# error here:
res = base.paste(x, collapse="-")
This can be changed by using our converter as an addition to the default conversion scheme:
from rpy2.robjects import default_converter
from rpy2.robjects.conversion import Converter, localconverter
with localconverter(default_converter + my_converter) as cv:
res = base.paste(x, collapse="-")
ri2ro()
¶
At this level the conversion is between lower-level (rpy2.rinterface
)
objects and higher-level (rpy2.robjects
) objects.
This method is a generic as implemented in functools.singledispatch()
(with Python 2, singledispatch.singledispatch()
).
ri2py()
¶
At this level the conversion is between lower-level (rpy2.rinterface
)
objects and any objects (presumably non-rpy2 is the conversion can be made).
This method is a generic as implemented in functools.singledispatch()
(with Python 2, singledispatch.singledispatch()
).
For example the optional conversion scheme for numpy
objects
will return numpy arrays whenever possible.
Note
robjects-level objects are also implicitly rinterface-level objects because of the inheritance relationship in their class definition, but the reverse is not true. The robjects level is an higher level of abstraction, aiming at simplifying one’s use of R from Python (although at the possible cost of performances).
p2ri()
¶
At this level the conversion is between (presumably) non-rpy2 objects
and rpy2 lower-level (rpy2.rinterface
).
This method is a generic as implemented in functools.singledispatch()
(with Python 2, singledispatch.singledispatch()
).
Customizing the conversion¶
As an example, let’s assume that one want to return atomic values whenever an R numerical vector is of length one. This is only a matter of writing a new function ri2py that handles this, as shown below:
import rpy2.robjects as robjects
from rpy2.rinterface import SexpVector
@robjects.conversion.ri2ro.register(SexpVector)
def my_ri2ro(obj):
if len(obj) == 1:
obj = obj[0]
return obj
Then we can test it with:
>>> pi = robjects.r.pi
>>> type(pi)
<type 'float'>
At the time of writing singledispath()
does not provide a way to unregister.
Removing the additional conversion rule without restarting Python is left as an
exercise for the reader.
Warning
The example is bending a little the rpy2 rules, as it is using ri2ro while it does not
return an robjects instance when an R vector of length one. We are getting away with it
because atomic Python types such as int
, float
, bool
, complex
,
str
are well handled by rpy2 at the rinterface/C level.