Basic usage

Note

If you like better seeing an example first and how this can be useful out of the box, check how achieve faster rpy2 conversions.

import pyarrow
import rpy2_arrow.arrow as pyra

# Create an array on the Python side using pyarrow
py_array = pyarrow.array(range(10))

# Pass the C/C++ pointer to an R arrow object
r_array = pyra.pyarrow_to_r_array(py_array)

The rpy2 objects obtained is a mapped to and rpy2.robjects.Environment.

The attributes of that R object in the OOP system used can are like keys for a mapping. We can list them with:

tuple(r_array.keys())
('.__enclos_env__',
 '.:xp:.',
 'ApproxEquals',
 'as_vector',
 'cast',
 'clone',
 'data',
 'Equals',
 'Filter',
 'initialize',
 'invalidate',
 'IsNull',
 'IsValid',
 'length',
 'null_count',
 'offset',
 'pointer',
'print',
'RangeEquals',
'set_pointer',
'Slice',
'Take',
'ToString',
'type',
'type_id',
'Validate',
'View')

Note

The R package arrow creates objects using an other R package called R6. R has several options for OOP, and R6 objects inherit from the base R type “environment”. This is why the base converter rpy2_arrow.arrow.converter will show Arrow objects on the R side as rpy2.robjects.Environment or rpy2.rinterface.SexpEnvironment.

The package rpy2_r6 offers utilities to facilitate the mapping of R6 classes to Python classes. An option to use it with rpy2_arrow could be offer later.

For exampple, we can call its method ToString:

print(
    ''.join(
        r_array['ToString']()
     )
 )
<int64>
[
  0,
  1,
  2,
  3,
  4,
  5,
  6,
  7,
  8,
  9
]

The R object can also be used with R functions able to use use arrow objects. For example:

import rpy2.robjects.packages as packages
base = packages.importr('base')
print(base.sum(r_array))
Scalar
45

The object can also be passed to blocks of R code in strings.

r_code = """
## this assumes a R Arrow array my_array

res = sum(my_array > 5)
print(res)
"""

import rpy2.rinterface
import rpy2.robjects
with rpy2.rinterface.local_context() as r_context:
    r_context['my_array'] = r_array
    res = rpy2.robjects.r(r_code)
Scalar
4