Welcome to rpy2-arrow’s documentation!¶
Installation¶
Releases are available on pypi, and can be installed with pip:
pip install rpy2-arrow
To install the development version with pip:
pip install -e git://github.com/rpy2/rpy2-arrow.git@main#egg=rpy2_arrow
The package allows the sharing of Apache Arrow
data structures (Array, ChunkedArray, Field, RecordBatch, RecordBatchReader,
Table, Schema) between Python and R
within the same process. The underlying C/C++ pointer is shared,
meaning potentially large gain in performance compared to regular
arrays or data frames shared between Python and R through the
conversion rules included in rpy2
. When used with a test
pandas.DataFrame
with half a million rows, making that data
availble to R was measured to be 200 times faster with the use of Arrow
(see Conversion).
Note
The R package arrow >= 12.0.0 is required to avoid segfault as exiting Python.
The latest released version of the R package can be installed in R with:
install.packages("arrow")
Polars¶
If using polars
, how to use conversion rules for it are in
Section polars.