R/marrrow.R
marrow_dir.Rd
map + arrow: iterate over a function and collate the results into an Arrow dataset. This happens without the whole dataset being in memory, so is suitable for large data objects. The function must return a data.frame or tibble. The returned value is a path to the directory containing the Arrow dataset.
marrow_dir(.x, .f, ..., .path, .partitioning = c(), .format = "parquet")
.x | vector or list of values for .f to iterate over |
---|---|
.f | function; must return a data.frame/tibble |
... | other arguments to .f |
.path | path to directory where collated Arrow dataset will be stored. will be created if it does not exist |
.partitioning | character vector of columns to use for partitioning. Columns must exist in output of .f. |
.format | "parquet" (the default) or "arrow". |
path to new dataset directory; character string of length one.
# ADD_EXAMPLES_HERE