Retrieve a specific subset of the aggregated data into a single data frame by specifying which columns to take from each dataset (file_info, scans, peaks, etc.) using dplyr::select()
syntax. If data from more than one dataset is selected (e.g. some columns from scans
AND some from peaks
), the datasets are combined with an dplyr::inner_join()
using the columns listed in by
(only the ones actually in the datasets). Joins that would lead to duplicated data entries (i.e. many-to-many joins) are not allowed and will throw an error to avoid unexpected replications of individual datapoints. If you really want to do such a join, you'll have to do it manually.
Arguments
- aggregated_data
datasets aggregated from
orbi_aggregate_raw()
- file_info
columns to get from the aggregated
file_info
, alldplyr::select()
syntax is supported- scans
columns to get from the aggregated
scans
, alldplyr::select()
syntax is supported- peaks
columns to get from the aggregated
peaks
, alldplyr::select()
syntax is supported- spectra
columns to get from the aggregated
spectra
, alldplyr::select()
syntax is supported- problems
columns to get from the aggregated
problems
, alldplyr::select()
syntax is supported- summary
columns to get from the
summary
calculated viaorbi_summarize_results()
, alldplyr::select()
syntax is supported. Warning: it is not advisable to combine columns fromsummary
with anything other thanfile_info
as it will lead to duplicated datasets given thatsummary
integrates across multiple scans.- by
which columns to look for when joining datasets together. Make sure to include the relevant
by
columns in the selections of the individual datasets so they are joined correctly. The default is usually sufficient