krangl / krangl / groupBy

groupBy

fun DataFrame.groupBy(columnSelect: ColumnSelector): DataFrame

Creates a grouped data-frame from a column selector function. See select() for details about column selection.

Most data operations are done on groups defined by variables. group_by() takes the receiver data-frame and converts it into a grouped data-frame where operations are performed "by group". ungroup() removes grouping.

Most krangl verbs like addColumn(), summarize(), etc. will be executed per group if a grouping is present.

// group by a single attribute
flightsData.groupBy("carrier")


// or by multiple attributes
flightsData.groupBy("carrier", "tailnum")

// or by selecting grouping attriutes with indicator function (same as in `select()`
flightsData.groupBy { startsWith("dep_") }

// finally we can also group with arbitrary table expressions
flightsData.groupByExpr { it["dep_time"] eq 22 }
flightsData.groupByExpr({ it["dep_time"] eq 22 }, { it["carrier"] })