Indexing LazyFrame

Indexing in LazyFrame requires using expression syntax.

Column Indexing

In the select context, you can use the col() expression to select certain columns. cols(["date", "logged_at"]) selects specified names. col("*") or all() selects all columns. exclude(["logged_at", "index"]) excludes specified columns. * can be used as a wildcard.

#![allow(unused)]
fn main() {
// Single column selection
let out = types_df.clone().lazy().select([col("place")]).collect()?;
// Wildcard selection, selects all columns starting with 'a'
let out = types_df.clone().lazy().select([col("a*")]).collect()?;
// Regular expression
let out = types_df.clone().lazy().select([col("^.*(as|sa).*$")]).collect()?;
// Multiple column names selection
let out = types_df.clone().lazy().select([cols(["date", "logged_at"])]).collect()?;
// Data type selection, selects all columns that match the data types.
let out = types_df.clone().lazy()
    .select([dtype_cols([DataType::Int64, DataType::UInt32, DataType::Boolean]).n_unique()])
.collect()?;
// Alias to rename fields
let df_alias = df.clone().lazy()
.select([ (col("nrs") + lit(5)).alias("nrs + 5"),
(col("nrs") - lit(5)).alias("nrs - 5")])
.collect()?;
}

Row Indexing

Using LazyFrame Methods

#![allow(unused)]
fn main() {
let res: LazyFrame = types_df.clone().lazy().filter(col("id").gt_eq(lit(4)));
// The filter parameter is an expression that returns a boolean array, rows with false values will be discarded.
println!("{}", res.collect()?);
let res: LazyFrame = types_df.clone().lazy().slice(1, 2); // Returns rows specified by the slice, a negative offset means counting from the end. .slice(-5,3) means selecting 3 elements starting from the 5th last element.
}

Using Expressions for Row Indexing in Context

#![allow(unused)]
fn main() {
let res: LazyFrame = employee_df.clone().lazy().select(
        col("*").filter(
            col("value").gt_eq(lit(4))
            )
        );
let res: LazyFrame = employee_df.clone().lazy().select(col("*").slice(2,3));
}

In these examples, you can see how to use expressions to filter and slice rows and columns in a LazyFrame, providing flexible ways to manipulate data.