Data type conversion
Series underlying type conversion
Series itself is dynamically typed. He points to the underlying ChunkedArray<T> by a reference. Where T is the underlying data type.
#![allow(unused)] fn main() { Series::cast(&self, dtype: &DataType) -> Result<Series, PolarsError> }
Type conversion by Expressions
Numeric types are converted to each other
Converting numeric types of different capacities to each other can experience overflow issues. By default, an error is thrown.
#![allow(unused)] fn main() { col("integers").cast(DataType::Boolean).alias("i2bool")//Numeric to bool, 0 is flase, non-zero is true. col("floats").cast(DataType::Boolean).alias("f2bool")//Numeric to bool, 0 is flase, non-zero is true. bool and numeric are allowed to convert to each other. However, transformation from string to bool is not allowed. }
Strings are converted to Numbers
#![allow(unused)] fn main() { col("floats_as_string").cast(DataType::Float64) //Converts a string to a numeric value through a transformation operation, throwing a runtime error if a non-number occurs. }
Numbers are converted to strings
#![allow(unused)] fn main() { col("integers").cast(DataType::String) col("float").cast(DataType::String) }
The string is parsed to a datetime
#![allow(unused)] fn main() { let mut opt = StrptimeOptions::default(); opt.format=Some("%Y-%m-%d %H:%M:%S".to_owned()); //Default value is None, polars can recognize standard date and time such as: //"2024-09-20 00:44:00+08:00" by default, time zone can be omitted. If your data // is in standard datetime format, then you don't need to set this parameter. col("datetimestr").str().to_datetime(None,None,opt,lit("raise")) }
The to_datatime parameters are Units, Time Zone, Resolution Parameters, and Ambiguity. Many parts of the world use daylight saving time, which turns on and moves the local time forward by 1 hour and back at the end of daylight saving time. Dialing 1 hour into the future will cause a time period to not exist, and dialing back will cause a time period to appear twice. When a certain time value appears in a time period that should not exist, that is, when the ambiguity processing parameter comes into play, it is set to lit("raise"), and an error is reported when the ambiguous time occurs. An important lesson is that the storage of dates and times must contain time zones, such as "2024-09-20 0:44:00+08:00", and the time zone information can be used to represent the time correctly. In cases where the time zone is omitted, the time zone defaults to UTC. This correctness is not only in the local region, but also in the world, where the data can be correctly calculated and compared.
Datetime to string
This is datetime formating.
#![allow(unused)] fn main() { col("datetime").dt().to_string("Data: %Y-%m-%d, Time: %H:%M:%S") }