Use case
using Polars to develop spark udf instead of pandas
Actually you can. For example, you can use Polars to write Arrow UDF, because Polars allows zero-copy creation of their dataframe from pyarrow RecordBatch and back. At the moment there is only
mapInArrow, butapplyInArrowis already added to the master branch of PySpark and it will be available in spark 4.0. https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.DataFrame.mapInArrow.html Polars UDF will be much much faster than pandas UDFs, I already tried it, it gave about x1.5 - x2
detail tutorial on best way to use polars
https://kevinheavey.github.io/modern-polars/tidy.html#pivot-and-melt