Spark Version Compatibility#

This page documents known issues and limitations specific to each supported Apache Spark version.

For general compatibility information that applies across all Spark versions, see the other pages in this compatibility guide.

Spark 3.4#

Spark 3.4.3 is supported with Java 11/17 and Scala 2.12/2.13.

Known Limitations#

  • Reading TimestampLTZ as TimestampNTZ: Spark 3.4 raises an error for this operation (SPARK-36182), but Comet’s native_datafusion scan silently returns the raw UTC value instead. See Parquet Compatibility for details.

  • Unsupported Parquet type conversions: Spark 3.4 raises schema incompatibility errors for certain type mismatches (e.g., reading INT32 as BIGINT, decimal precision changes), but Comet’s native_datafusion scan may not detect these and could return unexpected values. See Parquet Compatibility for details.

Spark 3.5#

Spark 3.5.8 is supported with Java 11/17 and Scala 2.12/2.13.

Known Limitations#

  • Reading TimestampLTZ as TimestampNTZ: Spark 3.5 raises an error for this operation (SPARK-36182), but Comet’s native_datafusion scan silently returns the raw UTC value instead. See Parquet Compatibility for details.

  • Unsupported Parquet type conversions: Spark 3.5 raises schema incompatibility errors for certain type mismatches (e.g., reading INT32 as BIGINT, decimal precision changes), but Comet’s native_datafusion scan may not detect these and could return unexpected values. See Parquet Compatibility for details.

Spark 4.0#

Spark 4.0.2 is supported with Java 17 and Scala 2.13.

Known Limitations#

  • Collation support (#1947, #4051): Spark 4.0 introduced collation support. Non-default collated strings are not yet supported by Comet and will fall back to Spark.

Spark 4.1#

Spark 4.1.1 is supported with Java 17/21 and Scala 2.13.

Known Limitations#

  • NullType columns in Parquet files (#4199): Spark encodes a NullType column as a Parquet BOOLEAN physical type annotated with LogicalType::Unknown. The Rust parquet crate that Comet depends on accepts Unknown only when paired with INT32 and rejects any other physical type with Parquet error: Cannot annotate Unknown from BOOLEAN for field '<name>'. Any attempt to read a Parquet file that contains a NullType column fails at decode time before Comet’s scan runs. Workaround: project the column away, cast it to a concrete type before persisting, or read the file with Comet disabled for that query.

Spark 4.2 (Experimental)#

Spark 4.2.0-preview4 is provided as experimental support with Java 17 and Scala 2.13.

Warning

Spark 4.2 support is experimental and targets a preview release of Spark. It is intended for early evaluation only and should not be used in production.