DataFusion Comet 0.12.0 Changelog#
This release consists of 105 commits from 13 contributors. See credits at the end of this changelog for more information.
Fixed bugs:
fix: Fix
None.getinstringDecodewhenbinchild cannot be converted #2606 (cfmcgrady)fix: Update FuzzDataGenerator to produce dictionary-encoded string arrays & fix bugs that this exposes #2635 (andygrove)
fix: Fallback to Spark for lpad/rpad for unsupported arguments & fix negative length handling #2630 (andygrove)
fix: Mark SortOrder with floating-point as incompatible #2650 (andygrove)
fix: Fall back to Spark for
trunc/date_truncfunctions when format string is unsupported, or is not a literal value #2634 (andygrove)fix: [native_datafusion] only pass single partition of PartitionedFiles into DataSourceExec #2675 (mbutrovich)
fix: Fix subcommands options in fuzz-testing #2684 (manuzhang)
fix: Do not replace SMJ with HJ for
LeftSemi#2687 (comphead)fix: Apply spotless on Iceberg 1.8.1 diff [iceberg] #2700 (hsiang-c)
fix: Fix generate-user-guide-reference-docs failure when mvn command is not executed at root #2691 (manuzhang)
fix: Fix missing SortOrder fallback reason in range partitioning #2716 (andygrove)
fix: CometLiteral class cast exception with arrays #2718 (andygrove)
fix: NormalizeNaNAndZero::children() returns child’s child #2732 (mbutrovich)
fix: checkSparkMaybeThrows should compare Spark and Comet results in success case #2728 (andygrove)
fix: Mark
WindowsExecas incompatible #2748 (andygrove)fix: Add strict floating point mode and fallback to Spark for min/max/sort on floating point inputs when enabled #2747 (andygrove)
fix: Implement producedAttributes for CometWindowExec #2789 (rahulbabarwal89)
fix: Pass all Comet configs to native plan #2801 (andygrove)
Implemented enhancements:
feat: Add option to write benchmark results to file #2640 (andygrove)
feat: Implement metrics for iceberg compat #2615 (EmilyMatt)
feat: Define function signatures in CometFuzz #2614 (andygrove)
feat: cherry-pick UUID conversion logic from #2528 #2648 (mbutrovich)
feat: support
concatfor strings #2604 (comphead)feat: Add support for
abs#2689 (andygrove)feat: Support variadic function in CometFuzz #2682 (manuzhang)
feat: CometExecRule refactor: Unify CometNativeExec creation with Serde in CometOperatorSerde trait #2768 (andygrove)
feat: support cot #2755 (psvri)
feat: Add bash script to build and run fuzz testing #2686 (manuzhang)
feat: Add getSupportLevel to CometAggregateExpressionSerde trait #2777 (andygrove)
feat: Add CI check to ensure generated docs are in sync with code #2779 (andygrove)
feat: Add prettier enforcement #2783 (andygrove)
feat: hyperbolic trig functions #2784 (psvri)
feat: [iceberg] Native scan by serializing FileScanTasks to iceberg-rust #2528 (mbutrovich)
Documentation updates:
docs: Add changelog for 0.11.0 release #2585 (mbutrovich)
docs: Improve documentation layout #2587 (andygrove)
docs: Publish 0.11.0 user guide #2589 (andygrove)
docs: Put Comet logo in top nav bar, respect light/dark mode #2591 (andygrove)
docs: Improve main landing page #2593 (andygrove)
docs: Improve site navigation #2597 (andygrove)
docs: Update benchmark results #2596 (andygrove)
docs: Upgrade pydata-sphinx-theme to 0.16.1 #2602 (andygrove)
docs: Fix redirect #2603 (andygrove)
docs: Fix broken image link #2613 (andygrove)
docs: Add FFI docs to contributor guide #2668 (andygrove)
docs: Various documentation updates #2674 (andygrove)
docs: Add supported SortOrder expressions and fix a typo #2694 (andygrove)
docs: Minor docs update for running Spark SQL tests #2712 (andygrove)
docs: Update contributor guide for adding a new expression #2704 (andygrove)
docs: Documentation updates for
LocalTableScanandWindowExec#2742 (andygrove)docs: Typo fix #2752 (wForget)
docs: Categorize some configs as
testingand add notes about known time zone issues #2740 (andygrove)docs: Run prettier on all markdown files #2782 (andygrove)
docs: Ignore prettier formatting for generated tables #2790 (andygrove)
docs: Add new section to contributor guide, explaining how to add a new operator #2758 (andygrove)
Other:
chore: Start 0.12.0 development #2584 (mbutrovich)
chore: Bump Spark from 3.5.6 to 3.5.7 #2574 (cfmcgrady)
chore(deps): bump parquet from 56.0.0 to 56.2.0 in /native #2608 (dependabot[bot])
chore(deps): bump tikv-jemallocator from 0.6.0 to 0.6.1 in /native #2609 (dependabot[bot])
chore(deps): bump tikv-jemalloc-ctl from 0.6.0 to 0.6.1 in /native #2610 (dependabot[bot])
tests: FuzzDataGenerator instead of Parquet-specific generator #2616 (mbutrovich)
chore: Simplify on-heap memory configuration #2599 (andygrove)
Feat: Add sha1 function impl #2471 (kazantsev-maksim)
chore: Refactor Parquet/DataFrame fuzz data generators #2629 (andygrove)
chore: Remove needless from_raw calls #2638 (EmilyMatt)
chore: support DataFusion 50.3.0 #2605 (comphead)
chore(deps): bump actions/upload-artifact from 4 to 5 #2654 (dependabot[bot])
chore(deps): bump cc from 1.2.42 to 1.2.43 in /native #2653 (dependabot[bot])
chore(deps): bump actions/download-artifact from 5 to 6 #2652 (dependabot[bot])
chore: extract comparison into separate tool #2632 (comphead)
chore: Various improvements to
checkSparkAnswer*methods inCometTestBase#2656 (andygrove)chore: Remove code for unpacking dictionaries prior to FilterExec #2659 (andygrove)
chore: display schema for datasets being compared #2665 (comphead)
chore: Remove
CopyExec#2663 (andygrove)chore: Add extended explain plans to stability suite #2669 (andygrove)
chore(deps): bump aws-config from 1.8.8 to 1.8.10 in /native #2677 (dependabot[bot])
chore(deps): bump cc from 1.2.43 to 1.2.44 in /native #2678 (dependabot[bot])
chore:
tpcbenchoutputexplainjust once and formatted #2679 (comphead)chore: Add tolerance for
ComparisonTool#2699 (comphead)chore: Expand test coverage for
CometWindowsExec#2711 (comphead)chore: generate Float/Double NaN #2695 (hsiang-c)
minor: Combine two CI workflows for Spark SQL tests #2727 (andygrove)
chore: Improve framework for specifying that configs can be set with env vars #2722 (andygrove)
chore: Rename
COMET_EXPLAIN_VERBOSE_ENABLEDtoCOMET_EXTENDED_EXPLAIN_FORMATand change default #2644 (andygrove)chore: Fallback to Spark for windows functions #2726 (comphead)
chore: Refactor operator serde - part 1 #2738 (andygrove)
Feat: Add CometLocalTableScanExec operator #2735 (kazantsev-maksim)
chore(deps): bump cc from 1.2.44 to 1.2.45 in /native #2750 (dependabot[bot])
chore(deps): bump aws-credential-types from 1.2.8 to 1.2.9 in /native #2751 (dependabot[bot])
chore: Operator serde refactor part 2 #2741 (andygrove)
chore: Fallback to Spark for
array_reverseforarray<binary>#2759 (comphead)chore: [iceberg] test iceberg 1.10.0 #2709 (manuzhang)
chore: Add
docs/comet-*to rat exclude list #2762 (manuzhang)Chore: Refactor static invoke exprs #2671 (kazantsev-maksim)
minor: Small refactor for consistent serde for hash aggregate #2764 (andygrove)
minor: Move
operator2PrototoCometExecRule#2767 (andygrove)chore: various refactoring changes for iceberg [iceberg] #2680 (parthchandra)
chore: Refactor CometExecRule handling of sink operators #2771 (andygrove)
minor: Refactor to move window-specific code from
QueryPlanSerdetoCometWindowExec#2780 (andygrove)chore: Remove many references to
COMET_EXPR_ALLOW_INCOMPATIBLE#2775 (andygrove)chore: Remove COMET_EXPR_ALLOW_INCOMPATIBLE config #2786 (andygrove)
chore: check
missingInputfor Comet plan nodes #2795 (comphead)chore: Finish refactoring expression serde out of
QueryPlanSerde#2791 (andygrove)chore: Update docs to fix CI after #2784 #2799 (mbutrovich)
chore: Update q79 golden plan for Spark 4.0 after #2795 #2800 (mbutrovich)
Fix: Fix null handling in CometVector implementations #2643 (cfmcgrady)
Credits#
Thank you to everyone who contributed to this release. Here is a breakdown of commits (PRs merged) per contributor.
54 Andy Grove
11 Oleks V
10 dependabot[bot]
9 Matt Butrovich
6 Manu Zhang
3 Fu Chen
3 Kazantsev Maksim
2 Emily Matheys
2 Vrishabh
2 hsiang-c
1 Parth Chandra
1 Zhen Wang
1 rahulbabarwal89
Thank you also to everyone who contributed in other ways such as filing issues, reviewing PRs, and providing feedback on this release.