Building Comet From Source#
It is sometimes preferable to build from source for a specific platform.
Using a Published Source Release#
This documentation is for the current development version of Comet. Published source releases are only available for released versions. To use this version of Comet, see the following section on building from the GitHubn repository.
Official source releases can be downloaded from https://dist.apache.org/repos/dist/release/datafusion/
# Pick the latest version
export COMET_VERSION=0.17.0-SNAPSHOT
# Download the tarball
curl -O "https://dist.apache.org/repos/dist/release/datafusion/datafusion-comet-0.17.0-SNAPSHOT/apache-datafusion-comet-0.17.0-SNAPSHOT.tar.gz"
# Unpack
tar -xzf apache-datafusion-comet-0.17.0-SNAPSHOT.tar.gz
cd apache-datafusion-comet-0.17.0-SNAPSHOT
Build
make release-nogit PROFILES="-Pspark-4.1"
Building from the GitHub repository#
Clone the repository:
git clone https://github.com/apache/datafusion-comet.git
Build Comet for a specific Spark version:
cd datafusion-comet
make release PROFILES="-Pspark-4.1"
Note that the project builds for Scala 2.13 by default but can be built for Scala 2.12 using an additional profile:
make release PROFILES="-Pspark-3.5 -Pscala-2.12"
To build Comet from the source distribution on an isolated environment without an access to github.com it is necessary to disable git-commit-id-maven-plugin, otherwise you will face errors that there is no access to the git during the build process. In that case you may use:
make release-nogit PROFILES="-Pspark-4.1"