<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements.  See the NOTICE file
distributed with this work for additional information
regarding copyright ownership.  The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License.  You may obtain a copy of the License at

  http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied.  See the License for the
specific language governing permissions and limitations
under the License.
-->

# Apache DataFusion Comet: Release Process

This documentation explains the release process for Apache DataFusion Comet.

## Creating the Release Candidate

This part of the process can be performed by any committer.

Here are the steps, using the 0.1.0 release as an example.

### Create Release Branch

This document assumes that GitHub remotes are set up as follows:

```shell
$ git remote -v
apache	git@github.com:apache/datafusion-comet.git (fetch)
apache	git@github.com:apache/datafusion-comet.git (push)
origin	git@github.com:yourgithubid/datafusion-comet.git (fetch)
origin	git@github.com:yourgithubid/datafusion-comet.git (push)
```

Create a release branch from the latest commit in main and push to the `apache` repo:

```shell
get fetch apache
git checkout main
git reset --hard apache/main
git checkout -b branch-0.1
git push apache branch-0.1
```

Update the `pom.xml` files in the release branch to update the Maven version from `0.1.0-SNAPSHOT` to `0.1.0`.

There is no need to update the Rust crate versions because they will already be `0.1.0`.

### Update Version in main

Create a PR against the main branch to prepare for developing the next release:

- Update the Rust crate version to `0.2.0`.
- Update the Maven version to `0.2.0-SNAPSHOT` (both in the `pom.xml` files and also in the diff files
  under `dev/diffs`).
- Update the CI scripts under the `.github` directory.

### Generate the Change Log

Generate a change log to cover changes between the previous release and the release branch HEAD by running
the provided `generate-changelog.py` script.

It is recommended that you set up a virtual Python environment and then install the dependencies:

```shell
python3 -m venv venv
source venv/bin/activate
pip3 install -r requirements.txt
```

To generate the changelog, set the `GITHUB_TOKEN` environment variable to a valid token and then run the script
providing two commit ids or tags followed by the version number of the release being created. The following
example generates a change log of all changes between the previous version and the current release branch HEAD revision.

```shell
export GITHUB_TOKEN=<your-token-here>
python3 generate-changelog.py 0.0.0 HEAD 0.1.0 > ../changelog/0.1.0.md
```

Create a PR against the _main_ branch to add this change log and once this is approved and merged, cherry-pick the
commit into the release branch.

### Build the jars

#### Setup to do the build

The build process requires Docker. Download the latest Docker Desktop from https://www.docker.com/products/docker-desktop/.
If you have multiple docker contexts running switch to the context of the Docker Desktop. For example -

```shell
$ docker context ls
NAME              DESCRIPTION                               DOCKER ENDPOINT                               ERROR
default           Current DOCKER_HOST based configuration   unix:///var/run/docker.sock
desktop-linux     Docker Desktop                            unix:///Users/parth/.docker/run/docker.sock
my_custom_context *                                         tcp://192.168.64.2:2376

$ docker context use desktop-linux
```

#### Run the build script

The `build-release-comet.sh` script will create a docker image for each architecture and use the image
to build the platform specific binaries. These builder images are created every time this script is run.
The script optionally allows overriding of the repository and branch to build the binaries from (Note that
the local git repo is not used in the building of the binaries, but it is used to build the final uber jar).

```shell
Usage: build-release-comet.sh [options]

This script builds comet native binaries inside a docker image. The image is named
"comet-rm" and will be generated by this script

Options are:

-r [repo]   : git repo (default: https://github.com/apache/datafusion-comet.git)
-b [branch] : git branch (default: release)
-t [tag]    : tag for the spark-rm docker image to use for building (default: "latest").
```

Example:

```shell
cd dev/release && ./build-release-comet.sh && cd ../..
```

#### Build output

The build output is installed to a temporary local maven repository. The build script will print the name of the
repository location at the end. This location will be required at the time of deploying the artifacts to a staging
repository

### Tag the Release Candidate

Tag the release branch with `0.1.0-rc1` and push to the `apache` repo

```shell
git fetch apache
git checkout branch-0.1
git reset --hard apache/branch-0.1
git tag 0.1.0-rc1
git push apache 0.1.0-rc1
```

Note that pushing a release candidate tag will trigger a GitHub workflow that will build a Docker image and publish
it to GitHub Container Registry at https://github.com/apache/datafusion-comet/pkgs/container/datafusion-comet

## Publishing the Release Candidate

This part of the process can mostly only be performed by a PMC member.

### Publish the maven artifacts

#### Setup maven

##### One time project setup

Setting up your project in the ASF Nexus Repository from here: https://infra.apache.org/publishing-maven-artifacts.html

##### Release Manager Setup

Set up your development environment from here: https://infra.apache.org/publishing-maven-artifacts.html

##### Build and publish a release candidate to nexus.

The script `publish-to-maven.sh` will publish the artifacts created by the `build-release-comet.sh` script.
The artifacts will be signed using the gpg key of the release manager and uploaded to the maven staging repository.

Note that installed GPG keys can be listed with `gpg --list-keys`. The gpg key is a 40 character hex string.

Note: This script needs `xmllint` to be installed. On macOS xmllint is available by default.

On Ubuntu `apt-get install -y libxml2-utils`

On RedHat `yum install -y xmlstarlet`

```shell

/comet:$./dev/release/publish-to-maven.sh -h
usage: publish-to-maven.sh options

Publish signed artifacts to Maven.

Options
-u ASF_USERNAME - Username of ASF committer account
-r LOCAL_REPO - path to temporary local maven repo (created and written to by 'build-release-comet.sh')

The following will be prompted for -
ASF_PASSWORD - Password of ASF committer account
GPG_KEY - GPG key used to sign release artifacts
GPG_PASSPHRASE - Passphrase for GPG key
```

example

```shell
/comet:$./dev/release/publish-to-maven.sh -u release_manager_asf_id  -r /tmp/comet-staging-repo-VsYOX
ASF Password :
GPG Key (Optional):
GPG Passphrase :
Creating Nexus staging repository
...
```

In the Nexus repository UI (https://repository.apache.org/) locate and verify the artifacts in
staging (https://central.sonatype.org/publish/release/#locate-and-examine-your-staging-repository).

If the artifacts appear to be correct, then close and release the repository so it is made visible (this should
actually happen automatically when running the script).

### Create the Release Candidate Tarball

Run the create-tarball script on the release candidate tag (`0.1.0-rc1`) to create the source tarball and upload it to
the dev subversion repository

```shell
./dev/release/create-tarball.sh 0.1.0 1
```

This will generate an email template for starting the vote.

### Start an Email Voting Thread

Send the email that is generated in the previous step to `dev@datafusion.apache.org`.

## Publishing Binary Releases

Once the vote passes, we can publish the source and binary releases.

### Publishing Source Tarball

Run the release-tarball script to move the tarball to the release subversion repository.

```shell
./dev/release/release-tarball.sh 0.1.0 1
```

### Create a release in the GitHub repository

Go to https://github.com/apache/datafusion-comet/releases and create a release for the release tag, and paste the 
changelog in the description.

### Publishing Maven Artifacts

Promote the Maven artifacts from staging to production by visiting https://repository.apache.org/#stagingRepositories
and selecting the staging repository and then clicking the "release" button.

### Publishing Crates

Publish the `datafusion-comet-spark-expr` crate to crates.io so that other Rust projects can leverage the
Spark-compatible operators and expressions outside of Spark.

### Push a release tag to the repo

Push a release tag (`0.1.0`) to the `apache` repository.

```shell
git fetch apache
git checkout 0.1.0-rc1
git tag 0.1.0
git push apache 0.1.0
```

Note that pushing a release tag will trigger a GitHub workflow that will build a Docker image and publish
it to GitHub Container Registry at https://github.com/apache/datafusion-comet/pkgs/container/datafusion-comet

Reply to the vote thread to close the vote and announce the release.

## Update released version number in documentation

- We provide direct links to the jar files in Maven
- The Kubernetes page needs updating once the Docker image has been published to GitHub Container Regsistry

## Post Release Admin

Register the release with the [Apache Reporter Service](https://reporter.apache.org/addrelease.html?datafusion) using
a version such as `COMET-0.1.0`.

### Delete old RCs and Releases

See the ASF documentation on [when to archive](https://www.apache.org/legal/release-policy.html#when-to-archive)
for more information.

#### Deleting old release candidates from `dev` svn

Release candidates should be deleted once the release is published.

Get a list of DataFusion Comet release candidates:

```shell
svn ls https://dist.apache.org/repos/dist/dev/datafusion | grep comet
```

Delete a release candidate:

```shell
svn delete -m "delete old DataFusion Comet RC" https://dist.apache.org/repos/dist/dev/datafusion/apache-datafusion-comet-0.1.0-rc1/
```

#### Deleting old releases from `release` svn

Only the latest release should be available. Delete old releases after publishing the new release.

Get a list of DataFusion releases:

```shell
svn ls https://dist.apache.org/repos/dist/release/datafusion | grep comet
```

Delete a release:

```shell
svn delete -m "delete old DataFusion Comet release" https://dist.apache.org/repos/dist/release/datafusion/datafusion-comet-0.0.0
```

## Post Release Activities

Writing a blog post about the release is a great way to generate more interest in the project. We typically create a
Google document where the community can collaborate on a blog post. Once the content is agreed then a PR can be
created against the [datafusion-site](https://github.com/apache/datafusion-site) repository to add the blog post. Any
contributor can drive this process.
