We welcome and encourage contributions of all kinds, such as:
- Tickets with issue reports of feature requests
- Documentation improvements
- Code (PR or PR Review)
In addition to submitting new PRs, we have a healthy tradition of community members helping review each other's PRs. Doing so is a great way to help the community as well as get more familiar with Rust and the relevant codebases.
Install the Rust tool chain:
https://www.rust-lang.org/tools/install
Also, make sure your Rust tool chain is up-to-date, because we always use the latest stable version of Rust to test this project.
rustup update stableThis is a standard cargo project with workspaces. To build it, you need to have rust and cargo:
cargo buildYou can also use rust's official docker image:
docker run --rm -v $(pwd):/arrow-rs -it rust /bin/bash -c "cd /arrow-rs && rustup component add rustfmt && cargo build"The command above assumes that are in the root directory of the project, not in the same directory as this README.md.
You can also compile specific workspaces:
cd arrow && cargo buildBefore running tests and examples, it is necessary to set up the local development environment.
The tests rely on test data that is contained in git submodules.
To pull down this data run the following:
git submodule update --initThis populates data in two git submodules:
../parquet-testing/data(sourced from https://github.com/apache/parquet-testing.git)../testing(sourced from https://github.com/apache/arrow-testing)
By default, cargo test will look for these directories at their
standard location. The following environment variables can be used to override the location:
# Optionally specify a different location for test data
export PARQUET_TEST_DATA=$(cd ../parquet-testing/data; pwd)
export ARROW_TEST_DATA=$(cd ../testing/data; pwd)From here on, this is a pure Rust project and cargo can be used to run tests, benchmarks, docs and examples as usual.
Run tests using the Rust standard cargo test command:
# run all tests.
cargo test
# run only tests for the arrow crate
cargo test -p arrowOur CI uses rustfmt to check code formatting. Before submitting a
PR be sure to run the following and check for lint issues:
cargo +stable fmt --all -- --checkWe recommend using clippy for checking lints during development. While we do not yet enforce clippy checks, we recommend not introducing new clippy errors or warnings.
Run the following to check for clippy lints.
cargo clippyIf you use Visual Studio Code with the rust-analyzer plugin, you can enable clippy to run each time you save a file. See https://users.rust-lang.org/t/how-to-use-clippy-in-vs-code-with-rust-analyzer/41881.
One of the concerns with clippy is that it often produces a lot of false positives, or that some recommendations may hurt readability. We do not have a policy of which lints are ignored, but if you disagree with a clippy lint, you may disable the lint and briefly justify it.
Search for allow(clippy:: in the codebase to identify lints that are ignored/allowed. We currently prefer ignoring lints on the lowest unit possible.
- If you are introducing a line that returns a lint warning or error, you may disable the lint on that line.
- If you have several lints on a function or module, you may disable the lint on the function or module.
- If a lint is pervasive across multiple modules, you may disable it at the crate level.
We can use git pre-commit hook to automate various kinds of git pre-commit checking/formatting.
Suppose you are in the root directory of the project.
First check if the file already exists:
ls -l .git/hooks/pre-commitIf the file already exists, to avoid mistakenly overriding, you MAY have to check
the link source or file content. Else if not exist, let's safely soft link pre-commit.sh as file .git/hooks/pre-commit:
ln -s ../../pre-commit.sh .git/hooks/pre-commitIf sometimes you want to commit without checking, just run git commit with --no-verify:
git commit --no-verify -m "... commit message ..."