Debugging and benchmarking¶

Benchmarking with `hyperfine`¶

We use hyperfine to benchmark, especially comparing before and after to see the impact of a change: https://github.com/sharkdp/hyperfine.

When benchmarking, you must decide if you care about cold cache performance vs. warm cache (or both). If cold, use --no-pantsd --no-local-cache. If warm, use hyperfine's option --warmup=1.

For example:

❯ hyperfine --warmup=1 --runs=5 'pants list ::`
❯ hyperfine --runs=5 'pants --no-pantsd --no-local-cache lint ::'

Profiling with py-spy¶

py-spy is a profiling sampler which can also be used to compare the impact of a change before and after: https://github.com/benfred/py-spy.

To profile with py-spy:

Activate Pants' development venv
source ~/.cache/pants/pants_dev_deps/<your platform dir>/bin/activate
Add Pants' code to Python's path
export PYTHONPATH=src/pants:$PYTHONPATH
Run Pants with py-spy (be sure to disable pantsd)
py-spy record --subprocesses -- python -m pants.bin.pants_loader --no-pantsd <pants args>

The default output is a flamegraph. py-spy can also output speedscope (https://github.com/jlfwong/speedscope) JSON with the --format speedscope flag. The resulting file can be uploaded to https://www.speedscope.app/ which provides a per-process, interactive, detailed UI.

Additionally, to profile the Rust code the --native flag can be passed to py-spy as well. The resulting output will contain frames from Pants Rust code.

Debugging `rule` code with a debugger¶

Running pants with the PANTS_DEBUG environment variable set will use debugpy (https://github.com/microsoft/debugpy) to start a Debug-Adapter server (https://microsoft.github.io/debug-adapter-protocol/) which will wait for a client connection before running Pants.

You can connect any Debug-Adapter-compliant editor (Such as VSCode) as a client, and use breakpoints, inspect variables, run code in a REPL, and break-on-exceptions in your rule code.

NOTE: PANTS_DEBUG doesn't work with the pants daemon, so --no-pantsd must be specified.

Identifying the impact of Python's GIL (on macOS)¶

Obtaining Full Thread Backtraces¶

Pants runs as a Python program that calls into a native Rust library. In debugging locking and deadlock issues, it is useful to capture dumps of the thread stacks in order to figure out where a deadlock may be occurring.

One-time setup:

Ensure that gdb is installed.
Ubuntu: sudo apt install gdb
Ensure that the kernel is configured to allow debuggers to attach to processes that are not in the same parent/child process hierarchy.
echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
To make the change permanent, add a file to /etc/sysctl.d named 99-ptrace.conf with contents kernel.yama.ptrace_scope = 0. Note: This is a security exposure if you are not normally debugging processes across the process hierarchy.
Ensure that the debug info for your system Python binary is installed.
Ubuntu: sudo apt install python3-dbg

Dumping thread stacks:

Find the pants binary (which may include pantsd if pantsd is enabled).
Run: ps -ef | grep pants
Invoke gdb with the python binary and the process ID:
Run: gdb /path/to/python/binary PROCESS_ID
Enable logging to write the thread dump to gdb.txt: set logging on
Dump all thread backtraces: thread apply all bt
If you use pyenv to mange your Python install, a gdb script will exist in the same directory as the Python binary. Source it into gdb:
source ~/.pyenv/versions/3.8.5/bin/python3.8-gdb.py (if using version 3.8.5)
Dump all Python stacks: thread apply all py-bt