Skip to content

Debugging and benchmarking


Benchmarking with hyperfine

We use hyperfine to benchmark, especially comparing before and after to see the impact of a change: https://github.com/sharkdp/hyperfine.

When benchmarking, you must decide if you care about cold cache performance vs. warm cache (or both). If cold, use --no-pantsd --no-local-cache. If warm, use hyperfine's option --warmup=1.

For example:

❯ hyperfine --warmup=1 --runs=5 'pants list ::`
❯ hyperfine --runs=5 'pants --no-pantsd --no-local-cache lint ::'

Profiling with py-spy

py-spy is a profiling sampler which can also be used to compare the impact of a change before and after: https://github.com/benfred/py-spy.

To profile with py-spy:

  1. Activate Pants' development venv
  2. source ~/.cache/pants/pants_dev_deps/<your platform dir>/bin/activate
  3. Add Pants' code to Python's path
  4. export PYTHONPATH=src/pants:$PYTHONPATH
  5. Run Pants with py-spy (be sure to disable pantsd)
  6. py-spy record --subprocesses -- python -m pants.bin.pants_loader --no-pantsd <pants args>

The default output is a flamegraph. py-spy can also output speedscope (https://github.com/jlfwong/speedscope) JSON with the --format speedscope flag. The resulting file can be uploaded to https://www.speedscope.app/ which provides a per-process, interactive, detailed UI.

Additionally, to profile the Rust code the --native flag can be passed to py-spy as well. The resulting output will contain frames from Pants Rust code.

Debugging rule code with a debugger

Running pants with the PANTS_DEBUG environment variable set will use debugpy (https://github.com/microsoft/debugpy) to start a Debug-Adapter server (https://microsoft.github.io/debug-adapter-protocol/) which will wait for a client connection before running Pants.

You can connect any Debug-Adapter-compliant editor (Such as VSCode) as a client, and use breakpoints, inspect variables, run code in a REPL, and break-on-exceptions in your rule code.

NOTE: PANTS_DEBUG doesn't work with the pants daemon, so --no-pantsd must be specified.

Identifying the impact of Python's GIL (on macOS)

Obtaining Full Thread Backtraces

Pants runs as a Python program that calls into a native Rust library. In debugging locking and deadlock issues, it is useful to capture dumps of the thread stacks in order to figure out where a deadlock may be occurring.

One-time setup:

  1. Ensure that gdb is installed.
  2. Ubuntu: sudo apt install gdb
  3. Ensure that the kernel is configured to allow debuggers to attach to processes that are not in the same parent/child process hierarchy.
  4. echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
  5. To make the change permanent, add a file to /etc/sysctl.d named 99-ptrace.conf with contents kernel.yama.ptrace_scope = 0. Note: This is a security exposure if you are not normally debugging processes across the process hierarchy.
  6. Ensure that the debug info for your system Python binary is installed.
  7. Ubuntu: sudo apt install python3-dbg

Dumping thread stacks:

  1. Find the pants binary (which may include pantsd if pantsd is enabled).
  2. Run: ps -ef | grep pants
  3. Invoke gdb with the python binary and the process ID:
  4. Run: gdb /path/to/python/binary PROCESS_ID
  5. Enable logging to write the thread dump to gdb.txt: set logging on
  6. Dump all thread backtraces: thread apply all bt
  7. If you use pyenv to mange your Python install, a gdb script will exist in the same directory as the Python binary. Source it into gdb:
  8. source ~/.pyenv/versions/3.8.5/bin/python3.8-gdb.py (if using version 3.8.5)
  9. Dump all Python stacks: thread apply all py-bt