Debugging and benchmarking¶
Benchmarking with hyperfine
¶
We use hyperfine
to benchmark, especially comparing before and after to see the impact of a change: https://github.com/sharkdp/hyperfine.
When benchmarking, you must decide if you care about cold cache performance vs. warm cache (or both). If cold, use --no-pantsd --no-local-cache
. If warm, use hyperfine's option --warmup=1
.
For example:
❯ hyperfine --warmup=1 --runs=5 'pants list ::`
❯ hyperfine --runs=5 'pants --no-pantsd --no-local-cache lint ::'
Profiling with py-spy¶
py-spy
is a profiling sampler which can also be used to compare the impact of a change before and after: https://github.com/benfred/py-spy.
To profile with py-spy
:
- Activate Pants' development venv
source ~/.cache/pants/pants_dev_deps/<your platform dir>/bin/activate
- Add Pants' code to Python's path
export PYTHONPATH=src/pants:$PYTHONPATH
- Run Pants with
py-spy
(be sure to disablepantsd
) py-spy record --subprocesses -- python -m pants.bin.pants_loader --no-pantsd <pants args>
The default output is a flamegraph. py-spy
can also output speedscope (https://github.com/jlfwong/speedscope) JSON with the --format speedscope
flag. The resulting file can be uploaded to https://www.speedscope.app/ which provides a per-process, interactive, detailed UI.
Additionally, to profile the Rust code the --native
flag can be passed to py-spy
as well. The resulting output will contain frames from Pants Rust code.
Debugging rule
code with a debugger¶
Running pants with the PANTS_DEBUG
environment variable set will use debugpy
(https://github.com/microsoft/debugpy)
to start a Debug-Adapter server (https://microsoft.github.io/debug-adapter-protocol/) which will
wait for a client connection before running Pants.
You can connect any Debug-Adapter-compliant editor (Such as VSCode) as a client, and use breakpoints,
inspect variables, run code in a REPL, and break-on-exceptions in your rule
code.
NOTE: PANTS_DEBUG
doesn't work with the pants daemon, so --no-pantsd
must be specified.
Identifying the impact of Python's GIL (on macOS)¶
Obtaining Full Thread Backtraces¶
Pants runs as a Python program that calls into a native Rust library. In debugging locking and deadlock issues, it is useful to capture dumps of the thread stacks in order to figure out where a deadlock may be occurring.
One-time setup:
- Ensure that gdb is installed.
- Ubuntu:
sudo apt install gdb
- Ensure that the kernel is configured to allow debuggers to attach to processes that are not in the same parent/child process hierarchy.
echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
- To make the change permanent, add a file to /etc/sysctl.d named
99-ptrace.conf
with contentskernel.yama.ptrace_scope = 0
. Note: This is a security exposure if you are not normally debugging processes across the process hierarchy. - Ensure that the debug info for your system Python binary is installed.
- Ubuntu:
sudo apt install python3-dbg
Dumping thread stacks:
- Find the pants binary (which may include pantsd if pantsd is enabled).
- Run:
ps -ef | grep pants
- Invoke gdb with the python binary and the process ID:
- Run:
gdb /path/to/python/binary PROCESS_ID
- Enable logging to write the thread dump to
gdb.txt
:set logging on
- Dump all thread backtraces:
thread apply all bt
- If you use pyenv to mange your Python install, a gdb script will exist in the same directory as the Python binary. Source it into gdb:
source ~/.pyenv/versions/3.8.5/bin/python3.8-gdb.py
(if using version 3.8.5)- Dump all Python stacks:
thread apply all py-bt