mirror of
https://github.com/python/cpython.git
synced 2025-12-31 12:33:28 +00:00
* Add profiling module documentation structure PEP 799 introduces a new `profiling` package that reorganizes Python's profiling tools under a unified namespace. This commit adds the documentation structure to match: a main entry point (profiling.rst) that helps users choose between profilers, detailed docs for the tracing profiler (profiling-tracing.rst), and separated pstats documentation. The tracing profiler docs note that cProfile remains as a backward-compatible alias, so existing code continues to work. The pstats module gets its own page since it's used by both profiler types and deserves focused documentation. * Add profiling.sampling documentation The sampling profiler is new in Python 3.15 and works fundamentally differently from the tracing profiler. It observes programs from outside by periodically capturing stack snapshots, which means zero overhead on the profiled code. This makes it practical for production use where you can attach to live servers. The docs explain the key concepts (statistical vs deterministic profiling), provide quick examples upfront, document all output formats (pstats, flamegraph, gecko, heatmap), and cover the live TUI mode. The defaults table helps users understand what happens without any flags. * Wire profiling docs into the documentation tree Add the new profiling module pages to the Debugging and Profiling toctree. The order places the main profiling.rst entry point first, followed by the two profiler implementations, then pstats, and finally the deprecated profile module last. * Convert profile.rst to deprecation stub The pure Python profile module is deprecated in 3.15 and scheduled for removal in 3.17. Users should migrate to profiling.tracing (or use the cProfile alias which continues to work). The page now focuses on helping existing users migrate: it shows the old vs new import style, keeps the shared API reference since both modules have the same interface, and preserves the calibration docs for anyone still using the pure Python implementation during the transition period. * Update CLI module references for profiling restructure Point cProfile to profiling.tracing docs and add profiling.sampling to the list of modules with CLI interfaces. The old profile-cli label no longer exists after the documentation restructure. * Update whatsnew to link to profiling module docs Enable cross-references to the new profiling module documentation and update the CLI examples to use the current syntax with the attach subcommand. Also reference profiling.tracing instead of cProfile since that's the new canonical name.
270 lines
10 KiB
ReStructuredText
270 lines
10 KiB
ReStructuredText
.. highlight:: shell-session
|
|
|
|
.. _profiling-module:
|
|
|
|
***************************************
|
|
:mod:`profiling` --- Python profilers
|
|
***************************************
|
|
|
|
.. module:: profiling
|
|
:synopsis: Python profiling tools for performance analysis.
|
|
|
|
.. versionadded:: 3.15
|
|
|
|
**Source code:** :source:`Lib/profiling/`
|
|
|
|
--------------
|
|
|
|
.. index::
|
|
single: statistical profiling
|
|
single: profiling, statistical
|
|
single: deterministic profiling
|
|
single: profiling, deterministic
|
|
|
|
|
|
Introduction to profiling
|
|
=========================
|
|
|
|
A :dfn:`profile` is a set of statistics that describes how often and for how
|
|
long various parts of a program execute. These statistics help identify
|
|
performance bottlenecks and guide optimization efforts. Python provides two
|
|
fundamentally different approaches to collecting this information: statistical
|
|
sampling and deterministic tracing.
|
|
|
|
The :mod:`profiling` package organizes Python's built-in profiling tools under
|
|
a single namespace. It contains two submodules, each implementing a different
|
|
profiling methodology:
|
|
|
|
:mod:`profiling.sampling`
|
|
A statistical profiler that periodically samples the call stack. Run scripts
|
|
directly or attach to running processes by PID. Provides multiple output
|
|
formats (flame graphs, heatmaps, Firefox Profiler), GIL analysis, GC tracking,
|
|
and multiple profiling modes (wall-clock, CPU, GIL) with virtually no overhead.
|
|
|
|
:mod:`profiling.tracing`
|
|
A deterministic profiler that traces every function call, return, and
|
|
exception event. Provides exact call counts and precise timing information,
|
|
capturing every invocation including very fast functions.
|
|
|
|
.. note::
|
|
|
|
The profiler modules are designed to provide an execution profile for a
|
|
given program, not for benchmarking purposes. For benchmarking, use the
|
|
:mod:`timeit` module, which provides reasonably accurate timing
|
|
measurements. This distinction is particularly important when comparing
|
|
Python code against C code: deterministic profilers introduce overhead for
|
|
Python code but not for C-level functions, which can skew comparisons.
|
|
|
|
|
|
.. _choosing-a-profiler:
|
|
|
|
Choosing a profiler
|
|
===================
|
|
|
|
For most performance analysis, use the statistical profiler
|
|
(:mod:`profiling.sampling`). It has minimal overhead, works for both development
|
|
and production, and provides rich visualization options including flamegraphs,
|
|
heatmaps, GIL analysis, and more.
|
|
|
|
Use the deterministic profiler (:mod:`profiling.tracing`) when you need **exact
|
|
call counts** and cannot afford to miss any function calls. Since it instruments
|
|
every function call and return, it will capture even very fast functions that
|
|
complete between sampling intervals. The tradeoff is higher overhead.
|
|
|
|
The following table summarizes the key differences:
|
|
|
|
+--------------------+------------------------------+------------------------------+
|
|
| Feature | Statistical sampling | Deterministic |
|
|
| | (:mod:`profiling.sampling`) | (:mod:`profiling.tracing`) |
|
|
+====================+==============================+==============================+
|
|
| **Overhead** | Virtually none | Moderate |
|
|
+--------------------+------------------------------+------------------------------+
|
|
| **Accuracy** | Statistical estimate | Exact call counts |
|
|
+--------------------+------------------------------+------------------------------+
|
|
| **Output formats** | pstats, flamegraph, heatmap, | pstats |
|
|
| | gecko, collapsed | |
|
|
+--------------------+------------------------------+------------------------------+
|
|
| **Profiling modes**| Wall-clock, CPU, GIL | Wall-clock |
|
|
+--------------------+------------------------------+------------------------------+
|
|
| **Special frames** | GC, native (C extensions) | N/A |
|
|
+--------------------+------------------------------+------------------------------+
|
|
| **Attach to PID** | Yes | No |
|
|
+--------------------+------------------------------+------------------------------+
|
|
|
|
|
|
When to use statistical sampling
|
|
--------------------------------
|
|
|
|
The statistical profiler (:mod:`profiling.sampling`) is recommended for most
|
|
performance analysis tasks. Use it the same way you would use
|
|
:mod:`profiling.tracing`::
|
|
|
|
python -m profiling.sampling run script.py
|
|
|
|
One of the main strengths of the sampling profiler is its variety of output
|
|
formats. Beyond traditional pstats tables, it can generate interactive
|
|
flamegraphs that visualize call hierarchies, line-level source heatmaps that
|
|
show exactly where time is spent in your code, and Firefox Profiler output for
|
|
timeline-based analysis.
|
|
|
|
The profiler also provides insight into Python interpreter behavior that
|
|
deterministic profiling cannot capture. Use ``--mode gil`` to identify GIL
|
|
contention in multi-threaded code, ``--mode cpu`` to measure actual CPU time
|
|
excluding I/O waits, or inspect ``<GC>`` frames to understand garbage collection
|
|
overhead. The ``--native`` option reveals time spent in C extensions, helping
|
|
distinguish Python overhead from library performance.
|
|
|
|
For multi-threaded applications, the ``-a`` option samples all threads
|
|
simultaneously, showing how work is distributed. And for production debugging,
|
|
the ``attach`` command connects to any running Python process by PID without
|
|
requiring a restart or code changes.
|
|
|
|
|
|
When to use deterministic tracing
|
|
---------------------------------
|
|
|
|
The deterministic profiler (:mod:`profiling.tracing`) instruments every function
|
|
call and return. This approach has higher overhead than sampling, but guarantees
|
|
complete coverage of program execution.
|
|
|
|
The primary reason to choose deterministic tracing is when you need exact call
|
|
counts. Statistical profiling estimates frequency based on sampling, which may
|
|
undercount short-lived functions that complete between samples. If you need to
|
|
verify that an optimization actually reduced the number of function calls, or
|
|
if you want to trace the complete call graph to understand caller-callee
|
|
relationships, deterministic tracing is the right choice.
|
|
|
|
Deterministic tracing also excels at capturing functions that execute in
|
|
microseconds. Such functions may not appear frequently enough in statistical
|
|
samples, but deterministic tracing records every invocation regardless of
|
|
duration.
|
|
|
|
|
|
Quick start
|
|
===========
|
|
|
|
This section provides the minimal steps needed to start profiling. For complete
|
|
documentation, see the dedicated pages for each profiler.
|
|
|
|
|
|
Statistical profiling
|
|
---------------------
|
|
|
|
To profile a script, use the :mod:`profiling.sampling` module with the ``run``
|
|
command::
|
|
|
|
python -m profiling.sampling run script.py
|
|
python -m profiling.sampling run -m mypackage.module
|
|
|
|
This runs the script under the profiler and prints a summary of where time was
|
|
spent. For an interactive flamegraph::
|
|
|
|
python -m profiling.sampling run --flamegraph script.py
|
|
|
|
To profile an already-running process, use the ``attach`` command with the
|
|
process ID::
|
|
|
|
python -m profiling.sampling attach 1234
|
|
|
|
For custom settings, specify the sampling interval (in microseconds) and
|
|
duration (in seconds)::
|
|
|
|
python -m profiling.sampling run -i 50 -d 30 script.py
|
|
|
|
|
|
Deterministic profiling
|
|
-----------------------
|
|
|
|
To profile a script from the command line::
|
|
|
|
python -m profiling.tracing script.py
|
|
|
|
To profile a piece of code programmatically:
|
|
|
|
.. code-block:: python
|
|
|
|
import profiling.tracing
|
|
profiling.tracing.run('my_function()')
|
|
|
|
This executes the given code under the profiler and prints a summary showing
|
|
exact function call counts and timing.
|
|
|
|
|
|
.. _profile-output:
|
|
|
|
Understanding profile output
|
|
============================
|
|
|
|
Both profilers collect function-level statistics, though they present them in
|
|
different formats. The sampling profiler offers multiple visualizations
|
|
(flamegraphs, heatmaps, Firefox Profiler, pstats tables), while the
|
|
deterministic profiler produces pstats-compatible output. Regardless of format,
|
|
the underlying concepts are the same.
|
|
|
|
Key profiling concepts:
|
|
|
|
**Direct time** (also called *self time* or *tottime*)
|
|
Time spent executing code in the function itself, excluding time spent in
|
|
functions it called. High direct time indicates the function contains
|
|
expensive operations.
|
|
|
|
**Cumulative time** (also called *total time* or *cumtime*)
|
|
Time spent in the function and all functions it called. This measures the
|
|
total cost of calling a function, including its entire call subtree.
|
|
|
|
**Call count** (also called *ncalls* or *samples*)
|
|
How many times the function was called (deterministic) or sampled
|
|
(statistical). In deterministic profiling, this is exact. In statistical
|
|
profiling, it represents the number of times the function appeared in a
|
|
stack sample.
|
|
|
|
**Primitive calls**
|
|
Calls that are not induced by recursion. When a function recurses, the total
|
|
call count includes recursive invocations, but primitive calls counts only
|
|
the initial entry. Displayed as ``total/primitive`` (for example, ``3/1``
|
|
means three total calls, one primitive).
|
|
|
|
**Caller/Callee relationships**
|
|
Which functions called a given function (callers) and which functions it
|
|
called (callees). Flamegraphs visualize this as nested rectangles; pstats
|
|
can display it via the :meth:`~pstats.Stats.print_callers` and
|
|
:meth:`~pstats.Stats.print_callees` methods.
|
|
|
|
|
|
Legacy compatibility
|
|
====================
|
|
|
|
For backward compatibility, the ``cProfile`` module remains available as an
|
|
alias to :mod:`profiling.tracing`. Existing code using ``import cProfile`` will
|
|
continue to work without modification in all future Python versions.
|
|
|
|
.. deprecated:: 3.15
|
|
|
|
The pure Python :mod:`profile` module is deprecated and will be removed in
|
|
Python 3.17. Use :mod:`profiling.tracing` (or its alias ``cProfile``)
|
|
instead. See :mod:`profile` for migration guidance.
|
|
|
|
|
|
.. seealso::
|
|
|
|
:mod:`profiling.sampling`
|
|
Statistical sampling profiler with flamegraphs, heatmaps, and GIL analysis.
|
|
Recommended for most users.
|
|
|
|
:mod:`profiling.tracing`
|
|
Deterministic tracing profiler for exact call counts.
|
|
|
|
:mod:`pstats`
|
|
Statistics analysis and formatting for profile data.
|
|
|
|
:mod:`timeit`
|
|
Module for measuring execution time of small code snippets.
|
|
|
|
|
|
.. rubric:: Submodules
|
|
|
|
.. toctree::
|
|
:maxdepth: 1
|
|
|
|
profiling.tracing.rst
|
|
profiling.sampling.rst
|