Add documentation for using Nsight Systems instead of the default Nsight Compute profiling tool

This commit is contained in:
Cosmin Ștefan Ciocan
2024-02-02 23:08:02 +00:00
parent bac447ef67
commit 0908891a47
2 changed files with 38 additions and 7 deletions
+13 -4
View File
@@ -36,15 +36,24 @@ Options
.. _profile: .. _profile:
-p, --profile -p, --profile
Boolean. If set, runs the NVIDIA Nsight Compute profiler whose Boolean. If set, runs the NVIDIA Nsight Compute (or NVIDIA Nsight Systems
output is appended to standard output. if changed via the \-\-profiler option) profiler whose output is appended to
standard output.
.. _profiler:
-l, --profiler
String. Can either be "ncu" (the default) to use NVIDIA Nsight Compute
profiling tool, or "nsys" to use NVIDIA Nsight Systems profiling tool.
.. _profiler_args: .. _profiler_args:
-a, --profiler-args -a, --profiler-args
String. Optional profiler arguments that can be space separated String. Optional profiler arguments that can be space separated
by wrapping them in double quotes. See all options here: by wrapping them in double quotes. Will be passed to the profiler selected
`Nsight Compute CLI <https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html#command-line-options>`_ by the \-\-profiler option.. See profiler options here:
`Nsight Compute <https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html#command-line-options>`_
or `Nsight Systems <https://docs.nvidia.com/nsight-systems/UserGuide/index.html#command-line-options>`_.
.. _compiler_args: .. _compiler_args:
+25 -3
View File
@@ -225,10 +225,11 @@ Profiling
--------- ---------
Another important feature of nvcc4jupyter is its integration with the NVIDIA Another important feature of nvcc4jupyter is its integration with the NVIDIA
Nsight Compute profiler, which you need to make sure is installed and its Nsight Compute / NVIDIA Nsight Systems profilers, which you need to make sure
executable can be found in a directory in your PATH environment variable. are installed and the executables can be found in a directory in your PATH
environment variable.
In order to use it and provide the profiler with custom arguments, simply run: To profile using Nsight Compute with custom arguments:
.. code-block:: c++ .. code-block:: c++
@@ -256,6 +257,27 @@ Running the cell above will compile and execute the vector addition code in the
Compute (SM) Throughput % 1.19 Compute (SM) Throughput % 1.19
----------------------- ------------- ------------ ----------------------- ------------- ------------
To profile using Nsight Systems with custom arguments:
.. code-block:: c++
%cuda_group_run --group "vector_add" --profiler nsys --profile --profiler-args "profile --stats=true"
Running the cell above will compile and execute the vector addition code in the
"vector_add" group and profile it with Nsight Systems. The output will contain
multiple tables, one of which will look similar to this:
.. code-block::
[5/8] Executing 'cuda_api_sum' stats report
Time (%) Total Time (ns) Num Calls Avg (ns) Med (ns) Min (ns) Max (ns) StdDev (ns) Name
-------- --------------- --------- ------------- ------------- ----------- ----------- ----------- ----------------------
77.3 200,844,276 1 200,844,276.0 200,844,276.0 200,844,276 200,844,276 0.0 cudaMalloc
22.6 58,594,762 2 29,297,381.0 29,297,381.0 29,153,999 29,440,763 202,772.8 cudaMemcpy
0.1 305,450 1 305,450.0 305,450.0 305,450 305,450 0.0 cudaLaunchKernel
0.0 1,970 1 1,970.0 1,970.0 1,970 1,970 0.0 cuModuleGetLoadingMode
Compiler arguments Compiler arguments
------------------ ------------------