mirror of
https://github.com/andreinechaev/nvcc4jupyter.git
synced 2026-06-15 19:50:50 +05:30
Add documentation for using Nsight Systems instead of the default Nsight Compute profiling tool
This commit is contained in:
+13
-4
@@ -36,15 +36,24 @@ Options
|
|||||||
.. _profile:
|
.. _profile:
|
||||||
|
|
||||||
-p, --profile
|
-p, --profile
|
||||||
Boolean. If set, runs the NVIDIA Nsight Compute profiler whose
|
Boolean. If set, runs the NVIDIA Nsight Compute (or NVIDIA Nsight Systems
|
||||||
output is appended to standard output.
|
if changed via the \-\-profiler option) profiler whose output is appended to
|
||||||
|
standard output.
|
||||||
|
|
||||||
|
.. _profiler:
|
||||||
|
|
||||||
|
-l, --profiler
|
||||||
|
String. Can either be "ncu" (the default) to use NVIDIA Nsight Compute
|
||||||
|
profiling tool, or "nsys" to use NVIDIA Nsight Systems profiling tool.
|
||||||
|
|
||||||
.. _profiler_args:
|
.. _profiler_args:
|
||||||
|
|
||||||
-a, --profiler-args
|
-a, --profiler-args
|
||||||
String. Optional profiler arguments that can be space separated
|
String. Optional profiler arguments that can be space separated
|
||||||
by wrapping them in double quotes. See all options here:
|
by wrapping them in double quotes. Will be passed to the profiler selected
|
||||||
`Nsight Compute CLI <https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html#command-line-options>`_
|
by the \-\-profiler option.. See profiler options here:
|
||||||
|
`Nsight Compute <https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html#command-line-options>`_
|
||||||
|
or `Nsight Systems <https://docs.nvidia.com/nsight-systems/UserGuide/index.html#command-line-options>`_.
|
||||||
|
|
||||||
.. _compiler_args:
|
.. _compiler_args:
|
||||||
|
|
||||||
|
|||||||
+25
-3
@@ -225,10 +225,11 @@ Profiling
|
|||||||
---------
|
---------
|
||||||
|
|
||||||
Another important feature of nvcc4jupyter is its integration with the NVIDIA
|
Another important feature of nvcc4jupyter is its integration with the NVIDIA
|
||||||
Nsight Compute profiler, which you need to make sure is installed and its
|
Nsight Compute / NVIDIA Nsight Systems profilers, which you need to make sure
|
||||||
executable can be found in a directory in your PATH environment variable.
|
are installed and the executables can be found in a directory in your PATH
|
||||||
|
environment variable.
|
||||||
|
|
||||||
In order to use it and provide the profiler with custom arguments, simply run:
|
To profile using Nsight Compute with custom arguments:
|
||||||
|
|
||||||
.. code-block:: c++
|
.. code-block:: c++
|
||||||
|
|
||||||
@@ -256,6 +257,27 @@ Running the cell above will compile and execute the vector addition code in the
|
|||||||
Compute (SM) Throughput % 1.19
|
Compute (SM) Throughput % 1.19
|
||||||
----------------------- ------------- ------------
|
----------------------- ------------- ------------
|
||||||
|
|
||||||
|
To profile using Nsight Systems with custom arguments:
|
||||||
|
|
||||||
|
.. code-block:: c++
|
||||||
|
|
||||||
|
%cuda_group_run --group "vector_add" --profiler nsys --profile --profiler-args "profile --stats=true"
|
||||||
|
|
||||||
|
Running the cell above will compile and execute the vector addition code in the
|
||||||
|
"vector_add" group and profile it with Nsight Systems. The output will contain
|
||||||
|
multiple tables, one of which will look similar to this:
|
||||||
|
|
||||||
|
.. code-block::
|
||||||
|
|
||||||
|
[5/8] Executing 'cuda_api_sum' stats report
|
||||||
|
|
||||||
|
Time (%) Total Time (ns) Num Calls Avg (ns) Med (ns) Min (ns) Max (ns) StdDev (ns) Name
|
||||||
|
-------- --------------- --------- ------------- ------------- ----------- ----------- ----------- ----------------------
|
||||||
|
77.3 200,844,276 1 200,844,276.0 200,844,276.0 200,844,276 200,844,276 0.0 cudaMalloc
|
||||||
|
22.6 58,594,762 2 29,297,381.0 29,297,381.0 29,153,999 29,440,763 202,772.8 cudaMemcpy
|
||||||
|
0.1 305,450 1 305,450.0 305,450.0 305,450 305,450 0.0 cudaLaunchKernel
|
||||||
|
0.0 1,970 1 1,970.0 1,970.0 1,970 1,970 0.0 cuModuleGetLoadingMode
|
||||||
|
|
||||||
Compiler arguments
|
Compiler arguments
|
||||||
------------------
|
------------------
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user