34 Commits

Author SHA1 Message Date
Cosmin Ștefan Ciocan 3f8b89cb58 Merge branch 'master' into feature/profiler-tool-choice 2024-02-16 23:57:06 +01:00
Cosmin Ștefan Ciocan 4d805bb98e Mention Nsight Systems in README.md 2024-02-16 22:50:58 +00:00
Cosmin Ștefan Ciocan c3b8524be6 Fix cuda typo 2024-02-12 19:49:08 +01:00
Cosmin Ștefan Ciocan 781ff5b76b Feature: Passing arguments to NVCC compiler (#26)
* Add option to give nvcc extra arguments

* Add test for nvcc options that changes c++ dialect from c++17 to c++14

* Add make and the english language pack to devcontainer to be able to build the documentation

* Update documentation config to automatically import the current version of the package

* Document new --compiler-args argument

* Improve tests coverage by testing for bad arguments and the error output during a failed compilation

* Add IPython to docs requirements to allow the __version__ import for readthedocs env

* Change devcontainer base image to have the latest CUDA toolkit

* Mock the nsight compute tool with a bash script

* Add test to compile with opencv

* Add new page to documentation that contains a new notebook that explains compiling with external libraries

* Add autodocstring vscode extension to devcontainer

* Add function that modifies the default profiler/compiler arguments to allow reusing them in multiple magic command calls

* Update pylint exceptions

* Update contributing instructions

* Change version from 1.0.3 to 1.1.0 due to adding features in a backward-compatible manner

* Install latest CUDA toolkit on the test runner to pass the OpenCV compilation test

* Install opencv in test runner and update code coverage install

* Add CUDA bin to PATH in test and coverage runners

* Add cuda bin to path variable in .bashrc

* Update way to set environment variable PATH in github action

* Change devcontainer base image back to ubuntu:22.04 to match the environment from the test runner
2024-02-12 17:29:26 +01:00
Cosmin Ștefan Ciocan 0908891a47 Add documentation for using Nsight Systems instead of the default Nsight Compute profiling tool 2024-02-02 23:08:02 +00:00
Cosmin Ștefan Ciocan bac447ef67 Install dev dependencies in editable mode 2024-02-02 23:05:50 +00:00
Cosmin Ștefan Ciocan ba775f7ce1 Search for profiling tools executable paths when they are required 2024-02-02 14:40:29 +00:00
Cosmin Ștefan Ciocan 26fab4d31e Replace experimental-string-processing black formatter config with enable-unstable-feature as it was removed in version 24.1.0 2024-02-02 13:31:18 +00:00
Cosmin Ștefan Ciocan 5a880c93bd Add isort config to help it find local modules so they are not considered 3rd party libraries 2024-02-02 13:26:40 +00:00
Cosmin Ștefan Ciocan 2c108442f6 Add tests for choosing the profiler 2024-02-01 14:55:17 +00:00
Cosmin Ștefan Ciocan 8d39ce01c3 Add option to choose between NSYS and NCU profilers 2024-02-01 14:46:45 +00:00
Cosmin Ștefan Ciocan ee9aa3dba3 Change devcontainer base image back to ubuntu:22.04 to match the environment from the test runner 2024-01-27 14:45:16 +00:00
Cosmin Ștefan Ciocan 27b045b782 Update way to set environment variable PATH in github action 2024-01-27 14:35:06 +00:00
Cosmin Ștefan Ciocan 2614c92b20 Add cuda bin to path variable in .bashrc 2024-01-27 13:49:00 +00:00
Cosmin Ștefan Ciocan 863cdcfa17 Add CUDA bin to PATH in test and coverage runners 2024-01-27 13:42:22 +00:00
Cosmin Ștefan Ciocan 28637d5c64 Install opencv in test runner and update code coverage install 2024-01-27 13:35:16 +00:00
Cosmin Ștefan Ciocan aaaa2605e1 Install latest CUDA toolkit on the test runner to pass the OpenCV compilation test 2024-01-27 13:26:33 +00:00
Cosmin Ștefan Ciocan 9663c74598 Change version from 1.0.3 to 1.1.0 due to adding features in a backward-compatible manner 2024-01-27 02:04:34 +00:00
Cosmin Ștefan Ciocan a3f4f31962 Update contributing instructions 2024-01-27 01:57:09 +00:00
Cosmin Ștefan Ciocan 33801a3491 Update pylint exceptions 2024-01-27 01:42:17 +00:00
Cosmin Ștefan Ciocan b3c015ae74 Add function that modifies the default profiler/compiler arguments to allow reusing them in multiple magic command calls 2024-01-27 01:40:47 +00:00
Cosmin Ștefan Ciocan e9f131a678 Add autodocstring vscode extension to devcontainer 2024-01-27 00:41:39 +00:00
Cosmin Ștefan Ciocan bc91620971 Add new page to documentation that contains a new notebook that explains compiling with external libraries 2024-01-26 16:22:29 +00:00
Cosmin Ștefan Ciocan c1fbc06604 Add test to compile with opencv 2024-01-26 11:30:32 +00:00
Cosmin Ștefan Ciocan b49062e9e2 Mock the nsight compute tool with a bash script 2024-01-26 11:25:58 +00:00
Cosmin Ștefan Ciocan 36fc282eed Change devcontainer base image to have the latest CUDA toolkit 2024-01-26 11:11:23 +00:00
Cosmin Ștefan Ciocan 639624be79 Add IPython to docs requirements to allow the __version__ import for readthedocs env 2024-01-24 00:17:35 +00:00
Cosmin Ștefan Ciocan 6236fe2b1e Improve tests coverage by testing for bad arguments and the error output during a failed compilation 2024-01-23 23:44:36 +00:00
Cosmin Ștefan Ciocan 65eca38a67 Document new --compiler-args argument 2024-01-23 23:01:38 +00:00
Cosmin Ștefan Ciocan 405c16efb3 Update documentation config to automatically import the current version of the package 2024-01-23 22:58:56 +00:00
Cosmin Ștefan Ciocan 595e450eb9 Add make and the english language pack to devcontainer to be able to build the documentation 2024-01-23 22:57:59 +00:00
Cosmin Ștefan Ciocan 50bc8ff4a6 Add test for nvcc options that changes c++ dialect from c++17 to c++14 2024-01-23 22:55:53 +00:00
Cosmin Ștefan Ciocan 881c67f5f1 Add option to give nvcc extra arguments 2024-01-23 22:53:47 +00:00
Cosmin Ștefan Ciocan 5cd225851b Merge pull request #24 from andreinechaev/docs/readme-badge-rename
Change "cosminc98" to "andreinechaev" in badge URLs
2024-01-23 16:14:31 +01:00
24 changed files with 667 additions and 90 deletions
+20 -6
View File
@@ -1,15 +1,29 @@
FROM ubuntu FROM ubuntu:22.04
ARG VENV_PATH=/opt/dev-venv ARG VENV_PATH=/opt/dev-venv
ENV VENV_ACTIVATE=${VENV_PATH}/bin/activate ENV VENV_ACTIVATE=${VENV_PATH}/bin/activate
ENV DEBIAN_FRONTEND="noninteractive"
# install the latest CUDA toolkit (https://developer.nvidia.com/cuda-downloads)
RUN apt update RUN apt update
RUN apt install -y python3.10-venv nvidia-cuda-toolkit gcc vim git RUN apt install -y wget
RUN wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
RUN dpkg -i cuda-keyring_1.1-1_all.deb
RUN apt update
RUN apt -y install cuda-toolkit-12-3
RUN echo "PATH=\"\$PATH:/usr/local/cuda/bin\"" >> ~/.bashrc
# the mkdir command bypasses a profiler error, which allows us to run it with # install OpenCV to test compilation with external libraries
# host code only to at least check that the profiler parameters are correctly RUN apt install -y libopencv-dev pkg-config
# provided; without this line, some tests will fail
RUN mkdir -p /usr/lib/x86_64-linux-gnu/nsight-compute/sections # make & language-pack-en are for documentation
RUN apt install -y \
gcc \
git \
language-pack-en \
make \
python3.10-venv \
vim
# we create the virtualenv here so that the devcontainer.json setting # we create the virtualenv here so that the devcontainer.json setting
# python.defaultInterpreterPath can be used to find it; if we do it in the # python.defaultInterpreterPath can be used to find it; if we do it in the
+4 -2
View File
@@ -16,10 +16,12 @@
"ms-python.isort", "ms-python.isort",
"ms-python.flake8", "ms-python.flake8",
"ms-python.black-formatter", "ms-python.black-formatter",
"ryanluker.vscode-coverage-gutters" "ryanluker.vscode-coverage-gutters",
"njpwerner.autodocstring"
], ],
"settings": { "settings": {
"python.defaultInterpreterPath": "/opt/dev-venv/bin/python" "python.defaultInterpreterPath": "/opt/dev-venv/bin/python",
"autoDocstring.docstringFormat": "google-notypes"
} }
} }
} }
+1 -1
View File
@@ -1,7 +1,7 @@
#!/bin/bash #!/bin/bash
# install developer dependencies # install developer dependencies
pip install .[dev] pip install -e .[dev]
# make sure the developer uses pre-commit hooks # make sure the developer uses pre-commit hooks
pre-commit install pre-commit install
+22 -9
View File
@@ -27,14 +27,19 @@ jobs:
with: with:
python-version: ${{ matrix.python-version }} python-version: ${{ matrix.python-version }}
# the mkdir command bypasses a profiler error, which allows us to run it - name: Install CUDA toolkit
# with host code only to at least check that the profiler parameters are
# correctly provided
- name: Install CUDA tools
run: | run: |
sudo apt update sudo apt update
sudo apt install nvidia-cuda-toolkit sudo apt install -y wget
sudo mkdir -p /usr/lib/x86_64-linux-gnu/nsight-compute/sections wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt update
sudo apt -y install cuda-toolkit-12-3
echo "PATH=$PATH:/usr/local/cuda/bin" >> $GITHUB_ENV
- name: Install OpenCV
run: |
sudo apt install -y libopencv-dev pkg-config
- name: Install Python dependencies - name: Install Python dependencies
run: | run: |
@@ -65,11 +70,19 @@ jobs:
with: with:
python-version: "3.10" python-version: "3.10"
- name: Install CUDA tools - name: Install CUDA toolkit
run: | run: |
sudo apt update sudo apt update
sudo apt install nvidia-cuda-toolkit sudo apt install -y wget
sudo mkdir -p /usr/lib/x86_64-linux-gnu/nsight-compute/sections wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt update
sudo apt -y install cuda-toolkit-12-3
echo "PATH=$PATH:/usr/local/cuda/bin" >> $GITHUB_ENV
- name: Install OpenCV
run: |
sudo apt install -y libopencv-dev pkg-config
- name: Install Python dependencies - name: Install Python dependencies
run: | run: |
+8 -6
View File
@@ -45,7 +45,8 @@ to own a GPU yourself.
Here are just a few of the things that nvcc4jupyter does well: Here are just a few of the things that nvcc4jupyter does well:
- [Easily run CUDA C++ code](https://nvcc4jupyter.readthedocs.io/en/latest/usage.html#hello-world) - [Easily run CUDA C++ code](https://nvcc4jupyter.readthedocs.io/en/latest/usage.html#hello-world)
- [Profile your code with NVIDIA Nsight Compute](https://nvcc4jupyter.readthedocs.io/en/latest/usage.html#profiling) - [Profile your code with NVIDIA Nsight Compute or Nsight Systems](https://nvcc4jupyter.readthedocs.io/en/latest/usage.html#profiling)
- [Compile your code with external libraries (e.g. OpenCV)](https://nvcc4jupyter.readthedocs.io/en/latest/notebooks.html#compiling-with-external-libraries)
- [Share code between different programs in the same notebook / split your code into multiple files for improved readability](https://nvcc4jupyter.readthedocs.io/en/latest/usage.html#groups) - [Share code between different programs in the same notebook / split your code into multiple files for improved readability](https://nvcc4jupyter.readthedocs.io/en/latest/usage.html#groups)
## Install ## Install
@@ -88,13 +89,14 @@ The official documentation is hosted on [readthedocs](https://nvcc4jupyter.readt
## Contributing ## Contributing
Install the package with the development dependencies: The recommended setup for development is using the devcontainer in GitHub
```bash Codespaces or locally in VSCode.
pip install .[dev]
```
As a developer, make sure you install the pre-commit hook before commiting any changes: If not using the devcontainer you need to install the package with the
development dependencies and install the pre-commit hook before commiting any
changes:
```bash ```bash
pip install -e .[dev]
pre-commit install pre-commit install
``` ```
+1
View File
@@ -1,2 +1,3 @@
sphinx==7.1.2 sphinx==7.1.2
sphinx-rtd-theme==1.3.0rc1 sphinx-rtd-theme==1.3.0rc1
IPython>=8.19.0
+9 -2
View File
@@ -6,11 +6,18 @@
# -- Project information ----------------------------------------------------- # -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information # https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information
import os
import sys
sys.path.append(os.path.join("..", ".."))
from nvcc4jupyter.__init__ import __version__ # noqa: E402
project = "nvcc4jupyter" project = "nvcc4jupyter"
copyright = "2024, Andrei Nechaev & Cosmin Stefan Ciocan" copyright = "2024, Andrei Nechaev & Cosmin Stefan Ciocan"
author = "Andrei Nechaev & Cosmin Stefan Ciocan" author = "Andrei Nechaev & Cosmin Stefan Ciocan"
release = "1.0.1" release = __version__
version = "1.0.1" version = __version__
# -- General configuration --------------------------------------------------- # -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration # https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
+1
View File
@@ -10,4 +10,5 @@ which provides CUDA capable GPUs with the CUDA toolkit already installed.
:caption: Contents: :caption: Contents:
usage usage
notebooks
magics magics
+45 -8
View File
@@ -21,23 +21,59 @@ Usage
- ``%%cuda``: Compile and run this cell. - ``%%cuda``: Compile and run this cell.
- ``%%cuda -p``: Also runs the Nsight Compute profiler. - ``%%cuda -p``: Also runs the Nsight Compute profiler.
- ``%%cuda -p -a "<SPACE SEPARATED PROFILER ARGS>"``: Also runs the Nsight Compute profiler. - ``%%cuda -p -a "<SPACE SEPARATED PROFILER ARGS>"``: Also runs the Nsight Compute profiler.
- ``%%cuda -c "<SPACE SEPARATED COMPILER ARGS"``: Passes additional arguments to "nvcc".
- ``%%cuda -t``: Outputs the "timeit" built-in magic results. - ``%%cuda -t``: Outputs the "timeit" built-in magic results.
Options Options
------- -------
.. _timeit:
-t, --timeit -t, --timeit
Boolean. If set, returns the output of the "timeit" built-in Boolean. If set, returns the output of the "timeit" built-in
ipython magic instead of stdout. ipython magic instead of stdout.
.. _profile:
-p, --profile -p, --profile
Boolean. If set, runs the NVIDIA Nsight Compute profiler whose Boolean. If set, runs the NVIDIA Nsight Compute (or NVIDIA Nsight Systems
output is appended to standard output. if changed via the \-\-profiler option) profiler whose output is appended to
standard output.
.. _profiler:
-l, --profiler
String. Can either be "ncu" (the default) to use NVIDIA Nsight Compute
profiling tool, or "nsys" to use NVIDIA Nsight Systems profiling tool.
.. _profiler_args:
.. _profiler_args:
-a, --profiler-args -a, --profiler-args
String. Optional profiler arguments that can be space separated String. Optional profiler arguments that can be space separated
by wrapping them in double quotes. See all options here: by wrapping them in double quotes. Will be passed to the profiler selected
`Nsight Compute CLI <https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html#command-line-options>`_ by the \-\-profiler option.. See profiler options here:
`Nsight Compute <https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html#command-line-options>`_
or `Nsight Systems <https://docs.nvidia.com/nsight-systems/UserGuide/index.html#command-line-options>`_.
.. _compiler_args:
-c, --compiler-args
String. Optional compiler arguments that can be space separated
by wrapping them in double quotes. They will be passed to "nvcc".
See all options here:
`NVCC Options <https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#nvcc-command-options>`_
.. _compiler_args:
-c, --compiler-args
String. Optional compiler arguments that can be space separated
by wrapping them in double quotes. They will be passed to "nvcc".
See all options here:
`NVCC Options <https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#nvcc-command-options>`_
.. note:: .. note::
If both "\-\-profile" and "\-\-timeit" are used then no profiling is If both "\-\-profile" and "\-\-timeit" are used then no profiling is
@@ -47,10 +83,11 @@ Examples
-------- --------
:: ::
# compile, run, and profile the code in the cell with the Nsight # compile, run, and profile the code in the cell with the Nsight compute
# compute profiler while collecting only metrics from the # profiler while collecting only metrics from the "MemoryWorkloadAnalysis"
# "MemoryWorkloadAnalysis" section. # section; also provides the "--optimize 3" option to "nvcc" during
%%cuda --profile --profiler-args "--section MemoryWorkloadAnalysis" # compilation to optimize host code
%%cuda -p -a "--section MemoryWorkloadAnalysis" -c "--optimize 3"
------ ------
+34
View File
@@ -0,0 +1,34 @@
*********
Notebooks
*********
This page provides a list of useful Jupyter notebooks written with the
**nvcc4jupyter** library.
.. note::
These notebooks are written for Google's Colab, but you may run them in
other environments by installing all expected dependencies. If running in
Colab, make sure to set the runtime type to a GPU instance (at the time of
writing this, T4 is the GPU offered for free by Colab).
------
.. _compiling_with_external_libraries:
Compiling with external libraries
=================================
[`NOTEBOOK <https://colab.research.google.com/drive/1iuY46DCwv4hy3SqDhJgFeO8kgpHnzjTh?usp=sharing>`_]
If you need to compile CUDA C++ code that uses external libraries in the host
code (e.g. OpenCV for reading and writing images to disk) then this section is
for you.
To achieve this, use the :ref:`compiler-args <compiler_args>` option of the
:ref:`cuda <cuda_magic>` magic command to pass the correct compiler options
of the OpenCV library to **nvcc** for it to link the OpenCV code with the
code in your Jupyter cell. Those compiler options can be provided by the
`pkg-config <https://www.freedesktop.org/wiki/Software/pkg-config/>`_ tool.
In the notebook we show how to use OpenCV to load an image, blur it with a CUDA
kernel, and then save it back to disk using OpenCV again.
+69 -3
View File
@@ -225,10 +225,11 @@ Profiling
--------- ---------
Another important feature of nvcc4jupyter is its integration with the NVIDIA Another important feature of nvcc4jupyter is its integration with the NVIDIA
Nsight Compute profiler, which you need to make sure is installed and its Nsight Compute / NVIDIA Nsight Systems profilers, which you need to make sure
executable can be found in a directory in your PATH environment variable. are installed and the executables can be found in a directory in your PATH
environment variable.
In order to use it and provide the profiler with custom arguments, simply run: To profile using Nsight Compute with custom arguments:
.. code-block:: c++ .. code-block:: c++
@@ -255,3 +256,68 @@ Running the cell above will compile and execute the vector addition code in the
SM Active Cycles cycle 383.65 SM Active Cycles cycle 383.65
Compute (SM) Throughput % 1.19 Compute (SM) Throughput % 1.19
----------------------- ------------- ------------ ----------------------- ------------- ------------
To profile using Nsight Systems with custom arguments:
.. code-block:: c++
%cuda_group_run --group "vector_add" --profiler nsys --profile --profiler-args "profile --stats=true"
Running the cell above will compile and execute the vector addition code in the
"vector_add" group and profile it with Nsight Systems. The output will contain
multiple tables, one of which will look similar to this:
.. code-block::
[5/8] Executing 'cuda_api_sum' stats report
Time (%) Total Time (ns) Num Calls Avg (ns) Med (ns) Min (ns) Max (ns) StdDev (ns) Name
-------- --------------- --------- ------------- ------------- ----------- ----------- ----------- ----------------------
77.3 200,844,276 1 200,844,276.0 200,844,276.0 200,844,276 200,844,276 0.0 cudaMalloc
22.6 58,594,762 2 29,297,381.0 29,297,381.0 29,153,999 29,440,763 202,772.8 cudaMemcpy
0.1 305,450 1 305,450.0 305,450.0 305,450 305,450 0.0 cudaLaunchKernel
0.0 1,970 1 1,970.0 1,970.0 1,970 1,970 0.0 cuModuleGetLoadingMode
Compiler arguments
------------------
In the same way profiler arguments can be passed to the profiling tool,
compiling arguments can be passed to **nvcc**:
.. code-block:: c++
%cuda_group_run --group "vector_add" --compiler-args "--optimize 3"
Running the cell above will compile and execute the vector addition code in the
"vector_add" group. During compilation, **nvcc** receives the "\-\-optimize"
option which specifies the optimization level for host code.
Set default arguments
---------------------
In the case where you execute multiple magic commands with the same compiler or
profiler arguments you can avoid writing them every time by setting the default
arguments:
.. code-block:: python
from nvcc4jupyter import set_defaults
set_defaults(compiler_args="--optimize 3", profiler_args="--section SpeedOfLight")
The same effect can be achieved by running "set_defaults" once for each config
due to the fact that the default value is not changed if an a value is not
given to the "set_defaults" function.
.. code-block:: python
from nvcc4jupyter import set_defaults
set_defaults(compiler_args="--optimize 3")
set_defaults(profiler_args="--section SpeedOfLight")
Now we can run the following cell without specifying the compiler and profiler
arguments once again.
.. code-block:: c++
%cuda_group_run --group "vector_add" --profile
+2 -1
View File
@@ -2,6 +2,7 @@
nvcc4jupyter: CUDA C++ plugin for Jupyter Notebook nvcc4jupyter: CUDA C++ plugin for Jupyter Notebook
""" """
from .parsers import Profiler, set_defaults # noqa: F401
from .plugin import NVCCPlugin, load_ipython_extension # noqa: F401 from .plugin import NVCCPlugin, load_ipython_extension # noqa: F401
__version__ = "1.0.3" __version__ = "1.1.0"
+79 -1
View File
@@ -3,6 +3,63 @@ Parsers for the CUDA magic commands.
""" """
import argparse import argparse
from enum import Enum
from typing import Callable, Optional, Type, TypeVar
class Profiler(Enum):
"""Choice between Nsight Compute and Nsight Systems profilers."""
NCU = "ncu"
NSYS = "nsys"
_default_profiler: Profiler = Profiler.NCU
_default_profiler_args: str = ""
_default_compiler_args: str = ""
T = TypeVar("T")
def set_defaults(
profiler: Optional[Profiler] = None,
compiler_args: Optional[str] = None,
profiler_args: Optional[str] = None,
) -> None:
"""
Set the default values for various arguments of the magic commands. These
values will be used if the user does not explicitly provide those arguments
to override this behaviour on a cell by cell basis.
Args:
profiler: If not None, this value becomes the new default profiler.
Defaults to None.
compiler_args: If not None, this value becomes the new default compiler
config. Defaults to None.
profiler_args: If not None, this value becomes the new default profiler
config. Defaults to None.
"""
# pylint: disable=global-statement
global _default_profiler
if profiler is not None:
_default_profiler = profiler
global _default_compiler_args
if compiler_args is not None:
_default_compiler_args = compiler_args
global _default_profiler_args
if profiler_args is not None:
_default_profiler_args = profiler_args
def str_to_lambda(arg: str) -> Callable[[], str]:
"""Convert argparse string to lambda"""
return lambda: arg
def class_to_lambda(arg: str, cls: Type[T]) -> Callable[[], T]:
"""Convert string value to class and then to lambda"""
return lambda: cls(arg)
def get_parser_cuda() -> argparse.ArgumentParser: def get_parser_cuda() -> argparse.ArgumentParser:
@@ -18,7 +75,28 @@ def get_parser_cuda() -> argparse.ArgumentParser:
) )
parser.add_argument("-t", "--timeit", action="store_true") parser.add_argument("-t", "--timeit", action="store_true")
parser.add_argument("-p", "--profile", action="store_true") parser.add_argument("-p", "--profile", action="store_true")
parser.add_argument("-a", "--profiler-args", type=str, default="")
# the type of the following arguments is a lambda lambda function to allow
# changing the default value at runtime
parser.add_argument(
"-l",
"--profiler",
type=lambda arg: class_to_lambda(arg, cls=Profiler),
default=lambda: _default_profiler,
)
parser.add_argument(
"-a",
"--profiler-args",
type=str_to_lambda,
default=lambda: _default_profiler_args,
)
parser.add_argument(
"-c",
"--compiler-args",
type=str_to_lambda,
default=lambda: _default_compiler_args,
)
return parser return parser
+61
View File
@@ -0,0 +1,61 @@
"""
Helper functions relating to file paths.
"""
import os
from glob import glob
from typing import List, Optional
CUDA_SEARCH_PATHS: List[str] = [
"/opt/nvidia/nsight-compute",
"/usr/local/cuda",
"/opt",
"/usr",
]
def is_executable(fpath: str) -> bool:
"""Check if file exists and is executable"""
return os.path.isfile(fpath) and os.access(fpath, os.X_OK)
def which(name: str) -> Optional[str]:
"""Find an executable by name by searching the PATH directories"""
for path_dir in os.environ.get("PATH", "").split(os.pathsep):
exec_path = os.path.join(path_dir, name)
if is_executable(exec_path):
return exec_path
return None
def find_executable(
name: str, search_paths: Optional[List[str]] = None
) -> Optional[str]:
"""
Find an executable, either by searching in the directories of the PATH
environment variable or, if that did not work, by searching recursively
in directories a list given as parameter.
Args:
name: The name of the executable to be found.
search_paths: If None, only executables that are available from PATH
will be found. Otherwise, will recursively search these
directories. Defaults to None.
Returns:
The path to the executable if it is found, and None otherwise.
"""
if search_paths is None:
search_paths = []
which_path = which(name)
if which_path is not None:
return which_path
for search_path in search_paths:
search_path = os.path.abspath(search_path)
search_path = os.path.join(search_path, f"**/{name}")
for exec_path in glob(search_path, recursive=True):
return exec_path
return None
+71 -24
View File
@@ -9,13 +9,20 @@ import shutil
import subprocess import subprocess
import tempfile import tempfile
import uuid import uuid
from typing import List, Optional from typing import Dict, List, Optional
# pylint: disable=import-error # pylint: disable=import-error
from IPython.core.interactiveshell import InteractiveShell from IPython.core.interactiveshell import InteractiveShell
from IPython.core.magic import Magics, cell_magic, line_magic, magics_class from IPython.core.magic import Magics, cell_magic, line_magic, magics_class
from . import parsers from .parsers import (
Profiler,
get_parser_cuda,
get_parser_cuda_group_delete,
get_parser_cuda_group_run,
get_parser_cuda_group_save,
)
from .path_utils import CUDA_SEARCH_PATHS, find_executable
DEFAULT_EXEC_FNAME = "cuda_exec.out" DEFAULT_EXEC_FNAME = "cuda_exec.out"
SHARED_GROUP_NAME = "shared" SHARED_GROUP_NAME = "shared"
@@ -37,14 +44,19 @@ class NVCCPlugin(Magics):
super().__init__(shell) super().__init__(shell)
self.shell: InteractiveShell # type hint not provided by parent class self.shell: InteractiveShell # type hint not provided by parent class
self.parser_cuda = parsers.get_parser_cuda() self.parser_cuda = get_parser_cuda()
self.parser_cuda_group_save = parsers.get_parser_cuda_group_save() self.parser_cuda_group_save = get_parser_cuda_group_save()
self.parser_cuda_group_delete = parsers.get_parser_cuda_group_delete() self.parser_cuda_group_delete = get_parser_cuda_group_delete()
self.parser_cuda_group_run = parsers.get_parser_cuda_group_run() self.parser_cuda_group_run = get_parser_cuda_group_run()
self.workdir = tempfile.mkdtemp() self.workdir = tempfile.mkdtemp()
print(f'Source files will be saved in "{self.workdir}".') print(f'Source files will be saved in "{self.workdir}".')
self.profiler_paths: Dict[Profiler, Optional[str]] = {
Profiler.NCU: None,
Profiler.NSYS: None,
}
def _save_source( def _save_source(
self, source_name: str, source_code: str, group_name: str self, source_name: str, source_code: str, group_name: str
) -> None: ) -> None:
@@ -87,7 +99,10 @@ class NVCCPlugin(Magics):
shutil.rmtree(group_dirpath) shutil.rmtree(group_dirpath)
def _compile( def _compile(
self, group_name: str, executable_fname: str = DEFAULT_EXEC_FNAME self,
group_name: str,
executable_fname: str = DEFAULT_EXEC_FNAME,
compiler_args: str = "",
) -> str: ) -> str:
""" """
Compiles all source files in a given group together with all source Compiles all source files in a given group together with all source
@@ -97,6 +112,7 @@ class NVCCPlugin(Magics):
group_name: The name of the source file group to be compiled. group_name: The name of the source file group to be compiled.
executable_fname: The output executable file name. Defaults to executable_fname: The output executable file name. Defaults to
"cuda_exec.out". "cuda_exec.out".
compiler_args: The optional "nvcc" compiler arguments.
Raises: Raises:
RuntimeError: If the group does not exist or if does not have any RuntimeError: If the group does not exist or if does not have any
@@ -121,27 +137,52 @@ class NVCCPlugin(Magics):
executable_fpath = os.path.join(group_dirpath, executable_fname) executable_fpath = os.path.join(group_dirpath, executable_fname)
args = [ args = ["nvcc"]
"nvcc", args.extend(compiler_args.split())
"-I" + shared_dirpath + "," + group_dirpath, args.append("-I" + shared_dirpath + "," + group_dirpath)
]
args.extend(source_files) args.extend(source_files)
args.extend( args.extend(["-o", executable_fpath, "-Wno-deprecated-gpu-targets"])
[
"-o",
executable_fpath,
"-Wno-deprecated-gpu-targets",
]
)
subprocess.check_output(args, stderr=subprocess.STDOUT) subprocess.check_output(args, stderr=subprocess.STDOUT)
return executable_fpath return executable_fpath
def _run( def _get_profiler_path(self, profiler: Profiler) -> str:
"""
Get the path of the executable of a given profiling tool. Searches
the directories of the PATH environment variable and some extra
directories where CUDA is usually installed.
Args:
profiler: The profiler whose executable should be found.
Raises:
RuntimeError: If the profiler executable could not be found.
Returns:
The file path of the executable.
"""
profiler_path = self.profiler_paths[profiler]
if profiler_path is not None:
return profiler_path
profiler_path = find_executable(profiler.value, CUDA_SEARCH_PATHS)
if profiler_path is None:
raise RuntimeError(
f'Could not find the "{profiler.value}" profiling tool.'
" Consider searching for where it is installed and adding its"
" directory to the PATH environment variable."
)
self.profiler_paths[profiler] = profiler_path
return profiler_path
def _run( # pylint: disable=too-many-arguments
self, self,
exec_fpath: str, exec_fpath: str,
timeit: bool = False, timeit: bool = False,
profile: bool = False, profile: bool = False,
profiler: Profiler = Profiler.NCU,
profiler_args: str = "", profiler_args: str = "",
) -> str: ) -> str:
""" """
@@ -152,8 +193,9 @@ class NVCCPlugin(Magics):
timeit: If True, returns the result of the "timeit" magic instead timeit: If True, returns the result of the "timeit" magic instead
of the standard output of the CUDA process. Defaults to False. of the standard output of the CUDA process. Defaults to False.
profile: If True, the executable is profiled with NVIDIA Nsight profile: If True, the executable is profiled with NVIDIA Nsight
Compute profiling tool and its output is added to stdout. Compute or NVIDIA Nsight Systems and the profiling output is
Defaults to False. added to stdout. Defaults to False.
profiler: The profiling tool to use.
profiler_args: The profiler arguments used to customize the profiler_args: The profiler arguments used to customize the
information gathered by it and its overall behaviour. Defaults information gathered by it and its overall behaviour. Defaults
to an empty string. to an empty string.
@@ -175,7 +217,8 @@ class NVCCPlugin(Magics):
else: else:
run_args = [] run_args = []
if profile: if profile:
run_args.extend(["ncu"] + profiler_args.split()) profiler_path = self._get_profiler_path(profiler)
run_args.extend([profiler_path] + profiler_args.split())
run_args.append(exec_fpath) run_args.append(exec_fpath)
output = subprocess.check_output( output = subprocess.check_output(
run_args, stderr=subprocess.STDOUT run_args, stderr=subprocess.STDOUT
@@ -188,12 +231,16 @@ class NVCCPlugin(Magics):
self, group_name: str, args: argparse.Namespace self, group_name: str, args: argparse.Namespace
) -> str: ) -> str:
try: try:
exec_fpath = self._compile(group_name) exec_fpath = self._compile(
group_name=group_name,
compiler_args=args.compiler_args(),
)
output = self._run( output = self._run(
exec_fpath=exec_fpath, exec_fpath=exec_fpath,
timeit=args.timeit, timeit=args.timeit,
profile=args.profile, profile=args.profile,
profiler_args=args.profiler_args, profiler=args.profiler(),
profiler_args=args.profiler_args(),
) )
except subprocess.CalledProcessError as e: except subprocess.CalledProcessError as e:
output = e.output.decode("utf8") output = e.output.decode("utf8")
+5 -3
View File
@@ -69,6 +69,7 @@ exclude_lines = [
[tool.isort] [tool.isort]
profile = "black" profile = "black"
src_paths = ["nvcc4jupyter"] # tells isort where to find local modules to not consider them 3rd party libraries
[tool.bandit] [tool.bandit]
exclude_dirs = ["build","dist","tests","scripts"] exclude_dirs = ["build","dist","tests","scripts"]
@@ -82,7 +83,8 @@ skips = ["B101", "B311", "B404", "B603"]
[tool.black] [tool.black]
line-length = 79 line-length = 79
fast = true fast = true
experimental-string-processing = true preview = true
enable-unstable-feature = ["string_processing"]
[tool.coverage.run] [tool.coverage.run]
branch = true branch = true
@@ -286,6 +288,6 @@ deprecated-modules="optparse,tkinter.tix"
[tool.pylint.'EXCEPTIONS'] [tool.pylint.'EXCEPTIONS']
overgeneral-exceptions= [ overgeneral-exceptions= [
"BaseException", "builtins.BaseException",
"Exception" "builtins.Exception"
] ]
+47
View File
@@ -0,0 +1,47 @@
#include <cstdlib>
#include <iostream>
#include <set>
#include <string>
#include <iterator>
#include <tuple>
struct S {
int n;
std::string s;
float d;
bool operator<(const S& rhs) const
{
// compares n to rhs.n,
// then s to rhs.s,
// then d to rhs.d
return std::tie(n, s, d) < std::tie(rhs.n, rhs.s, rhs.d);
}
};
int main()
{
std::set<S> mySet;
// pre C++17:
{
S value{42, "Test", 3.14};
std::set<S>::iterator iter;
bool inserted;
// unpacks the return val of insert into iter and inserted
std::tie(iter, inserted) = mySet.insert(value);
if (inserted)
std::cout << "Value was inserted\n";
}
// with C++17:
{
S value{100, "abc", 100.0};
const auto [iter, inserted] = mySet.insert(value);
if (inserted)
std::cout << "Value(" << iter->n << ", " << iter->s << ", ...) was inserted" << "\n";
}
}
+8
View File
@@ -0,0 +1,8 @@
#include <opencv2/core.hpp>
#include <iostream>
int main(int argc, char** argv)
{
std::cout << cv::getBuildInformation() << std::endl;
return 0;
}
+29 -1
View File
@@ -1,9 +1,11 @@
import argparse
import glob import glob
import os import os
import pytest import pytest
from IPython.core.interactiveshell import InteractiveShell from IPython.core.interactiveshell import InteractiveShell
from nvcc4jupyter.parsers import Profiler
from nvcc4jupyter.plugin import NVCCPlugin from nvcc4jupyter.plugin import NVCCPlugin
@@ -27,10 +29,25 @@ def fixtures_path(tests_path):
return os.path.join(tests_path, "fixtures") return os.path.join(tests_path, "fixtures")
@pytest.fixture(scope="session")
def scripts_path(fixtures_path: str):
return os.path.join(fixtures_path, "scripts")
@pytest.fixture(scope="session")
def compiler_cpp_17_fpath(fixtures_path: str):
return os.path.join(fixtures_path, "compiler", "cpp_17.cu")
@pytest.fixture(scope="session")
def compiler_opencv_fpath(fixtures_path: str):
return os.path.join(fixtures_path, "compiler", "opencv.cu")
@pytest.fixture(scope="session") @pytest.fixture(scope="session")
def sample_magic_cu_line(): def sample_magic_cu_line():
# fmt: off # fmt: off
return '--profile --profiler-args "--metrics l1tex__t_sectors_pipe_lsu_mem_global_op_ld.sum"' # noqa: E501 return '--profile --profiler-args "--metrics l1tex__t_sectors_pipe_lsu_mem_global_op_ld.sum" --compiler-args "--optimize 3"' # noqa: E501
# fmt: on # fmt: on
@@ -55,3 +72,14 @@ def multiple_source_fpaths(fixtures_path: str):
pattern_h = os.path.join(fixtures_path, "multiple_files", "*.h") pattern_h = os.path.join(fixtures_path, "multiple_files", "*.h")
pattern_cu = os.path.join(fixtures_path, "multiple_files", "*.cu") pattern_cu = os.path.join(fixtures_path, "multiple_files", "*.cu")
return list(glob.glob(pattern_h)) + list(glob.glob(pattern_cu)) return list(glob.glob(pattern_h)) + list(glob.glob(pattern_cu))
@pytest.fixture(scope="session")
def default_args():
return argparse.Namespace(
timeit=False,
profile=True,
profiler=lambda: Profiler.NCU,
profiler_args=lambda: "",
compiler_args=lambda: "",
)
Vendored Executable
+7
View File
@@ -0,0 +1,7 @@
#!/bin/bash
echo "[NCU]"
# this is a mock of nsight compute cli tool that just executes the program
# given as the last argument
"${@: -1}"
Vendored Executable
+7
View File
@@ -0,0 +1,7 @@
#!/bin/bash
echo "[NSYS]"
# this is a mock of nsight systems cli tool that just executes the program
# given as the last argument
"${@: -1}"
+3
View File
@@ -0,0 +1,3 @@
#!/bin/bash
echo "This is just used to test the path_utils.find_executable function"
+16
View File
@@ -0,0 +1,16 @@
import os
from nvcc4jupyter.path_utils import find_executable
def test_which():
assert find_executable("ls") == "/usr/bin/ls"
def test_find_executable(fixtures_path: str):
exec_path = find_executable("searchforme", [fixtures_path])
assert exec_path is not None
exec_dir, exec_fname = os.path.split(exec_path)
assert exec_fname == "searchforme"
assert os.path.basename(exec_dir) == "scripts"
+118 -23
View File
@@ -1,27 +1,26 @@
import argparse
import math import math
import os import os
import re import re
import shutil import shutil
import subprocess
from argparse import ArgumentParser, Namespace
from copy import deepcopy
from typing import List from typing import List
import pytest import pytest
from nvcc4jupyter.parsers import Profiler, get_parser_cuda, set_defaults
from nvcc4jupyter.plugin import NVCCPlugin from nvcc4jupyter.plugin import NVCCPlugin
def check_profiler_output(output: str): def check_profiler_output(output: str, profiler: str = "[NCU]"):
# the profiler output will be a line of "Hello World!" along with some # the output from the profiler will first be a line containing only
# warning lines which start with "==WARNING==" # "[NCU]" or "[NSYS]" depending on what profiler was used and another
# line containing the string "Hello World!"
lines = output.strip().split("\n") lines = output.strip().split("\n")
warn_count = 0 assert len(lines) >= 2
for line in lines: assert lines[0] == profiler
if not line.startswith("==WARNING=="): assert lines[1] == "Hello World!"
assert line == "Hello World!"
else:
warn_count += 1
assert warn_count >= 1
assert warn_count == len(lines) - 1
def copy_source_to_group( def copy_source_to_group(
@@ -36,11 +35,19 @@ def copy_source_to_group(
return destination_fpath return destination_fpath
@pytest.fixture(autouse=True, scope="session")
def before_all(scripts_path: str):
os.environ["PATH"] = scripts_path + os.pathsep + os.environ["PATH"]
@pytest.fixture(autouse=True, scope="function") @pytest.fixture(autouse=True, scope="function")
def before_each(plugin: NVCCPlugin): def before_each(plugin: NVCCPlugin):
shutil.rmtree(plugin.workdir, ignore_errors=True) # before test # BEFORE TESTS
set_defaults(profiler=Profiler.NCU, compiler_args="", profiler_args="")
shutil.rmtree(plugin.workdir, ignore_errors=True)
yield yield
pass # after test # AFTER TESTS
pass
def test_save_source(plugin: NVCCPlugin, sample_cuda_code: str) -> None: def test_save_source(plugin: NVCCPlugin, sample_cuda_code: str) -> None:
@@ -88,6 +95,49 @@ def test_compile(
plugin._compile(gname) plugin._compile(gname)
def test_compile_args(
plugin: NVCCPlugin,
compiler_cpp_17_fpath: str,
default_args: Namespace,
):
gname = "test_compile_args"
copy_source_to_group(compiler_cpp_17_fpath, gname, plugin.workdir)
exec_fpath = plugin._compile(gname, compiler_args="--std c++17")
assert os.path.exists(exec_fpath)
# should fail due to the source file having c++ 17 features
with pytest.raises(subprocess.CalledProcessError):
exec_fpath = plugin._compile(gname, compiler_args="--std c++14")
args = deepcopy(default_args)
args.compiler_args = lambda: "--std c++14"
output = plugin._compile_and_run(group_name=gname, args=args)
assert "errors detected in the compilation of" in output
def test_compile_opencv(
plugin: NVCCPlugin,
compiler_opencv_fpath: str,
default_args: Namespace,
):
gname = "test_compile_opencv"
copy_source_to_group(compiler_opencv_fpath, gname, plugin.workdir)
# check that "pkg-config" exists
assert subprocess.check_call(["which", "pkg-config"]) == 0
pkg_config_args = ["pkg-config", "--cflags", "--libs", "opencv4"]
opencv_compile_options = (
subprocess.check_output(args=pkg_config_args).decode().strip()
)
args = deepcopy(default_args)
args.compiler_args = lambda: opencv_compile_options
output = plugin._compile_and_run(group_name=gname, args=args)
assert "General configuration for OpenCV" in output
def test_run( def test_run(
plugin: NVCCPlugin, plugin: NVCCPlugin,
sample_cuda_fpath: str, sample_cuda_fpath: str,
@@ -133,7 +183,9 @@ def test_run_profile(plugin: NVCCPlugin, sample_cuda_fpath: str):
def test_compile_and_run_multiple_files( def test_compile_and_run_multiple_files(
plugin: NVCCPlugin, multiple_source_fpaths: List[str] plugin: NVCCPlugin,
multiple_source_fpaths: List[str],
default_args: Namespace,
): ):
""" """
Compiles and executes 3 cuda source files from Compiles and executes 3 cuda source files from
@@ -142,14 +194,14 @@ def test_compile_and_run_multiple_files(
gname = "test_compile_and_run_multiple_files" gname = "test_compile_and_run_multiple_files"
for fpath in multiple_source_fpaths: for fpath in multiple_source_fpaths:
copy_source_to_group(fpath, gname, plugin.workdir) copy_source_to_group(fpath, gname, plugin.workdir)
output = plugin._compile_and_run( output = plugin._compile_and_run(group_name=gname, args=default_args)
gname, argparse.Namespace(timeit=False, profile=True, profiler_args="")
)
check_profiler_output(output) check_profiler_output(output)
def test_compile_and_run_multiple_files_shared( def test_compile_and_run_multiple_files_shared(
plugin: NVCCPlugin, multiple_source_fpaths: List[str] plugin: NVCCPlugin,
multiple_source_fpaths: List[str],
default_args: Namespace,
): ):
""" """
Compiles and executes 3 cuda source files from Compiles and executes 3 cuda source files from
@@ -164,14 +216,12 @@ def test_compile_and_run_multiple_files_shared(
copy_source_to_group(fpath, gname, plugin.workdir) copy_source_to_group(fpath, gname, plugin.workdir)
else: else:
copy_source_to_group(fpath, "shared", plugin.workdir) copy_source_to_group(fpath, "shared", plugin.workdir)
output = plugin._compile_and_run( output = plugin._compile_and_run(group_name=gname, args=default_args)
gname, argparse.Namespace(timeit=False, profile=True, profiler_args="")
)
check_profiler_output(output) check_profiler_output(output)
def test_read_args(plugin: NVCCPlugin): def test_read_args(plugin: NVCCPlugin):
parser = argparse.ArgumentParser() parser = ArgumentParser()
parser.add_argument("-a", type=str, required=True) parser.add_argument("-a", type=str, required=True)
parser.add_argument("-b", type=float, required=True) parser.add_argument("-b", type=float, required=True)
args = plugin._read_args( args = plugin._read_args(
@@ -181,6 +231,29 @@ def test_read_args(plugin: NVCCPlugin):
assert math.isclose(args.b, 0.75) assert math.isclose(args.b, 0.75)
def test_set_defaults():
parser = get_parser_cuda()
args = parser.parse_args([])
assert args.profiler_args() == ""
assert args.compiler_args() == ""
set_defaults(profiler_args="123")
args = parser.parse_args([])
assert args.profiler_args() == "123"
assert args.compiler_args() == ""
set_defaults(compiler_args="456")
args = parser.parse_args([])
assert args.profiler_args() == "123"
assert args.compiler_args() == "456"
set_defaults(profiler_args="")
args = parser.parse_args([])
assert args.profiler_args() == ""
assert args.compiler_args() == "456"
set_defaults(profiler_args="123")
args = parser.parse_args(["--profiler-args", "789"])
assert args.profiler_args() == "789"
assert args.compiler_args() == "456"
def test_magic_cuda( def test_magic_cuda(
capsys, capsys,
plugin: NVCCPlugin, plugin: NVCCPlugin,
@@ -191,6 +264,28 @@ def test_magic_cuda(
check_profiler_output(capsys.readouterr().out) check_profiler_output(capsys.readouterr().out)
def test_magic_cuda_set_default_profiler(
capsys,
plugin: NVCCPlugin,
sample_cuda_code: str,
sample_magic_cu_line: str,
):
# set the default profiler to Nsight Systems
set_defaults(profiler=Profiler.NSYS)
plugin.cuda(sample_magic_cu_line, sample_cuda_code)
check_profiler_output(capsys.readouterr().out, profiler="[NSYS]")
def test_magic_cuda_bad_args(
capsys,
plugin: NVCCPlugin,
sample_cuda_code: str,
):
plugin.cuda("--this-is-an-unrecognized-argument", sample_cuda_code)
output = capsys.readouterr().out
assert output.startswith("usage: ")
def test_magic_cuda_group_save(plugin: NVCCPlugin, sample_cuda_code: str): def test_magic_cuda_group_save(plugin: NVCCPlugin, sample_cuda_code: str):
gname = "test_save_source" gname = "test_save_source"
sname = "sample.cu" sname = "sample.cu"