TensorBoard callback without profile_batch setting cause Errors: CUPTI_ERROR_INSUFFICIENT_PRIVILEGES and CUPTI_ERROR_INVALID_PARAMETER #35860

gawain-git-code · 2020-01-14T10:47:56Z

Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Stateless LSTM from Keras tutorial using tf backend
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 2.1.0
Python version: 3.7.4
CUDA/cuDNN version: 10.1
GPU model and memory: MX150 10GB

Describe the current behavior
When using tf.keras.callbacks.TensorBoard() without the profile_batch setting, it gives out errors of CUPTI_ERROR_INSUFFICIENT_PRIVILEGES and CUPTI_ERROR_INVALID_PARAMETER from tensorflow/core/profiler/internal/gpu/cupti_tracer.cc.

Describe the expected behavior
With profile_batch = 0, these two errors are gone.
But comes back when profile_batch = 1, or other non-zero values.

Code to reproduce the issue


from __future__ import print_function
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM


input_len = 1000
tsteps = 2
lahead = 1
batch_size = 1
epochs = 5

print("*" * 33)
if lahead >= tsteps:
    print("STATELESS LSTM WILL ALSO CONVERGE")
else:
    print("STATELESS LSTM WILL NOT CONVERGE")
print("*" * 33)

np.random.seed(1986)

print('Generating Data...')


def gen_uniform_amp(amp=1, xn=10000):

    data_input = np.random.uniform(-1 * amp, +1 * amp, xn)
    data_input = pd.DataFrame(data_input)
    return data_input


to_drop = max(tsteps - 1, lahead - 1)
data_input = gen_uniform_amp(amp=0.1, xn=input_len + to_drop)

expected_output = data_input.rolling(window=tsteps, center=False).mean()

if lahead > 1:
    data_input = np.repeat(data_input.values, repeats=lahead, axis=1)
    data_input = pd.DataFrame(data_input)
    for i, c in enumerate(data_input.columns):
        data_input[c] = data_input[c].shift(i)

expected_output = expected_output[to_drop:]
data_input = data_input[to_drop:]


def create_model(stateful):
    model = Sequential()
    model.add(LSTM(20,
              input_shape=(lahead, 1),
              batch_size=batch_size,
              stateful=stateful))
    model.add(Dense(1))
    model.compile(loss='mse', optimizer='adam')
    return model

print('Creating Stateful Model...')
model_stateful = create_model(stateful=True)


def split_data(x, y, ratio=0.8):
    to_train = int(input_len * ratio)
    to_train -= to_train % batch_size
    x_train = x[:to_train]
    y_train = y[:to_train]
    x_test = x[to_train:]
    y_test = y[to_train:]

    # tweak to match with batch_size
    to_drop = x.shape[0] % batch_size
    if to_drop > 0:
        x_test = x_test[:-1 * to_drop]
        y_test = y_test[:-1 * to_drop]

    # some reshaping
    reshape_3 = lambda x: x.values.reshape((x.shape[0], x.shape[1], 1))
    x_train = reshape_3(x_train)
    x_test = reshape_3(x_test)

    reshape_2 = lambda x: x.values.reshape((x.shape[0], 1))
    y_train = reshape_2(y_train)
    y_test = reshape_2(y_test)

    return (x_train, y_train), (x_test, y_test)


(x_train, y_train), (x_test, y_test) = split_data(data_input, expected_output)
print('x_train.shape: ', x_train.shape)
print('y_train.shape: ', y_train.shape)
print('x_test.shape: ', x_test.shape)
print('y_test.shape: ', y_test.shape)

print('Creating Stateless Model...')
model_stateless = create_model(stateful=False)

import os
import datetime
ROOT_DIR = os.getcwd()
log_dir = os.path.join('callback_tests')
if not os.path.exists(log_dir):
    os.makedirs(log_dir)
print(log_dir)
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir)
                                       
print('Training')
history = model_stateless.fit(x_train,
                    y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    verbose=1,
                    validation_data=(x_test, y_test),
                    shuffle=False,
                    callbacks=[tensorboard_callback]
                    )

Other info / logs
Train on 800 samples, validate on 200 samples
2020-01-14 21:30:27.591905: I tensorflow/core/profiler/lib/profiler_session.cc:225] Profiler session started.
2020-01-14 21:30:27.594743: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1259] Profiler found 1 GPUs
2020-01-14 21:30:27.599172: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cupti64_101.dll
2020-01-14 21:30:27.704083: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1307] function cupti_interface_->Subscribe( &subscriber_, (CUpti_CallbackFunc)ApiCallback, this)failed with error CUPTI_ERROR_INSUFFICIENT_PRIVILEGES
2020-01-14 21:30:27.716790: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1346] function cupti_interface_->ActivityRegisterCallbacks( AllocCuptiActivityBuffer, FreeCuptiActivityBuffer)failed with error CUPTI_ERROR_INSUFFICIENT_PRIVILEGES
Epoch 1/5
2020-01-14 21:30:28.370429: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-01-14 21:30:28.651767: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-01-14 21:30:29.662864: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1329] function cupti_interface_->EnableCallback( 0 , subscriber_, CUPTI_CB_DOMAIN_DRIVER_API, cbid)failed with error CUPTI_ERROR_INVALID_PARAMETER
2020-01-14 21:30:29.670282: I tensorflow/core/profiler/internal/gpu/device_tracer.cc:88] GpuTracer has collected 0 callback api events and 0 activity events.
800/800 [==============================] - 5s 6ms/sample - loss: 0.0011 - val_loss: 0.0011
Epoch 2/5
800/800 [==============================] - 3s 4ms/sample - loss: 8.5921e-04 - val_loss: 0.0010
Epoch 3/5
800/800 [==============================] - 3s 3ms/sample - loss: 8.5613e-04 - val_loss: 0.0010
Epoch 4/5
800/800 [==============================] - 3s 4ms/sample - loss: 8.5458e-04 - val_loss: 9.9713e-04
Epoch 5/5
800/800 [==============================] - 3s 4ms/sample - loss: 8.5345e-04 - val_loss: 9.8825e-04

The text was updated successfully, but these errors were encountered:

airMeng · 2020-01-15T02:15:23Z

Facing the same problem.

kevin-hartman · 2020-01-15T06:07:32Z

Same issue. Different code sample.

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 2.1.0
Python version: 3.7.6
CUDA/cuDNN version: 10.1
GPU model and memory: RX 2080 Ti

eduardofv · 2020-01-15T20:16:57Z

Same (INSUFFICIENT PRIVILEGES):

2020-01-15 20:28:38.181667: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1307] function cupti_interface_->Subscribe( &subscriber_, (CUpti_CallbackFunc)ApiCallback, this)failed with error CUPTI_ERROR_INSUFFICIENT_PRIVILEGES
2020-01-15 20:28:38.183369: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1346] function cupti_interface_->ActivityRegisterCallbacks( AllocCuptiActivityBuffer, FreeCuptiActivityBuffer)failed with error CUPTI_ERROR_INSUFFICIENT_PRIVILEGES

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
TensorFlow installed from (source or binary): binary

CUDA/cuDNN version: 10.1

Host: ttmagpie_d99d3f105d0a
Python: 3.6.9 (default, Nov 7 2019, 10:44:02)
[GCC 8.3.0]
Tensorflow: 2.1.0
GPU: available
GPU 0: GeForce GTX 960M (UUID: GPU-c604cc5b-50da-5483-bde7-f562fb1c3420)
GPU Memory:4GB
Keras: 2.2.4-tf
Hub: 0.7.0

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 435.21       Driver Version: 435.21       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 960M    Off  | 00000000:02:00.0 Off |                  N/A |
| N/A   46C    P5    N/A /  N/A |   3869MiB /  4046MiB |     26%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

oanush · 2020-01-16T06:26:43Z

@gawain-git-code ,
I tried running the code in colab and I was able to run it successfully. please find the gist for reference.Thanks!

gawain-git-code · 2020-01-17T09:07:50Z

@airMeng @kevin-hartman @eduardofv Are you guys satisfy with the answer from @oanush ?

I personally would not because that was just telling me I can only use profile_batch settings safely in google colab instead of on my own machine setup.

This was not answering the root cause of the problem of why cupti_tracer is signalling the errors. But thank you @oanush for spending the time to help out.

eduardofv · 2020-01-17T19:16:10Z

I am still receiving the error but have moved to another environment. It may have to do with driver updates on Ubuntu. Will try to check again and get back to you.

gowthamkpr · 2020-01-23T23:27:42Z

The error states that

User doesn't have sufficient privileges which are required to start the profiling session. One possible reason for this may be that the NVIDIA driver or your system administrator may have restricted access to the NVIDIA GPU performance counters.

May be its an error with the configuration itself.

shaywinter · 2020-01-28T14:05:47Z

Facing same issue when capturing profile data by tensorboard through gRPC: Tried following solution by nvidia (enable non privileged access to profile counters) - to no avail. my training runs as root inside a container (which was other solution suggested by nvidia).
TF2.1.0, NVIDIA Driver Version: 418.56 CUDA Version: 10.1

seongmoon729 · 2020-01-29T11:32:23Z

I have the same issue when I use the official TensorFlow docker image(tensorflow/tensorflow:2.1.0-gpu-py3).
I used 'tf.keras.callbacks.TensorBoard' with model.fit() method, but during the first epoch following errors came out.

E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1307] function cupti_interface_->Subscribe( &subscriber_, (CUpti_CallbackFunc)ApiCallback, this)failed with error CUPTI_ERROR_INSUFFICIENT_PRIVILEGES

E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1329] function cupti_interface_->EnableCallback( 0 , subscriber_, CUPTI_CB_DOMAIN_DRIVER_API, cbid)failed with error CUPTI_ERROR_INVALID_PARAMETER

So I go back to the 'tensorflow/tensorflow:2.0.0-gpu-py3' image.

tamaramiteva · 2020-02-04T12:35:01Z

@dartlune Did going back to the 'tensorflow/tensorflow:2.0.0-gpu-py3' image helped?

I cannot save the model in both versions of the docker image! The weird thing is that when running the model in jupyter notebook, it saves the model each iteration! But not in python3!

Any suggestions?

seongmoon729 · 2020-02-09T15:38:21Z

@tamaramiteva Actually I had some error with tensorflow/tensorflow:2.0.0-gpu-py3'.
It was a little different error with the above things.
So I did a little search and found a solution that adds some paths.

LD_LIBRARY_PATH=:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64
LD_INCLUDE_PATH=:/usr/local/cuda/include:/usr/local/cuda/extras/CUPTI/include

Liu-Da · 2020-02-10T09:11:56Z

Any update about this problem?

vlasenkoalexey · 2020-02-12T19:59:21Z

This is due to NVIDIA CUPTI libary API change: https://developer.nvidia.com/nvidia-development-tools-solutions-err-nvgpuctrperm-cupti
In order for GPU profiling to work you need to run your job as sudo.
On Kubernetes you'll need a allowPrivilegeEscalation: true, you can see example here:
https://github.com/vlasenkoalexey/criteo/blob/05e2aa4c5a15b9e437a364295b7f1e5e2653a22b/scripts/template.yaml.jinja#L136
It is not convenient and won't work for all use-cases, I hope there is a better solution.

Also note that tensorboard profiler plugin got broken by Chrome 80 update, see tensorflow/tensorboard#3209

Suggested workaround works - run Chrome with --enable-blink-features=ShadowDOMV0,CustomElementsV0,HTMLImports flags

like:
/usr/bin/google-chrome-stable --enable-blink-features=ShadowDOMV0,CustomElementsV0,HTMLImports

trisolaran · 2020-02-12T21:52:42Z

Adding options nvidia "NVreg_RestrictProfilingToAdminUsers=0" to /etc/modprobe.d/nvidia-kernel-common.conf
and reboot should resolve the permision issue.

mikechen66 · 2020-07-28T16:13:25Z

I have had the test on the above-mentioned solutions. It seems that there is no quick way to go out of the dilemma of the error "CUPTI_ERROR_INSUFFICIENT_PRIVILEGES"

1. The ad-hoc solution

Even though Nvidia gives the temporal solution "CAP_SYS_ADMIN", it is a ad hoc solution. It sometimes works and does not work in the rest of the time.

$ python abc.py --cap-add=CAP_SYS_ADMIN

2. LD_LIBRARY_PATH is not reliable

The following solutions sometimes work ans does not work in the rest of the time.

LD_LIBRARY_PATH=:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64
LD_INCLUDE_PATH=:/usr/local/cuda/include:/usr/local/cuda/extras/CUPTI/include

or

LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-10.1/extras/CUPTI/lib64

3. No path of "/etc/modprobe.d/nvidia-kernel-common.conf"

The modprobe.d does not include the path iof "/etc/modprobe.d/nvidia-kernel-common.conf". So I could not add "NVreg_RestrictProfilingToAdminUsers=0" to /etc/modprobe.d/nvidia-kernel-common.conf

Nvidia gives quite value explanation on the error. So it is quite strange.

Liu-Da · 2020-09-16T11:44:11Z

Adding options nvidia "NVreg_RestrictProfilingToAdminUsers=0" to /etc/modprobe.d/nvidia-kernel-common.conf
and reboot should resolve the permision issue.

Thanks , It worked for me.

bonryu · 2020-10-19T19:37:12Z

None of the solutions offered here nor anywhere else has worked for me. Perhaps it may work if one upgrades from Ubuntu 16.04 to Ubuntu 18.04. But since I'm on a shared server, it may take some time to do the upgrade. I have not tried docker yet.

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 2.3.0 (installed with pip install -U tensorflow)
Python version: 3.8.5
CUDA/cuDNN version: 10.1
GPU model and memory: GTX 1080 Ti 11 GB

SarfarazHabib · 2020-10-21T14:18:50Z

I am having the same error in anaconda environment. None of the solutions posted above work for me. Does anyone have any ideas what can be done?
Also what does this error actually mean, if someone is kind enough to explain it to a noob ?

trisolaran · 2020-10-21T15:52:14Z

Tensorflow use NVIDIA provided libcupti for GPU tracing support. However since CUDA 10, that functionality requires CAP_SYS_ADMIN privilege, or you should change the /etc/modprobe.d/nvidia-kernel-common.conf (which also require sudo, but only once).
More information can be found at (which include how to do it on windows)
https://developer.nvidia.com/nvidia-development-tools-solutions-err-nvgpuctrperm-cupti

I believe that NVIDIA enforce this restriction because of some research papers said you can steal user secrets by probing performance counters.

SarfarazHabib · 2020-10-23T07:31:07Z

Hey @trisolaran thanks for brief intro. The thing is i do not have /etc/modprobe.d/nvidia-kernel-common.conf such file. I am using a conda environment.

kunihik0 · 2020-10-23T07:53:07Z

@SarfarazHabib Hi I am using a conda enviroment too and I solved this problem by adding
options nvidia "NVreg_RestrictProfilingToAdminUsers=0" to /etc/modprobe.d/nvidia-kernel-common.conf. I had not had the file too , so you should make the file.

SarfarazHabib · 2020-10-24T11:34:10Z

@kunihik0 Thanks alot for the help. The error is now gone but my training stucks after random epochs. I am using tensorflow 2.3 for now on ubuntu 18.04. Can anyone guide me in any direction with respect to this new problem ??

solarflarefx · 2020-11-03T13:54:01Z

Anyone with the same issue on Windows 10? The two offered solutions only work for Linux.

This solved the issue for me:
Right-click on your desktop desktop for quick access to the NVIDIA Control Panel
Windows Step 1: Open the NVIDIA Control Panel, select 'Desktop', and ensure 'Enable Developer Settings' is checked.
Windows Step 2: Under 'Developer' > 'Manage GPU Performance Counters', select 'Allow access to the GPU performance counter to all users' to enable unrestricted profiling[1]

This solution works great for Windows 10 systems. But what about Windows Server 2019? It seems that now Microsoft requires you to get NVIDIA Control Panel from the Microsoft Store, but that is not available on Windows Server 2019. Is there an alternative way to allow these permissions on Windows Server 2019?

Ruomei · 2020-11-29T21:53:29Z

Adding options nvidia "NVreg_RestrictProfilingToAdminUsers=0" to /etc/modprobe.d/nvidia-kernel-common.conf
and reboot should resolve the permision issue.

This works for me after switching from conda to virtualenv and I also need to use sudo inside my virtualenv with the python under /venv/bin/python. Other dependencies are:

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 2.3.1
Python version: 3.7.6
CUDA/cuDNN version: 11.0
GPU model: GeForce GTX TITAN X

Now I can profile with --profile_steps=1000, 1005, for example, 5 steps, but if I increase it to 10, there is this non-deterministic segfault appearing. Not sure whether this happened to anyone else?

d-miketa · 2020-11-29T22:18:48Z

Now I can profile with --profile_steps=1000, 1005, for example, 5 steps, but if I increase it to 10, there is this non-deterministic segfault appearing. Not sure whether this happened to anyone else?

Yes, I get that segfault too – I think it's because the overhead of profiling, on top of regular GPU computations, causes GPU memory overflow.

dhiren-hamal · 2021-01-19T14:50:24Z

In order to run docker:
nvidia-docker run '--privileged=true' -d -it --name retina_net -v /home/readib/Experiments/:/home -p 8000:8888 -v /tmp/.X11-unix/:/tmp/.X11-unix -e DISPLAY=$DISPLAY retina_net:latest /bin/bash

lyw615 · 2021-05-19T14:44:41Z

@vlasenkoalexey Do you mean the version of NVIDIA CUPTI libary that changes API result in the error? Will the old version that API doesn't change run normally?

vlasenkoalexey · 2021-05-19T15:30:57Z

CUPTI library is part of CUDA, before CUDA 10.x profiling didn't require admin privileges. See nvidia doc for details: https://developer.nvidia.com/nvidia-development-tools-solutions-err-nvgpuctrperm-cupti

lyw615 · 2021-05-20T15:10:50Z

@vlasenkoalexey But CUDA 10.x is also troubled with the privileges problem in my local , so do many people under this issue. My local configuration: ubuntu 18.04 python3.7 cuda10.1/ cuda10.2 (two machines)

sushreebarsa · 2021-05-30T16:02:35Z

Was able to reproduce the issue in TF v2.5 ,please find the gist here..Thanks !

castielhzh · 2021-08-24T06:53:09Z

I also met this problem.My OS is centos7, adding the conf file under /etc/modprobe.d/ and then rebuilding the inital RAM disk by sudo dracut --force and reboot solve the problem.

mohantym · 2021-09-20T15:49:08Z

Hi @gawain-git-code , Could you look at this thread for answer ?

google-ml-butler · 2021-09-27T16:24:09Z

This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.

google-ml-butler · 2021-10-04T16:59:21Z

Closing as stale. Please reopen if you'd like to work on this further.

google-ml-butler · 2021-10-04T16:59:23Z

Are you satisfied with the resolution of your issue?
Yes
No

whybeyoung · 2021-11-11T01:33:34Z

how about cpu train?

corneliusschroeder · 2021-12-06T09:08:07Z

I'm running a tensorflow application in a Docker Container on a Windows Machine with WSL2. I get the following errors:

tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started. I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1363] Profiler found 1 GPUs I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcupti.so.10.1 E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1408] function cupti_interface_->Subscribe( &subscriber_, (CUpti_CallbackFunc)ApiCallback, this)failed with error CUPTI_ERROR_NOT_INITIALIZED E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1447] function cupti_interface_->ActivityRegisterCallbacks( AllocCuptiActivityBuffer, FreeCuptiActivityBuffer)failed with error CUPTI_ERROR_NOT_INITIALIZED E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1430] function cupti_interface_->EnableCallback( 0 , subscriber_, CUPTI_CB_DOMAIN_DRIVER_API, cbid)failed with error CUPTI_ERROR_INVALID_PARAMETER

I changed the /etc/modprobe.d/nvidia-kernel-common.conf file as suggested and run the docker container as root-user.
Does anyone have an idea what to try next?

oanush self-assigned this Jan 16, 2020

oanush added comp:tensorboard Tensorboard related issues TF 2.1 for tracking issues in 2.1 release type:support Support issues labels Jan 16, 2020

oanush added the stat:awaiting response Status - Awaiting response from author label Jan 16, 2020

oanush assigned gowthamkpr and unassigned oanush Jan 17, 2020

tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label Jan 17, 2020

gowthamkpr added type:bug Bug and removed type:support Support issues labels Jan 22, 2020

gowthamkpr added the stat:awaiting response Status - Awaiting response from author label Jan 23, 2020

gowthamkpr added the comp:keras Keras related issues label Feb 3, 2020

gowthamkpr assigned omalleyt12 and unassigned gowthamkpr Feb 3, 2020

gowthamkpr added stat:awaiting tensorflower Status - Awaiting response from tensorflower and removed stat:awaiting response Status - Awaiting response from author labels Feb 3, 2020

Flamefire mentioned this issue Feb 15, 2021

{lib}[fosscuda/2019b] TensorFlow v2.4.1 w/ Python 3.7.4 easybuilders/easybuild-easyconfigs#11637

Merged

6 tasks

lyw615 mentioned this issue May 18, 2021

Is CUPTI error due to tensorflow version or NVIDIA CUPTI libary API change ? #49255

Closed

mohantym self-assigned this Sep 13, 2021

mohantym added the stat:awaiting response Status - Awaiting response from author label Sep 20, 2021

google-ml-butler bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Sep 27, 2021

google-ml-butler bot closed this as completed Oct 4, 2021

yzs981130 mentioned this issue Nov 25, 2021

CUPTI_ERROR_INSUFFICIENT_PRIVILEGES in container geoffxy/habitat#5

Closed

TensorBoard callback without profile_batch setting cause Errors: CUPTI_ERROR_INSUFFICIENT_PRIVILEGES and CUPTI_ERROR_INVALID_PARAMETER #35860

TensorBoard callback without profile_batch setting cause Errors: CUPTI_ERROR_INSUFFICIENT_PRIVILEGES and CUPTI_ERROR_INVALID_PARAMETER #35860

Comments

gawain-git-code commented Jan 14, 2020 • edited

airMeng commented Jan 15, 2020

kevin-hartman commented Jan 15, 2020

eduardofv commented Jan 15, 2020 • edited

oanush commented Jan 16, 2020

gawain-git-code commented Jan 17, 2020

eduardofv commented Jan 17, 2020

gowthamkpr commented Jan 23, 2020

shaywinter commented Jan 28, 2020

seongmoon729 commented Jan 29, 2020 • edited

tamaramiteva commented Feb 4, 2020 • edited

seongmoon729 commented Feb 9, 2020

Liu-Da commented Feb 10, 2020

vlasenkoalexey commented Feb 12, 2020

trisolaran commented Feb 12, 2020

mikechen66 commented Jul 28, 2020 • edited

Liu-Da commented Sep 16, 2020

bonryu commented Oct 19, 2020 • edited

SarfarazHabib commented Oct 21, 2020

trisolaran commented Oct 21, 2020

SarfarazHabib commented Oct 23, 2020

kunihik0 commented Oct 23, 2020

SarfarazHabib commented Oct 24, 2020

solarflarefx commented Nov 3, 2020

Ruomei commented Nov 29, 2020

d-miketa commented Nov 29, 2020

dhiren-hamal commented Jan 19, 2021

lyw615 commented May 19, 2021

vlasenkoalexey commented May 19, 2021

lyw615 commented May 20, 2021 • edited

sushreebarsa commented May 30, 2021

castielhzh commented Aug 24, 2021

mohantym commented Sep 20, 2021

google-ml-butler bot commented Sep 27, 2021

google-ml-butler bot commented Oct 4, 2021

google-ml-butler bot commented Oct 4, 2021

whybeyoung commented Nov 11, 2021

corneliusschroeder commented Dec 6, 2021

gawain-git-code commented Jan 14, 2020 •

edited

eduardofv commented Jan 15, 2020 •

edited

seongmoon729 commented Jan 29, 2020 •

edited

tamaramiteva commented Feb 4, 2020 •

edited

mikechen66 commented Jul 28, 2020 •

edited

bonryu commented Oct 19, 2020 •

edited

lyw615 commented May 20, 2021 •

edited