1pbs_(gpureset)                       Local                      pbs_(gpureset)
2
3
4

NAME

6       pbs_ gpureset - reset GPU error counts
7

SYNOPSIS

9       #include <pbs_error.h>
10       #include <pbs_ifl.h>
11
12       int pbs_ gpureset(int connect, char *mom_node, int gpu_id, int ecc_perm, int ecc_vol)
13

DESCRIPTION

15       Issue a batch request for the pbs_mom to reset the ECC counts on one of
16       it's Nvidia GPUs.  The GPU's error count is reset by sending a GPU Con‐
17       trol batch request to the batch server.
18
19       The  argument, mom_node, specifies the host within the cluster on which
20       the GPU is located. The argument is the name of a host that is a member
21       of the cluster of hosts managed by the server.
22
23       The argument, gpu_id, specifies ID of the GPU on the MOM node.
24
25       The  argument,  ecc_perm,  specifies  whether or not to reset the GPU's
26       permanent ECC error count.  Value of 1 resets, value of 0 does not.
27
28       The argument, ecc_vol, specifies whether or  not  to  reset  the  GPU's
29       volatile ECC error count.  Value of 1 resets, value of 0 does not.
30
31       This call requires PBS Operator or Manager privilege.  It also requires
32       that Torque be configured with --enable-nvidia-gpu.
33

SEE ALSO

35       qgpureset(1B)
36

DIAGNOSTICS

38       When the batch request generated by the pbs_  gpureset()  function  has
39       been  completed successfully by a batch server, the routine will return
40       0 (zero).  Otherwise, a non zero error is returned.  The  error  number
41       is also set in pbs_errno.
42
43
44
45
46
47                                      3B                        pbs_(gpureset)
Impressum