1qgpureset(1B) PBS qgpureset(1B)
2
3
4
6 qgpureset - reset GPU error counts
7
9 qgpureset -H host -g gpuid -p -v
10
12 The qgpureset command will request a MOM to reset the ECC counts on one
13 of it's Nvidia GPUs. The GPU's error count is reset by sending a GPU
14 Control batch request to the batch server.
15
16 Changing the GPU mode requires PBS Operator or Manager privilege. It
17 also requires that Torque be configured with --enable-nvidia-gpu.
18
20 -H host Specifies the host within the cluster on which the GPU
21 is located. The argument is the name of a host that is a
22 member of the cluster of hosts managed by the server.
23
24 -g gpuid Specifies the ID of the GPU.
25
26 -p Specifies to reset the GPU's permanent ECC error count.
27
28 -v Specifies to reset the GPU's volatile ECC error count.
29
31 None
32
34 The qgpureset command will write a diagnostic messages to standard
35 error for each error occurrence.
36
38 Upon successful processing of all the operands presented to the
39 qgpureset command, the exit status will be a value of zero.
40
41 If the qgpureset command fails to process any operand, the command
42 exits with a value greater than zero.
43
45 pbs_mom(8B) and pbs_server(8B)
46
47
48
49
50Local qgpureset(1B)