1sh5util(1)                      Slurm Commands                      sh5util(1)
2
3
4

NAME

6       sh5util  -  Tool  for  merging  HDF5 files from the acct_gather_profile
7       plugin that gathers detailed data for jobs running under Slurm
8
9

SYNOPSIS

11       sh5util
12
13

DESCRIPTION

15       sh5util merges HDF5 files produced on each node for each step of a  job
16       into  one  HDF5  file for the job. The resulting file can be viewed and
17       manipulated by common HDF5 tools such as HDF5View, h5dump,  h5edit,  or
18       h5ls.
19
20       sh5util  also has two extract modes. The first, writes a limited set of
21       data for specific nodes, steps, and data  series  in  "comma  separated
22       value"  form  to a file which can be imported into other analysis tools
23       such as spreadsheets.
24
25       The second, (Item-Extract) extracts one data time from one time  series
26       for all the samples on all the nodes from a jobs HDF5 profile.
27
28       - Finds sample with maximum value of the item.
29
30       -  Write CSV file with min, ave, max, and item totals for each node for
31       each sample
32
33
34

OPTIONS

36       -L, --list
37
38              Print the items of a series contained in a job file.
39
40              List mode options
41
42
43              -i, --input=path
44                        Merged file to extract from (default ./job_$jobid.h5)
45
46
47              -s, --series=[Energy | Filesystem | Network | Task]
48
49
50       -E, --extract
51
52              Extract data series from a merged job file.
53
54
55              Extract mode options
56
57
58              -i, --input=path
59                        merged file to extract from (default ./job_$jobid.h5)
60
61
62              -N, --node=nodename
63                        Node name to extract (default is all)
64
65
66              -l, --level=[Node:Totals | Node:TimeSeries]
67                        Level  to   which   series   is   attached.   (default
68                        Node:Totals)
69
70
71              -s, --series=[Energy | Filesystem | Network | Task | Task_#]
72                        Task is all tasks, Task_# (# is a task id) (default is
73                        everything)
74
75
76
77       -I, --item-extract
78
79              Extract one data item from all samples of one data  series  from
80              all nodes in a merged job file.
81
82
83              Item-Extract mode options
84
85
86              -s, --series=[Energy | Filesystem | Network | Task]
87
88
89              -d, --data
90                        Name of data item in series (See note below).
91
92
93
94       -j, --jobs=<job(.step)>
95              Format  is  <job(.step)>.  Merge this job/step (or a comma-sepa‐
96              rated list of job steps). This option is required.  Not specify‐
97              ing a step will result in all steps found to be processed.
98
99
100       -h, --help
101              Print this description of use.
102
103
104       -o, --output=path
105              Path to a file into which to write.
106              Default for merge is ./job_$jobid.h5
107              Default for extract is ./extract_$jobid.csv
108
109
110       -p, --profiledir=dir
111              Directory location where node-step files exist default is set in
112              acct_gather.conf.
113
114
115       -S, --savefiles
116              Instead of removing node-step files after merging them into  the
117              job file, keep them around.
118
119
120       --user=user
121              User  who  profiled job.  (Handy for root user, defaults to user
122              running this command.)
123
124
125       --usage
126              Display brief usage message.
127
128

Data Items per Series

130       Energy
131              Power
132              CPU_Frequency
133
134
135       Filesystem
136              Reads
137              Megabytes_Read
138              Writes
139              Megabytes_Write
140
141
142       Network
143              Packets_In
144              Megabytes_In
145              Packets_Out
146              Megabytes_Out
147
148
149       Task
150              CPU_Frequency
151              CPU_Time
152              CPU_Utilization
153              RSS
154              VM_Size
155              Pages
156              Read_Megabytes
157              Write_Megabytes
158
159

PERFORMANCE

161       Executing sh5util sends a remote procedure call to slurmctld. If enough
162       calls from sh5util or other Slurm client commands that send remote pro‐
163       cedure calls to the slurmctld daemon come in at once, it can result  in
164       a  degradation of performance of the slurmctld daemon, possibly result‐
165       ing in a denial of service.
166
167       Do not run sh5util or other Slurm client commands that send remote pro‐
168       cedure  calls  to  slurmctld  from loops in shell scripts or other pro‐
169       grams. Ensure that programs limit calls to sh5util to the minimum  nec‐
170       essary for the information you are trying to gather.
171
172

Examples

174       Merge node-step files (as part of a sbatch script)
175
176       sbatch    -n1    -d$SLURM_JOB_ID    --wrap="sh5util    --savefiles   -j
177       $SLURM_JOB_ID"
178
179
180       Extract all task data from a node
181
182       sh5util -j 42 -N snowflake01 --level=Node:TimeSeries --series=Tasks
183
184
185       Extract all energy data
186              sh5util -j 42 --series=Energy --data=power
187
188

COPYING

190       Copyright (C) 2013 Bull.
191       Copyright (C) 2013 SchedMD LLC.  Slurm is free software; you can redis‐
192       tribute  it  and/or modify it under the terms of the GNU General Public
193       License as published by the Free Software Foundation; either version  2
194       of the License, or (at your option) any later version.
195
196       Slurm  is  distributed  in the hope that it will be useful, but WITHOUT
197       ANY WARRANTY; without even the implied warranty of  MERCHANTABILITY  or
198       FITNESS  FOR  A PARTICULAR PURPOSE.  See the GNU General Public License
199       for more details.
200
201

SEE ALSO

203December 2020                   Slurm Commands                      sh5util(1)
Impressum