1sh5util(1) Slurm Commands sh5util(1)
2
3
4
6 sh5util - Tool for merging HDF5 files from the acct_gather_profile
7 plugin that gathers detailed data for jobs running under Slurm
8
9
11 sh5util
12
13
15 sh5util merges HDF5 files produced on each node for each step of a job
16 into one HDF5 file for the job. The resulting file can be viewed and
17 manipulated by common HDF5 tools such as HDF5View, h5dump, h5edit, or
18 h5ls.
19
20 sh5util also has two extract modes. The first, writes a limited set of
21 data for specific nodes, steps, and data series in "comma separated
22 value" form to a file which can be imported into other analysis tools
23 such as spreadsheets.
24
25 The second, (Item-Extract) extracts one data time from one time series
26 for all the samples on all the nodes from a jobs HDF5 profile.
27
28 - Finds sample with maximum value of the item.
29
30 - Write CSV file with min, ave, max, and item totals for each node for
31 each sample
32
33
34
36 -L, --list
37
38 Print the items of a series contained in a job file.
39
40 List mode options
41
42
43 -i, --input=path
44 Merged file to extract from (default ./job_$jobid.h5)
45
46
47 -s, --series=[Energy | Filesystem | Network | Task]
48
49
50 -E, --extract
51
52 Extract data series from a merged job file.
53
54
55 Extract mode options
56
57
58 -i, --input=path
59 merged file to extract from (default ./job_$jobid.h5)
60
61
62 -N, --node=nodename
63 Node name to extract (default is all)
64
65
66 -l, --level=[Node:Totals | Node:TimeSeries]
67 Level to which series is attached. (default
68 Node:Totals)
69
70
71 -s, --series=[Energy | Filesystem | Network | Task | Task_#]
72 Task is all tasks, Task_# (# is a task id) (default is
73 everything)
74
75
76
77 -I, --item-extract
78
79 Extract one data item from all samples of one data series from
80 all nodes in a merged job file.
81
82
83 Item-Extract mode options
84
85
86 -s, --series=[Energy | Filesystem | Network | Task]
87
88
89 -d, --data
90 Name of data item in series (See note below).
91
92
93
94 -j, --jobs=<job(.step)>
95 Format is <job(.step)>. Merge this job/step (or a comma-sepa‐
96 rated list of job steps). This option is required. Not specify‐
97 ing a step will result in all steps found to be processed.
98
99
100 -h, --help
101 Print this description of use.
102
103
104 -o, --output=path
105 Path to a file into which to write.
106 Default for merge is ./job_$jobid.h5
107 Default for extract is ./extract_$jobid.csv
108
109
110 -p, --profiledir=dir
111 Directory location where node-step files exist default is set in
112 acct_gather.conf.
113
114
115 -S, --savefiles
116 Instead of removing node-step files after merging them into the
117 job file, keep them around.
118
119
120 --user=user
121 User who profiled job. (Handy for root user, defaults to user
122 running this command.)
123
124
125 --usage
126 Display brief usage message.
127
128
130 Energy
131 Power
132 CPU_Frequency
133
134
135 Filesystem
136 Reads
137 Megabytes_Read
138 Writes
139 Megabytes_Write
140
141
142 Network
143 Packets_In
144 Megabytes_In
145 Packets_Out
146 Megabytes_Out
147
148
149 Task
150 CPU_Frequency
151 CPU_Time
152 CPU_Utilization
153 RSS
154 VM_Size
155 Pages
156 Read_Megabytes
157 Write_Megabytes
158
159
161 Executing sh5util sends a remote procedure call to slurmctld. If enough
162 calls from sh5util or other Slurm client commands that send remote pro‐
163 cedure calls to the slurmctld daemon come in at once, it can result in
164 a degradation of performance of the slurmctld daemon, possibly result‐
165 ing in a denial of service.
166
167 Do not run sh5util or other Slurm client commands that send remote pro‐
168 cedure calls to slurmctld from loops in shell scripts or other pro‐
169 grams. Ensure that programs limit calls to sh5util to the minimum nec‐
170 essary for the information you are trying to gather.
171
172
174 Merge node-step files (as part of a sbatch script)
175
176 sbatch -n1 -d$SLURM_JOB_ID --wrap="sh5util --savefiles -j
177 $SLURM_JOB_ID"
178
179
180 Extract all task data from a node
181
182 sh5util -j 42 -N snowflake01 --level=Node:TimeSeries --series=Tasks
183
184
185 Extract all energy data
186 sh5util -j 42 --series=Energy --data=power
187
188
190 Copyright (C) 2013 Bull.
191 Copyright (C) 2013 SchedMD LLC. Slurm is free software; you can redis‐
192 tribute it and/or modify it under the terms of the GNU General Public
193 License as published by the Free Software Foundation; either version 2
194 of the License, or (at your option) any later version.
195
196 Slurm is distributed in the hope that it will be useful, but WITHOUT
197 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
198 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
199 for more details.
200
201
203December 2020 Slurm Commands sh5util(1)