1sh5util(1) Slurm Commands sh5util(1)
2
3
4
6 sh5util - Tool for merging HDF5 files from the acct_gather_profile
7 plugin that gathers detailed data for jobs running under Slurm
8
9
11 sh5util
12
13
15 sh5util merges HDF5 files produced on each node for each step of a job
16 into one HDF5 file for the job. The resulting file can be viewed and
17 manipulated by common HDF5 tools such as HDF5View, h5dump, h5edit, or
18 h5ls.
19
20 sh5util also has two extract modes. The first, writes a limited set of
21 data for specific nodes, steps, and data series in "comma separated
22 value" form to a file which can be imported into other analysis tools
23 such as spreadsheets.
24
25 The second, (Item-Extract) extracts one data time from one time series
26 for all the samples on all the nodes from a jobs HDF5 profile.
27
28 - Finds sample with maximum value of the item.
29
30 - Write CSV file with min, ave, max, and item totals for each node for
31 each sample
32
33
34
36 -E, --extract
37
38 Extract data series from a merged job file.
39
40 Extract mode options
41
42 -i, --input=path
43 merged file to extract from (default ./job_$jobid.h5)
44
45 -N, --node=nodename
46 Node name to extract (default is all)
47
48 -l, --level=[Node:Totals | Node:TimeSeries]
49 Level to which series is attached. (default Node:To‐
50 tals)
51
52 -s, --series=[Energy | Filesystem | Network | Task | Task_#]
53 Task is all tasks, Task_# (# is a task id) (default is
54 everything)
55
56 -h, --help
57 Print this description of use.
58
59 -I, --item-extract
60
61 Extract one data item from all samples of one data series from
62 all nodes in a merged job file.
63
64 Item-Extract mode options
65
66 -s, --series=[Energy | Filesystem | Network | Task]
67
68 -d, --data
69 Name of data item in series (See note below).
70
71 -j, --jobs=<job[.step]>
72 Format is <job[.step]>. Merge this job/step (or a comma-sepa‐
73 rated list of job steps). This option is required. Not specify‐
74 ing a step will result in all steps found to be processed.
75
76 -L, --list
77
78 Print the items of a series contained in a job file.
79
80 List mode options
81
82 -i, --input=path
83 Merged file to extract from (default ./job_$jobid.h5)
84
85 -s, --series=[Energy | Filesystem | Network | Task]
86
87 -o, --output=<path>
88 Path to a file into which to write. Default for merge is
89 ./job_$jobid.h5 Default for extract is ./extract_$jobid.csv
90
91 -p, --profiledir=<dir>
92 Directory location where node-step files exist default is set in
93 acct_gather.conf.
94
95 -S, --savefiles
96 Instead of removing node-step files after merging them into the
97 job file, keep them around.
98
99 --usage
100 Display brief usage message.
101
102 --user=<user>
103 User who profiled job. (Handy for root user, defaults to user
104 running this command.)
105
107 Energy
108 Power
109 CPU_Frequency
110
111 Filesystem
112 Reads
113 Megabytes_Read
114 Writes
115 Megabytes_Write
116
117 Network
118 Packets_In
119 Megabytes_In
120 Packets_Out
121 Megabytes_Out
122
123 Task
124 CPU_Frequency
125 CPU_Time
126 CPU_Utilization
127 RSS
128 VM_Size
129 Pages
130 Read_Megabytes
131 Write_Megabytes
132
134 Executing sh5util sends a remote procedure call to slurmctld. If enough
135 calls from sh5util or other Slurm client commands that send remote pro‐
136 cedure calls to the slurmctld daemon come in at once, it can result in
137 a degradation of performance of the slurmctld daemon, possibly result‐
138 ing in a denial of service.
139
140 Do not run sh5util or other Slurm client commands that send remote pro‐
141 cedure calls to slurmctld from loops in shell scripts or other pro‐
142 grams. Ensure that programs limit calls to sh5util to the minimum nec‐
143 essary for the information you are trying to gather.
144
145
147 Merge node-step files (as part of a sbatch script):
148
149 $ sbatch -n1 -d$SLURM_JOB_ID --wrap="sh5util --savefiles -j $SLURM_JOB_ID"
150
151
152 Extract all task data from a node:
153
154 $ sh5util -j 42 -N snowflake01 --level=Node:TimeSeries --series=Tasks
155
156
157 Extract all energy data:
158
159 $ sh5util -j 42 --series=Energy --data=power
160
161
163 Copyright (C) 2013 Bull.
164 Copyright (C) 2013-2022 SchedMD LLC. Slurm is free software; you can
165 redistribute it and/or modify it under the terms of the GNU General
166 Public License as published by the Free Software Foundation; either
167 version 2 of the License, or (at your option) any later version.
168
169 Slurm is distributed in the hope that it will be useful, but WITHOUT
170 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
171 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
172 for more details.
173
174
176February 2021 Slurm Commands sh5util(1)