1
2ethfabricanalysis(8)         EFSFFCLIRG (Man Page)        ethfabricanalysis(8)
3
4
5

NAME

7       ethfabricanalysis
8
9
10
11       Performs analysis of the fabric.
12

Syntax

14       ethfabricanalysis  [-b|-e]  [-s]  [-d   dir]  [-c   file] [-E file] [-p
15       planes] [-T  topology_inputs]
16

Options

18       --help
19
20                 Produces full help text.
21
22       -b
23
24                 Specifies the baseline mode. Default is compare/check mode.
25
26       -e
27
28                 Evaluates health only. Default is compare/check mode.
29
30       -s
31
32                 Saves history of failures (errors/differences).
33
34       -d dir
35
36                 Specifies the top-level directory  for  saving  baseline  and
37                 history   of  failed  checks.  Default  is  /var/usr/lib/eth-
38                 tools/analysis
39
40       -c file
41
42                 Specifies  the  error  thresholds  config   file.Default   is
43                 /etc/eth-tools/ethmon.conf
44
45       -E file
46
47                 Specifies  Ethernet  Mgt  configuration  file. The default is
48                 /etc/eth-tools/mgt_config.xml.
49
50
51       -p planes
52
53                 Specifies Fabric planes separated by space.  The  default  is
54                 the  first  enabled plane defined in config file. Value 'ALL'
55                 will use all enabled planes.
56
57
58       -T topology_inputs
59
60                 Specifies the name of topology input filenames  separated  by
61                 space. See Details and ethreport for more information.
62

Example

64       ethfabricanalysis
65
66       The fabric analysis tool checks the following:
67
68       •      Fabric links (both internal to switch and external cables)
69
70       •      Fabric  components  (nodes, links, systems, and their configura‐
71              tion)
72
73       •      Fabric error counters and link speed mismatches
74
75       NOTE: The comparison includes components on the fabric. Therefore,  op‐
76       erations  such  as shutting down a server cause the server to no longer
77       appear on the fabric and are flagged as a fabric change or  failure  by
78       ethfabricanalysis.
79
80

Environment Variables

82       The following environment variables are also used by this command:
83
84       FF_ANALYSIS_DIR
85
86                 Top-level directory for baselines and failed health checks.
87

Details

89       You can specify the topology_input file to be used with one of the fol‐
90       lowing methods:
91
92       •      On the command line using the -T option.
93
94       •      Using the TopologyFile specified in Ethernet Mgt config file.
95
96       If the specified file does not exist, no topology_input file is used.
97
98       For more information on topology_input, refer to ethreport
99
100       By default, the error analysis includes counters and slow  links  (that
101       is,  links running below enabled speeds). You can change this using the
102       FF_FABRIC_HEALTH configuration parameter  in  ethfastfabric.conf.  This
103       parameter  specifies  the  ethreport options and reports to be used for
104       the health analysis.
105
106       When a topology_input file is used, it can also  be  useful  to  extend
107       FF_FABRIC_HEALTH  to  include fabric topology verification options such
108       as -o verifylinks.
109
110       The thresholds for  counter  analysis  default  to  /etc/eth-tools/eth‐
111       mon.conf.  However, you can specify an alternate configuration file for
112       thresholds using the -c option. The ethmon.si.conf  file  can  also  be
113       used  to  check for any non-zero values for signal integrity (SI) coun‐
114       ters.
115
116       All files generated by ethfabricanalysis start  with  fabric  in  their
117       file name.
118
119       The ethfabricanalysis tool generates files such as the following within
120       FF_ANALYSIS_DIR :
121
122       Health Check
123
124
125       •      latest/fabric.<plane_name>.errors stdout of ethreport for errors
126              encountered during fabric error analysis.
127
128
129       •      latest/fabric.<plane_name>.errors.stderr   stderr  of  ethreport
130              during fabric error analysis.
131
132
133       Baseline
134
135
136       During a  baseline  run,  the  following  files  are  also  created  in
137       FF_ANALYSIS_DIR/latest.
138
139       •      baseline/fabric.<plane_name>.snapshot.xml  ethreport snapshot of
140              complete fabric components and configuration.
141
142
143       •      baseline/fabric.<plane_name>.comps ethreport summary  of  fabric
144              components and basic configuration.
145
146
147       •      baseline/fabric.<plane_name>.links ethreport summary of internal
148              and external links.
149
150
151       Full Analysis
152
153
154       •      latest/fabric.<plane_name>.snapshot.xml  ethreport  snapshot  of
155              complete fabric components and configuration.
156
157
158       •      latest/fabric.<plane_name>.snapshot.stderr  stderr  of ethreport
159              during snapshot.
160
161
162       •      latest/fabric.<plane_name>.errors stdout of ethreport for errors
163              encountered during fabric error analysis.
164
165
166       •      latest/fabric.<plane_name>.errors.stderr   stderr  of  ethreport
167              during fabric error analysis.
168
169
170       •      latest/fabric.<plane_name>.comps stdout of ethreport for  fabric
171              components and configuration.
172
173
174       •      latest/fabric.<plane_name>.comps.stderr  stderr of ethreport for
175              fabric components.
176
177
178       •      latest/fabric.<plane_name>.comps.diff diff of baseline and  lat‐
179              est fabric components.
180
181
182       •      latest/fabric.<plane_name>.links  stdout of ethreport summary of
183              internal and external links.
184
185
186       •      latest/fabric.<plane_name>.links.stderr stderr of ethreport sum‐
187              mary of internal and external links.
188
189
190       •      latest/fabric.<plane_name>.links.diff  diff of baseline and lat‐
191              est fabric internal and external links.
192
193
194       •      latest/fabric.<plane_name>.links.changes.stderr stderr of ethre‐
195              port comparison of links.
196
197
198       •      latest/fabric.<plane_name>.links.changes ethreport comparison of
199              links against baseline. This is typically easier  to  read  than
200              the links.diff file and contains the same information.
201
202
203       •      latest/fabric.<plane_name>.comps.changes.stderr stderr of ethre‐
204              port comparison of components.
205
206
207       •      latest/fabric.<plane_name>.comps.changes ethreport comparison of
208              components  against  baseline.  This is typically easier to read
209              than the comps.diff file and contains the same information.
210
211
212       The .diff and .changes files are only created if  differences  are  de‐
213       tected.
214
215       If  the  -s  option is used and failures are detected, files related to
216       the checks that failed are also copied to  the  time-stamped  directory
217       name under FF_ANALYSIS_DIR.
218

Fabric Items Checked Against the Baseline

220       Based on ethreport -o links:
221
222       •      Unconnected/down/missing cables
223
224       •      Added/moved cables
225
226       •      Changes in link width and speed
227
228       •      Changes  to IfAddr in fabric (replacement of NIC or Switch hard‐
229              ware)
230
231       •      Adding/Removing Nodes  [NIC,  Virtual  NICs,  Virtual  Switches,
232              Physical  Switches,  Physical  Switch  internal  switching cards
233              (leaf/spine)]
234
235       •      Changes to server or switch names
236
237       Based on ethreport -o comps:
238
239       •      Overlap with items from links report
240
241       •      Changes in port MTU
242
243       •      Changes in port speed/width enabled or supported
244
245       •      Changes in NIC or switch device IDs/revisions/VendorID (for  ex‐
246              ample, ASIC hardware changes)
247
248       •      Changes  in  port  Capability mask (which features/agents run on
249              port/server)
250
251       •      Changes to IOUs/IOCs/IOC Services provided
252
253
254

Fabric Items Also Checked During Health Check

256       Based on ethreport -s -o errors -o slowlinks:
257
258       •      error counters on  all  Intel(R)  Ethernet  Fabric  ports  (NIC,
259              switch  external  and  switch  internal) checked against config‐
260              urable thresholds.
261
262       •      Typically identifies potential fabric errors, such as symbol er‐
263              rors.
264
265       •      May  also  identify transient congestion, depending on the coun‐
266              ters that are monitored.
267
268       •      Link active speed/width as compared to Enabled speed.
269
270       •      Identifies links whose active  speed/width  is  <  min  (enabled
271              speed/width on each side of link).
272
273       •      This  typically reflects bad cables or bad ports or poor connec‐
274              tions.
275
276       •      Side effect is the verification of fabric health.
277
278
279
280Copyright(C) 2020-2021         Intel Corporation          ethfabricanalysis(8)
Impressum