1
2ethfindgood(8)               EFSFFCLIRG (Man Page)              ethfindgood(8)
3
4
5

NAME

7       ethfindgood
8
9
10
11       Checks  for hosts that are able to be pinged, accessed via SSH, and ac‐
12       tive on the Intel(R) Ethernet Fabric. Produces a  list  of  good  hosts
13       meeting  all criteria. Typically used to identify good hosts to undergo
14       further testing and benchmarking during  initial  cluster  staging  and
15       startup.
16
17       The  resulting  good  file lists each good host exactly once and can be
18       used as input to create mpi_hosts files for running  mpi_apps  and  the
19       NIC-SW  cable test. The files alive, running, active, good, and bad are
20       created in the selected directory listing hosts passing each  criteria.
21       If  a  plane  name  is  provided,  filename  will  be xxx_<plane>, e.g.
22       good_plane1
23
24       This command  automatically  generates  the  file  FF_RESULT_DIR/punch‐
25       list.csv.  This file provides a concise summary of the bad hosts found.
26       This can be imported into Excel directly  as  a  *.csv  file.  Alterna‐
27       tively,  it  can be cut/pasted into Excel, and the Data/Text to Columns
28       toolbar can be used to separate the information into  multiple  columns
29       at the semicolons.
30
31       A sample generated output is:
32
33       # ethfindgood
34
35       3 hosts will be checked
36
37       2 hosts are pingable (alive)
38
39       2 hosts are ssh'able (running)
40
41       2 total hosts have RDMA active on one or more fabrics (active)
42
43       1 hosts are alive, running, active (good)
44
45       2 hosts are bad (bad)
46
47       Bad hosts have been added to /root/punchlist.csv
48
49       # cat /root/punchlist.csv
50
51       2015/10/09 14:36:48;phs1fnivd13u07n4;Doesn't ping
52
53       2015/10/09 14:36:48;phs1fnivd13u07n4;Can't ssh
54
55       2015/10/09 14:36:48;phs1fnivd13u07n3;No active RDMA port
56
57
58
59       For  a  given run, a line is generated for each failing host. Hosts are
60       reported exactly once for a given run. Therefore, a host that does  not
61       ping  is  NOT listed as can't ssh nor No active RDMA port. There may be
62       cases where ports could be active for hosts that do not ping.  However,
63       the lack of ping often implies there are other fundamental issues, such
64       as PXE boot or inability to access DNS or DHCP to get proper host  name
65       and  IP  address.  Therefore, reporting hosts that do not ping is typi‐
66       cally of limited value.
67

Syntax

69       ethfindgood [-R|-A] [-d  dir] [-p  plane] [-f  hostfile]  [-h  'hosts']
70       [-T  timelimit]
71

Options

73       --help
74
75                 Produces full help text.
76
77       -R
78
79                 Skips  the  running  test (SSH). Recommended if password-less
80                 SSH is not set up.
81
82       -A
83
84                 Skips the active test. Recommended if Intel(R) Ethernet  Fab‐
85                 ric Suite software or fabric is not up.
86
87       -p plane
88
89                 Specifies the name of the plane to use.
90
91       -d dir
92
93                 Specifies  the  directory  in  which to create alive, active,
94                 running, good, and bad files. Default is  /etc/eth-tools  di‐
95                 rectory.
96
97       -f hostfile
98
99                 Specifies   the  file  with  hosts  in  cluster.  Default  is
100                 /etc/eth-tools/hosts directory.
101
102       -h hosts
103
104                 Specifies the list of hosts to ping.
105
106       -T timelimit
107
108                 Specifies the time limit in seconds for host  to  respond  to
109                 SSH. Default is 20 seconds.
110
111

Environment Variables

113       The following environment variables are also used by this command:
114
115       HOSTS
116
117                 List of hosts, used if -h option not supplied.
118
119
120       HOSTS_FILE
121
122                 File containing list of hosts, used in absence of -f and -h.
123
124
125       FF_MAX_PARALLEL
126
127                 Maximum concurrent operations.
128
129

Examples

131       ethfindgood
132
133       ethfindgood -f allhosts
134
135       ethfindgood -h 'arwen elrond'
136
137       HOSTS='arwen elrond' ethfindgood
138
139       HOSTS_FILE=allhosts ethfindgood
140
141
142
143Copyright(C) 2020-2022         Intel Corporation                ethfindgood(8)
Impressum