1SORTER(1) General Commands Manual SORTER(1)
2
3
4
6 sorter - Sort files in an image into categories based on file type
7
9 [-b size ] [-e] [-E] [-h] [-l] [-md5] [-s] [-sha1] [-U] [-v] [-V] [-a
10 hash_alert ] [-c config ] [-C config ] [-d dir ] [-m mnt ] [-n nsrl_db
11 ] [-x hash_exclude ] [-i imgtype] [-o imgoffset] [-f fstype] image
12 [image] [meta_addr]
13
15 sorter is a Perl script that analyzes a file system to organize the
16 allocated and unallocated files by file type. It runs the 'file' com‐
17 mand on each file and organizes the files according to the rules in
18 configuration files. Extension mismatching is also done to identify
19 'hidden' files. One can also provide hash databases for files that are
20 known to be good and can be ignored and files that are known to be bad
21 and should be alerted.
22
23 By default, the program uses the configuration files in the directory
24 where The Sleuth Kit was installed. Those can be overruled with run-
25 time options. There is a standard configuration file for all file sys‐
26 tem types and then a specific one for a given operating system.
27
28
30 The required arguments are as follows. This will analyze one or more
31 images and either save the results in the '-d' directory or list the
32 results to STDOUT (if '-l' is given).
33
34
35 -d dir Specify the location of where all files should be written. This
36 includes the index files and subdirectories if the '-s' flag is
37 given. This MUST be given, unless the '-l' list flag is given.
38
39 -l List information to STDOUT (no files are ever written). This is
40 useful for Incident Response, with the use of 'netcat'. This
41 cannot be used if '-d' is used.
42
43 images The file names of the image(s) to analyze.
44
45
46 The options are as follows:
47
48 -f fstype
49 Specify the file system type of the image(s). This is the same
50 type that The Sleuth Kit uses.
51
52
53 -i imgtype
54 Specify the image type in which the file system is located.
55 This is the same type that The Sleuth Kit uses.
56
57
58 -o imgoffset
59 Specify the sector offset from the beginning of the image to the
60 start of the file system.
61
62
63 -b size
64 Specify the minimum size of file to process. All files less
65 than this size will be ignored.
66
67
68 -c config
69 Specify the location of an additional configuration file. This
70 file will be loaded in addition to the standard ones in the
71 install directory. These settings will have priority over the
72 standard files.
73
74 -C config
75 Specify the location of the ONLY configuration file. The stan‐
76 dard config files will not be loaded if this option is given.
77 For example, in the ´share/sort´ directory there is a file
78 called 'images.sort'. This file contains only rules about
79 graphic images. If it is specified with -C, then only images
80 will be saved about the image.
81
82 -m mnt Specify the mounting point of the image being analyzed. This is
83 only for cosmetic reasons. When the entries in the output files
84 are written, the files will have a the full path instead of just
85 the relative path. If this is given, then only one image can be
86 given.
87
88 -a hash_alert
89 Specify the location a hash database with entries of known 'bad'
90 files. If any file is found with an MD5 hash value in this
91 database, it will be placed in a special alert file. This data‐
92 base must have been indexed for MD5 using 'hfind' in The Sleuth
93 Kit before it is used by sorter.
94
95 -n nsrl_db
96 Specify the location of the NIST National Software Reference
97 Library (NSRL) database (www.nsrl.nist.org). Any file found in
98 the NSRL will be ignored and not placed into a category. The
99 database must be indexed for MD5 with 'hfind' in The Sleuth Kit
100 before it is used by sorter. The database file is currently
101 called 'NSRLFile.txt'.
102
103 -x hash_exclude
104 Specify the location a hash database with entries of known
105 'good' files. If any file is found with an MD5 hash value in
106 this database, it will be ignored and not processed or saved to
107 the category files. This database must have been indexed for
108 MD5 using 'hfind' in The Sleuth Kit before it is used by sorter.
109
110 -e Perform extension mismatch checks on (no category index files
111 are generated)
112
113 -i Perform category indexing only (no extension mismatch checks)
114
115 -U Do no save data about unknown file types. By default, an
116 'unknown' file is created for files where the 'file' output is
117 not known. This allows one to refine their configuration. If
118 this is not desired, use this flag.
119
120 -h Create category files in HTML
121
122 -md5 Calculate the MD5 value for each file and save it in the cate‐
123 gory file. This will be done automatically when any of the
124 databases are given.
125
126 -sha1 Calculate the SHA-1 value for each file and save it in the cate‐
127 gory file.
128
129 -s Save the actual file content to sub-directories in the directory
130 specified by '-d'. For example, all JPG and GIF files would
131 actually be saved in the 'images' directory. If '-h' is also
132 given, thumbnails of graphic images are also created.
133
134 -v Display verbose information
135
136 -V Display version.
137
138 [meta_addr]
139 The meta data address of the directory to start with. By
140 default, the root directory is used. If this is given, then
141 only one image can be given.
142
143
145 sorter is a Perl script that interacts with other The Sleuth Kit tools.
146 It starts by reading the configuration files from the installation
147 directory. There is a general configuration file and a specific one
148 for each operating system. The specific one is determined from the
149 '-f' flag. Each configuration file contains rules for processing the
150 output of the 'file' command. One type of line identifies which cate‐
151 gory (i.e. 'images') a given 'file' output belongs to (i.e. ´image
152 data´) (using regular expressions). Another rule shows the file exten‐
153 sions (i.e. .txt) that belong to a 'file' output (i.e.
154 ASCII(.*?)text). See the Rules section below.
155
156 The program then runs the 'fls' tool in The Sleuth Kit to identify the
157 files in the file system image. Each identified file is viewed using
158 the 'icat' tool. If a hash database is given, the hash of the file is
159 calculated and looked up. If it is found in an 'alert' database, then
160 it is added to a special 'alert.txt' file. If it is found in the NSRL
161 or 'exclude' database, then it is ignored as a known good file.
162 Excluded files are recorded in an 'exclude' file for future reference
163 but it is not saved in the category files.
164
165 The 'file' command is then run to identify the file type (based on
166 header information). The configuration file rules are used to identify
167 which category it belongs to. An entry is added to the corresponding
168 category file (in the '-d dir' directory). If the '-s' flag is given,
169 then a copy of the file is saved in a subdirectory of the same name as
170 the category. If the HTML format is used, then hyper-links will allow
171 one to easily view saved files and view what is in each category.
172
173 Files that do not have a category are recorded in the 'unknown' cate‐
174 gory and the 'data' category. 'data' is for files with a structure
175 that 'file' does not know and 'unknown' is for files with a structure
176 that 'file' knows about. These are saved for future reference, but the
177 unknown category can be ignored by using the '-U' flag.
178
179 A copy of the files can be saved by using the '-s' flag. If so, then
180 the files are saved in a subdirectory that is named with the category
181 name. Each file is named using the file system image name followed by
182 the meta data address and the original file extension. The category
183 index file can be used to translate the actual name to the saved name.
184 The HTML format makes viewing easier as there are links to each file
185 from the category index file.
186
187 The program will also consult the rules about the file extension. If
188 the file has an extension at the end of it (anything after a ´.´), it
189 will be compared to the rules. If the extension is not found in the
190 rules as a valid extension for the file type, it will be added to the
191 file of 'mismatch'. If the file does not have an extension it will not
192 be entered even if the file type has valid extensions. This check is
193 done even if the file is found in one of the known good hash databases.
194 If it is found in one of those, it will be added to a special file.
195 Files of type 'data' have no extension checks done by default (as they
196 have an unknown structure).
197
198
199
200 The program repeats the above procedures using the output of the 'ils'
201 command as well. This allows 'sorter' to examine the contents of unal‐
202 located files that still have pointers to the data units (not all file
203 systems will produce data from this step).
204
205
207 Configuration files are used to define what file types belong in which
208 categories and what extensions belong to what file types. Configura‐
209 tion files are distributed with the 'sorter' tool and are located in
210 the installation directory in the 'share/sorter' directory.
211
212 The 'default.sort' file is used by any file system type. It contains
213 entries for common file types. A specific operating system file also
214 exists, which is useful for extensions that are specific to a given OS.
215 By default, the default file and the OS specific one will be used.
216 Using the '-c' flag, an additional file can be used. If the '-C' flag
217 is used, then only the supplied configuration file is used.
218
219 There are two rule types in the configuration files. Each rule starts
220 with a header that specifies which rule type it is (category or ext).
221 Both rule types have two additional columns that can be separated by
222 any white space.
223
224
225 The category rule has the category name as the second column and a Perl
226 regular expression in the third column. The category name can not have
227 any spaces in it and can only be letters and numbers. The regular
228 expression is used to examine the output of 'file'. The regular
229 expression will be used case insensitive. More than one rule can exist
230 for a category, but only one category can exist for a given file out‐
231 put. For example:
232
233 This saves all file output with 'image data' anywhere in it to the
234 ´images´ category:
235 category images image data
236
237 This saves all file output that has 'ASCII' followed by anything and
238 then 'text' to be saved to the 'text' category:
239 category text ASCII(.*?)text
240
241 This saves all file output that is just 'data' to the 'data' category
242 (the ^ and $ define the boundaries in Perl). The 'data' value is com‐
243 mon in the output of file for unknown binary data.
244 category data ^data?
245
246
247 There is a special category of 'ignore' that is used to skip over files
248 of this type. This is mainly a time and space saver.
249
250
251 The extension rule is similar except that the second column has the
252 value extensions for the file output. Multiple rules can exist for the
253 same file type. The comparison will be done case insensitive. If no
254 extension is valid for the file type, a rule does not need to be made.
255 That is already assumed.
256
257 For example, the ASCII is used for several file extensions so the fol‐
258 lowing rules could exist:
259
260 ext txt,log ASCII(.*?)text
261 ext c,cpp,h,js ASCII(.*?)text
262
263
264 Please email me any rules that you find useful for standard investiga‐
265 tions and I will incorporate them into future releases (carrier at
266 sleuthkit dot org).
267
268
270 To run sorter with no hash databases, the following can be used:
271
272 # sorter -f ntfs -d data/sorter images/hda1.dd
273 # sorter -d data/sorter images/hda1.dd
274
275 # sorter -i raw -f ntfs -o 63 -d data/sorter images/hda.dd
276
277 To include the NSRL, an exclude, and an alert hash database:
278
279 # sorter -f ntfs -d data/sorter -a /usr/hash/rootkit.db -x
280 /usr/hash/win2k.db -n /usr/hash/nsrl/NSRLFile.txt images/hda1.dd
281
282 To just identify images using the supplied 'images.sort' file:
283
284 # sorter -f ntfs -C /usr/local/sleuthkit/share/sort/images.sort
285 -d data/sorter -h -s images/hda1.dd
286
287
289 The NIST National Software Reference Library (NSRL) can be found at
290 www.nsrl.nist.gov.
291
292
294 Distributed under the Common Public License, found in the cpl1.0.txt
295 file in the The Sleuth Kit licenses directory.
296
297
299 Brian Carrier <carrier at sleuthkit dogt org>
300
301 Send documentation updates to <doc-updates at sleuthkit dot org>
302
303
304
305 SORTER(1)