1EXTRACT(1) General Commands Manual EXTRACT(1)
2
3
4
6 extract - determine meta-information about a file
7
9 extract [ -bgihLmnvV ] [ -l library ] [ -p type ] [ -x type ] file ...
10
12 This manual page documents version 1.0.0 of the extract command.
13
14 extract tests each file specified in the argument list in an attempt to
15 infer meta-information from it. Each file is subjected to the
16 meta-data extraction libraries from libextractor.
17
18 libextractor classifies meta-information (also referred to as keywords)
19 into types. A list of all types can be obtained with the -L option.
20
21
23 -b Display the output in BiBTeX format.
24
25 -g Use grep-friendly output (all keywords on a single line for
26 each file). Use the verbose option to print the filename
27 first, followed by the keywords. Use the verbose option twice
28 to also display the keyword types. This option will not print
29 keyword types or non-textual metadata.
30
31 -h Print a brief summary of the options.
32
33 -i Run plugins in-process (for debugging). By default, each plug‐
34 in is run in its own process.
35
36 -l libraries
37 Use the specified libraries to extract keywords. The general
38 format of libraries is .I [[-]LIBRARYNAME[:[-]LIBRARYNAME]*]
39 where LIBRARYNAME is a libextractor compatible library and typ‐
40 ically of the form .Ijpeg. The minus before the libraryname
41 indicates that this library should be removed from the existing
42 list. To run only a few selected plugins, use -l in combina‐
43 tion with -n.
44
45 -L Print a list of all known keyword types.
46
47 -m Load the file into memory and perform extraction from memory
48 (for debugging).
49
50 -n Do not use the default set of extractors (typically all stan‐
51 dard extractors, currently mp3, ogg, jpg, gif, png, tiff, real,
52 html, pdf and mime-types), use only the extractors specified
53 with the .B -l option.
54
55 -p type
56 Print only the keywords matching the specified type. By
57 default, all keywords that are found and not removed as dupli‐
58 cates are printed.
59
60 -v Print the version number and exit.
61
62 -V Be verbose. This option can be specified multiple times to
63 increase verbosity further.
64
65 -x type
66 Exclude keywords of the specified type from the output. By
67 default, all keywords that are found and not removed as dupli‐
68 cates are printed.
69
71 libextractor(3) - description of the libextractor library
72
74 $ extract test/test.jpg
75 comment - (C) 2001 by Christian Grothoff, using gimp 1.2 1
76 mimetype - image/jpeg
77
78 $ extract -V -x comment test/test.jpg
79 Keywords for file test/test.jpg:
80 mimetype - image/jpeg
81
82 $ extract -p comment test/test.jpg
83 comment - (C) 2001 by Christian Grothoff, using gimp 1.2 1
84
85 $ extract -nV -l png.so -p comment test/test.jpg test/test.png
86 Keywords for file test/test.jpg:
87 Keywords for file test/test.png:
88 comment - Testing keyword extraction
89
90
92 libextractor and the extract tool are released under the GPL. libex‐
93 tractor is a GNU package.
94
95
97 A couple of file-formats (on the order of 10^3) are not recognized...
98
99
101 extract was originally written by Christian Grothoff <chris‐
102 tian@grothoff.org> and Vidyut Samanta <vids@cs.ucla.edu>. Use <libex‐
103 tractor@gnu.org> to contact the current maintainer(s).
104
105
107 You can obtain the original author's latest version from
108 http://www.gnu.org/software/libextractor/
109
110
111
112libextractor 1.0.0 Aug 7, 2012 EXTRACT(1)