1HYPERTOC(1) User Contributed Perl Documentation HYPERTOC(1)
2
3
4
6 hypertoc - generate a table of contents for HTML documents
7
9 version 3.20
10
12 hypertoc --help | --manpage | --man_help | --man
13
14 hypertoc [--bak string ] [ --debug ] [ --entrysep string ] [ --footer
15 file ] [ --header file ] [ --ignore_only_one ] [ --ignore_sole_first ]
16 [ --inline ] [ --make_anchors ] [ --make_toc ] [ --notoc_match string ]
17 [ --ol | --nool ] [ --ol_num_levels ] [ --outfile file ] [ --overwrite
18 ] [ --quiet ] [ --textonly ] [ --title string ] { --toc_after
19 tag=suffix } { --toc_before tag=prefix } { --toc_end tag=endtag } {
20 --toc_entry tag=level } [ --toc_label string ] [ --toc_only |
21 --notoc_only ] [ --toc_tag string ] [ --toc_tag_replace ] [ --use_id ]
22 [ --useorg ] file ...
23
25 hypertoc allows you to specify "significant elements" that will be
26 hyperlinked to in a "Table of Contents" (ToC) for a given set of HTML
27 documents.
28
29 Basically, the ToC generated is a multi-level level list containing
30 links to the significant elements. hypertoc inserts the links into the
31 ToC to significant elements at a level specified by the user.
32
33 Example:
34
35 If H1s are specified as level 1, than they appear in the first level
36 list of the ToC. If H2s are specified as a level 2, than they appear in
37 a second level list in the ToC.
38
39 There are two aspects to the ToC generation: (1) putting suitable
40 anchors into the HTML documents (--make_anchors), and (2) generating
41 the ToC from HTML documents which have anchors in them for the ToC to
42 link to (--make_toc). One can choose to do one or both of these.
43
44 hypertoc also supports the ability to incorporate the ToC into the HTML
45 document itself via the --inline option.
46
47 In order for hypertoc to support linking to significant elements,
48 hypertoc inserts anchors into the significant elements. One can use
49 hypertoc as a filter, outputing the result to another file, or one can
50 overwrite the original file, with the original backed up with a suffix
51 (default: "org") appended to the filename.
52
53 One can also define options in a config file as well as on the command-
54 line.
55
57 Options can start with "--" or "-"; boolean options can be negated by
58 preceding them with "no"; options with hash or array values can be
59 added to by giving the option again for each value.
60
61 See Getopt::Long for more information.
62
63 --argfile filename
64 The name of a file to read more options from. This can be used
65 more than once. For example:
66
67 --argfile your.args --argfile my.args
68
69 See "Options Files" for more information.
70
71 --bak
72 --bak string
73
74 If the input file/files is/are being overwritten (--overwrite is
75 on), copy the original file to "filename.string". If the value is
76 empty, there is no backup file written. (default:org)
77
78 --debug
79 Enable verbose debugging output. Used for debugging this module;
80 in other words, don't bother. (default:off)
81
82 --entrysep
83 --entrysep string
84
85 Separator string for non-<li> item entries (default: ", ")
86
87 --footer
88 --footer file
89
90 File containing footer text for table of contents.
91
92 --header
93 --header file
94
95 File containing header text for table of contents.
96
97 --help
98 Print a short help message and exit.
99
100 --ignore_only_one
101 If there would be only one item in the ToC, don't make a ToC.
102
103 --ignore_sole_first
104 If the first item in the ToC is of the highest level, AND it is the
105 only one of that level, ignore it. This is useful in web-pages
106 where there is only one H1 header but one doesn't know beforehand
107 whether there will be only one.
108
109 --inline
110 Put ToC in document at a given point. See "Inlining the ToC" for
111 more information.
112
113 --make_anchors | --gen_anchors
114 Create anchors for the table-of-contents to link to.
115
116 --make_toc | --gen_toc
117 Make a Table-of-Contents which links to anchored significant
118 elements.
119
120 --man_help | --manpage | --man
121 Print all documentation and exit.
122
123 --notoc_match
124 --notoc_match string
125
126 If there are certain individual tags you don't wish to include in
127 the table of contents, even though they match the "significant
128 elements", then if this pattern matches contents inside the tag
129 (not the body), then that tag will not be included, either in
130 generating anchors nor in generating the ToC. (default:
131 class="notoc")
132
133 --ol | --nool
134 Use an ordered list for Table-of-Contents entries (to a given
135 depth). If --ol is false (i.e. --nool is set) then don't use an
136 ordered list for ToC entries.
137
138 (default:false)
139
140 (See --ol_num_levels to determine how deep the ordered-list listing
141 goes)
142
143 --ol_num_levels
144 The number of levels deep the OL listing will go if --ol is true.
145 If set to zero, will use an ordered list for all levels.
146 (default:1)
147
148 --outfile
149 --outfile file
150
151 File to write the output to. This is where the modified HTML
152 output and the Table-of-Contents goes to. If you give '-' as the
153 filename, then output will go to STDOUT. (default: STDOUT)
154
155 --overwrite
156 Overwrite the input file with the output. If this is in effect,
157 --outfile is ignored. Used in generate_anchors for creating the
158 anchors "in place" and in generate_toc if the --inline option is in
159 effect. (default:off)
160
161 --quiet
162 Suppress informative messages. (default: off)
163
164 --textonly
165 Use only text content in significant elements.
166
167 --title
168 --title string
169
170 Title for ToC page (if not using --header or --inline or
171 --toc_only) (default: "Table of Contents")
172
173 --toc_after
174 --toc_after tag=suffix
175
176 --toc_after "H2=</em>"
177
178 For defining layout of significant elements in the ToC. The tag is
179 the HTML tag which marks the start of the element. The suffix is
180 what is required to be appended to the Table of Contents entry
181 generated for that tag. This is a cumulative hash argument.
182 (default: undefined)
183
184 --toc_before
185 --toc_before tag=prefix
186
187 --toc_before "H2=<em>"
188
189 For defining the layout of significant elements in the ToC. The
190 tag is the HTML tag which marks the start of the element. The
191 prefix is what is required to be prepended to the Table of Contents
192 entry generated for that tag. This is a cumulative hash argument.
193 (default: undefined)
194
195 --toc_end
196 --toc_end tag=endtag
197
198 --toc_end "H1=/H1"
199
200 For defining significant elements. The tag is the HTML tag which
201 marks the start of the element. The endtag the HTML tag which
202 marks the end of the element. When matching in the input file,
203 case is ignored (but make sure that all your tag options referring
204 to the same tag are exactly the same!). This is a cumulative hash
205 argument. (default: H1=/H1 H2=/H2)
206
207 --toc_entry
208 --toc_entry tag=level
209
210 --toc_entry "TITLE=1" --toc_entry "H1=2"
211
212 For defining significant elements. The tag is the HTML tag which
213 marks the start of the element. The level is what level the tag is
214 considered to be. The value of level must be numeric, and non-
215 zero. If the value is negative, consective entries represented by
216 the significant_element will be separated by the value set by
217 --entrysep option. This is a cumulative hash argument. (default:
218 H1=1 H2=2)
219
220 --toc_label | --toclabel
221 --toc_label string
222
223 HTML text that labels the ToC. Always used. (default: "<h1>Table
224 of Contents</h1>")
225
226 --toc_only | --notoc_only
227 Output only the Table of Contents, that is, the Table of Contents
228 plus the toc_label. If there is a --header or a --footer, these
229 will also be output.
230
231 If --toc_only is false (i.e. --notoc_only is set) then if there is
232 no --header, and --inline is not true, then a suitable HTML page
233 header will be output, and if there is no --footer and --inline is
234 not true, then a HTML page footer will be output.
235 (default:--notoc_only)
236
237 --toc_tag
238 --toc_tag string
239
240 If a ToC is to be included inline, this is the pattern which is
241 used to match the tag where the ToC should be put. This can be a
242 start-tag, an end-tag or a comment, but the < should be left out;
243 that is, if you want the ToC to be placed after the BODY tag, then
244 give "BODY". If you want a special comment tag to make where the
245 ToC should go, then include the comment marks, for example:
246 "!--toc--" (default:BODY)
247
248 --toc_tag_replace
249 In conjunction with --toc_tag, this is a flag to say whether the
250 given tag should be replaced, or if the ToC should be put after the
251 tag. This can be useful if your toc_tag is a comment and you don't
252 need it after you have the ToC in place. (default:false)
253
254 --use_id
255 Use id="name" for anchors rather than <a name="name"> anchors.
256 However if an anchor already exists for a Significant Element, this
257 won't make an ID for that particular element.
258
259 --useorg
260 Use pre-existing backup files as the input source; that is, files
261 of the form filename.bak (see --bak).
262
264 Options Files
265 Options can be given in files as well as on the command-line by using
266 the --argfile filename option in the command-line. Also, the files
267 ~/.hypertocrc and ./.hypertocrc are checked for options.
268
269 The format is as follows: Lines starting with # are comments. Lines
270 enclosed in PoD markers are also comments. Blank lines are ignored.
271 The options themselves should be given the way they would be on the
272 command line, that is, the option name (including the --) followed by
273 its value (if any).
274
275 For example:
276
277 # set the ToC to be three-level
278 --toc_entry H1=1
279 --toc_entry H2=2
280 --toc_entry H3=3
281
282 --toc_end H1=/H1
283 --toc_end H2=/H2
284 --toc_end H3=/H3
285
286 Option files can be nested, by giving an --argfile filename argument
287 inside the option file, it will go and get that referred file as well.
288
289 See Getopt::ArgvFile for more information.
290
292 Significant Elements
293 Here are some examples of defining the significant elements for your
294 Table of Contents.
295
296 Example of Default
297
298 The following reflects the default setting if nothing is explicitly
299 specified:
300
301 --toc_entry "H1=1" --toc_end "H1=/H1" --toc_entry "H2=2" --toc_end "H2=/H2"
302
303 Or, if it was defined in one of the possible "Options Files":
304
305 # default settings
306 --toc_entry H1=1
307 --toc_end H1=/H1
308 --toc_entry H2=2
309 --toc_end H2=/H2
310
311 Example of before/after
312
313 The following options make use of the before/after options:
314
315 # An options file that adds some formatting
316 # make level 1 ToC entries <strong>
317 --toc_entry H1=1
318 --toc_end H1=/H1
319 --toc_before H1=<strong>
320 --toc_after H1=</strong>
321
322 # make level 2 ToC entries <em>
323 --toc_entry H2=2
324 --toc_end H2=/H2
325 --toc_before H2=<em>
326 --toc_after H2=</em>
327
328 # Make level 3 entries as is
329 --toc_entry H3=3
330 --toc_end H3=/H3
331
332 Example of custom end
333
334 The following options try to index definition terms:
335
336 # An options file that can work for Glossary type documents
337 --toc_entry H1=1
338 --toc_end H1=/H1
339 --toc_entry H2=2
340 --toc_end H2=/H2
341
342 # Assumes document has a DD for each DT, otherwise ToC
343 # will get entries with alot of text.
344 --toc_entry DT=3
345 --toc_end DT=DD
346 --toc_before DT=<em>
347 --toc_after DT=</em>
348
349 Formatting the ToC
350 The --toc_entry etc. options give you control on how the ToC entries
351 may look, but there are other options to affect the final appearance of
352 the ToC file created.
353
354 With the --header option, the contents of the given file will be
355 prepended before the generated ToC. This allows you to have
356 introductory text, or any other text, before the ToC.
357
358 Note:
359 If you use the --header option, make sure the file specified
360 contains the opening HTML tag, the HEAD element (containing the
361 TITLE element), and the opening BODY tag. However, these
362 tags/elements should not be in the header file if the --inline
363 options is used. See "Inlining the ToC" for information on what the
364 header file should contain for inlining the ToC.
365
366 With the --toc_label option, the contents of the given string will be
367 prepended before the generated ToC (but after any text taken from a
368 --header file).
369
370 With the --footer option, the contents of the file will be appended
371 after the generated ToC.
372
373 Note:
374 If you use the -footer, make sure it includes the closing BODY and
375 HTML tags (unless, of course, you are using the --inline option).
376
377 If the --header option is not specified, the appropriate starting HTML
378 markup will be added, unless the --toc_only option is specified. If
379 the --footer option is not specified, the appropriate closing HTML
380 markup will be added, unless the --toc_only option is specified.
381
382 If you do not want/need to deal with header, and footer, files, then
383 you are alloed to specify the title, --title option, of the ToC file;
384 and it allows you to specify a heading, or label, to put before ToC
385 entries' list, the --toc_label option. Both options have default
386 values, see "OPTIONS" for more information on each option.
387
388 If you do not want HTML page tags to be supplied, and just want the ToC
389 itself, then specify the --toc_only option. If there are no --header
390 or --footer files, then this will simply output the contents of
391 --toc_label and the ToC itself.
392
393 Inlining the ToC
394 The ability to incorporate the ToC directly into an HTML document is
395 supported via the --inline option.
396
397 Inlining will be done on the first file in the list of files processed,
398 and will only be done if that file contains an opening tag matching the
399 --toc_tag value.
400
401 If --overwrite is true, then the first file in the list will be
402 overwritten, with the generated ToC inserted at the appropriate spot.
403 Otherwise a modified version of the first file is output to either
404 STDOUT or to the output file defined by the --outfile option.
405
406 The options --toc_tag and --toc_tag_replace are used to determine where
407 and how the ToC is inserted into the output.
408
409 Example 1
410
411 # this is the default
412 --toc_tag BODY --notoc_tag_replace
413
414 This will put the generated ToC after the BODY tag of the first file.
415 If the --header option is specified, then the contents of the specified
416 file are inserted after the BODY tag. If the --toc_label option is not
417 empty, then the text specified by the --toc_label option is inserted.
418 Then the ToC is inserted, and finally, if the --footer option is
419 specified, it inserts the footer. Then the rest of the input file
420 follows as it was before.
421
422 Example 2
423
424 --toc_tag '!--toc--' --toc_tag_replace
425
426 This will put the generated ToC after the first comment of the form
427 <!--toc-->, and that comment will be replaced by the ToC (in the order
428 --header
429 --toc_label
430 ToC
431 --footer) followed by the rest of the input file.
432
433 Note:
434 The header file should not contain the beginning HTML tag and HEAD
435 element since the HTML file being processed should already contain
436 these tags/elements.
437
439 Create an inline ToC for one file
440 hypertoc --inline --make_anchors --overwrite --make_toc index.html
441
442 This will create anchors in "index.html", create a ToC with a heading
443 of "Table of Contents" and place it after the BODY tag of "index.html".
444 The file index.html.org will contain the original index.html file,
445 without ToC or anchors.
446
447 Create a ToC file from multiple files
448 First, create the anchors.
449
450 hypertoc --make_anchors --overwrite index.html fred.html george.html
451
452 Then create the ToC
453
454 hypertoc --make_toc --outfile table.html index.html fred.html george.html
455
456 Create an inline ToC after the first heading of the first file
457 hypertoc --make_anchors --inline --overwrite --make_toc --toc_tag /H1 \
458 --notoc_tag_replace --toc_label "" index.html fred.html george.html
459
460 This will create anchors in the "index.html", "fred.html" and
461 "george.html" files, create a ToC with no header and place it after the
462 first H1 header in "index.html" and back up the original files to
463 "index.html.org", "fred.html.org" and "george.html.org"
464
465 Create an inline ToC with custom elements
466 hypertoc --quiet --make_anchors --bak "" --overwrite \
467 --make_toc --inline --toc_label "" --toc_tag '!--toc--' \
468 --toc_tag_replace \
469 --toc_entry H2=1 --toc_entry H3=2 \
470 --toc_end H2=/H2 --toc_end H3=/H3 myfile.html
471
472 This will create an inline ToC overwriting the original file, and
473 replacing a <!--toc--> comment, and which takes H2 headers as level 1
474 and H3 headers as level 2. This can be useful where the .html file is
475 generated by some other process, and you can then create the ToC as the
476 last step.
477
478 Create a ToC with custom elements
479 hypertoc --quiet --make_anchors --bak "" --overwrite \
480 --toc_entry TITLE=1 --toc_end TITLE=/TITLE
481 --toc_entry H2=2 --toc_entry H3=3 \
482 --toc_end H2=/H2 --toc_end H3=/H3 \
483 --make_toc --outfile index.html \
484 mary.html fred.html george.html
485
486 This creates anchors at the H2 and H3 elements, and creates a ToC file
487 called index.html, indexing on the TITLE, and the H2 and H3 elements.
488
489 Create a ToC with custom elements and options file
490 Given an options file called 'custom.opt' as follows:
491
492 # Title, H2 and H3
493 --toc_entry TITLE=1
494 --toc_end TITLE=/TITLE
495 --toc_entry H2=2
496 --toc_end H2=/H2
497 --toc_entry H3=3
498 --toc_end H3=/H3
499
500 then the previous example can have shorter command lines as follows:
501
502 hypertoc --quiet --make_anchors --bak "" --overwrite \
503 --argfile custom.opt --make_toc --outfile index.html mary.html fred.html george.html
504
506 • hypertoc is smart enough to detect anchors inside significant
507 elements. If the anchor defines the NAME attribute, hypertoc uses
508 the value. Else, it adds its own NAME attribute to the anchor. If
509 --use_id is true, then it likewise checks for and uses IDs.
510
511 • The TITLE element is treated specially if specified as a
512 significant element. It is illegal to insert anchors (A) into
513 TITLE elements. Therefore, hypertoc will actually link to the
514 filename itself instead of the TITLE element of the document.
515
516 • hypertoc will ignore a significant element if it does not contain
517 any non-whitespace characters. A warning message is generated if
518 such a condition exists.
519
520 • If you have a sequence of significant elements that change in a
521 slightly disordered fashion, such as H1 -> H3 -> H2 or even H2 ->
522 H1, though hypertoc deals with this to create a list which is still
523 good HTML, if you are using an ordered list to that depth, then you
524 will get strange numbering, as an extra list element will have been
525 inserted to nest the elements at the correct level.
526
527 For example (H2 -> H1 with --ol_num_levels=1):
528
529 1.
530 * My H2 Header
531 2. My H1 Header
532
533 For example (H1 -> H3 -> H2 with --ol_num_levels=0 and H3 also
534 being significant):
535
536 1. My H1 Header
537 1.
538 1. My H3 Header
539 2. My H2 Header
540 2. My Second H1 Header
541
542 In cases such as this it may be better not to use the --ol option.
543
544 • If one is not using --overwrite when generating anchors, then the
545 command needs to be done in two passes, in order to give the
546 correct filenames (the ones with the actual anchors in them) to the
547 ToC generation part. Otherwise the ToC will have anchors pointing
548 to files that don't have them.
549
550 • When using --inline, care needs to be taken if overwriting -- if
551 one sets the ToC to be included after a given tag (such as the
552 default BODY) then if one runs the command repeatedly one could get
553 multiple ToCs in the same file, one after the other.
554
556 • Version 3.10 (and above) generates more verbose (SEO-friendly)
557 anchors than prior versions. Thus anchors generated with earlier
558 versions will not match version 3.10 anchors.
559
560 • Version 3.00 (and above) of hypertoc behaves somewhat differently
561 than Version 2.x of hypertoc. It is now designed to do everything
562 in one pass, and has dropped certain options: the --infile option
563 is no longer used (all filenames are put at the end of the
564 command); the --toc_file option no longer exists; use the --outfile
565 option instead; the --tocmap option is no longer supported.
566
567 It now generates lower-case tags rather than upper-case ones.
568
569 • hypertoc is not very efficient (memory and speed), and can be slow
570 for large documents.
571
572 • Now that generation of anchors and of the ToC are done in one pass,
573 even more memory is used than was the case before. This is more
574 notable when processing multiple files, since all files are read
575 into memory before processing them.
576
577 • Invalid markup will be generated if a significant element is
578 contained inside of an anchor. For example:
579
580 <a name="foo"><h1>The FOO command</h1></a>
581
582 will be converted to (if h1 is a significant element),
583
584 <a name="foo"><h1><a name="The">The</a> FOO command</h1></a>
585
586 which is illegal since anchors cannot be nested.
587
588 It is better style to put anchor statements within the element to
589 be anchored. For example, the following is preferred:
590
591 <h1><a name="foo">The FOO command</a></h1>
592
593 hypertoc will detect the "foo" NAME and use it.
594
595 Even better is to use IDs:
596
597 <h1 id="foo">The FOO command</h1>
598
599 • NAME attributes without quotes are not recognized.
600
602 Tell me about them.
603
605 Getopt::Long
606 Getopt::ArgvFile
607 File::Basename
608 Pod::Usage
609 HTML::LinkList
610 HTML::Entities
611 HTML::GenToc
612
614 Web
615
617 HOME
618 hypertoc looks in the HOME directory for config files.
619
621 "~/.hypertocrc"
622 User configuration file.
623
624 ".hypertocrc"
625 Configuration file in the current working directory; overrides
626 options in "~/.hypertocrc" and is overridden by command-line
627 options.
628
630 perl(1) htmltoc(1) HTML::GenToc Getopt::ArgvFile Getopt::Long
631
633 Kathryn Andersen http://www.katspace.org/tools/hypertoc/
634
635 Based on htmltoc by Earl Hood ehood AT medusa.acs.uci.edu
636
637 Contributions from Dan Dascalescu, <http://dandascalescu.com>
638
640 Copyright (C) 1994-1997 Earl Hood, ehood AT medusa.acs.uci.edu
641 Copyright (C) 2002-2008 Kathryn Andersen
642
643 This program is free software; you can redistribute it and/or modify it
644 under the terms of the GNU General Public License as published by the
645 Free Software Foundation; either version 2 of the License, or (at your
646 option) any later version.
647
648 This program is distributed in the hope that it will be useful, but
649 WITHOUT ANY WARRANTY; without even the implied warranty of
650 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
651 General Public License for more details.
652
653 You should have received a copy of the GNU General Public License along
654 with this program; if not, write to the Free Software Foundation, Inc.,
655 675 Mass Ave, Cambridge, MA 02139, USA.
656
657
658
659perl v5.38.0 2023-07-20 HYPERTOC(1)