1LOWDOWN(3) BSD Library Functions Manual LOWDOWN(3)
2
4 lowdown — simple markdown translator library
5
7 library “liblowdown”
8
10 #include <sys/queue.h>
11 #include <stdio.h>
12 #include <lowdown.h>
13
14 struct lowdown_metadata
15 struct lowdown_node
16 struct lowdown_opts
17
19 This library parses lowdown(5) into various output formats.
20
21 The library consists first of a high-level interface consisting of
22 lowdown_buf(3), lowdown_buf_diff(3), lowdown_file(3), and
23 lowdown_file_diff(3).
24
25 The high-level functions interface with low-level functions that perform
26 parsing and formatting. These consist of lowdown_doc_new(3),
27 lowdown_doc_parse(3), and lowdown_doc_free(3) for parsing lowdown(5) doc‐
28 uments into an abstract syntax tree.
29
30 The front-end functions for freeing, allocation, and rendering are as
31 follows.
32
33 • HTML5:
34 lowdown_html_free(3)
35 lowdown_html_new(3)
36 lowdown_html_rndr(3)
37
38 • gemini:
39 lowdown_gemini_free(3)
40 lowdown_gemini_new(3)
41 lowdown_gemini_rndr(3)
42
43 • LaTeX:
44 lowdown_latex_free(3)
45 lowdown_latex_new(3)
46 lowdown_latex_rndr(3)
47
48 • OpenDocument:
49 lowdown_odt_free(3)
50 lowdown_odt_new(3)
51 lowdown_odt_rndr(3)
52
53 • roff:
54 lowdown_nroff_free(3)
55 lowdown_nroff_new(3)
56 lowdown_nroff_rndr(3)
57
58 • UTF-8 ANSI terminal:
59 lowdown_term_free(3)
60 lowdown_term_new(3)
61 lowdown_term_rndr(3)
62
63 • debugging:
64 lowdown_tree_rndr(3)
65
66 To compile and link, use pkg-config(1):
67
68 % cc `pkg-config --cflags lowdown` -c -o sample.o sample.c
69 % cc -o sample sample.o `pkg-config --libs lowdown`
70
71 Pledge Promises
72 The lowdown library is built to operate in security-sensitive environ‐
73 ments, such as those using pledge(2) on OpenBSD. The only promise re‐
74 quired is stdio for lowdown_file_diff(3) and lowdown_file(3): both re‐
75 quire access to the stream for reading input.
76
77 Types
78 All lowdown functions use one or more of the following structures.
79
80 The struct lowdown_opts structure manage features. It has the following
81 fields:
82
83 unsigned int feat
84 Features used during the parse. This bit-field may have
85 the following bits OR'd:
86
87 LOWDOWN_ATTRS
88 Parse PHP extra link, header, and image attributes.
89 LOWDOWN_AUTOLINK
90 Parse http, https, ftp, mailto, and relative links
91 or link fragments.
92 LOWDOWN_COMMONMARK
93 Tighten input parsing to the CommonMark specifica‐
94 tion. This also uses the first ordered list value
95 instead of starting all lists at one. This feature
96 is experimental and incomplete.
97 LOWDOWN_DEFLIST
98 Parse PHP extra definition lists. This is cur‐
99 rently constrained to single-key lists.
100 LOWDOWN_FENCED
101 Parse GFM fenced (language-specific) code blocks.
102 LOWDOWN_FOOTNOTES
103 Parse MMD style footnotes. This only supports the
104 referenced footnote style, not the "inline" style.
105 LOWDOWN_HILITE
106 Parse highlit sequences. This are disabled by de‐
107 fault because it may be erroneously interpreted as
108 section headers.
109 LOWDOWN_IMG_EXT
110 Deprecated. Use LOWDOWN_ATTRS instead.
111 LOWDOWN_MANTITLE
112 Recognise manpage titles in Pandoc metadata title
113 lines. Only applicable if LOWDOWN_METADATA is also
114 provided. Manpages titles must begin with a non-
115 empty title followed by an open parenthesis, digit
116 or "n", optional letters after, then a closing
117 parenthesis. This may be optionally followed by a
118 source and, if a vertical bar is detected, the con‐
119 tent after as the volume. These are passed to the
120 renderers as the title, volume, and optionally
121 source and volume metadata key-value pairs. The
122 original title is not recoverable.
123 LOWDOWN_MATH
124 Parse mathematics equations.
125 LOWDOWN_METADATA
126 Parse in-document metadata.
127 LOWDOWN_NOCODEIND
128 Do not parse indented content as code blocks.
129 LOWDOWN_NOINTEM
130 Do not parse emphasis within words.
131 LOWDOWN_STRIKE
132 Parse strikethrough sequences.
133 LOWDOWN_SUPER
134 Parse super-scripts. This accepts foo^bar, which
135 puts the parts following the caret until whitespace
136 in superscripts; or foo^(bar), which puts only the
137 parts in parenthesis.
138 LOWDOWN_TABLES
139 Parse GFM tables.
140 LOWDOWN_TASKLIST
141 Parse GFM task list items.
142
143 The default value is zero (none).
144
145 unsigned int oflags
146 Features used by the output generators. This bit-field may
147 have the following enabled. Note that bits are by defini‐
148 tion specific to an output type.
149
150 For LOWDOWN_HTML:
151
152 LOWDOWN_HTML_ESCAPE
153 If LOWDOWN_HTML_SKIP_HTML has not been set, escapes
154 in-document HTML so that it is rendered as opaque
155 text.
156 LOWDOWN_HTML_HARD_WRAP
157 Retain line-breaks within paragraphs.
158 LOWDOWN_HTML_HEAD_IDS
159 Have an identifier written with each header element
160 consisting of an HTML-escaped version of the header
161 contents.
162 LOWDOWN_HTML_OWASP
163 When escaping text, be extra paranoid in following
164 the OWASP suggestions for which characters to es‐
165 cape.
166 LOWDOWN_HTML_NUM_ENT
167 Convert, when possible, HTML entities to their nu‐
168 meric form. If not set, the entities are used as
169 given in the input.
170 LOWDOWN_HTML_SKIP_HTML
171 Do not render in-document HTML at all.
172 LOWDOWN_HTML_TITLEBLOCK
173 If used with LOWDOWN_STANDALONE, output a Pandoc-
174 style title block. This is a <header
175 id="title-block-header"> element right after the
176 opening <body> containing elements for specified
177 title, author(s), and date. These are <h1> and <p>
178 elements, respectively, with classes set to what's
179 being output (title, etc.). At least one of these
180 must be specified for the title block to be output.
181
182 For LOWDOWN_GEMINI, there are several flags for controlling
183 link placement. By default, links (images, autolinks, and
184 links) are queued when specified in-line then emitted in a
185 block sequence after the nearest block element.
186
187 LOWDOWN_GEMINI_LINK_END
188 Emit the queue of links at the end of the document
189 instead of after the nearest block element.
190 LOWDOWN_GEMINI_LINK_IN
191 Render all links within the flow of text. This
192 will cause breakage when nested links, such as im‐
193 ages within links, links in blockquotes, etc. It
194 should not be used unless in carefully crafted doc‐
195 uments.
196 LOWDOWN_GEMINI_LINK_NOREF
197 Do not format link labels. Takes precedence over
198 LOWDOWN_GEMINI_LINK_ROMAN.
199 LOWDOWN_GEMINI_LINK_ROMAN
200 When formatting link labels, use lower-case Roman
201 numerals instead of the default lowercase hexavi‐
202 gesimal (i.e., “a”, “b”, ..., “aa”, “ab”, ...).
203 LOWDOWN_GEMINI_METADATA
204 Print metadata as the canonicalised key followed by
205 a colon then the value, each on one line (newlines
206 replaced by spaces). The metadata block is termi‐
207 nated by a double newline. If there is no meta‐
208 data, this does nothing.
209
210 There may only be one of LOWDOWN_GEMINI_LINK_END or
211 LOWDOWN_GEMINI_LINK_IN. If both are specified, the latter
212 is unset.
213
214 For LOWDOWN_FODT:
215
216 LOWDOWN_ODT_SKIP_HTML
217 Do not render in-document HTML at all. Text within
218 HTML elements remains.
219
220 For LOWDOWN_LATEX:
221
222 LOWDOWN_LATEX_NUMBERED
223 Use the default numbering scheme for sections, sub‐
224 sections, etc. If not specified, these are inhib‐
225 ited.
226 LOWDOWN_LATEX_SKIP_HTML
227 Do not render in-document HTML at all. Text within
228 HTML elements remains.
229
230 And for LOWDOWN_MAN and LOWDOWN_NROFF:
231
232 LOWDOWN_NROFF_GROFF
233 Use GNU extensions (i.e., for groff(1)) when ren‐
234 dering output. The groff arguments must include
235 -mpdfmark for formatting links with LOWDOWN_MAN or
236 -mspdf instead of -ms for LOWDOWN_NROFF. Applies
237 to the LOWDOWN_MAN and LOWDOWN_NROFF output types.
238 LOWDOWN_NROFF_NUMBERED
239 Use numbered sections if LOWDOWON_NROFF_GROFF is
240 not specified. Only applies to the LOWDOWN_NROFF
241 output type.
242 LOWDOWN_NROFF_SKIP_HTML
243 Do not render in-document HTML at all. Text within
244 HTML elements remains.
245 LOWDOWN_NROFF_SHORTLINK
246 Render link URLs in short form. Applies to images,
247 autolinks, and regular links. Only in LOWDOWN_MAN
248 or when LOWDOWN_NROFF_GROFF is not specified.
249 LOWDOWN_NROFF_NOLINK
250 Don't show links at all if they have embedded text.
251 Applies to images and regular links. Only in
252 LOWDOWN_MAN or when LOWDOWN_NROFF_GROFF is not
253 specified.
254
255 For LOWDOWN_TERM:
256
257 LOWDOWN_TERM_NOANSI
258 Don't apply ANSI style codes at all. This implies
259 LOWDOWN_TERM_NOCOLOUR.
260 LOWDOWN_TERM_NOCOLOUR
261 Don't apply ANSI colour codes. This will still
262 show underline, bold, etc. This should not be used
263 in difference mode, as the output will make no
264 sense.
265 LOWDOWN_TERM_NOLINK
266 Don't show links at all. Applies to images and
267 regular links: autolinks are still shown. This may
268 be combined with LOWDOWN_TERM_SHORTLINK to also
269 shorten autolinks.
270 LOWDOWN_TERM_SHORTLINK
271 Render link URLs in short form. Applies to images,
272 autolinks, and regular links. This may be combined
273 with LOWDOWN_TERM_NOLINK to only show shortened au‐
274 tolinks.
275
276 For any mode, you may specify:
277
278 LOWDOWN_SMARTY
279 Don't use smart typography formatting.
280 LOWDOWN_STANDALONE
281 Emit a full document instead of a document frag‐
282 ment. This envelope is largely populated from
283 metadata if LOWDOWN_METADATA was provided as an op‐
284 tion or as given in meta or metaovr.
285
286 size_t maxdepth
287 The maximum parse depth before the parser exits. Most doc‐
288 uments will have a parse depth in the single digits.
289
290 size_t cols
291 For LOWDOWN_TERM, the "soft limit" for width of terminal
292 output not including margins. If zero, 80 shall be used.
293
294 size_t hmargin
295 For LOWDOWN_TERM, the left margin (space characters).
296
297 size_t vmargin
298 For LOWDOWN_TERM, the top/bottom margin (newlines).
299
300 enum lowdown_type type
301 May be set to LOWDOWN_HTML for HTML5 output, LOWDOWN_LATEX
302 for LaTeX, LOWDOWN_MAN for -man macros, LOWDOWN_FODT for
303 “flat” OpenDocument, LOWDOWN_TERM for ANSI-compatible UTF-8
304 terminal output, LOWDOWN_GEMINI for the Gemini format, or
305 LOWDOWN_NROFF for -ms macros. The LOWDOWN_TREE type causes
306 a debug tree to be written.
307
308 struct lowdown_opts_odt odt
309 If type is LOWDOWN_FODT, this contains const char *sty,
310 which is either NULL or the OpenDocument styles used when
311 creating standalone documents. If NULL, the default styles
312 are used.
313
314 char **meta
315 An array of metadata key-value pairs or NULL. Each pair
316 must appear as if provided on one line (or multiple lines)
317 of the input, including the terminating newline character.
318 If not consisting of a valid pair (e.g., no newline, no
319 colon), then it is ignored. When processed, these values
320 are overridden by those in the document (if
321 LOWDOWN_METADATA is specified) or by those in metaovr.
322
323 size_t metasz
324 Number of pairs in metaovr.
325
326 char **metaovr
327 See meta. The difference is that metaovr is applied after
328 meta and in-document metadata, so it overrides prior val‐
329 ues.
330
331 size_t metaovrsz
332 Number of pairs in metaovr.
333
334 Another common structure is struct lowdown_metadata, which is used to
335 hold parsed (and output-formatted) metadata keys and values if
336 LOWDOWN_METADATA was provided as an input bit. This structure consists
337 of the following fields:
338
339 char *key
340 The metadata key in its lowercase, canonical form.
341
342 char *value
343 The metadata value as rendered in the current output for‐
344 mat. This may be an empty string.
345
346 The abstract syntax tree is encoded in struct lowdown_node, which con‐
347 sists of the following.
348
349 enum lowdown_rndrt type
350 The node type. (Described below.)
351
352 size_t id
353 An identifier unique within the document. This can be used
354 as a table index since the number is assigned from a mono‐
355 tonically increasing point during the parse.
356
357 struct lowdown_node *parent
358 The parent of the node, or NULL at the root.
359
360 enum lowdown_chng chng
361 Change tracking: whether this node was inserted
362 (LOWDOWN_CHNG_INSERT), deleted (LOWDOWN_CHNG_DELETE), or
363 neither (LOWDOWN_CHNG_NONE).
364
365 struct lowdown_nodeq children
366 A possibly-empty list of child nodes.
367
368 <anon union>
369 An anonymous union of type-specific structures. See below
370 for a description of each one.
371
372 The nodes may be one of the following types, with default rendering in
373 HTML5 to illustrate functionality.
374
375 LOWDOWN_BLOCKCODE
376 A block-level (and possibly language-specific) snippet of
377 code. Described by the <pre><code> elements.
378
379 LOWDOWN_BLOCKHTML
380 A block-level snippet of HTML. This is simply opaque HTML
381 content. (Only if configured during parse.)
382
383 LOWDOWN_BLOCKQUOTE
384 A block-level quotation. Described by the <blockquote> el‐
385 ement.
386
387 LOWDOWN_CODESPAN
388 A snippet of code. Described by the <code> element.
389
390 LOWDOWN_DOC_HEADER
391 A header with data gathered from document metadata (if con‐
392 figured). Described by the <head> element. (Only if con‐
393 figured during parse.)
394
395 LOWDOWN_DOUBLE_EMPHASIS
396 Bold (or otherwise notable) content. Described by the
397 <strong> element.
398
399 LOWDOWN_EMPHASIS
400 Italic (or otherwise notable) content. Described by the
401 <em> element.
402
403 LOWDOWN_ENTITY
404 An HTML entity, which may either be named or numeric.
405
406 LOWDOWN_FOOTNOTE
407 A footnote. (Only if configured during parse.)
408
409 LOWDOWN_HEADER
410 A block-level header. Described (in the HTML case) by one
411 of <h1> through <h6>.
412
413 LOWDOWN_HIGHLIGHT
414 Marked test. Described by the <mark> element. (Only if
415 configured during parse.)
416
417 LOWDOWN_HRULE
418 A horizontal line. Described by <hr>.
419
420 LOWDOWN_IMAGE
421 An image. Described by the <img> element.
422
423 LOWDOWN_LINEBREAK
424 A hard line-break within a block context. Described by the
425 <br> element.
426
427 LOWDOWN_LINK
428 A link to external media. Described by the <a> element.
429
430 LOWDOWN_LINK_AUTO
431 Like LOWDOWN_LINK, except inferred from text content. De‐
432 scribed by the <a> element. (Only if configured during
433 parse.)
434
435 LOWDOWN_LIST
436 A block-level list enclosure. Described by <ul> or <ol>.
437
438 LOWDOWN_LISTITEM
439 A block-level list item, always appearing within a
440 LOWDOWN_LIST. Described by <li>.
441
442 LOWDOWN_MATH_BLOCK
443 A block (or inline) of mathematical text in LaTeX format.
444 Described within \[xx\] or \(xx\). This is usually (in
445 HTML) externally handled by a JavaScript renderer. (Only
446 if configured during parse.)
447
448 LOWDOWN_META
449 Meta-data keys and values. (Only if configured during
450 parse.) These are described by elements in the <head> ele‐
451 ment.
452
453 LOWDOWN_NORMAL_TEXT
454 Normal text content.
455
456 LOWDOWN_PARAGRAPH
457 A block-level paragraph. Described by the <p> element.
458
459 LOWDOWN_RAW_HTML
460 An inline of raw HTML. (Only if configured during parse.)
461
462 LOWDOWN_ROOT
463 The root of the document. This is always the topmost node,
464 and the only node where the parent field is NULL.
465
466 LOWDOWN_STRIKETHROUGH
467 Content struck through. Described by the <del> element.
468 (Only if configured during parse.)
469
470 LOWDOWN_SUPERSCRIPT
471 A superscript. Described by the <sup> element. (Only if
472 configured during parse.)
473
474 LOWDOWN_TABLE_BLOCK
475 A table block. Described by <table>. (Only if configured
476 during parse.)
477
478 LOWDOWN_TABLE_BODY
479 A table body section. Described by <tbody>. Parent is al‐
480 ways LOWDOWN_TABLE_BLOCK. (Only if configured during
481 parse.)
482
483 LOWDOWN_TABLE_CELL
484 A table cell. Described by <td> or <th> if in the header.
485 Parent is always LOWDOWN_TABLE_ROW. (Only if configured
486 during parse.)
487
488 LOWDOWN_TABLE_HEADER
489 A table header section. Described by <thead>. Parent is
490 always LOWDOWN_TABLE_BLOCK. (Only if configured during
491 parse.)
492
493 LOWDOWN_TABLE_ROW
494 A table row. Described by <tr>. Parent is always
495 LOWDOWN_TABLE_HEADER or LOWDOWN_TABLE_BODY. (Only if con‐
496 figured during parse.)
497
498 LOWDOWN_TRIPLE_EMPHASIS
499 Combination of LOWDOWN_EMPHASIS and
500 LOWDOWN_DOUBLE_EMPHASIS.
501
502 The following anonymous union structures correspond to certain nodes.
503 Note that all buffers may be zero-length.
504
505 rndr_autolink
506 For LOWDOWN_LINK_AUTO, the link address as link and the
507 link type type, which may be one of HALINK_EMAIL for e-mail
508 links and HALINK_NORMAL otherwise. Any buffer may be
509 empty-sized.
510
511 rndr_blockcode
512 For LOWDOWN_BLOCKCODE, the opaque text of the block and the
513 optional lang of the code language.
514
515 rndr_blockhtml
516 For LOWDOWN_BLOCKHTML, the opaque HTML text.
517
518 rndr_codespan
519 The opaque text of the contents.
520
521 rndr_definition
522 For LOWDOWN_DEFINITION, containing flags that may be
523 HLIST_FL_BLOCK if the definition list should be interpreted
524 as containing block elements.
525
526 rndr_entity
527 For LOWDOWN_ENTITY, the entity text.
528
529 rndr_header
530 For LOWDOWN_HEADER, the level of the header starting at
531 zero (this value is relative to the metadata base header
532 level, defaulting to one), optional space-separated class
533 list attr_cls, and optional single identifier attr_id.
534
535 rndr_image
536 For LOWDOWN_IMAGE, the image address link, the image title
537 title, dimensions NxN (width by height) in dims, and alter‐
538 nate text alt. CSS in-line style for width and height may
539 be given in attr_width and/or attr_height, and a space-sep‐
540 arated list of classes may be in attr_cls and a single
541 identifier may be in attr_id.
542
543 rndr_link
544 Like rndr_autolink, but without a type and further defining
545 an optional link title title, optional space-separated
546 class list attr_cls, and optional single identifier
547 attr_id.
548
549 rndr_list
550 For LOWDOWN_LIST, consists of a bitfield flags that may be
551 set to HLIST_FL_ORDERED for an ordered list and
552 HLIST_FL_UNORDERED for an unordered one. If HLIST_FL_BLOCK
553 is set, the list should be output as if items were separate
554 blocks. The start value for HLIST_FL_ORDERED is the start‐
555 ing list item position, which is one by default and never
556 zero. The items is the number of list items.
557
558 rndr_listitem
559 For LOWDOWN_LISTITEM, consists of a bitfield flags that may
560 be set to HLIST_FL_ORDERED for an ordered list,
561 HLIST_FL_UNORDERED for an unordered list, HLIST_FL_DEF for
562 definition list data, HLIST_FL_CHECKED or
563 HLIST_FL_UNCHECKED for an unordered “task” list element,
564 and/or HLIST_FL_BLOCK for list item output as if containing
565 block elements. The HLIST_FL_BLOCK should not be used: use
566 the parent list (or definition list) flags for this. The
567 num is the index in a HLIST_FL_ORDERED list. It is mono‐
568 tonically increasing with each item in the list, starting
569 at the start variable given in struct rndr_list.
570
571 rndr_math
572 For LOWDOWN_MATH, the mode of display in blockmode: if 1,
573 in-line math; if 2, multi-line. The opaque equation, which
574 is assumed to be in LaTeX format, is in the opaque text.
575
576 rndr_meta
577 Each LOWDOWN_META key-value pair is represented. The keys
578 are lower-case without spaces or non-ASCII characters. If
579 provided, enclosed nodes may consist only of
580 LOWDOWN_NORMAL_TEXT and LOWDOWN_ENTITY.
581
582 rndr_normal_text
583 The basic text content for LOWDOWN_NORMAL_TEXT. If flags
584 is set to HTEXT_ESCAPED, the text may be escaped for out‐
585 put, but may not be altered by any smart typography or sim‐
586 ilar (it should be passed as-is).
587
588 rndr_paragraph
589 For LOWDOWN_PARAGRAPH, species how many lines the paragraph
590 has in the input file and beoln, set to non-zero if the
591 paragraph ends with an empty line instead of a breaking
592 block element.
593
594 rndr_raw_html
595 For LOWDOWN_RAW_HTML, the opaque HTML text.
596
597 rndr_table
598 For LOWDOWN_TABLE_BLOCK, the number of columns in each row
599 or header row. The number of columns in rndr_table,
600 rndr_table_header, and rndr_table_cell are the same.
601
602 rndr_table_cell
603 For LOWDOWN_TABLE_CELL, the current col column number out
604 of columns. See rndr_table_header for a description of the
605 bits in flags. The number of columns in rndr_table,
606 rndr_table_header, and rndr_table_cell are the same.
607
608 rndr_table_header
609 For LOWDOWN_TABLE_HEADER, the number of columns in each row
610 and the per-column flags, which may tested for equality
611 against HTBL_FL_ALIGN_LEFT, HTBL_FL_ALIGN_RIGHT, or
612 HTBL_FL_ALIGN_CENTER after being masked with
613 HTBL_FL_ALIGNMASK; or HTBL_FL_HEADER. If no alignment is
614 specified after the mask, the default should be left-
615 aligned. The number of columns in rndr_table,
616 rndr_table_header, and rndr_table_cell are the same.
617
619 lowdown(1), lowdown_buf(3), lowdown_buf_diff(3), lowdown_diff(3),
620 lowdown_doc_free(3), lowdown_doc_new(3), lowdown_doc_parse(3),
621 lowdown_file(3), lowdown_file_diff(3), lowdown_gemini_free(3),
622 lowdown_gemini_new(3), lowdown_gemini_rndr(3), lowdown_html_free(3),
623 lowdown_html_new(3), lowdown_html_rndr(3), lowdown_latex_free(3),
624 lowdown_latex_new(3), lowdown_latex_rndr(3), lowdown_metaq_free(3),
625 lowdown_nroff_free(3), lowdown_nroff_new(3), lowdown_nroff_rndr(3),
626 lowdown_odt_free(3), lowdown_odt_new(3), lowdown_odt_rndr(3),
627 lowdown_term_free(3), lowdown_term_new(3), lowdown_term_rndr(3),
628 lowdown_tree_rndr(3), lowdown(5)
629
631 lowdown was forked from hoedown: https://github.com/hoedown/hoedown by
632 Kristaps Dzonsons, kristaps@bsd.lv. It has been considerably modified
633 since.
634
635BSD December 17, 2023 BSD