1HXUNENT(1)                      HTML-XML-utils                      HXUNENT(1)
2
3
4

NAME

6       hxunent - replace HTML predefined character entities by UTF-8
7

SYNOPSIS

9       hxunent [ -b ] [ -f ] [ file ]
10

DESCRIPTION

12       The hxunent command reads the file (or standard input) and copies it to
13       standard output with &-entities by their equivalent character  (encoded
14       as UTF-8). E.g., &quot; is replaced by " and &lt; is replaced by <.
15

OPTIONS

17       The following options are supported:
18
19       -b        The  five  builtin  entities  of XML (&lt; &gt; &quot; &apos;
20                 &amp;) are not replaced but copied unchanged. This is  neces‐
21                 sary if the output has to be valid XML or SGML.
22
23       -f        This  option  changes how unknown entities or lone ampersands
24                 are handled. Normally they are  copied  unchanged,  but  this
25                 option  tries to "fix" them by replacing ampersands by &amp;.
26                 Often such stray ampersands are the result of copy and  paste
27                 of  URLs  into  a  document and then this option indeed fixes
28                 them and makes the document valid.
29

DIAGNOSTICS

31       The program's exit value is 0 if all went well, otherwise:
32
33       1         The input couldn't be read (file not found,  file  not  read‐
34                 able...)
35
36       2         Wrong command line arguments.
37

SEE ALSO

39       asc2xml(1), xml2asc(1), UTF-8 (RFC 2279)
40

BUGS

42       The  program assumes entities are as defined by HTML. It doesn't read a
43       document's DTD to find the actual definitions in  use  in  a  document.
44       With -f, it will even remove all entities that are not HTML entities.
45
46
47
485.x                               21 Nov 2008                       HXUNENT(1)
Impressum