1HTMLMIN(1)                          htmlmin                         HTMLMIN(1)
2
3
4

NAME

6       htmlmin - htmlmin Documentation
7
8       An HTML Minifier with Seatbelts
9

QUICKSTART

11       For  single  invocations,  there is the htmlmin.minify method. It takes
12       input html as a string for its  first  argument  and  returns  minified
13       html.  It accepts multiple different options that allow you to tune the
14       amount of minification being done, with the defaults being  the  safest
15       available options:
16
17          >>> import htmlmin
18          >>> input_html = '''
19            <body   style="background-color: tomato;">
20              <h1>  htmlmin   rocks</h1>
21              <pre>
22                and rolls
23              </pre>
24            </body>'''
25          >>> htmlmin.minify(input_html)
26          u' <body style="background-color: tomato;"> <h1> htmlmin rocks</h1> <pre>\n        and rolls\n      </pre> </body>'
27          >>> print htmlmin.minify(input_html)
28           <body style="background-color: tomato;"> <h1> htmlmin rocks</h1> <pre>
29                  and rolls
30                </pre> </body>
31
32       If  there  is a chunk of html which you do not want minified, put a pre
33       attribute on an HTML tag that wraps it. htmlmin will leave the contents
34       of the tag alone and will remove the pre attribute before it is output:
35
36          >>> import htmlmin
37          >>> input_html = '''<span>   minified   </span><span pre>   not minified   </span>'''
38          >>> htmlmin.minify(input_html)
39          u'<span> minified </span><span>   not minified   </span>'
40
41       Attributes  will be condensed to their smallest possible representation
42       by default. You can prefix an individual attribute with pre-  to  leave
43       it unchanged:
44
45          >>> import htmlmin
46          >>> input_html = '''<input value="&lt;minified&gt;" /><input pre-value="&lt;not minified&gt;" />'''
47          >>> htmlmin.minify(input_html)
48          u'<input value="<minified>"><input value=&lt;not minified&gt;>'
49
50       The  minify  function works well for one off minifications. However, if
51       you are going to minify several pieces of HTML, the Minifier  class  is
52       provided.  It  works  similarly,  but allows for persistence of options
53       between invocations and recycles the internal data structures used  for
54       minification.
55
56   Command Line
57       htmlmin is invoked by running:
58
59          htmlmin input.html output.html
60
61       If  no  output  file is specified, it will print to stdout. If no input
62       specified, it reads form stdin. Help with options can be  retrieved  at
63       any time by running htmlmin -h:
64
65          htmlmin -h
66          usage: htmlmin [-h] [-c] [-s] [--remove-all-empty-space]
67                         [--keep-optional-attribute-quotes] [-H] [-k] [-a PRE_ATTR]
68                         [-p [TAG [TAG ...]]] [-e ENCODING]
69                         [INPUT] [OUTPUT]
70
71          Minify HTML
72
73          positional arguments:
74            INPUT                 File path to html file to minify. Defaults to stdin.
75            OUTPUT                File path to output to. Defaults to stdout.
76
77          optional arguments:
78            -h, --help            show this help message and exit
79            -c, --remove-comments
80                                  When set, comments will be removed. They can be kept on an individual basis
81                                  by starting them with a '!': <!--! comment -->. The '!' will be removed from
82                                  the final output. If you want a '!' as the leading character of your comment,
83                                  put two of them: <!--!! comment -->.
84
85            -s, --remove-empty-space
86                                  When set, this removes empty space betwen tags in certain cases.
87                                  Specifically, it will remove empty space if and only if there a newline
88                                  character occurs within the space. Thus, code like
89                                  '<span>x</span> <span>y</span>' will be left alone, but code such as
90                                  '   ...
91                                    </head>
92                                    <body>
93                                      ...'
94                                  will become '...</head><body>...'. Note that this CAN break your
95                                  html if you spread two inline tags over two lines. Use with caution.
96
97            --remove-all-empty-space
98                                  When set, this removes ALL empty space betwen tags. WARNING: this can and
99                                  likely will cause unintended consequences. For instance, '<i>X</i> <i>Y</i>'
100                                  will become '<i>X</i><i>Y</i>'. Putting whitespace along with other text will
101                                  avoid this problem. Only use if you are confident in the result. Whitespace is
102                                  not removed from inside of tags, thus '<span> </span>' will be left alone.
103
104            --keep-optional-attribute-quotes
105                                  When set, this keeps all attribute quotes, even if they are optional.
106
107            -H, --in-head         If you are parsing only a fragment of HTML, and the fragment occurs in the
108                                  head of the document, setting this will remove some extra whitespace.
109
110            -k, --keep-pre-attr   HTMLMin supports the propietary attribute 'pre' that can be added to elements
111                                  to prevent minification. This attribute is removed by default. Set this flag to
112                                  keep the 'pre' attributes in place.
113
114            -a PRE_ATTR, --pre-attr PRE_ATTR
115                                  The attribute htmlmin looks for to find blocks of HTML that it should not
116                                  minify. This attribute will be removed from the HTML unless '-k' is
117                                  specified. Defaults to 'pre'. You can also prefix individual tag attributes
118                                  with ``{pre_attr}-`` to prevent the contents of the individual attribute from
119                                  being changed.
120
121            -p [TAG [TAG ...]], --pre-tags [TAG [TAG ...]]
122                                  By default, the contents of 'pre', and 'textarea' tags are left unminified.
123                                  You can specify different tags using the --pre-tags option. 'script' and 'style'
124                                  tags are always left unmininfied.
125
126            -e ENCODING, --encoding ENCODING
127                                  Encoding to read and write with. Default 'utf-8'.
128

TUTORIAL & EXAMPLES

130       Coming soon…
131

API REFERENCE

133   Main Functions
134       htmlmin.minify(input,  remove_comments=False, remove_empty_space=False,
135       remove_all_empty_space=False,             reduce_empty_attributes=True,
136       reduce_boolean_attributes=False, remove_optional_attribute_quotes=True,
137       convert_charrefs=True, keep_pre=False, pre_tags=(u'pre',  u'textarea'),
138       pre_attr='pre', cls=<class htmlmin.parser.HTMLMinParser>)
139              Minifies HTML in one shot.
140
141              Parameters
142
143                     · input – A string containing the HTML to be minified.
144
145                     · remove_comments 
146
147                       Remove  comments found in HTML. Individual comments can
148                       be maintained by putting a !  as  the  first  character
149                       inside the comment.  Thus:
150
151                          <!-- FOO --> <!--! BAR -->
152
153                       Will become simply:
154
155                          <!-- BAR -->
156
157                       The added exclamation is removed.
158
159
160                     · remove_empty_space  –  Remove empty space found in HTML
161                       between an opening and a closing tag and when  it  con‐
162                       tains  a  newline  or carriage return. If whitespace is
163                       found that is only  spaces  and/or  tabs,  it  will  be
164                       turned  into  a single space. Be careful, this can have
165                       unintended consequences.
166
167                     · remove_all_empty_space –  A  more  extreme  version  of
168                       remove_empty_space,  this  removes all empty whitespace
169                       found between tags. This is almost guaranteed to  break
170                       your HTML unless you are very careful.
171
172                     · reduce_boolean_attributes  – Where allowed by the HTML5
173                       specification, attributes such as ‘disabled’ and ‘read‐
174                       only’   will   have   their  value  removed,  so  ‘dis‐
175                       abled=”true”’ will simply become  ‘disabled’.  This  is
176                       generally   a  good  option  to  turn  on  except  when
177                       JavaScript relies on the values.
178
179                     · remove_optional_attribute_quotes – When True,  optional
180                       quotes  around  attributes are removed. When False, all
181                       attribute quotes are left intact.  Defaults to True.
182
183                     · conver_charrefs – Decode character references  such  as
184                       &amp;  and  .   to  their  single charater values where
185                       safe. This currently only applies to  attributes.  Data
186                       content between tags will be left encoded.
187
188                     · keep_pre   –  By  default,  htmlmin  uses  the  special
189                       attribute pre to allow you to demarcate areas  of  HTML
190                       that  should not be minified. It removes this attribute
191                       as it finds  it.  Setting  this  value  to  True  tells
192                       htmlmin to leave the attribute in the output.
193
194                     · pre_tags  –  A  list  of tag names that should never be
195                       minified. You are free to change this list as  you  see
196                       fit,  but  you  will  probably  want to include pre and
197                       textarea if you make any changes to the list. Note that
198                       <script> and <style> tags are never minimized.
199
200                     · pre_attr  – Specifies the attribute that, when found in
201                       an HTML tag, indicates that  the  content  of  the  tag
202                       should  not  be minified. Defaults to pre. You can also
203                       prefix individual tag attributes  with  {pre_attr}-  to
204                       prevent  the  contents of the individual attribute from
205                       being changed.
206
207              Returns
208                     A string containing the minified HTML.
209
210              If you are going to be minifying multiple HTML  documents,  each
211              with the same settings, consider using Minifier.
212
213       class htmlmin.Minifier(remove_comments=False, remove_empty_space=False,
214       remove_all_empty_space=False,             reduce_empty_attributes=True,
215       reduce_boolean_attributes=False, remove_optional_attribute_quotes=True,
216       convert_charrefs=True, keep_pre=False, pre_tags=(u'pre',  u'textarea'),
217       pre_attr='pre', cls=<class htmlmin.parser.HTMLMinParser>)
218              An object that supports HTML Minification.
219
220              Options  are  passed  into this class at initialization time and
221              are then persisted across each use of the instance. If  you  are
222              going to be minifying multiple peices of HTML, this will be more
223              efficient than using htmlmin.minify.
224
225              See htmlmin.minify for an explanation of options.
226
227              minify(*input)
228                     Runs HTML through the minifier in one pass.
229
230                     Parameters
231                            input – HTML to be fed into the minimizer.  Multi‐
232                            ple  chunks  of HTML can be provided, and they are
233                            fed in sequentially as if they were concatenated.
234
235                     Returns
236                            A string containing the minified HTML.
237
238                     This is the simplest way  to  use  an  existing  Minifier
239                     instance.  This  method  takes  in  HTML  and minfies it,
240                     returning the result. Note that this  method  resets  the
241                     internal state of  the parser before it does any work. If
242                     there is pending HTML in the buffers, it will be lost.
243
244              input(*input)
245                     Feed more HTML into the input stream
246
247                     Parameters
248                            input – HTML to be fed into the minimizer.  Multi‐
249                            ple  chunks  of HTML can be provided, and they are
250                            fed in sequentially as if they were  concatenated.
251                            You  can  also  call this method multiple times to
252                            achieve the same effect.
253
254              output Retrieve the minified output generated thus far.
255
256              finalize()
257                     Finishes  current  input  HTML  and  returns   mininified
258                     result.
259
260                     This  method flushes any remaining input HTML and returns
261                     the minified result. It resets the state of the  internal
262                     parser  in  the process so that new HTML can be minified.
263                     Be sure to call this method before you reuse the Minifier
264                     instance on a new HTML document.
265
266   WSGI Middlware
267       class     htmlmin.middleware.HTMLMinMiddleware(app,    by_default=True,
268       keep_header=False, debug=False, **kwargs)
269              WSGI Middleware that minifies html on the way out.
270
271              Parameters
272
273                     · by_default – Specifies if minification should be turned
274                       on or off by default. Defaults to True.
275
276                     · keep_header – The middleware recognizes one custom HTTP
277                       header that can be used to turn minification on or  off
278                       on  a per-request basis: X-HTML-Min-Enable. Setting the
279                       header to true will turn minfication on; anything  else
280                       will  turn  minification  off.  If by_default is set to
281                       False, this header is how you would  turn  minification
282                       back on. The middleware, by default, removes the header
283                       from the output. Setting this to True leaves the header
284                       in tact.
285
286                     · debug  –  A quick setting to turn all minification off.
287                       The middleware is effectively bypassed.
288
289              This simple middleware minifies any  HTML  content  that  passes
290              through  it.  Any  additional keyword arguments beyond the three
291              settings the middleware has are passed on to the internal  mini‐
292              fier.  The  documentation  for  the  options  can be found under
293              htmlmin.minify.
294
295   Decorator
296       htmlmin.decorator.htmlmin(*args, **kwargs)
297              Minifies HTML that is returned by a function.
298
299              A simple decorator that minifies the HTML output of any function
300              that  it  decorates.  It  supports  all  the  same  options that
301              htmlmin.minify has.  With no options, it uses  minify’s  default
302              settings:
303
304                 @htmlmin
305                 def foobar():
306                    return '   minify me!   '
307
308              or:
309
310                 @htmlmin(remove_comments=True)
311                 def foobar():
312                    return '   minify me!  <!-- and remove me! -->'
313
314       htmlmin  is  an  HTML  minifier  that  just  works.  It comes with safe
315       defaults and an easily configurable set options. It can turn this:
316
317          <html>
318            <head>
319              <title>  Hello, World!  </title>
320            </head>
321            <body>
322              <p> How are <em>you</em> doing?  </p>
323            </body>
324          </html>
325
326       Into this:
327
328          <html><head><title>Hello, World!</title><body><p> How are <em>you</em> doing? </p></body></html>
329
330       When we say that htmlmin has ‘seatbelts’, what we mean is that it comes
331       with  features  that  you can use to safely minify beyond the defaults,
332       but you have to put them in yourself. For instance, by default, htmlmin
333       will  never  minimize  the content between <pre>, <textarea>, <script>,
334       and <style> tags.  You can also  explicitly tell it to not minify addi‐
335       tional  tags  either  globally  by  name  or  by  adding the custom pre
336       attribute to a tag in your HTML. htmlmin will remove the pre attributes
337       as it parses your HTML automatically.
338
339       It  also  includes a command-line tool for easy invocation and integra‐
340       tion with existing workflows.
341
342       To install via pip:
343
344          pip install htmlmin
345
346       Source code is availble on github at https://github.com/mankyd/htmlmin:
347
348          git clone git://github.com/mankyd/htmlmin.git
349
350       · Safely minify HTML with either a function call or  from  the  command
351         line.
352
353       · Extend what elements can and cannot be minified.
354
355       · Intelligently  remove  whitespace completely or reduce to single spa‐
356         ces.
357
358       · Properly handles unclosed HTML5 tags.
359
360       · Optionally remove comments while marking some comments to keep.
361
362       · Simple function decorator to minify all function output.
363
364       · Simple WSGI middleware to minify web app output.
365
366       · Tested in both Python 2.7 and 3.2: [image: build_status] [image]
367
368
369       · genindex
370
371       · search
372

AUTHOR

374       Dave Mankoff
375
377       2013, Dave Mankoff
378
379
380
381
3820.1                              Feb 02, 2019                       HTMLMIN(1)
Impressum