1Regexp::Common::commentU(s3e)r Contributed Perl DocumentaRteigoenxp::Common::comment(3)
2
3
4

NAME

6       Regexp::Common::comment -- provide regexes for comments.
7

SYNOPSIS

9           use Regexp::Common qw /comment/;
10
11           while (<>) {
12               /$RE{comment}{C}/       and  print "Contains a C comment\n";
13               /$RE{comment}{C++}/     and  print "Contains a C++ comment\n";
14               /$RE{comment}{PHP}/     and  print "Contains a PHP comment\n";
15               /$RE{comment}{Java}/    and  print "Contains a Java comment\n";
16               /$RE{comment}{Perl}/    and  print "Contains a Perl comment\n";
17               /$RE{comment}{awk}/     and  print "Contains an awk comment\n";
18               /$RE{comment}{HTML}/    and  print "Contains an HTML comment\n";
19           }
20
21           use Regexp::Common qw /comment RE_comment_HTML/;
22
23           while (<>) {
24               $_ =~ RE_comment_HTML() and  print "Contains an HTML comment\n";
25           }
26

DESCRIPTION

28       Please consult the manual of Regexp::Common for a general description
29       of the works of this interface.
30
31       Do not use this module directly, but load it via Regexp::Common.
32
33       This modules gives you regular expressions for comments in various
34       languages.
35
36   THE LANGUAGES
37       Below, the comments of each of the languages are described.  The
38       patterns are available as $RE{comment}{LANG}, foreach language LANG.
39       Some languages have variants; it's described at the individual
40       languages how to get the patterns for the variants.  Unless mentioned
41       otherwise, "{-keep}" sets $1, $2, $3 and $4 to the entire comment, the
42       opening marker, the content of the comment, and the closing marker (for
43       many languages, the latter is a newline) respectively.
44
45       ABC Comments in ABC start with a backslash ("\"), and last till the end
46           of the line.  See <http://homepages.cwi.nl/%7Esteven/abc/>.
47
48       Ada Comments in Ada start with "--", and last till the end of the line.
49
50       Advisor
51           Advisor is a language used by the HP product glance. Comments for
52           this language start with either "#" or "//", and last till the end
53           of the line.
54
55       Advsys
56           Comments for the Advsys language start with ";" and last till the
57           end of the line. See also <http://www.wurb.com/if/devsys/12>.
58
59       Alan
60           Alan comments start with "--", and last till the end of the line.
61           See also
62           <http://w1.132.telia.com/~u13207378/alan/manual/alanTOC.html>.
63
64       Algol 60
65           Comments in the Algol 60 language start with the keyword "comment",
66           and end with a ";". See
67           <http://www.masswerk.at/algol60/report.htm>.
68
69       Algol 68
70           In Algol 68, comments are either delimited by "#", or by one of the
71           keywords "co" or "comment". The keywords should not be part of
72           another word. See
73           http://westein.arb-phys.uni-dortmund.de/~wb/a68s.txt
74           <http://westein.arb-phys.uni-dortmund.de/~wb/a68s.txt>.  With
75           "{-keep}", only $1 will be set, returning the entire comment.
76
77       ALPACA
78           The ALPACA language has comments starting with "/*" and ending with
79           "*/".
80
81       awk The awk programming language uses comments that start with "#" and
82           end at the end of the line.
83
84       B   The B language has comments starting with "/*" and ending with
85           "*/".
86
87       BASIC
88           There are various forms of BASIC around. Currently, we only support
89           the variant supported by mvEnterprise, whose pattern is available
90           as $RE{comment}{BASIC}{mvEnterprise}. Comments in this language
91           start with a "!", a "*" or the keyword "REM", and end till the end
92           of the line. See
93           <http://www.rainingdata.com/products/beta/docs/mve/50/ReferenceManual/Basic.pdf>.
94
95       Beatnik
96           The esotoric language Beatnik only uses words consisting of
97           letters.  Words are scored according to the rules of Scrabble.
98           Words scoring less than 5 points, or 18 points or more are
99           considered comments (although the compiler might mock at you if you
100           score less than 5 points).  Regardless whether "{-keep}", $1 will
101           be set, and set to the entire comment. This pattern requires perl
102           5.8.0 or newer.
103
104       beta-Juliet
105           The beta-Juliet programming language has comments that start with
106           "//" and that continue till the end of the line. See also
107           http://www.catseye.mb.ca/esoteric/b-juliet/index.html
108           <http://www.catseye.mb.ca/esoteric/b-juliet/index.html>.
109
110       Befunge-98
111           The esotoric language Befunge-98 uses comments that start and end
112           with a ";". See
113           <http://www.catseye.mb.ca/esoteric/befunge/98/spec98.html>.
114
115       BML BML, or Better Markup Language is an HTML templating language that
116           uses comments starting with "<?c_", and ending with "c_?>".  See
117           <http://www.livejournal.com/doc/server/bml.index.html>.
118
119       Brainfuck
120           The minimal language Brainfuck uses only eight characters, "<",
121           ">", "[", "]", "+", "-", "." and ",".  Any other characters are
122           considered comments. With "{-keep}", $1 is set to the entire
123           comment.
124
125       C   The C language has comments starting with "/*" and ending with
126           "*/".
127
128       C-- The C-- language has comments starting with "/*" and ending with
129           "*/".  See
130           http://cs.uas.arizona.edu/classes/453/programs/C--Spec.html
131           <http://cs.uas.arizona.edu/classes/453/programs/C--Spec.html>.
132
133       C++ The C++ language has two forms of comments. Comments that start
134           with "//" and last till the end of the line, and comments that
135           start with "/*", and end with "*/". If "{-keep}" is used, only $1
136           will be set, and set to the entire comment.
137
138       C#  The C# language has two forms of comments. Comments that start with
139           "//" and last till the end of the line, and comments that start
140           with "/*", and end with "*/". If "{-keep}" is used, only $1 will be
141           set, and set to the entire comment.  See
142           http://msdn.microsoft.com/library/default.asp?url=/library/en-us/csspec/html/vclrfcsharpspec_C.asp
143           <http://msdn.microsoft.com/library/default.asp?url=/library/en-
144           us/csspec/html/vclrfcsharpspec_C.asp>.
145
146       Caml
147           Comments in Caml start with "(*", end with "*)", and can be nested.
148           See <http://www.cs.caltech.edu/courses/cs134/cs134b/book.pdf> and
149           http://pauillac.inria.fr/caml/index-eng.html
150           <http://pauillac.inria.fr/caml/index-eng.html>.
151
152       Cg  The Cg language has two forms of comments. Comments that start with
153           "//" and last till the end of the line, and comments that start
154           with "/*", and end with "*/". If "{-keep}" is used, only $1 will be
155           set, and set to the entire comment.  See
156           <http://developer.nvidia.com/attach/3722>.
157
158       CLU In "CLU", a comment starts with a procent sign ("%"), and ends with
159           the next newline. See ftp://ftp.lcs.mit.edu:/pub/pclu/CLU-syntax.ps
160           <ftp://ftp.lcs.mit.edu:/pub/pclu/CLU-syntax.ps> and
161           <http://www.pmg.lcs.mit.edu/CLU.html>.
162
163       COBOL
164           Traditionally, comments in COBOL are indicated by an asteriks in
165           the seventh column. This is what the pattern matches. Modern
166           compiler may more lenient though. See
167           <http://www.csis.ul.ie/cobol/Course/COBOLIntro.htm>, and
168           <http://www.csis.ul.ie/cobol/default.htm>. Due to a bug in the
169           regexp engine of perl 5.6.x, this regexp is only available in
170           version 5.8.0 and up.
171
172       CQL Comments in the chess query language (CQL) start with a semi colon
173           (";") and last till the end of the line. See
174           <http://www.rbnn.com/cql/>.
175
176       Crystal Report
177           The formula editor in Crystal Reports uses comments that start with
178           "//", and end with the end of the line.
179
180       Dylan
181           There are two types of comments in Dylan. They either start with
182           "//", or are nested comments, delimited with "/*" and "*/".  Under
183           "{-keep}", only $1 will be set, returning the entire comment.  This
184           pattern requires perl 5.6.0 or newer.
185
186       ECMAScript
187           The ECMAScript language has two forms of comments. Comments that
188           start with "//" and last till the end of the line, and comments
189           that start with "/*", and end with "*/". If "{-keep}" is used, only
190           $1 will be set, and set to the entire comment. JavaScript is
191           Netscapes implementation of ECMAScript. See
192           http://www.ecma-international.org/publications/files/ecma-st/Ecma-262.pdf
193           <http://www.ecma-international.org/publications/files/ecma-
194           st/Ecma-262.pdf>, and
195           http://www.ecma-international.org/publications/standards/Ecma-262.htm
196           <http://www.ecma-
197           international.org/publications/standards/Ecma-262.htm>.
198
199       Eiffel
200           Eiffel comments start with "--", and last till the end of the line.
201
202       False
203           In False, comments start with "{" and end with "}".  See
204           <http://wouter.fov120.com/false/false.txt>
205
206       FPL The FPL language has two forms of comments. Comments that start
207           with "//" and last till the end of the line, and comments that
208           start with "/*", and end with "*/". If "{-keep}" is used, only $1
209           will be set, and set to the entire comment.
210
211       Forth
212           Comments in Forth start with "\", and end with the end of the line.
213           See also http://docs.sun.com/sb/doc/806-1377-10
214           <http://docs.sun.com/sb/doc/806-1377-10>.
215
216       Fortran
217           There are two forms of Fortran. There's free form Fortran, which
218           has comments that start with "!", and end at the end of the line.
219           The pattern for this is given by $RE{Fortran}. Fixed form Fortran,
220           which has been obsoleted, has comments that start with "C", "c" or
221           "*" in the first column, or with "!" anywhere, but the sixth
222           column.  The pattern for this are given by $RE{Fortran}{fixed}.
223
224           See also
225           http://www.cray.com/craydoc/manuals/007-3692-005/html-007-3692-005/
226           <http://www.cray.com/craydoc/manuals/007-3692-005/html-007-3692-005/>.
227
228       Funge-98
229           The esotoric language Funge-98 uses comments that start and end
230           with a ";".
231
232       fvwm2
233           Configuration files for fvwm2 have comments starting with a "#" and
234           lasting the rest of the line.
235
236       Haifu
237           Haifu, an esotoric language using haikus, has comments starting and
238           ending with a ",".  See
239           <http://www.dangermouse.net/esoteric/haifu.html>.
240
241       Haskell
242           There are two types of comments in Haskell. They either start with
243           at least two dashes, or are nested comments, delimited with "{-"
244           and "-}".  Under "{-keep}", only $1 will be set, returning the
245           entire comment.  This pattern requires perl 5.6.0 or newer.
246
247       HTML
248           In HTML, comments only appear inside a comment declaration.  A
249           comment declaration starts with a "<!", and ends with a ">". Inside
250           this declaration, we have zero or more comments.  Comments starts
251           with "--" and end with "--", and are optionally followed by
252           whitespace. The pattern $RE{comment}{HTML} recognizes those comment
253           declarations (and hence more than a comment).  Note that this is
254           not the same as something that starts with "<!--" and ends with
255           "-->", because the following will be matched completely:
256
257               <!--  First  Comment   --
258                 --> Second Comment <!--
259                 --  Third  Comment   -->
260
261           Do not be fooled by what your favourite browser thinks is an HTML
262           comment.
263
264           If "{-keep}" is used, the following are returned:
265
266           $1  captures the entire comment declaration.
267
268           $2  captures the MDO (markup declaration open), "<!".
269
270           $3  captures the content between the MDO and the MDC.
271
272           $4  captures the (last) comment, without the surrounding dashes.
273
274           $5  captures the MDC (markup declaration close), ">".
275
276       Hugo
277           There are two types of comments in Hugo. They either start with "!"
278           (which cannot be followed by a "\"), or are nested comments,
279           delimited with "!\" and "\!".  Under "{-keep}", only $1 will be
280           set, returning the entire comment.  This pattern requires perl
281           5.6.0 or newer.
282
283       Icon
284           Icon has comments that start with "#" and end at the next new line.
285           See
286           <http://www.toolsofcomputing.com/IconHandbook/IconHandbook.pdf>,
287           <http://www.cs.arizona.edu/icon/index.htm>, and
288           <http://burks.bton.ac.uk/burks/language/icon/index.htm>.
289
290       ILLGOL
291           The esotoric language ILLGOL uses comments starting with NB and
292           lasting till the end of the line.  See
293           <http://www.catseye.mb.ca/esoteric/illgol/index.html>.
294
295       INTERCAL
296           Comments in INTERCAL are single line comments. They start with one
297           of the keywords "NOT" or "N'T", and can optionally be preceeded by
298           the keywords "DO" and "PLEASE". If both keywords are used, "PLEASE"
299           preceeds "DO". Keywords are separated by whitespace.
300
301       J   The language J uses comments that start with "NB.", and that last
302           till the end of the line. See
303           <http://www.jsoftware.com/books/help/primer/contents.htm>, and
304           <http://www.jsoftware.com/>.
305
306       Java
307           The Java language has two forms of comments. Comments that start
308           with "//" and last till the end of the line, and comments that
309           start with "/*", and end with "*/". If "{-keep}" is used, only $1
310           will be set, and set to the entire comment.
311
312       JavaScript
313           The JavaScript language has two forms of comments. Comments that
314           start with "//" and last till the end of the line, and comments
315           that start with "/*", and end with "*/". If "{-keep}" is used, only
316           $1 will be set, and set to the entire comment. JavaScript is
317           Netscapes implementation of ECMAScript.  See
318           http://www.mozilla.org/js/language/E262-3.pdf
319           <http://www.mozilla.org/js/language/E262-3.pdf>, and
320           <http://www.mozilla.org/js/language/>.
321
322       LaTeX
323           The documentation language LaTeX uses comments starting with "%"
324           and ending at the end of the line.
325
326       Lisp
327           Comments in Lisp start with a semi-colon (";") and last till the
328           end of the line.
329
330       LPC The LPC language has comments starting with "/*" and ending with
331           "*/".
332
333       LOGO
334           Comments for the language LOGO start with ";", and last till the
335           end of the line.
336
337       lua Comments for the lua language start with "--", and last till the
338           end of the line. See also <http://www.lua.org/manual/manual.html>.
339
340       M, MUMPS
341           In "M" (aka "MUMPS"), comments start with a semi-colon, and last
342           till the end of a line. The language specification requires the
343           semi-colon to be preceeded by one or more linestart characters.
344           Those characters default to a space, but that's configurable. This
345           requirement, of preceeding the comment with linestart characters is
346           not tested for. See
347           <ftp://ftp.intersys.com/pub/openm/ism/ism64docs.zip>,
348           <http://mtechnology.intersys.com/mproducts/openm/index.html>, and
349           <http://mcenter.com/mtrc/index.html>.
350
351       m4  By default, the preprocessor language m4 uses single line comments,
352           that start with a "#" and continue to the end of the line,
353           including the newline. The pattern "$RE {comment} {m4}" matches
354           such comments.  In m4, it is possible to change the starting token
355           though.  See
356           <http://wolfram.schneider.org/bsd/7thEdManVol2/m4/m4.pdf>,
357           http://www.cs.stir.ac.uk/~kjt/research/pdf/expl-m4.pdf
358           <http://www.cs.stir.ac.uk/~kjt/research/pdf/expl-m4.pdf>, and
359           <http://www.gnu.org/software/m4/manual/>.
360
361       Modula-2
362           In "Modula-2", comments start with "(*", and end with "*)".
363           Comments may be nested. See <http://www.modula2.org/>.
364
365       Modula-3
366           In "Modula-3", comments start with "(*", and end with "*)".
367           Comments may be nested. See <http://www.m3.org/>.
368
369       mutt
370           Configuration files for mutt have comments starting with a "#" and
371           lasting the rest of the line.
372
373       Nickle
374           The Nickle language has one line comments starting with "#" (like
375           Perl), or multiline comments delimited by "/*" and "*/" (like C).
376           Under "-keep", only $1 will be set. See also
377           <http://www.nickle.org>.
378
379       Oberon
380           Comments in Oberon start with "(*" and end with "*)".  See
381           <http://www.oberon.ethz.ch/oreport.html>.
382
383       Pascal
384           There are many implementations of Pascal. This modules provides
385           pattern for comments of several implementations.
386
387           $RE{comment}{Pascal}
388               This is the pattern that recognizes comments according to the
389               Pascal ISO standard. This standard says that comments start
390               with either "{", or "(*", and end with "}" or "*)". This means
391               that "{*)" and "(*}" are considered to be comments. Many Pascal
392               applications don't allow this.  See
393               http://www.pascal-central.com/docs/iso10206.txt
394               <http://www.pascal-central.com/docs/iso10206.txt>
395
396           $RE{comment}{Alice}
397               The Alice Pascal compiler accepts comments that start with "{"
398               and end with "}". Comments are not allowed to contain newlines.
399               See <http://www.templetons.com/brad/alice/language/>.
400
401           $RE{comment}{Pascal}{Delphi}, $RE{comment}{Pascal}{Free} and
402           $RE{comment}{Pascal}{GPC}
403               The Delphi Pascal, Free Pascal and the Gnu Pascal Compiler
404               implementations of Pascal all have comments that either start
405               with "//" and last till the end of the line, are delimited with
406               "{" and "}" or are delimited with "(*" and "*)". Patterns for
407               those comments are given by $RE{comment}{Pascal}{Delphi},
408               $RE{comment}{Pascal}{Free} and $RE{comment}{Pascal}{GPC}
409               respectively. These patterns only set $1 when "{-keep}" is
410               used, which will then include the entire comment.
411
412               See <http://info.borland.com/techpubs/delphi5/oplg/>,
413               http://www.freepascal.org/docs-html/ref/ref.html
414               <http://www.freepascal.org/docs-html/ref/ref.html> and
415               http://www.gnu-pascal.de/gpc/ <http://www.gnu-pascal.de/gpc/>.
416
417           $RE{comment}{Pascal}{Workshop}
418               The Workshop Pascal compiler, from SUN Microsystems, allows
419               comments that are delimited with either "{" and "}", delimited
420               with "(*)" and "*"), delimited with "/*", and "*/", or starting
421               and ending with a double quote ("""). When "{-keep}" is used,
422               only $1 is set, and returns the entire comment.
423
424               See http://docs.sun.com/db/doc/802-5762
425               <http://docs.sun.com/db/doc/802-5762>.
426
427       PEARL
428           Comments in PEARL start with a "!" and last till the end of the
429           line, or start with "/*" and end with "*/". With "{-keep}", $1 will
430           be set to the entire comment.
431
432       PHP Comments in PHP start with either "#" or "//" and last till the end
433           of the line, or are delimited by "/*" and "*/". With "{-keep}", $1
434           will be set to the entire comment.
435
436       PL/B
437           In PL/B, comments start with either "." or ";", and end with the
438           next newline. See http://www.mmcctech.com/pl-b/plb-0010.htm
439           <http://www.mmcctech.com/pl-b/plb-0010.htm>.
440
441       PL/I
442           The PL/I language has comments starting with "/*" and ending with
443           "*/".
444
445       PL/SQL
446           In PL/SQL, comments either start with "--" and run till the end of
447           the line, or start with "/*" and end with "*/".
448
449       Perl
450           Perl uses comments that start with a "#", and continue till the end
451           of the line.
452
453       Portia
454           The Portia programming language has comments that start with "//",
455           and last till the end of the line.
456
457       Python
458           Python uses comments that start with a "#", and continue till the
459           end of the line.
460
461       Q-BAL
462           Comments in the Q-BAL language start with "`" (a backtick), and
463           contine till the end of the line.
464
465       QML In "QML", comments start with "#" and last till the end of the
466           line.  See <http://www.questionmark.com/uk/qml/overview.doc>.
467
468       R   The statistical language R uses comments that start with a "#" and
469           end with the following new line. See http://www.r-project.org/
470           <http://www.r-project.org/>.
471
472       REBOL
473           Comments for the REBOL language start with ";" and last till the
474           end of the line.
475
476       Ruby
477           Comments in Ruby start with "#" and last till the end of the time.
478
479       Scheme
480           Scheme comments start with ";", and last till the end of the line.
481           See <http://schemers.org/>.
482
483       shell
484           Comments in various shells start with a "#" and end at the end of
485           the line.
486
487       Shelta
488           The esotoric language Shelta uses comments that start and end with
489           a ";". See <http://www.catseye.mb.ca/esoteric/shelta/index.html>.
490
491       SLIDE
492           The SLIDE language has two froms of comments. First there is the
493           line comment, which starts with a "#" and includes the rest of the
494           line (just like Perl). Second, there is the multiline, nested
495           comment, which are delimited by "(*" and "*)". Under C{-keep}>,
496           only $1 is set, and is set to the entire comment. This pattern
497           needs at least Perl version 5.6.0. See
498           <http://www.cs.berkeley.edu/~ug/slide/docs/slide/spec/spec_frame_intro.shtml>.
499
500       slrn
501           Configuration files for slrn have comments starting with a "%" and
502           lasting the rest of the line.
503
504       Smalltalk
505           Smalltalk uses comments that start and end with a double quote,
506           """.
507
508       SMITH
509           Comments in the SMITH language start with ";", and last till the
510           end of the line.
511
512       Squeak
513           In the Smalltalk variant Squeak, comments start and end with """.
514           Double quotes can appear inside comments by doubling them.
515
516       SQL Standard SQL uses comments starting with two or more dashes, and
517           ending at the end of the line.
518
519           MySQL does not follow the standard. Instead, it allows comments
520           that start with a "#" or "-- " (that's two dashes and a space)
521           ending with the following newline, and comments starting with "/*",
522           and ending with the next ";" or "*/" that isn't inside single or
523           double quotes. A pattern for this is returned by
524           $RE{comment}{SQL}{MySQL}. With "{-keep}", only $1 will be set, and
525           it returns the entire comment.
526
527       Tcl In Tcl, comments start with "#" and continue till the end of the
528           line.
529
530       TeX The documentation language TeX uses comments starting with "%" and
531           ending at the end of the line.
532
533       troff
534           The document formatting language troff uses comments starting with
535           "\"", and continuing till the end of the line.
536
537       Ubercode
538           The Windows programming language Ubercode uses comments that start
539           with "//" and continue to the end of the line. See
540           <http://www.ubercode.com>.
541
542       vi  In configuration files for the editor vi, one can use comments
543           starting with """, and ending at the end of the line.
544
545       *W  In the language *W, comments start with "||", and end with "!!".
546
547       zonefile
548           Comments in DNS zonefiles start with ";", and continue till the end
549           of the line.
550
551       ZZT-OOP
552           The in-game language ZZT-OOP uses comments that start with a "'"
553           character, and end at the following newline. See
554           <http://dave2.rocketjump.org/rad/zzthelp/lang.html>.
555

REFERENCES

557       [Go 90]
558           Charles F. Goldfarb: The SGML Handbook. Oxford: Oxford University
559           Press. 1990. ISBN 0-19-853737-9. Ch. 10.3, pp 390-391.
560

SEE ALSO

562       Regexp::Common for a general description of how to use this interface.
563

AUTHOR

565       Damian Conway (damian@conway.org)
566

MAINTAINANCE

568       This package is maintained by Abigail (regexp-common@abigail.be).
569

BUGS AND IRRITATIONS

571       Bound to be plenty.
572
573       For a start, there are many common regexes missing.  Send them in to
574       regexp-common@abigail.be.
575
577       This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail.
578
579       This module is free software, and maybe used under any of the following
580       licenses:
581
582        1) The Perl Artistic License.     See the file COPYRIGHT.AL.
583        2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2.
584        3) The BSD Licence.               See the file COPYRIGHT.BSD.
585        4) The MIT Licence.               See the file COPYRIGHT.MIT.
586
587
588
589perl v5.12.0                      2010-01-02        Regexp::Common::comment(3)
Impressum