1Regexp::Common::commentU(s3e)r Contributed Perl DocumentaRteigoenxp::Common::comment(3)
2
3
4

NAME

6       Regexp::Common::comment -- provide regexes for comments.
7

SYNOPSIS

9           use Regexp::Common qw /comment/;
10
11           while (<>) {
12               /$RE{comment}{C}/       and  print "Contains a C comment\n";
13               /$RE{comment}{C++}/     and  print "Contains a C++ comment\n";
14               /$RE{comment}{PHP}/     and  print "Contains a PHP comment\n";
15               /$RE{comment}{Java}/    and  print "Contains a Java comment\n";
16               /$RE{comment}{Perl}/    and  print "Contains a Perl comment\n";
17               /$RE{comment}{awk}/     and  print "Contains an awk comment\n";
18               /$RE{comment}{HTML}/    and  print "Contains an HTML comment\n";
19           }
20
21           use Regexp::Common qw /comment RE_comment_HTML/;
22
23           while (<>) {
24               $_ =~ RE_comment_HTML() and  print "Contains an HTML comment\n";
25           }
26

DESCRIPTION

28       Please consult the manual of Regexp::Common for a general description
29       of the works of this interface.
30
31       Do not use this module directly, but load it via Regexp::Common.
32
33       This modules gives you regular expressions for comments in various
34       languages.
35
36   THE LANGUAGES
37       Below, the comments of each of the languages are described.  The
38       patterns are available as $RE{comment}{LANG}, foreach language LANG.
39       Some languages have variants; it's described at the individual
40       languages how to get the patterns for the variants.  Unless mentioned
41       otherwise, "{-keep}" sets $1, $2, $3 and $4 to the entire comment, the
42       opening marker, the content of the comment, and the closing marker (for
43       many languages, the latter is a newline) respectively.
44
45       ABC Comments in ABC start with a backslash ("\"), and last till the end
46           of the line.  See <http://homepages.cwi.nl/%7Esteven/abc/>.
47
48       Ada Comments in Ada start with "--", and last till the end of the line.
49
50       Advisor
51           Advisor is a language used by the HP product glance. Comments for
52           this language start with either "#" or "//", and last till the end
53           of the line.
54
55       Advsys
56           Comments for the Advsys language start with ";" and last till the
57           end of the line. See also <http://www.wurb.com/if/devsys/12>.
58
59       Alan
60           Alan comments start with "--", and last till the end of the line.
61           See also
62           <http://w1.132.telia.com/~u13207378/alan/manual/alanTOC.html>.
63
64       Algol 60
65           Comments in the Algol 60 language start with the keyword "comment",
66           and end with a ";". See
67           <http://www.masswerk.at/algol60/report.htm>.
68
69       Algol 68
70           In Algol 68, comments are either delimited by "#", or by one of the
71           keywords "co" or "comment". The keywords should not be part of
72           another word. See
73           <http://westein.arb-phys.uni-dortmund.de/~wb/a68s.txt>.  With
74           "{-keep}", only $1 will be set, returning the entire comment.
75
76       ALPACA
77           The ALPACA language has comments starting with "/*" and ending with
78           "*/".
79
80       awk The awk programming language uses comments that start with "#" and
81           end at the end of the line.
82
83       B   The B language has comments starting with "/*" and ending with
84           "*/".
85
86       BASIC
87           There are various forms of BASIC around. Currently, we only support
88           the variant supported by mvEnterprise, whose pattern is available
89           as $RE{comment}{BASIC}{mvEnterprise}. Comments in this language
90           start with a "!", a "*" or the keyword "REM", and end till the end
91           of the line. See
92           <http://www.rainingdata.com/products/beta/docs/mve/50/ReferenceManual/Basic.pdf>.
93
94       Beatnik
95           The esotoric language Beatnik only uses words consisting of
96           letters.  Words are scored according to the rules of Scrabble.
97           Words scoring less than 5 points, or 18 points or more are
98           considered comments (although the compiler might mock at you if you
99           score less than 5 points).  Regardless whether "{-keep}", $1 will
100           be set, and set to the entire comment. This pattern requires perl
101           5.8.0 or newer.
102
103       beta-Juliet
104           The beta-Juliet programming language has comments that start with
105           "//" and that continue till the end of the line. See also
106           <http://www.catseye.mb.ca/esoteric/b-juliet/index.html>.
107
108       Befunge-98
109           The esotoric language Befunge-98 uses comments that start and end
110           with a ";". See
111           <http://www.catseye.mb.ca/esoteric/befunge/98/spec98.html>.
112
113       BML BML, or Better Markup Language is an HTML templating language that
114           uses comments starting with "<?c_", and ending with "c_?>".  See
115           <http://www.livejournal.com/doc/server/bml.index.html>.
116
117       Brainfuck
118           The minimal language Brainfuck uses only eight characters, "<",
119           ">", "[", "]", "+", "-", "." and ",".  Any other characters are
120           considered comments. With "{-keep}", $1 is set to the entire
121           comment.
122
123       C   The C language has comments starting with "/*" and ending with
124           "*/".
125
126       C-- The C-- language has comments starting with "/*" and ending with
127           "*/".  See
128           <http://cs.uas.arizona.edu/classes/453/programs/C--Spec.html>.
129
130       C++ The C++ language has two forms of comments. Comments that start
131           with "//" and last till the end of the line, and comments that
132           start with "/*", and end with "*/". If "{-keep}" is used, only $1
133           will be set, and set to the entire comment.
134
135       C#  The C# language has two forms of comments. Comments that start with
136           "//" and last till the end of the line, and comments that start
137           with "/*", and end with "*/". If "{-keep}" is used, only $1 will be
138           set, and set to the entire comment.  See
139           <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/csspec/html/vclrfcsharpspec_C.asp>.
140
141       Caml
142           Comments in Caml start with "(*", end with "*)", and can be nested.
143           See <http://www.cs.caltech.edu/courses/cs134/cs134b/book.pdf> and
144           <http://pauillac.inria.fr/caml/index-eng.html>.
145
146       Cg  The Cg language has two forms of comments. Comments that start with
147           "//" and last till the end of the line, and comments that start
148           with "/*", and end with "*/". If "{-keep}" is used, only $1 will be
149           set, and set to the entire comment.  See
150           <http://developer.nvidia.com/attach/3722>.
151
152       CLU In "CLU", a comment starts with a procent sign ("%"), and ends with
153           the next newline. See
154           <ftp://ftp.lcs.mit.edu:/pub/pclu/CLU-syntax.ps> and
155           <http://www.pmg.lcs.mit.edu/CLU.html>.
156
157       COBOL
158           Traditionally, comments in COBOL are indicated by an asteriks in
159           the seventh column. This is what the pattern matches. Modern
160           compiler may more lenient though. See
161           <http://www.csis.ul.ie/cobol/Course/COBOLIntro.htm>, and
162           <http://www.csis.ul.ie/cobol/default.htm>.
163
164       CQL Comments in the chess query language (CQL) start with a semi colon
165           (";") and last till the end of the line. See
166           <http://www.rbnn.com/cql/>.
167
168       Crystal Report
169           The formula editor in Crystal Reports uses comments that start with
170           "//", and end with the end of the line.
171
172       Dylan
173           There are two types of comments in Dylan. They either start with
174           "//", or are nested comments, delimited with "/*" and "*/".  Under
175           "{-keep}", only $1 will be set, returning the entire comment.  This
176           pattern requires perl 5.6.0 or newer.
177
178       ECMAScript
179           The ECMAScript language has two forms of comments. Comments that
180           start with "//" and last till the end of the line, and comments
181           that start with "/*", and end with "*/". If "{-keep}" is used, only
182           $1 will be set, and set to the entire comment. JavaScript is
183           Netscapes implementation of ECMAScript. See
184           <http://www.ecma-international.org/publications/files/ecma-st/Ecma-262.pdf>,
185           and
186           <http://www.ecma-international.org/publications/standards/Ecma-262.htm>.
187
188       Eiffel
189           Eiffel comments start with "--", and last till the end of the line.
190
191       False
192           In False, comments start with "{" and end with "}".  See
193           <http://wouter.fov120.com/false/false.txt>
194
195       FPL The FPL language has two forms of comments. Comments that start
196           with "//" and last till the end of the line, and comments that
197           start with "/*", and end with "*/". If "{-keep}" is used, only $1
198           will be set, and set to the entire comment.
199
200       Forth
201           Comments in Forth start with "\", and end with the end of the line.
202           See also <http://docs.sun.com/sb/doc/806-1377-10>.
203
204       Fortran
205           There are two forms of Fortran. There's free form Fortran, which
206           has comments that start with "!", and end at the end of the line.
207           The pattern for this is given by $RE{Fortran}. Fixed form Fortran,
208           which has been obsoleted, has comments that start with "C", "c" or
209           "*" in the first column, or with "!" anywhere, but the sixth
210           column.  The pattern for this are given by $RE{Fortran}{fixed}.
211
212           See also
213           <http://www.cray.com/craydoc/manuals/007-3692-005/html-007-3692-005/>.
214
215       Funge-98
216           The esotoric language Funge-98 uses comments that start and end
217           with a ";".
218
219       fvwm2
220           Configuration files for fvwm2 have comments starting with a "#" and
221           lasting the rest of the line.
222
223       Haifu
224           Haifu, an esotoric language using haikus, has comments starting and
225           ending with a ",".  See
226           <http://www.dangermouse.net/esoteric/haifu.html>.
227
228       Haskell
229           There are two types of comments in Haskell. They either start with
230           at least two dashes, or are nested comments, delimited with "{-"
231           and "-}".  Under "{-keep}", only $1 will be set, returning the
232           entire comment.  This pattern requires perl 5.6.0 or newer.
233
234       HTML
235           In HTML, comments only appear inside a comment declaration.  A
236           comment declaration starts with a "<!", and ends with a ">". Inside
237           this declaration, we have zero or more comments.  Comments starts
238           with "--" and end with "--", and are optionally followed by
239           whitespace. The pattern $RE{comment}{HTML} recognizes those comment
240           declarations (and hence more than a comment).  Note that this is
241           not the same as something that starts with "<!--" and ends with
242           "-->", because the following will be matched completely:
243
244               <!--  First  Comment   --
245                 --> Second Comment <!--
246                 --  Third  Comment   -->
247
248           Do not be fooled by what your favourite browser thinks is an HTML
249           comment.
250
251           If "{-keep}" is used, the following are returned:
252
253           $1  captures the entire comment declaration.
254
255           $2  captures the MDO (markup declaration open), "<!".
256
257           $3  captures the content between the MDO and the MDC.
258
259           $4  captures the (last) comment, without the surrounding dashes.
260
261           $5  captures the MDC (markup declaration close), ">".
262
263       Hugo
264           There are two types of comments in Hugo. They either start with "!"
265           (which cannot be followed by a "\"), or are nested comments,
266           delimited with "!\" and "\!".  Under "{-keep}", only $1 will be
267           set, returning the entire comment.  This pattern requires perl
268           5.6.0 or newer.
269
270       Icon
271           Icon has comments that start with "#" and end at the next new line.
272           See
273           <http://www.toolsofcomputing.com/IconHandbook/IconHandbook.pdf>,
274           <http://www.cs.arizona.edu/icon/index.htm>, and
275           <http://burks.bton.ac.uk/burks/language/icon/index.htm>.
276
277       ILLGOL
278           The esotoric language ILLGOL uses comments starting with NB and
279           lasting till the end of the line.  See
280           <http://www.catseye.mb.ca/esoteric/illgol/index.html>.
281
282       INTERCAL
283           Comments in INTERCAL are single line comments. They start with one
284           of the keywords "NOT" or "N'T", and can optionally be preceded by
285           the keywords "DO" and "PLEASE". If both keywords are used, "PLEASE"
286           precedes "DO". Keywords are separated by whitespace.
287
288       J   The language J uses comments that start with "NB.", and that last
289           till the end of the line. See
290           <http://www.jsoftware.com/books/help/primer/contents.htm>, and
291           <http://www.jsoftware.com/>.
292
293       Java
294           The Java language has two forms of comments. Comments that start
295           with "//" and last till the end of the line, and comments that
296           start with "/*", and end with "*/". If "{-keep}" is used, only $1
297           will be set, and set to the entire comment.
298
299       JavaDoc
300           The Javadoc documentation syntax is demarked with a subset of
301           ordinary Java comments to separate it from code.  Comments start
302           with "/**" end with "*/".  If "{-keep}" is used, only $1 will be
303           set, and set to the entire comment. See
304           <http://www.oracle.com/technetwork/java/javase/documentation/index-137868.html#format>.
305
306       JavaScript
307           The JavaScript language has two forms of comments. Comments that
308           start with "//" and last till the end of the line, and comments
309           that start with "/*", and end with "*/". If "{-keep}" is used, only
310           $1 will be set, and set to the entire comment. JavaScript is
311           Netscapes implementation of ECMAScript.  See
312           <http://www.mozilla.org/js/language/E262-3.pdf>, and
313           <http://www.mozilla.org/js/language/>.
314
315       LaTeX
316           The documentation language LaTeX uses comments starting with "%"
317           and ending at the end of the line.
318
319       Lisp
320           Comments in Lisp start with a semi-colon (";") and last till the
321           end of the line.
322
323       LPC The LPC language has comments starting with "/*" and ending with
324           "*/".
325
326       LOGO
327           Comments for the language LOGO start with ";", and last till the
328           end of the line.
329
330       lua Comments for the lua language start with "--", and last till the
331           end of the line. See also <http://www.lua.org/manual/manual.html>.
332
333       M, MUMPS
334           In "M" (aka "MUMPS"), comments start with a semi-colon, and last
335           till the end of a line. The language specification requires the
336           semi-colon to be preceded by one or more linestart characters.
337           Those characters default to a space, but that's configurable. This
338           requirement, of preceding the comment with linestart characters is
339           not tested for. See
340           <ftp://ftp.intersys.com/pub/openm/ism/ism64docs.zip>,
341           <http://mtechnology.intersys.com/mproducts/openm/index.html>, and
342           <http://mcenter.com/mtrc/index.html>.
343
344       m4  By default, the preprocessor language m4 uses single line comments,
345           that start with a "#" and continue to the end of the line,
346           including the newline. The pattern "$RE {comment} {m4}" matches
347           such comments.  In m4, it is possible to change the starting token
348           though.  See
349           <http://wolfram.schneider.org/bsd/7thEdManVol2/m4/m4.pdf>,
350           <http://www.cs.stir.ac.uk/~kjt/research/pdf/expl-m4.pdf>, and
351           <http://www.gnu.org/software/m4/manual/>.
352
353       Modula-2
354           In "Modula-2", comments start with "(*", and end with "*)".
355           Comments may be nested. See <http://www.modula2.org/>.
356
357       Modula-3
358           In "Modula-3", comments start with "(*", and end with "*)".
359           Comments may be nested. See <http://www.m3.org/>.
360
361       mutt
362           Configuration files for mutt have comments starting with a "#" and
363           lasting the rest of the line.
364
365       Nickle
366           The Nickle language has one line comments starting with "#" (like
367           Perl), or multiline comments delimited by "/*" and "*/" (like C).
368           Under "-keep", only $1 will be set. See also
369           <http://www.nickle.org>.
370
371       Oberon
372           Comments in Oberon start with "(*" and end with "*)".  See
373           <http://www.oberon.ethz.ch/oreport.html>.
374
375       Pascal
376           There are many implementations of Pascal. This modules provides
377           pattern for comments of several implementations.
378
379           $RE{comment}{Pascal}
380               This is the pattern that recognizes comments according to the
381               Pascal ISO standard. This standard says that comments start
382               with either "{", or "(*", and end with "}" or "*)". This means
383               that "{*)" and "(*}" are considered to be comments. Many Pascal
384               applications don't allow this.  See
385               <http://www.pascal-central.com/docs/iso10206.txt>
386
387           $RE{comment}{Pascal}{Alice}
388               The Alice Pascal compiler accepts comments that start with "{"
389               and end with "}". Comments are not allowed to contain newlines.
390               See <http://www.templetons.com/brad/alice/language/>.
391
392           $RE{comment}{Pascal}{Delphi}, $RE{comment}{Pascal}{Free} and
393           $RE{comment}{Pascal}{GPC}
394               The Delphi Pascal, Free Pascal and the Gnu Pascal Compiler
395               implementations of Pascal all have comments that either start
396               with "//" and last till the end of the line, are delimited with
397               "{" and "}" or are delimited with "(*" and "*)". Patterns for
398               those comments are given by $RE{comment}{Pascal}{Delphi},
399               $RE{comment}{Pascal}{Free} and $RE{comment}{Pascal}{GPC}
400               respectively. These patterns only set $1 when "{-keep}" is
401               used, which will then include the entire comment.
402
403               See <http://info.borland.com/techpubs/delphi5/oplg/>,
404               <http://www.freepascal.org/docs-html/ref/ref.html> and
405               <http://www.gnu-pascal.de/gpc/>.
406
407           $RE{comment}{Pascal}{Workshop}
408               The Workshop Pascal compiler, from SUN Microsystems, allows
409               comments that are delimited with either "{" and "}", delimited
410               with "(*)" and "*"), delimited with "/*", and "*/", or starting
411               and ending with a double quote ("""). When "{-keep}" is used,
412               only $1 is set, and returns the entire comment.
413
414               See <http://docs.sun.com/db/doc/802-5762>.
415
416       PEARL
417           Comments in PEARL start with a "!" and last till the end of the
418           line, or start with "/*" and end with "*/". With "{-keep}", $1 will
419           be set to the entire comment.
420
421       PHP Comments in PHP start with either "#" or "//" and last till the end
422           of the line, or are delimited by "/*" and "*/". With "{-keep}", $1
423           will be set to the entire comment.
424
425       PL/B
426           In PL/B, comments start with either "." or ";", and end with the
427           next newline. See <http://www.mmcctech.com/pl-b/plb-0010.htm>.
428
429       PL/I
430           The PL/I language has comments starting with "/*" and ending with
431           "*/".
432
433       PL/SQL
434           In PL/SQL, comments either start with "--" and run till the end of
435           the line, or start with "/*" and end with "*/".
436
437       Perl
438           Perl uses comments that start with a "#", and continue till the end
439           of the line.
440
441       Portia
442           The Portia programming language has comments that start with "//",
443           and last till the end of the line.
444
445       Python
446           Python uses comments that start with a "#", and continue till the
447           end of the line.
448
449       Q-BAL
450           Comments in the Q-BAL language start with "`" (a backtick), and
451           contine till the end of the line.
452
453       QML In "QML", comments start with "#" and last till the end of the
454           line.  See <http://www.questionmark.com/uk/qml/overview.doc>.
455
456       R   The statistical language R uses comments that start with a "#" and
457           end with the following new line. See <http://www.r-project.org/>.
458
459       REBOL
460           Comments for the REBOL language start with ";" and last till the
461           end of the line.
462
463       Ruby
464           Comments in Ruby start with "#" and last till the end of the time.
465
466       Scheme
467           Scheme comments start with ";", and last till the end of the line.
468           See <http://schemers.org/>.
469
470       shell
471           Comments in various shells start with a "#" and end at the end of
472           the line.
473
474       Shelta
475           The esotoric language Shelta uses comments that start and end with
476           a ";". See <http://www.catseye.mb.ca/esoteric/shelta/index.html>.
477
478       SLIDE
479           The SLIDE language has two froms of comments. First there is the
480           line comment, which starts with a "#" and includes the rest of the
481           line (just like Perl). Second, there is the multiline, nested
482           comment, which are delimited by "(*" and "*)". Under C{-keep}>,
483           only $1 is set, and is set to the entire comment. See
484           <http://www.cs.berkeley.edu/~ug/slide/docs/slide/spec/spec_frame_intro.shtml>.
485
486       slrn
487           Configuration files for slrn have comments starting with a "%" and
488           lasting the rest of the line.
489
490       Smalltalk
491           Smalltalk uses comments that start and end with a double quote,
492           """.
493
494       SMITH
495           Comments in the SMITH language start with ";", and last till the
496           end of the line.
497
498       Squeak
499           In the Smalltalk variant Squeak, comments start and end with """.
500           Double quotes can appear inside comments by doubling them.
501
502       SQL Standard SQL uses comments starting with two or more dashes, and
503           ending at the end of the line.
504
505           MySQL does not follow the standard. Instead, it allows comments
506           that start with a "#" or "-- " (that's two dashes and a space)
507           ending with the following newline, and comments starting with "/*",
508           and ending with the next ";" or "*/" that isn't inside single or
509           double quotes. A pattern for this is returned by
510           $RE{comment}{SQL}{MySQL}. With "{-keep}", only $1 will be set, and
511           it returns the entire comment.
512
513       Tcl In Tcl, comments start with "#" and continue till the end of the
514           line.
515
516       TeX The documentation language TeX uses comments starting with "%" and
517           ending at the end of the line.
518
519       troff
520           The document formatting language troff uses comments starting with
521           "\"", and continuing till the end of the line.
522
523       Ubercode
524           The Windows programming language Ubercode uses comments that start
525           with "//" and continue to the end of the line. See
526           <http://www.ubercode.com>.
527
528       vi  In configuration files for the editor vi, one can use comments
529           starting with """, and ending at the end of the line.
530
531       *W  In the language *W, comments start with "||", and end with "!!".
532
533       zonefile
534           Comments in DNS zonefiles start with ";", and continue till the end
535           of the line.
536
537       ZZT-OOP
538           The in-game language ZZT-OOP uses comments that start with a "'"
539           character, and end at the following newline. See
540           <http://dave2.rocketjump.org/rad/zzthelp/lang.html>.
541

REFERENCES

543       [Go 90]
544           Charles F. Goldfarb: The SGML Handbook. Oxford: Oxford University
545           Press. 1990. ISBN 0-19-853737-9. Ch. 10.3, pp 390-391.
546

SEE ALSO

548       Regexp::Common for a general description of how to use this interface.
549

AUTHOR

551       Damian Conway (damian@conway.org)
552

MAINTENANCE

554       This package is maintained by Abigail (regexp-common@abigail.be).
555

BUGS AND IRRITATIONS

557       Bound to be plenty.
558
559       For a start, there are many common regexes missing.  Send them in to
560       regexp-common@abigail.be.
561
563       This software is Copyright (c) 2001 - 2017, Damian Conway and Abigail.
564
565       This module is free software, and maybe used under any of the following
566       licenses:
567
568        1) The Perl Artistic License.     See the file COPYRIGHT.AL.
569        2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2.
570        3) The BSD License.               See the file COPYRIGHT.BSD.
571        4) The MIT License.               See the file COPYRIGHT.MIT.
572
573
574
575perl v5.32.0                      2020-07-28        Regexp::Common::comment(3)
Impressum