1Regexp::Common::commentU(s3e)r Contributed Perl DocumentaRteigoenxp::Common::comment(3)
2
3
4
6 Regexp::Common::comment -- provide regexes for comments.
7
9 use Regexp::Common qw /comment/;
10
11 while (<>) {
12 /$RE{comment}{C}/ and print "Contains a C comment\n";
13 /$RE{comment}{C++}/ and print "Contains a C++ comment\n";
14 /$RE{comment}{PHP}/ and print "Contains a PHP comment\n";
15 /$RE{comment}{Java}/ and print "Contains a Java comment\n";
16 /$RE{comment}{Perl}/ and print "Contains a Perl comment\n";
17 /$RE{comment}{awk}/ and print "Contains an awk comment\n";
18 /$RE{comment}{HTML}/ and print "Contains an HTML comment\n";
19 }
20
21 use Regexp::Common qw /comment RE_comment_HTML/;
22
23 while (<>) {
24 $_ =~ RE_comment_HTML() and print "Contains an HTML comment\n";
25 }
26
28 Please consult the manual of Regexp::Common for a general description
29 of the works of this interface.
30
31 Do not use this module directly, but load it via Regexp::Common.
32
33 This modules gives you regular expressions for comments in various
34 languages.
35
36 THE LANGUAGES
37 Below, the comments of each of the languages are described. The
38 patterns are available as $RE{comment}{LANG}, foreach language LANG.
39 Some languages have variants; it's described at the individual
40 languages how to get the patterns for the variants. Unless mentioned
41 otherwise, "{-keep}" sets $1, $2, $3 and $4 to the entire comment, the
42 opening marker, the content of the comment, and the closing marker (for
43 many languages, the latter is a newline) respectively.
44
45 ABC Comments in ABC start with a backslash ("\"), and last till the end
46 of the line. See <http://homepages.cwi.nl/%7Esteven/abc/>.
47
48 Ada Comments in Ada start with "--", and last till the end of the line.
49
50 Advisor
51 Advisor is a language used by the HP product glance. Comments for
52 this language start with either "#" or "//", and last till the end
53 of the line.
54
55 Advsys
56 Comments for the Advsys language start with ";" and last till the
57 end of the line. See also <http://www.wurb.com/if/devsys/12>.
58
59 Alan
60 Alan comments start with "--", and last till the end of the line.
61 See also
62 <http://w1.132.telia.com/~u13207378/alan/manual/alanTOC.html>.
63
64 Algol 60
65 Comments in the Algol 60 language start with the keyword "comment",
66 and end with a ";". See
67 <http://www.masswerk.at/algol60/report.htm>.
68
69 Algol 68
70 In Algol 68, comments are either delimited by "#", or by one of the
71 keywords "co" or "comment". The keywords should not be part of
72 another word. See
73 <http://westein.arb-phys.uni-dortmund.de/~wb/a68s.txt>. With
74 "{-keep}", only $1 will be set, returning the entire comment.
75
76 ALPACA
77 The ALPACA language has comments starting with "/*" and ending with
78 "*/".
79
80 awk The awk programming language uses comments that start with "#" and
81 end at the end of the line.
82
83 B The B language has comments starting with "/*" and ending with
84 "*/".
85
86 BASIC
87 There are various forms of BASIC around. Currently, we only support
88 the variant supported by mvEnterprise, whose pattern is available
89 as $RE{comment}{BASIC}{mvEnterprise}. Comments in this language
90 start with a "!", a "*" or the keyword "REM", and end till the end
91 of the line. See
92 <http://www.rainingdata.com/products/beta/docs/mve/50/ReferenceManual/Basic.pdf>.
93
94 Beatnik
95 The esotoric language Beatnik only uses words consisting of
96 letters. Words are scored according to the rules of Scrabble.
97 Words scoring less than 5 points, or 18 points or more are
98 considered comments (although the compiler might mock at you if you
99 score less than 5 points). Regardless whether "{-keep}", $1 will
100 be set, and set to the entire comment. This pattern requires perl
101 5.8.0 or newer.
102
103 beta-Juliet
104 The beta-Juliet programming language has comments that start with
105 "//" and that continue till the end of the line. See also
106 <http://www.catseye.mb.ca/esoteric/b-juliet/index.html>.
107
108 Befunge-98
109 The esotoric language Befunge-98 uses comments that start and end
110 with a ";". See
111 <http://www.catseye.mb.ca/esoteric/befunge/98/spec98.html>.
112
113 BML BML, or Better Markup Language is an HTML templating language that
114 uses comments starting with "<?c_", and ending with "c_?>". See
115 <http://www.livejournal.com/doc/server/bml.index.html>.
116
117 Brainfuck
118 The minimal language Brainfuck uses only eight characters, "<",
119 ">", "[", "]", "+", "-", "." and ",". Any other characters are
120 considered comments. With "{-keep}", $1 is set to the entire
121 comment.
122
123 C The C language has comments starting with "/*" and ending with
124 "*/".
125
126 C-- The C-- language has comments starting with "/*" and ending with
127 "*/". See
128 <http://cs.uas.arizona.edu/classes/453/programs/C--Spec.html>.
129
130 C++ The C++ language has two forms of comments. Comments that start
131 with "//" and last till the end of the line, and comments that
132 start with "/*", and end with "*/". If "{-keep}" is used, only $1
133 will be set, and set to the entire comment.
134
135 C# The C# language has two forms of comments. Comments that start with
136 "//" and last till the end of the line, and comments that start
137 with "/*", and end with "*/". If "{-keep}" is used, only $1 will be
138 set, and set to the entire comment. See
139 <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/csspec/html/vclrfcsharpspec_C.asp>.
140
141 Caml
142 Comments in Caml start with "(*", end with "*)", and can be nested.
143 See <http://www.cs.caltech.edu/courses/cs134/cs134b/book.pdf> and
144 <http://pauillac.inria.fr/caml/index-eng.html>.
145
146 Cg The Cg language has two forms of comments. Comments that start with
147 "//" and last till the end of the line, and comments that start
148 with "/*", and end with "*/". If "{-keep}" is used, only $1 will be
149 set, and set to the entire comment. See
150 <http://developer.nvidia.com/attach/3722>.
151
152 CLU In "CLU", a comment starts with a procent sign ("%"), and ends with
153 the next newline. See
154 <ftp://ftp.lcs.mit.edu:/pub/pclu/CLU-syntax.ps> and
155 <http://www.pmg.lcs.mit.edu/CLU.html>.
156
157 COBOL
158 Traditionally, comments in COBOL are indicated by an asteriks in
159 the seventh column. This is what the pattern matches. Modern
160 compiler may more lenient though. See
161 <http://www.csis.ul.ie/cobol/Course/COBOLIntro.htm>, and
162 <http://www.csis.ul.ie/cobol/default.htm>.
163
164 CQL Comments in the chess query language (CQL) start with a semi colon
165 (";") and last till the end of the line. See
166 <http://www.rbnn.com/cql/>.
167
168 Crystal Report
169 The formula editor in Crystal Reports uses comments that start with
170 "//", and end with the end of the line.
171
172 Dylan
173 There are two types of comments in Dylan. They either start with
174 "//", or are nested comments, delimited with "/*" and "*/". Under
175 "{-keep}", only $1 will be set, returning the entire comment. This
176 pattern requires perl 5.6.0 or newer.
177
178 ECMAScript
179 The ECMAScript language has two forms of comments. Comments that
180 start with "//" and last till the end of the line, and comments
181 that start with "/*", and end with "*/". If "{-keep}" is used, only
182 $1 will be set, and set to the entire comment. JavaScript is
183 Netscapes implementation of ECMAScript. See
184 <http://www.ecma-international.org/publications/files/ecma-st/Ecma-262.pdf>,
185 and
186 <http://www.ecma-international.org/publications/standards/Ecma-262.htm>.
187
188 Eiffel
189 Eiffel comments start with "--", and last till the end of the line.
190
191 False
192 In False, comments start with "{" and end with "}". See
193 <http://wouter.fov120.com/false/false.txt>
194
195 FPL The FPL language has two forms of comments. Comments that start
196 with "//" and last till the end of the line, and comments that
197 start with "/*", and end with "*/". If "{-keep}" is used, only $1
198 will be set, and set to the entire comment.
199
200 Forth
201 Comments in Forth start with "\", and end with the end of the line.
202 See also <http://docs.sun.com/sb/doc/806-1377-10>.
203
204 Fortran
205 There are two forms of Fortran. There's free form Fortran, which
206 has comments that start with "!", and end at the end of the line.
207 The pattern for this is given by $RE{Fortran}. Fixed form Fortran,
208 which has been obsoleted, has comments that start with "C", "c" or
209 "*" in the first column, or with "!" anywhere, but the sixth
210 column. The pattern for this are given by $RE{Fortran}{fixed}.
211
212 See also
213 <http://www.cray.com/craydoc/manuals/007-3692-005/html-007-3692-005/>.
214
215 Funge-98
216 The esotoric language Funge-98 uses comments that start and end
217 with a ";".
218
219 fvwm2
220 Configuration files for fvwm2 have comments starting with a "#" and
221 lasting the rest of the line.
222
223 Haifu
224 Haifu, an esotoric language using haikus, has comments starting and
225 ending with a ",". See
226 <http://www.dangermouse.net/esoteric/haifu.html>.
227
228 Haskell
229 There are two types of comments in Haskell. They either start with
230 at least two dashes, or are nested comments, delimited with "{-"
231 and "-}". Under "{-keep}", only $1 will be set, returning the
232 entire comment. This pattern requires perl 5.6.0 or newer.
233
234 HTML
235 In HTML, comments only appear inside a comment declaration. A
236 comment declaration starts with a "<!", and ends with a ">". Inside
237 this declaration, we have zero or more comments. Comments starts
238 with "--" and end with "--", and are optionally followed by
239 whitespace. The pattern $RE{comment}{HTML} recognizes those comment
240 declarations (and hence more than a comment). Note that this is
241 not the same as something that starts with "<!--" and ends with
242 "-->", because the following will be matched completely:
243
244 <!-- First Comment --
245 --> Second Comment <!--
246 -- Third Comment -->
247
248 Do not be fooled by what your favourite browser thinks is an HTML
249 comment.
250
251 If "{-keep}" is used, the following are returned:
252
253 $1 captures the entire comment declaration.
254
255 $2 captures the MDO (markup declaration open), "<!".
256
257 $3 captures the content between the MDO and the MDC.
258
259 $4 captures the (last) comment, without the surrounding dashes.
260
261 $5 captures the MDC (markup declaration close), ">".
262
263 Hugo
264 There are two types of comments in Hugo. They either start with "!"
265 (which cannot be followed by a "\"), or are nested comments,
266 delimited with "!\" and "\!". Under "{-keep}", only $1 will be
267 set, returning the entire comment. This pattern requires perl
268 5.6.0 or newer.
269
270 Icon
271 Icon has comments that start with "#" and end at the next new line.
272 See
273 <http://www.toolsofcomputing.com/IconHandbook/IconHandbook.pdf>,
274 <http://www.cs.arizona.edu/icon/index.htm>, and
275 <http://burks.bton.ac.uk/burks/language/icon/index.htm>.
276
277 ILLGOL
278 The esotoric language ILLGOL uses comments starting with NB and
279 lasting till the end of the line. See
280 <http://www.catseye.mb.ca/esoteric/illgol/index.html>.
281
282 INTERCAL
283 Comments in INTERCAL are single line comments. They start with one
284 of the keywords "NOT" or "N'T", and can optionally be preceded by
285 the keywords "DO" and "PLEASE". If both keywords are used, "PLEASE"
286 precedes "DO". Keywords are separated by whitespace.
287
288 J The language J uses comments that start with "NB.", and that last
289 till the end of the line. See
290 <http://www.jsoftware.com/books/help/primer/contents.htm>, and
291 <http://www.jsoftware.com/>.
292
293 Java
294 The Java language has two forms of comments. Comments that start
295 with "//" and last till the end of the line, and comments that
296 start with "/*", and end with "*/". If "{-keep}" is used, only $1
297 will be set, and set to the entire comment.
298
299 JavaDoc
300 The Javadoc documentation syntax is demarked with a subset of
301 ordinary Java comments to separate it from code. Comments start
302 with "/**" end with "*/". If "{-keep}" is used, only $1 will be
303 set, and set to the entire comment. See
304 <http://www.oracle.com/technetwork/java/javase/documentation/index-137868.html#format>.
305
306 JavaScript
307 The JavaScript language has two forms of comments. Comments that
308 start with "//" and last till the end of the line, and comments
309 that start with "/*", and end with "*/". If "{-keep}" is used, only
310 $1 will be set, and set to the entire comment. JavaScript is
311 Netscapes implementation of ECMAScript. See
312 <http://www.mozilla.org/js/language/E262-3.pdf>, and
313 <http://www.mozilla.org/js/language/>.
314
315 LaTeX
316 The documentation language LaTeX uses comments starting with "%"
317 and ending at the end of the line.
318
319 Lisp
320 Comments in Lisp start with a semi-colon (";") and last till the
321 end of the line.
322
323 LPC The LPC language has comments starting with "/*" and ending with
324 "*/".
325
326 LOGO
327 Comments for the language LOGO start with ";", and last till the
328 end of the line.
329
330 lua Comments for the lua language start with "--", and last till the
331 end of the line. See also <http://www.lua.org/manual/manual.html>.
332
333 M, MUMPS
334 In "M" (aka "MUMPS"), comments start with a semi-colon, and last
335 till the end of a line. The language specification requires the
336 semi-colon to be preceded by one or more linestart characters.
337 Those characters default to a space, but that's configurable. This
338 requirement, of preceding the comment with linestart characters is
339 not tested for. See
340 <ftp://ftp.intersys.com/pub/openm/ism/ism64docs.zip>,
341 <http://mtechnology.intersys.com/mproducts/openm/index.html>, and
342 <http://mcenter.com/mtrc/index.html>.
343
344 m4 By default, the preprocessor language m4 uses single line comments,
345 that start with a "#" and continue to the end of the line,
346 including the newline. The pattern "$RE {comment} {m4}" matches
347 such comments. In m4, it is possible to change the starting token
348 though. See
349 <http://wolfram.schneider.org/bsd/7thEdManVol2/m4/m4.pdf>,
350 <http://www.cs.stir.ac.uk/~kjt/research/pdf/expl-m4.pdf>, and
351 <http://www.gnu.org/software/m4/manual/>.
352
353 Modula-2
354 In "Modula-2", comments start with "(*", and end with "*)".
355 Comments may be nested. See <http://www.modula2.org/>.
356
357 Modula-3
358 In "Modula-3", comments start with "(*", and end with "*)".
359 Comments may be nested. See <http://www.m3.org/>.
360
361 mutt
362 Configuration files for mutt have comments starting with a "#" and
363 lasting the rest of the line.
364
365 Nickle
366 The Nickle language has one line comments starting with "#" (like
367 Perl), or multiline comments delimited by "/*" and "*/" (like C).
368 Under "-keep", only $1 will be set. See also
369 <http://www.nickle.org>.
370
371 Oberon
372 Comments in Oberon start with "(*" and end with "*)". See
373 <http://www.oberon.ethz.ch/oreport.html>.
374
375 Pascal
376 There are many implementations of Pascal. This modules provides
377 pattern for comments of several implementations.
378
379 $RE{comment}{Pascal}
380 This is the pattern that recognizes comments according to the
381 Pascal ISO standard. This standard says that comments start
382 with either "{", or "(*", and end with "}" or "*)". This means
383 that "{*)" and "(*}" are considered to be comments. Many Pascal
384 applications don't allow this. See
385 <http://www.pascal-central.com/docs/iso10206.txt>
386
387 $RE{comment}{Pascal}{Alice}
388 The Alice Pascal compiler accepts comments that start with "{"
389 and end with "}". Comments are not allowed to contain newlines.
390 See <http://www.templetons.com/brad/alice/language/>.
391
392 $RE{comment}{Pascal}{Delphi}, $RE{comment}{Pascal}{Free} and
393 $RE{comment}{Pascal}{GPC}
394 The Delphi Pascal, Free Pascal and the Gnu Pascal Compiler
395 implementations of Pascal all have comments that either start
396 with "//" and last till the end of the line, are delimited with
397 "{" and "}" or are delimited with "(*" and "*)". Patterns for
398 those comments are given by $RE{comment}{Pascal}{Delphi},
399 $RE{comment}{Pascal}{Free} and $RE{comment}{Pascal}{GPC}
400 respectively. These patterns only set $1 when "{-keep}" is
401 used, which will then include the entire comment.
402
403 See <http://info.borland.com/techpubs/delphi5/oplg/>,
404 <http://www.freepascal.org/docs-html/ref/ref.html> and
405 <http://www.gnu-pascal.de/gpc/>.
406
407 $RE{comment}{Pascal}{Workshop}
408 The Workshop Pascal compiler, from SUN Microsystems, allows
409 comments that are delimited with either "{" and "}", delimited
410 with "(*)" and "*"), delimited with "/*", and "*/", or starting
411 and ending with a double quote ("""). When "{-keep}" is used,
412 only $1 is set, and returns the entire comment.
413
414 See <http://docs.sun.com/db/doc/802-5762>.
415
416 PEARL
417 Comments in PEARL start with a "!" and last till the end of the
418 line, or start with "/*" and end with "*/". With "{-keep}", $1 will
419 be set to the entire comment.
420
421 PHP Comments in PHP start with either "#" or "//" and last till the end
422 of the line, or are delimited by "/*" and "*/". With "{-keep}", $1
423 will be set to the entire comment.
424
425 PL/B
426 In PL/B, comments start with either "." or ";", and end with the
427 next newline. See <http://www.mmcctech.com/pl-b/plb-0010.htm>.
428
429 PL/I
430 The PL/I language has comments starting with "/*" and ending with
431 "*/".
432
433 PL/SQL
434 In PL/SQL, comments either start with "--" and run till the end of
435 the line, or start with "/*" and end with "*/".
436
437 Perl
438 Perl uses comments that start with a "#", and continue till the end
439 of the line.
440
441 Portia
442 The Portia programming language has comments that start with "//",
443 and last till the end of the line.
444
445 Python
446 Python uses comments that start with a "#", and continue till the
447 end of the line.
448
449 Q-BAL
450 Comments in the Q-BAL language start with "`" (a backtick), and
451 contine till the end of the line.
452
453 QML In "QML", comments start with "#" and last till the end of the
454 line. See <http://www.questionmark.com/uk/qml/overview.doc>.
455
456 R The statistical language R uses comments that start with a "#" and
457 end with the following new line. See <http://www.r-project.org/>.
458
459 REBOL
460 Comments for the REBOL language start with ";" and last till the
461 end of the line.
462
463 Ruby
464 Comments in Ruby start with "#" and last till the end of the time.
465
466 Scheme
467 Scheme comments start with ";", and last till the end of the line.
468 See <http://schemers.org/>.
469
470 shell
471 Comments in various shells start with a "#" and end at the end of
472 the line.
473
474 Shelta
475 The esotoric language Shelta uses comments that start and end with
476 a ";". See <http://www.catseye.mb.ca/esoteric/shelta/index.html>.
477
478 SLIDE
479 The SLIDE language has two froms of comments. First there is the
480 line comment, which starts with a "#" and includes the rest of the
481 line (just like Perl). Second, there is the multiline, nested
482 comment, which are delimited by "(*" and "*)". Under C{-keep}>,
483 only $1 is set, and is set to the entire comment. See
484 <http://www.cs.berkeley.edu/~ug/slide/docs/slide/spec/spec_frame_intro.shtml>.
485
486 slrn
487 Configuration files for slrn have comments starting with a "%" and
488 lasting the rest of the line.
489
490 Smalltalk
491 Smalltalk uses comments that start and end with a double quote,
492 """.
493
494 SMITH
495 Comments in the SMITH language start with ";", and last till the
496 end of the line.
497
498 Squeak
499 In the Smalltalk variant Squeak, comments start and end with """.
500 Double quotes can appear inside comments by doubling them.
501
502 SQL Standard SQL uses comments starting with two or more dashes, and
503 ending at the end of the line.
504
505 MySQL does not follow the standard. Instead, it allows comments
506 that start with a "#" or "-- " (that's two dashes and a space)
507 ending with the following newline, and comments starting with "/*",
508 and ending with the next ";" or "*/" that isn't inside single or
509 double quotes. A pattern for this is returned by
510 $RE{comment}{SQL}{MySQL}. With "{-keep}", only $1 will be set, and
511 it returns the entire comment.
512
513 Tcl In Tcl, comments start with "#" and continue till the end of the
514 line.
515
516 TeX The documentation language TeX uses comments starting with "%" and
517 ending at the end of the line.
518
519 troff
520 The document formatting language troff uses comments starting with
521 "\"", and continuing till the end of the line.
522
523 Ubercode
524 The Windows programming language Ubercode uses comments that start
525 with "//" and continue to the end of the line. See
526 <http://www.ubercode.com>.
527
528 vi In configuration files for the editor vi, one can use comments
529 starting with """, and ending at the end of the line.
530
531 *W In the language *W, comments start with "||", and end with "!!".
532
533 zonefile
534 Comments in DNS zonefiles start with ";", and continue till the end
535 of the line.
536
537 ZZT-OOP
538 The in-game language ZZT-OOP uses comments that start with a "'"
539 character, and end at the following newline. See
540 <http://dave2.rocketjump.org/rad/zzthelp/lang.html>.
541
543 [Go 90]
544 Charles F. Goldfarb: The SGML Handbook. Oxford: Oxford University
545 Press. 1990. ISBN 0-19-853737-9. Ch. 10.3, pp 390-391.
546
548 Regexp::Common for a general description of how to use this interface.
549
551 Damian Conway (damian@conway.org)
552
554 This package is maintained by Abigail (regexp-common@abigail.be).
555
557 Bound to be plenty.
558
559 For a start, there are many common regexes missing. Send them in to
560 regexp-common@abigail.be.
561
563 This software is Copyright (c) 2001 - 2017, Damian Conway and Abigail.
564
565 This module is free software, and maybe used under any of the following
566 licenses:
567
568 1) The Perl Artistic License. See the file COPYRIGHT.AL.
569 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2.
570 3) The BSD License. See the file COPYRIGHT.BSD.
571 4) The MIT License. See the file COPYRIGHT.MIT.
572
573
574
575perl v5.34.0 2021-07-22 Regexp::Common::comment(3)