1AnyEvent::Intro(3) User Contributed Perl Documentation AnyEvent::Intro(3)
2
3
4
6 AnyEvent::Intro - an introductory tutorial to AnyEvent
7
9 This is a tutorial that will introduce you to the features of AnyEvent.
10
11 The first part introduces the core AnyEvent module (after swamping you
12 a bit in evangelism), which might already provide all you ever need: If
13 you are only interested in AnyEvent's event handling capabilities, read
14 no further.
15
16 The second part focuses on network programming using sockets, for which
17 AnyEvent offers a lot of support you can use, and a lot of workarounds
18 around portability quirks.
19
21 If you don't care for the whys and want to see code, skip this section!
22
23 AnyEvent is first of all just a framework to do event-based
24 programming. Typically such frameworks are an all-or-nothing thing: If
25 you use one such framework, you can't (easily, or even at all) use
26 another in the same program.
27
28 AnyEvent is different - it is a thin abstraction layer on top of other
29 event loops, just like DBI is an abstraction of many different database
30 APIs. Its main purpose is to move the choice of the underlying
31 framework (the event loop) from the module author to the program author
32 using the module.
33
34 That means you can write code that uses events to control what it does,
35 without forcing other code in the same program to use the same
36 underlying framework as you do - i.e. you can create a Perl module that
37 is event-based using AnyEvent, and users of that module can still
38 choose between using Gtk2, Tk, Event (or run inside Irssi or rxvt-
39 unicode) or any other supported event loop. AnyEvent even comes with
40 its own pure-perl event loop implementation, so your code works
41 regardless of other modules that might or might not be installed. The
42 latter is important, as AnyEvent does not have any hard dependencies to
43 other modules, which makes it easy to install, for example, when you
44 lack a C compiler. No matter what environment, AnyEvent will just cope
45 with it.
46
47 A typical limitation of existing Perl modules such as Net::IRC is that
48 they come with their own event loop: In Net::IRC, a program which uses
49 it needs to start the event loop of Net::IRC. That means that one
50 cannot integrate this module into a Gtk2 GUI for instance, as that
51 module, too, enforces the use of its own event loop (namely Glib).
52
53 Another example is LWP: it provides no event interface at all. It's a
54 pure blocking HTTP (and FTP etc.) client library, which usually means
55 that you either have to start another process or have to fork for a
56 HTTP request, or use threads (e.g. Coro::LWP), if you want to do
57 something else while waiting for the request to finish.
58
59 The motivation behind these designs is often that a module doesn't want
60 to depend on some complicated XS-module (Net::IRC), or that it doesn't
61 want to force the user to use some specific event loop at all (LWP),
62 out of fear of severly limiting the usefulness of the module: If your
63 module requires Glib, it will not run in a Tk program.
64
65 AnyEvent solves this dilemma, by not forcing module authors to either:
66
67 - write their own event loop (because it guarantees the availability of
68 an event loop everywhere - even on windows with no extra modules
69 installed).
70 - choose one specific event loop (because AnyEvent works with most
71 event loops available for Perl).
72
73 If the module author uses AnyEvent for all his (or her) event needs (IO
74 events, timers, signals, ...) then all other modules can just use his
75 module and don't have to choose an event loop or adapt to his event
76 loop. The choice of the event loop is ultimately made by the program
77 author who uses all the modules and writes the main program. And even
78 there he doesn't have to choose, he can just let AnyEvent choose the
79 most efficient event loop available on the system.
80
81 Read more about this in the main documentation of the AnyEvent module.
82
84 So what exactly is programming using events? It quite simply means that
85 instead of your code actively waiting for something, such as the user
86 entering something on STDIN:
87
88 $| = 1; print "enter your name> ";
89
90 my $name = <STDIN>;
91
92 You instead tell your event framework to notify you in the event of
93 some data being available on STDIN, by using a callback mechanism:
94
95 use AnyEvent;
96
97 $| = 1; print "enter your name> ";
98
99 my $name;
100
101 my $wait_for_input = AnyEvent->io (
102 fh => \*STDIN, # which file handle to check
103 poll => "r", # which event to wait for ("r"ead data)
104 cb => sub { # what callback to execute
105 $name = <STDIN>; # read it
106 }
107 );
108
109 # do something else here
110
111 Looks more complicated, and surely is, but the advantage of using
112 events is that your program can do something else instead of waiting
113 for input (side note: combining AnyEvent with a thread package such as
114 Coro can recoup much of the simplicity, effectively getting the best of
115 two worlds).
116
117 Waiting as done in the first example is also called "blocking" the
118 process because you "block"/keep your process from executing anything
119 else while you do so.
120
121 The second example avoids blocking by only registering interest in a
122 read event, which is fast and doesn't block your process. The callback
123 will be called only when data is available and can be read without
124 blocking.
125
126 The "interest" is represented by an object returned by "AnyEvent->io"
127 called a "watcher" object - thus named because it "watches" your file
128 handle (or other event sources) for the event you are interested in.
129
130 In the example above, we create an I/O watcher by calling the
131 "AnyEvent->io" method. A lack of further interest in some event is
132 expressed by simply forgetting about its watcher, for example by
133 "undef"-ing the only variable it is stored in. AnyEvent will
134 automatically clean up the watcher if it is no longer used, much like
135 Perl closes your file handles if you no longer use them anywhere.
136
137 A short note on callbacks
138
139 A common issue that hits people is the problem of passing parameters to
140 callbacks. Programmers used to languages such as C or C++ are often
141 used to a style where one passes the address of a function (a function
142 reference) and some data value, e.g.:
143
144 sub callback {
145 my ($arg) = @_;
146
147 $arg->method;
148 }
149
150 my $arg = ...;
151
152 call_me_back_later \&callback, $arg;
153
154 This is clumsy, as the place where behaviour is specified (when the
155 callback is registered) is often far away from the place where
156 behaviour is implemented. It also doesn't use Perl syntax to invoke the
157 code. There is also an abstraction penalty to pay as one has to name
158 the callback, which often is unnecessary and leads to nonsensical or
159 duplicated names.
160
161 In Perl, one can specify behaviour much more directly by using
162 closures. Closures are code blocks that take a reference to the
163 enclosing scope(s) when they are created. This means lexical variables
164 in scope when a closure is created can be used inside the closure:
165
166 my $arg = ...;
167
168 call_me_back_later sub { $arg->method };
169
170 Under most circumstances, closures are faster, use fewer resources and
171 result in much clearer code than the traditional approach. Faster,
172 because parameter passing and storing them in local variables in Perl
173 is relatively slow. Fewer resources, because closures take references
174 to existing variables without having to create new ones, and clearer
175 code because it is immediately obvious that the second example calls
176 the "method" method when the callback is invoked.
177
178 Apart from these, the strongest argument for using closures with
179 AnyEvent is that AnyEvent does not allow passing parameters to the
180 callback, so closures are the only way to achieve that in most cases
181 :->
182
183 A little hint to catch mistakes
184
185 AnyEvent does not check the parameters you pass in, at least not by
186 default. to enable checking, simply start your program with
187 "AE_STRICT=1" in the environment, or put "use AnyEvent::Strict" near
188 the top of your program:
189
190 AE_STRICT=1 perl myprogram
191
192 You can find more info on this and additional debugging aids later in
193 this introduction.
194
195 Condition Variables
196 Back to the I/O watcher example: The code is not yet a fully working
197 program, and will not work as-is. The reason is that your callback will
198 not be invoked out of the blue; you have to run the event loop first.
199 Also, event-based programs need to block sometimes too, such as when
200 there is nothing to do, and everything is waiting for new events to
201 arrive.
202
203 In AnyEvent, this is done using condition variables. Condition
204 variables are named "condition variables" because they represent a
205 condition that is initially false and needs to be fulfilled.
206
207 You can also call them "merge points", "sync points", "rendezvous
208 ports" or even callbacks and many other things (and they are often
209 called these names in other frameworks). The important point is that
210 you can create them freely and later wait for them to become true.
211
212 Condition variables have two sides - one side is the "producer" of the
213 condition (whatever code detects and flags the condition), the other
214 side is the "consumer" (the code that waits for that condition).
215
216 In our example in the previous section, the producer is the event
217 callback and there is no consumer yet - let's change that right now:
218
219 use AnyEvent;
220
221 $| = 1; print "enter your name> ";
222
223 my $name;
224
225 my $name_ready = AnyEvent->condvar;
226
227 my $wait_for_input = AnyEvent->io (
228 fh => \*STDIN,
229 poll => "r",
230 cb => sub {
231 $name = <STDIN>;
232 $name_ready->send;
233 }
234 );
235
236 # do something else here
237
238 # now wait until the name is available:
239 $name_ready->recv;
240
241 undef $wait_for_input; # watcher no longer needed
242
243 print "your name is $name\n";
244
245 This program creates an AnyEvent condvar by calling the
246 "AnyEvent->condvar" method. It then creates a watcher as usual, but
247 inside the callback it "send"s the $name_ready condition variable,
248 which causes whoever is waiting on it to continue.
249
250 The "whoever" in this case is the code that follows, which calls
251 "$name_ready->recv": The producer calls "send", the consumer calls
252 "recv".
253
254 If there is no $name available yet, then the call to
255 "$name_ready->recv" will halt your program until the condition becomes
256 true.
257
258 As the names "send" and "recv" imply, you can actually send and receive
259 data using this, for example, the above code could also be written like
260 this, without an extra variable to store the name in:
261
262 use AnyEvent;
263
264 $| = 1; print "enter your name> ";
265
266 my $name_ready = AnyEvent->condvar;
267
268 my $wait_for_input = AnyEvent->io (
269 fh => \*STDIN, poll => "r",
270 cb => sub { $name_ready->send (scalar <STDIN>) }
271 );
272
273 # do something else here
274
275 # now wait and fetch the name
276 my $name = $name_ready->recv;
277
278 undef $wait_for_input; # watcher no longer needed
279
280 print "your name is $name\n";
281
282 You can pass any number of arguments to "send", and every subsequent
283 call to "recv" will return them.
284
285 The "main loop"
286 Most event-based frameworks have something called a "main loop" or
287 "event loop run function" or something similar.
288
289 Just like in "recv" AnyEvent, these functions need to be called
290 eventually so that your event loop has a chance of actually looking for
291 the events you are interested in.
292
293 For example, in a Gtk2 program, the above example could also be written
294 like this:
295
296 use Gtk2 -init;
297 use AnyEvent;
298
299 ############################################
300 # create a window and some label
301
302 my $window = new Gtk2::Window "toplevel";
303 $window->add (my $label = new Gtk2::Label "soon replaced by name");
304
305 $window->show_all;
306
307 ############################################
308 # do our AnyEvent stuff
309
310 $| = 1; print "enter your name> ";
311
312 my $wait_for_input = AnyEvent->io (
313 fh => \*STDIN, poll => "r",
314 cb => sub {
315 # set the label
316 $label->set_text (scalar <STDIN>);
317 print "enter another name> ";
318 }
319 );
320
321 ############################################
322 # Now enter Gtk2's event loop
323
324 main Gtk2;
325
326 No condition variable anywhere in sight - instead, we just read a line
327 from STDIN and replace the text in the label. In fact, since nobody
328 "undef"s $wait_for_input you can enter multiple lines.
329
330 Instead of waiting for a condition variable, the program enters the
331 Gtk2 main loop by calling "Gtk2->main", which will block the program
332 and wait for events to arrive.
333
334 This also shows that AnyEvent is quite flexible - you didn't have to do
335 anything to make the AnyEvent watcher use Gtk2 (actually Glib) - it
336 just worked.
337
338 Admittedly, the example is a bit silly - who would want to read names
339 from standard input in a Gtk+ application? But imagine that instead of
340 doing that, you make an HTTP request in the background and display its
341 results. In fact, with event-based programming you can make many HTTP
342 requests in parallel in your program and still provide feedback to the
343 user and stay interactive.
344
345 And in the next part you will see how to do just that - by implementing
346 an HTTP request, on our own, with the utility modules AnyEvent comes
347 with.
348
349 Before that, however, let's briefly look at how you would write your
350 program using only AnyEvent, without ever calling some other event
351 loop's run function.
352
353 In the example using condition variables, we used those to start
354 waiting for events, and in fact, condition variables are the solution:
355
356 my $quit_program = AnyEvent->condvar;
357
358 # create AnyEvent watchers (or not) here
359
360 $quit_program->recv;
361
362 If any of your watcher callbacks decide to quit (this is often called
363 an "unloop" in other frameworks), they can just call
364 "$quit_program->send". Of course, they could also decide not to and
365 call "exit" instead, or they could decide never to quit (e.g. in a
366 long-running daemon program).
367
368 If you don't need some clean quit functionality and just want to run
369 the event loop, you can do this:
370
371 AnyEvent->condvar->recv;
372
373 And this is, in fact, the closest to the idea of a main loop run
374 function that AnyEvent offers.
375
376 Timers and other event sources
377 So far, we have used only I/O watchers. These are useful mainly to find
378 out whether a socket has data to read, or space to write more data. On
379 sane operating systems this also works for console windows/terminals
380 (typically on standard input), serial lines, all sorts of other
381 devices, basically almost everything that has a file descriptor but
382 isn't a file itself. (As usual, "sane" excludes windows - on that
383 platform you would need different functions for all of these,
384 complicating code immensely - think "socket only" on windows).
385
386 However, I/O is not everything - the second most important event source
387 is the clock. For example when doing an HTTP request you might want to
388 time out when the server doesn't answer within some predefined amount
389 of time.
390
391 In AnyEvent, timer event watchers are created by calling the
392 "AnyEvent->timer" method:
393
394 use AnyEvent;
395
396 my $cv = AnyEvent->condvar;
397
398 my $wait_one_and_a_half_seconds = AnyEvent->timer (
399 after => 1.5, # after how many seconds to invoke the cb?
400 cb => sub { # the callback to invoke
401 $cv->send;
402 },
403 );
404
405 # can do something else here
406
407 # now wait till our time has come
408 $cv->recv;
409
410 Unlike I/O watchers, timers are only interested in the amount of
411 seconds they have to wait. When (at least) that amount of time has
412 passed, AnyEvent will invoke your callback.
413
414 Unlike I/O watchers, which will call your callback as many times as
415 there is data available, timers are normally one-shot: after they have
416 "fired" once and invoked your callback, they are dead and no longer do
417 anything.
418
419 To get a repeating timer, such as a timer firing roughly once per
420 second, you can specify an "interval" parameter:
421
422 my $once_per_second = AnyEvent->timer (
423 after => 0, # first invoke ASAP
424 interval => 1, # then invoke every second
425 cb => sub { # the callback to invoke
426 $cv->send;
427 },
428 );
429
430 More esoteric sources
431
432 AnyEvent also has some other, more esoteric event sources you can tap
433 into: signal, child and idle watchers.
434
435 Signal watchers can be used to wait for "signal events", which means
436 your process was sent a signal (such as "SIGTERM" or "SIGUSR1").
437
438 Child-process watchers wait for a child process to exit. They are
439 useful when you fork a separate process and need to know when it exits,
440 but you do not want to wait for that by blocking.
441
442 Idle watchers invoke their callback when the event loop has handled all
443 outstanding events, polled for new events and didn't find any, i.e.,
444 when your process is otherwise idle. They are useful if you want to do
445 some non-trivial data processing that can be done when your program
446 doesn't have anything better to do.
447
448 All these watcher types are described in detail in the main AnyEvent
449 manual page.
450
451 Sometimes you also need to know what the current time is:
452 "AnyEvent->now" returns the time the event toolkit uses to schedule
453 relative timers, and is usually what you want. It is often cached
454 (which means it can be a bit outdated). In that case, you can use the
455 more costly "AnyEvent->time" method which will ask your operating
456 system for the current time, which is slower, but also more up to date.
457
459 So far you have seen how to register event watchers and handle events.
460
461 This is a great foundation to write network clients and servers, and
462 might be all that your module (or program) ever requires, but writing
463 your own I/O buffering again and again becomes tedious, not to mention
464 that it attracts errors.
465
466 While the core AnyEvent module is still small and self-contained, the
467 distribution comes with some very useful utility modules such as
468 AnyEvent::Handle, AnyEvent::DNS and AnyEvent::Socket. These can make
469 your life as a non-blocking network programmer a lot easier.
470
471 Here is a quick overview of these three modules:
472
473 AnyEvent::DNS
474 This module allows fully asynchronous DNS resolution. It is used mainly
475 by AnyEvent::Socket to resolve hostnames and service ports for you, but
476 is a great way to do other DNS resolution tasks, such as reverse
477 lookups of IP addresses for log files.
478
479 AnyEvent::Handle
480 This module handles non-blocking IO on (socket-, pipe- etc.) file
481 handles in an event based manner. It provides a wrapper object around
482 your file handle that provides queueing and buffering of incoming and
483 outgoing data for you.
484
485 It also implements the most common data formats, such as text lines, or
486 fixed and variable-width data blocks.
487
488 AnyEvent::Socket
489 This module provides you with functions that handle socket creation and
490 IP address magic. The two main functions are "tcp_connect" and
491 "tcp_server". The former will connect a (streaming) socket to an
492 internet host for you and the later will make a server socket for you,
493 to accept connections.
494
495 This module also comes with transparent IPv6 support, this means: If
496 you write your programs with this module, you will be IPv6 ready
497 without doing anything special.
498
499 It also works around a lot of portability quirks (especially on the
500 windows platform), which makes it even easier to write your programs in
501 a portable way (did you know that windows uses different error codes
502 for all socket functions and that Perl does not know about these? That
503 "Unknown error 10022" (which is "WSAEINVAL") can mean that our
504 "connect" call was successful? That unsuccessful TCP connects might
505 never be reported back to your program? That "WSAEINPROGRESS" means
506 your "connect" call was ignored instead of being in progress?
507 AnyEvent::Socket works around all of these Windows/Perl bugs for you).
508
509 Implementing a parallel finger client with non-blocking connects and
510 AnyEvent::Socket
511 The finger protocol is one of the simplest protocols in use on the
512 internet. Or in use in the past, as almost nobody uses it anymore.
513
514 It works by connecting to the finger port on another host, writing a
515 single line with a user name and then reading the finger response, as
516 specified by that user. OK, RFC 1288 specifies a vastly more complex
517 protocol, but it basically boils down to this:
518
519 # telnet freebsd.org finger
520 Trying 8.8.178.135...
521 Connected to freebsd.org (8.8.178.135).
522 Escape character is '^]'.
523 larry
524 Login: lile Name: Larry Lile
525 Directory: /home/lile Shell: /usr/local/bin/bash
526 No Mail.
527 Mail forwarded to: lile@stdio.com
528 No Plan.
529
530 So let's write a little AnyEvent function that makes a finger request:
531
532 use AnyEvent;
533 use AnyEvent::Socket;
534
535 sub finger($$) {
536 my ($user, $host) = @_;
537
538 # use a condvar to return results
539 my $cv = AnyEvent->condvar;
540
541 # first, connect to the host
542 tcp_connect $host, "finger", sub {
543 # the callback receives the socket handle - or nothing
544 my ($fh) = @_
545 or return $cv->send;
546
547 # now write the username
548 syswrite $fh, "$user\015\012";
549
550 my $response;
551
552 # register a read watcher
553 my $read_watcher; $read_watcher = AnyEvent->io (
554 fh => $fh,
555 poll => "r",
556 cb => sub {
557 my $len = sysread $fh, $response, 1024, length $response;
558
559 if ($len <= 0) {
560 # we are done, or an error occured, lets ignore the latter
561 undef $read_watcher; # no longer interested
562 $cv->send ($response); # send results
563 }
564 },
565 );
566 };
567
568 # pass $cv to the caller
569 $cv
570 }
571
572 That's a mouthful! Let's dissect this function a bit, first the overall
573 function and execution flow:
574
575 sub finger($$) {
576 my ($user, $host) = @_;
577
578 # use a condvar to return results
579 my $cv = AnyEvent->condvar;
580
581 # first, connect to the host
582 tcp_connect $host, "finger", sub {
583 ...
584 };
585
586 $cv
587 }
588
589 This isn't too complicated, just a function with two parameters that
590 creates a condition variable $cv, initiates a TCP connect to $host, and
591 returns $cv. The caller can use the returned $cv to receive the finger
592 response, but one could equally well pass a third argument, a callback,
593 to the function.
594
595 Since we are programming event'ish, we do not wait for the connect to
596 finish - it could block the program for a minute or longer!
597
598 Instead, we pass "tcp_connect" a callback to invoke when the connect is
599 done. The callback is called with the socket handle as its first
600 argument if the connect succeeds, and no arguments otherwise. The
601 important point is that it will always be called as soon as the outcome
602 of the TCP connect is known.
603
604 This style of programming is also called "continuation style": the
605 "continuation" is simply the way the program continues - normally at
606 the next line after some statement (the exception is loops or things
607 like "return"). When we are interested in events, however, we instead
608 specify the "continuation" of our program by passing a closure, which
609 makes that closure the "continuation" of the program.
610
611 The "tcp_connect" call is like saying "return now, and when the
612 connection is established or the attempt failed, continue there".
613
614 Now let's look at the callback/closure in more detail:
615
616 # the callback receives the socket handle - or nothing
617 my ($fh) = @_
618 or return $cv->send;
619
620 The first thing the callback does is to save the socket handle in $fh.
621 When there was an error (no arguments), then our instinct as expert
622 Perl programmers would tell us to "die":
623
624 my ($fh) = @_
625 or die "$host: $!";
626
627 While this would give good feedback to the user (if he happens to watch
628 standard error), our program would probably stop working here, as we
629 never report the results to anybody, certainly not the caller of our
630 "finger" function, and most event loops continue even after a "die"!
631
632 This is why we instead "return", but also call "$cv->send" without any
633 arguments to signal to the condvar consumer that something bad has
634 happened. The return value of "$cv->send" is irrelevant, as is the
635 return value of our callback. The "return" statement is used for the
636 side effect of, well, returning immediately from the callback.
637 Checking for errors and handling them this way is very common, which is
638 why this compact idiom is so handy.
639
640 As the next step in the finger protocol, we send the username to the
641 finger daemon on the other side of our connection (the kernel.org
642 finger service doesn't actually wait for a username, but the net is
643 running out of finger servers fast):
644
645 syswrite $fh, "$user\015\012";
646
647 Note that this isn't 100% clean socket programming - the socket could,
648 for whatever reasons, not accept our data. When writing a small amount
649 of data like in this example it doesn't matter, as a socket buffer is
650 almost always big enough for a mere "username", but for real-world
651 cases you might need to implement some kind of write buffering - or use
652 AnyEvent::Handle, which handles these matters for you, as shown in the
653 next section.
654
655 What we do have to do is implement our own read buffer - the response
656 data could arrive late or in multiple chunks, and we cannot just wait
657 for it (event-based programming, you know?).
658
659 To do that, we register a read watcher on the socket which waits for
660 data:
661
662 my $read_watcher; $read_watcher = AnyEvent->io (
663 fh => $fh,
664 poll => "r",
665
666 There is a trick here, however: the read watcher isn't stored in a
667 global variable, but in a local one - if the callback returns, it would
668 normally destroy the variable and its contents, which would in turn
669 unregister our watcher.
670
671 To avoid that, we refer to the watcher variable in the watcher
672 callback. This means that, when the "tcp_connect" callback returns,
673 perl thinks (quite correctly) that the read watcher is still in use -
674 namely inside the inner callback - and thus keeps it alive even if
675 nothing else in the program refers to it anymore (it is much like Baron
676 Münchhausen keeping himself from dying by pulling himself out of a
677 swamp).
678
679 The trick, however, is that instead of:
680
681 my $read_watcher = AnyEvent->io (...
682
683 The program does:
684
685 my $read_watcher; $read_watcher = AnyEvent->io (...
686
687 The reason for this is a quirk in the way Perl works: variable names
688 declared with "my" are only visible in the next statement. If the whole
689 "AnyEvent->io" call, including the callback, would be done in a single
690 statement, the callback could not refer to the $read_watcher variable
691 to "undef"ine it, so it is done in two statements.
692
693 Whether you'd want to format it like this is of course a matter of
694 style. This way emphasizes that the declaration and assignment really
695 are one logical statement.
696
697 The callback itself calls "sysread" for as many times as necessary,
698 until "sysread" returns either an error or end-of-file:
699
700 cb => sub {
701 my $len = sysread $fh, $response, 1024, length $response;
702
703 if ($len <= 0) {
704
705 Note that "sysread" has the ability to append data it reads to a scalar
706 if we specify an offset, a feature which we make use of in this
707 example.
708
709 When "sysread" indicates we are done, the callback "undef"ines the
710 watcher and then "send"s the response data to the condition variable.
711 All this has the following effects:
712
713 Undefining the watcher destroys it, as our callback was the only one
714 still having a reference to it. When the watcher gets destroyed, it
715 destroys the callback, which in turn means the $fh handle is no longer
716 used, so that gets destroyed as well. The result is that all resources
717 will be nicely cleaned up by perl for us.
718
719 Using the finger client
720
721 Now, we could probably write the same finger client in a simpler way if
722 we used "IO::Socket::INET", ignored the problem of multiple hosts and
723 ignored IPv6 and a few other things that "tcp_connect" handles for us.
724
725 But the main advantage is that we can not only run this finger function
726 in the background, we even can run multiple sessions in parallel, like
727 this:
728
729 my $f1 = finger "kuriyama", "freebsd.org";
730 my $f2 = finger "icculus?listarchives=1", "icculus.org";
731 my $f3 = finger "mikachu", "icculus.org";
732
733 print "kuriyama's gpg key\n" , $f1->recv, "\n";
734 print "icculus' plan archive\n" , $f2->recv, "\n";
735 print "mikachu's plan zomgn\n" , $f3->recv, "\n";
736
737 It doesn't look like it, but in fact all three requests run in
738 parallel. The code waits for the first finger request to finish first,
739 but that doesn't keep it from executing them parallel: when the first
740 "recv" call sees that the data isn't ready yet, it serves events for
741 all three requests automatically, until the first request has finished.
742
743 The second "recv" call might either find the data is already there, or
744 it will continue handling events until that is the case, and so on.
745
746 By taking advantage of network latencies, which allows us to serve
747 other requests and events while we wait for an event on one socket, the
748 overall time to do these three requests will be greatly reduced,
749 typically all three are done in the same time as the slowest of the
750 three requests.
751
752 By the way, you do not actually have to wait in the "recv" method on an
753 AnyEvent condition variable - after all, waiting is evil - you can also
754 register a callback:
755
756 $f1->cb (sub {
757 my $response = shift->recv;
758 # ...
759 });
760
761 The callback will be invoked only when "send" is called. In fact,
762 instead of returning a condition variable you could also pass a third
763 parameter to your finger function, the callback to invoke with the
764 response:
765
766 sub finger($$$) {
767 my ($user, $host, $cb) = @_;
768
769 How you implement it is a matter of taste - if you expect your function
770 to be used mainly in an event-based program you would normally prefer
771 to pass a callback directly. If you write a module and expect your
772 users to use it "synchronously" often (for example, a simple http-get
773 script would not really care much for events), then you would use a
774 condition variable and tell them "simply "->recv" the data".
775
776 Problems with the implementation and how to fix them
777
778 To make this example more real-world-ready, we would not only implement
779 some write buffering (for the paranoid, or maybe denial-of-service
780 aware security expert), but we would also have to handle timeouts and
781 maybe protocol errors.
782
783 Doing this quickly gets unwieldy, which is why we introduce
784 AnyEvent::Handle in the next section, which takes care of all these
785 details for you and lets you concentrate on the actual protocol.
786
787 Implementing simple HTTP and HTTPS GET requests with AnyEvent::Handle
788 The AnyEvent::Handle module has been hyped quite a bit in this document
789 so far, so let's see what it really offers.
790
791 As finger is such a simple protocol, let's try something slightly more
792 complicated: HTTP/1.0.
793
794 An HTTP GET request works by sending a single request line that
795 indicates what you want the server to do and the URI you want to act it
796 on, followed by as many "header" lines ("Header: data", same as e-mail
797 headers) as required for the request, followed by an empty line.
798
799 The response is formatted very similarly, first a line with the
800 response status, then again as many header lines as required, then an
801 empty line, followed by any data that the server might send.
802
803 Again, let's try it out with "telnet" (I condensed the output a bit -
804 if you want to see the full response, do it yourself).
805
806 # telnet www.google.com 80
807 Trying 209.85.135.99...
808 Connected to www.google.com (209.85.135.99).
809 Escape character is '^]'.
810 GET /test HTTP/1.0
811
812 HTTP/1.0 404 Not Found
813 Date: Mon, 02 Jun 2008 07:05:54 GMT
814 Content-Type: text/html; charset=UTF-8
815
816 <html><head>
817 [...]
818 Connection closed by foreign host.
819
820 The "GET ..." and the empty line were entered manually, the rest of the
821 telnet output is google's response, in this case a "404 not found" one.
822
823 So, here is how you would do it with "AnyEvent::Handle":
824
825 sub http_get {
826 my ($host, $uri, $cb) = @_;
827
828 # store results here
829 my ($response, $header, $body);
830
831 my $handle; $handle = new AnyEvent::Handle
832 connect => [$host => 'http'],
833 on_error => sub {
834 $cb->("HTTP/1.0 500 $!");
835 $handle->destroy; # explicitly destroy handle
836 },
837 on_eof => sub {
838 $cb->($response, $header, $body);
839 $handle->destroy; # explicitly destroy handle
840 };
841
842 $handle->push_write ("GET $uri HTTP/1.0\015\012\015\012");
843
844 # now fetch response status line
845 $handle->push_read (line => sub {
846 my ($handle, $line) = @_;
847 $response = $line;
848 });
849
850 # then the headers
851 $handle->push_read (line => "\015\012\015\012", sub {
852 my ($handle, $line) = @_;
853 $header = $line;
854 });
855
856 # and finally handle any remaining data as body
857 $handle->on_read (sub {
858 $body .= $_[0]->rbuf;
859 $_[0]->rbuf = "";
860 });
861 }
862
863 And now let's go through it step by step. First, as usual, the overall
864 "http_get" function structure:
865
866 sub http_get {
867 my ($host, $uri, $cb) = @_;
868
869 # store results here
870 my ($response, $header, $body);
871
872 my $handle; $handle = new AnyEvent::Handle
873 ... create handle object
874
875 ... push data to write
876
877 ... push what to expect to read queue
878 }
879
880 Unlike in the finger example, this time the caller has to pass a
881 callback to "http_get". Also, instead of passing a URL as one would
882 expect, the caller has to provide the hostname and URI - normally you
883 would use the "URI" module to parse a URL and separate it into those
884 parts, but that is left to the inspired reader :)
885
886 Since everything else is left to the caller, all "http_get" does is
887 initiate the connection by creating the AnyEvent::Handle object (which
888 calls "tcp_connect" for us) and leave everything else to its callback.
889
890 The handle object is created, unsurprisingly, by calling the "new"
891 method of AnyEvent::Handle:
892
893 my $handle; $handle = new AnyEvent::Handle
894 connect => [$host => 'http'],
895 on_error => sub {
896 $cb->("HTTP/1.0 500 $!");
897 $handle->destroy; # explicitly destroy handle
898 },
899 on_eof => sub {
900 $cb->($response, $header, $body);
901 $handle->destroy; # explicitly destroy handle
902 };
903
904 The "connect" argument tells AnyEvent::Handle to call "tcp_connect" for
905 the specified host and service/port.
906
907 The "on_error" callback will be called on any unexpected error, such as
908 a refused connection, or unexpected end-of-file while reading headers.
909
910 Instead of having an extra mechanism to signal errors, connection
911 errors are signalled by crafting a special "response status line", like
912 this:
913
914 HTTP/1.0 500 Connection refused
915
916 This means the caller cannot distinguish (easily) between locally-
917 generated errors and server errors, but it simplifies error handling
918 for the caller a lot.
919
920 The error callback also destroys the handle explicitly, because we are
921 not interested in continuing after any errors. In AnyEvent::Handle
922 callbacks you have to call "destroy" explicitly to destroy a handle.
923 Outside of those callbacks you can just forget the object reference and
924 it will be automatically cleaned up.
925
926 Last but not least, we set an "on_eof" callback that is called when the
927 other side indicates it has stopped writing data, which we will use to
928 gracefully shut down the handle and report the results. This callback
929 is only called when the read queue is empty - if the read queue expects
930 some data and the handle gets an EOF from the other side this will be
931 an error - after all, you did expect more to come.
932
933 If you wanted to write a server using AnyEvent::Handle, you would use
934 "tcp_accept" and then create the AnyEvent::Handle with the "fh"
935 argument.
936
937 The write queue
938
939 The next line sends the actual request:
940
941 $handle->push_write ("GET $uri HTTP/1.0\015\012\015\012");
942
943 No headers will be sent (this is fine for simple requests), so the
944 whole request is just a single line followed by an empty line to signal
945 the end of the headers to the server.
946
947 The more interesting question is why the method is called "push_write"
948 and not just write. The reason is that you can always add some write
949 data without blocking, and to do this, AnyEvent::Handle needs some
950 write queue internally - and "push_write" pushes some data onto the end
951 of that queue, just like Perl's "push" pushes data onto the end of an
952 array.
953
954 The deeper reason is that at some point in the future, there might be
955 "unshift_write" as well, and in any case, we will shortly meet
956 "push_read" and "unshift_read", and it's usually easiest to remember if
957 all those functions have some symmetry in their name. So "push" is used
958 as the opposite of "unshift" in AnyEvent::Handle, not as the opposite
959 of "pull" - just like in Perl.
960
961 Note that we call "push_write" right after creating the
962 AnyEvent::Handle object, before it has had time to actually connect to
963 the server. This is fine, pushing the read and write requests will
964 queue them in the handle object until the connection has been
965 established. Alternatively, we could do this "on demand" in the
966 "on_connect" callback.
967
968 If "push_write" is called with more than one argument, then you can do
969 formatted I/O. For example, this would JSON-encode your data before
970 pushing it to the write queue:
971
972 $handle->push_write (json => [1, 2, 3]);
973
974 This pretty much summarises the write queue, there is little else to
975 it.
976
977 Reading the response is far more interesting, because it involves the
978 more powerful and complex read queue:
979
980 The read queue
981
982 The response consists of three parts: a single line with the response
983 status, a single paragraph of headers ended by an empty line, and the
984 request body, which is the remaining data on the connection.
985
986 For the first two, we push two read requests onto the read queue:
987
988 # now fetch response status line
989 $handle->push_read (line => sub {
990 my ($handle, $line) = @_;
991 $response = $line;
992 });
993
994 # then the headers
995 $handle->push_read (line => "\015\012\015\012", sub {
996 my ($handle, $line) = @_;
997 $header = $line;
998 });
999
1000 While one can just push a single callback to parse all the data on the
1001 queue, formatted I/O really comes to our aid here, since there is a
1002 ready-made "read line" read type. The first read expects a single line,
1003 ended by "\015\012" (the standard end-of-line marker in internet
1004 protocols).
1005
1006 The second "line" is actually a single paragraph - instead of reading
1007 it line by line we tell "push_read" that the end-of-line marker is
1008 really "\015\012\015\012", which is an empty line. The result is that
1009 the whole header paragraph will be treated as a single line and read.
1010 The word "line" is interpreted very freely, much like Perl itself does
1011 it.
1012
1013 Note that push read requests are pushed immediately after creating the
1014 handle object - since AnyEvent::Handle provides a queue we can push as
1015 many requests as we want, and AnyEvent::Handle will handle them in
1016 order.
1017
1018 There is, however, no read type for "the remaining data". For that, we
1019 install our own "on_read" callback:
1020
1021 # and finally handle any remaining data as body
1022 $handle->on_read (sub {
1023 $body .= $_[0]->rbuf;
1024 $_[0]->rbuf = "";
1025 });
1026
1027 This callback is invoked every time data arrives and the read queue is
1028 empty - which in this example will only be the case when both response
1029 and header have been read. The "on_read" callback could actually have
1030 been specified when constructing the object, but doing it this way
1031 preserves logical ordering.
1032
1033 The read callback adds the current read buffer to its $body variable
1034 and, most importantly, empties the buffer by assigning the empty string
1035 to it.
1036
1037 Given these instructions, AnyEvent::Handle will handle incoming data -
1038 if all goes well, the callback will be invoked with the response data;
1039 if not, it will get an error.
1040
1041 In general, you can implement pipelining (a semi-advanced feature of
1042 many protocols) very easily with AnyEvent::Handle: If you have a
1043 protocol with a request/response structure, your request
1044 methods/functions will all look like this (simplified):
1045
1046 sub request {
1047
1048 # send the request to the server
1049 $handle->push_write (...);
1050
1051 # push some response handlers
1052 $handle->push_read (...);
1053 }
1054
1055 This means you can queue as many requests as you want, and while
1056 AnyEvent::Handle goes through its read queue to handle the response
1057 data, the other side can work on the next request - queueing the
1058 request just appends some data to the write queue and installs a
1059 handler to be called later.
1060
1061 You might ask yourself how to handle decisions you can only make after
1062 you have received some data (such as handling a short error response or
1063 a long and differently-formatted response). The answer to this problem
1064 is "unshift_read", which we will introduce together with an example in
1065 the coming sections.
1066
1067 Using "http_get"
1068
1069 Finally, here is how you would use "http_get":
1070
1071 http_get "www.google.com", "/", sub {
1072 my ($response, $header, $body) = @_;
1073
1074 print
1075 $response, "\n",
1076 $body;
1077 };
1078
1079 And of course, you can run as many of these requests in parallel as you
1080 want (and your memory supports).
1081
1082 HTTPS
1083
1084 Now, as promised, let's implement the same thing for HTTPS, or more
1085 correctly, let's change our "http_get" function into a function that
1086 speaks HTTPS instead.
1087
1088 HTTPS is a standard TLS connection (Transport Layer Security is the
1089 official name for what most people refer to as "SSL") that contains
1090 standard HTTP protocol exchanges. The only other difference to HTTP is
1091 that by default it uses port 443 instead of port 80.
1092
1093 To implement these two differences we need two tiny changes, first, in
1094 the "connect" parameter, we replace "http" by "https" to connect to the
1095 https port:
1096
1097 connect => [$host => 'https'],
1098
1099 The other change deals with TLS, which is something AnyEvent::Handle
1100 does for us if the Net::SSLeay module is available. To enable TLS with
1101 AnyEvent::Handle, we pass an additional "tls" parameter to the call to
1102 "AnyEvent::Handle::new":
1103
1104 tls => "connect",
1105
1106 Specifying "tls" enables TLS, and the argument specifies whether
1107 AnyEvent::Handle is the server side ("accept") or the client side
1108 ("connect") for the TLS connection, as unlike TCP, there is a clear
1109 server/client relationship in TLS.
1110
1111 That's all.
1112
1113 Of course, all this should be handled transparently by "http_get" after
1114 parsing the URL. If you need this, see the part about exercising your
1115 inspiration earlier in this document. You could also use the
1116 AnyEvent::HTTP module from CPAN, which implements all this and works
1117 around a lot of quirks for you too.
1118
1119 The read queue - revisited
1120
1121 HTTP always uses the same structure in its responses, but many
1122 protocols require parsing responses differently depending on the
1123 response itself.
1124
1125 For example, in SMTP, you normally get a single response line:
1126
1127 220 mail.example.net Neverusesendmail 8.8.8 <mailme@example.net>
1128
1129 But SMTP also supports multi-line responses:
1130
1131 220-mail.example.net Neverusesendmail 8.8.8 <mailme@example.net>
1132 220-hey guys
1133 220 my response is longer than yours
1134
1135 To handle this, we need "unshift_read". As the name (we hope) implies,
1136 "unshift_read" will not append your read request to the end of the read
1137 queue, but will prepend it to the queue instead.
1138
1139 This is useful in the situation above: Just push your response-line
1140 read request when sending the SMTP command, and when handling it, you
1141 look at the line to see if more is to come, and "unshift_read" another
1142 reader callback if required, like this:
1143
1144 my $response; # response lines end up in here
1145
1146 my $read_response; $read_response = sub {
1147 my ($handle, $line) = @_;
1148
1149 $response .= "$line\n";
1150
1151 # check for continuation lines ("-" as 4th character")
1152 if ($line =~ /^...-/) {
1153 # if yes, then unshift another line read
1154 $handle->unshift_read (line => $read_response);
1155
1156 } else {
1157 # otherwise we are done
1158
1159 # free callback
1160 undef $read_response;
1161
1162 print "we are don reading: $response\n";
1163 }
1164 };
1165
1166 $handle->push_read (line => $read_response);
1167
1168 This recipe can be used for all similar parsing problems, for example
1169 in NNTP, the response code to some commands indicates that more data
1170 will be sent:
1171
1172 $handle->push_write ("article 42");
1173
1174 # read response line
1175 $handle->push_read (line => sub {
1176 my ($handle, $status) = @_;
1177
1178 # article data following?
1179 if ($status =~ /^2/) {
1180 # yes, read article body
1181
1182 $handle->unshift_read (line => "\012.\015\012", sub {
1183 my ($handle, $body) = @_;
1184
1185 $finish->($status, $body);
1186 });
1187
1188 } else {
1189 # some error occured, no article data
1190
1191 $finish->($status);
1192 }
1193 }
1194
1195 Your own read queue handler
1196
1197 Sometimes your protocol doesn't play nice, and uses lines or chunks of
1198 data not formatted in a way handled out of the box by AnyEvent::Handle.
1199 In this case you have to implement your own read parser.
1200
1201 To make up a contorted example, imagine you are looking for an even
1202 number of characters followed by a colon (":"). Also imagine that
1203 AnyEvent::Handle has no "regex" read type which could be used, so you'd
1204 have to do it manually.
1205
1206 To implement a read handler for this, you would "push_read" (or
1207 "unshift_read") a single code reference.
1208
1209 This code reference will then be called each time there is (new) data
1210 available in the read buffer, and is expected to either successfully
1211 eat/consume some of that data (and return true) or to return false to
1212 indicate that it wants to be called again.
1213
1214 If the code reference returns true, then it will be removed from the
1215 read queue (because it has parsed/consumed whatever it was supposed to
1216 consume), otherwise it stays in the front of it.
1217
1218 The example above could be coded like this:
1219
1220 $handle->push_read (sub {
1221 my ($handle) = @_;
1222
1223 # check for even number of characters + ":"
1224 # and remove the data if a match is found.
1225 # if not, return false (actually nothing)
1226
1227 $handle->{rbuf} =~ s/^( (?:..)* ) ://x
1228 or return;
1229
1230 # we got some data in $1, pass it to whoever wants it
1231 $finish->($1);
1232
1233 # and return true to indicate we are done
1234 1
1235 });
1236
1238 Now that you have seen how to use AnyEvent, here's what to use when you
1239 don't use it correctly, or simply hit a bug somewhere and want to debug
1240 it:
1241
1242 Enable strict argument checking during development
1243 AnyEvent does not, by default, do any argument checking. This can
1244 lead to strange and unexpected results especially if you are just
1245 trying to find your way with AnyEvent.
1246
1247 AnyEvent supports a special "strict" mode - off by default - which
1248 does very strict argument checking, at the expense of slowing down
1249 your program. During development, however, this mode is very useful
1250 because it quickly catches the msot common errors.
1251
1252 You can enable this strict mode either by having an environment
1253 variable "AE_STRICT" with a true value in your environment:
1254
1255 AE_STRICT=1 perl myprog
1256
1257 Or you can write "use AnyEvent::Strict" in your program, which has
1258 the same effect (do not do this in production, however).
1259
1260 Increase verbosity, configure logging
1261 AnyEvent, by default, only logs critical messages. If something
1262 doesn't work, maybe there was a warning about it that you didn't
1263 see because it was suppressed.
1264
1265 So during development it is recommended to push up the logging
1266 level to at least warn level (5):
1267
1268 AE_VERBOSE=5 perl myprog
1269
1270 Other levels that might be helpful are debug (8) or even trace (9).
1271
1272 AnyEvent's logging is quite versatile - the AnyEvent::Log manpage
1273 has all the details.
1274
1275 Watcher wrapping, tracing, the shell
1276 For even more debugging, you can enable watcher wrapping:
1277
1278 AE_DEBUG_WRAP=2 perl myprog
1279
1280 This will have the effect of wrapping every watcher into a special
1281 object that stores a backtrace of when it was created, stores a
1282 backtrace when an exception occurs during watcher execution, and
1283 stores a lot of other information. If that slows down your program
1284 too much, then "AE_DEBUG_WRAP=1" avoids the costly backtraces.
1285
1286 Here is an example of what of information is stored:
1287
1288 59148536 DC::DB:472(Server::run)>io>DC::DB::Server::fh_read
1289 type: io watcher
1290 args: poll r fh GLOB(0x35283f0)
1291 created: 2011-09-01 23:13:46.597336 +0200 (1314911626.59734)
1292 file: ./blib/lib/Deliantra/Client/private/DC/DB.pm
1293 line: 472
1294 subname: DC::DB::Server::run
1295 context:
1296 tracing: enabled
1297 cb: CODE(0x2d1fb98) (DC::DB::Server::fh_read)
1298 invoked: 0 times
1299 created
1300 (eval 25) line 6 AnyEvent::Debug::Wrap::__ANON__('AnyEvent','fh',GLOB(0x35283f0),'poll','r','cb',CODE(0x2d1fb98)=DC::DB::Server::fh_read)
1301 DC::DB line 472 AE::io(GLOB(0x35283f0),'0',CODE(0x2d1fb98)=DC::DB::Server::fh_read)
1302 bin/deliantra line 2776 DC::DB::Server::run()
1303 bin/deliantra line 2941 main::main()
1304
1305 There are many ways to get at this data - see the AnyEvent::Debug
1306 and AnyEvent::Log manpages for more details.
1307
1308 The most interesting and interactive way is to create a debug
1309 shell, for example by setting "AE_DEBUG_SHELL":
1310
1311 AE_DEBUG_WRAP=2 AE_DEBUG_SHELL=$HOME/myshell ./myprog
1312
1313 # while myprog is running:
1314 socat readline $HOME/myshell
1315
1316 Note that anybody who can access $HOME/myshell can make your
1317 program do anything he or she wants, so if you are not the only
1318 user on your machine, better put it into a secure location ($HOME
1319 might not be secure enough).
1320
1321 If you don't have "socat" (a shame!) and care even less about
1322 security, you can also use TCP and "telnet":
1323
1324 AE_DEBUG_WRAP=2 AE_DEBUG_SHELL=127.0.0.1:1234 ./myprog
1325
1326 telnet 127.0.0.1 1234
1327
1328 The debug shell can enable and disable tracing of watcher
1329 invocations, can display the trace output, give you a list of
1330 watchers and lets you investigate watchers in detail.
1331
1332 This concludes our little tutorial.
1333
1335 This introduction should have explained the key concepts of AnyEvent -
1336 event watchers and condition variables, AnyEvent::Socket - basic
1337 networking utilities, and AnyEvent::Handle - a nice wrapper around
1338 sockets.
1339
1340 You could either start coding stuff right away, look at those manual
1341 pages for the gory details, or roam CPAN for other AnyEvent modules
1342 (such as AnyEvent::IRC or AnyEvent::HTTP) to see more code examples (or
1343 simply to use them).
1344
1345 If you need a protocol that doesn't have an implementation using
1346 AnyEvent, remember that you can mix AnyEvent with one other event
1347 framework, such as POE, so you can always use AnyEvent for your own
1348 tasks plus modules of one other event framework to fill any gaps.
1349
1350 And last not least, you could also look at Coro, especially
1351 Coro::AnyEvent, to see how you can turn event-based programming from
1352 callback style back to the usual imperative style (also called
1353 "inversion of control" - AnyEvent calls you, but Coro lets you call
1354 AnyEvent).
1355
1357 Robin Redeker "<elmex at ta-sa.org>", Marc Lehmann
1358 <schmorp@schmorp.de>.
1359
1360
1361
1362perl v5.34.0 2021-07-22 AnyEvent::Intro(3)