1PERLOTHRTUT(1) Perl Programmers Reference Guide PERLOTHRTUT(1)
2
3
4
6 perlothrtut - old tutorial on threads in Perl
7
9 WARNING: This tutorial describes the old-style thread model that was
10 introduced in release 5.005. This model is deprecated, and has been
11 removed for version 5.10. The interfaces described here were considered
12 experimental, and are likely to be buggy.
13
14 For information about the new interpreter threads ("ithreads") model,
15 see the perlthrtut tutorial, and the threads and threads::shared
16 modules.
17
18 You are strongly encouraged to migrate any existing threads code to the
19 new model as soon as possible.
20
22 A thread is a flow of control through a program with a single execution
23 point.
24
25 Sounds an awful lot like a process, doesn't it? Well, it should.
26 Threads are one of the pieces of a process. Every process has at least
27 one thread and, up until now, every process running Perl had only one
28 thread. With 5.005, though, you can create extra threads. We're going
29 to show you how, when, and why.
30
32 There are three basic ways that you can structure a threaded program.
33 Which model you choose depends on what you need your program to do.
34 For many non-trivial threaded programs you'll need to choose different
35 models for different pieces of your program.
36
37 Boss/Worker
38 The boss/worker model usually has one `boss' thread and one or more
39 `worker' threads. The boss thread gathers or generates tasks that need
40 to be done, then parcels those tasks out to the appropriate worker
41 thread.
42
43 This model is common in GUI and server programs, where a main thread
44 waits for some event and then passes that event to the appropriate
45 worker threads for processing. Once the event has been passed on, the
46 boss thread goes back to waiting for another event.
47
48 The boss thread does relatively little work. While tasks aren't
49 necessarily performed faster than with any other method, it tends to
50 have the best user-response times.
51
52 Work Crew
53 In the work crew model, several threads are created that do essentially
54 the same thing to different pieces of data. It closely mirrors
55 classical parallel processing and vector processors, where a large
56 array of processors do the exact same thing to many pieces of data.
57
58 This model is particularly useful if the system running the program
59 will distribute multiple threads across different processors. It can
60 also be useful in ray tracing or rendering engines, where the
61 individual threads can pass on interim results to give the user visual
62 feedback.
63
64 Pipeline
65 The pipeline model divides up a task into a series of steps, and passes
66 the results of one step on to the thread processing the next. Each
67 thread does one thing to each piece of data and passes the results to
68 the next thread in line.
69
70 This model makes the most sense if you have multiple processors so two
71 or more threads will be executing in parallel, though it can often make
72 sense in other contexts as well. It tends to keep the individual tasks
73 small and simple, as well as allowing some parts of the pipeline to
74 block (on I/O or system calls, for example) while other parts keep
75 going. If you're running different parts of the pipeline on different
76 processors you may also take advantage of the caches on each processor.
77
78 This model is also handy for a form of recursive programming where,
79 rather than having a subroutine call itself, it instead creates another
80 thread. Prime and Fibonacci generators both map well to this form of
81 the pipeline model. (A version of a prime number generator is presented
82 later on.)
83
85 There are several different ways to implement threads on a system. How
86 threads are implemented depends both on the vendor and, in some cases,
87 the version of the operating system. Often the first implementation
88 will be relatively simple, but later versions of the OS will be more
89 sophisticated.
90
91 While the information in this section is useful, it's not necessary, so
92 you can skip it if you don't feel up to it.
93
94 There are three basic categories of threads-user-mode threads, kernel
95 threads, and multiprocessor kernel threads.
96
97 User-mode threads are threads that live entirely within a program and
98 its libraries. In this model, the OS knows nothing about threads. As
99 far as it's concerned, your process is just a process.
100
101 This is the easiest way to implement threads, and the way most OSes
102 start. The big disadvantage is that, since the OS knows nothing about
103 threads, if one thread blocks they all do. Typical blocking activities
104 include most system calls, most I/O, and things like sleep().
105
106 Kernel threads are the next step in thread evolution. The OS knows
107 about kernel threads, and makes allowances for them. The main
108 difference between a kernel thread and a user-mode thread is blocking.
109 With kernel threads, things that block a single thread don't block
110 other threads. This is not the case with user-mode threads, where the
111 kernel blocks at the process level and not the thread level.
112
113 This is a big step forward, and can give a threaded program quite a
114 performance boost over non-threaded programs. Threads that block
115 performing I/O, for example, won't block threads that are doing other
116 things. Each process still has only one thread running at once,
117 though, regardless of how many CPUs a system might have.
118
119 Since kernel threading can interrupt a thread at any time, they will
120 uncover some of the implicit locking assumptions you may make in your
121 program. For example, something as simple as "$a = $a + 2" can behave
122 unpredictably with kernel threads if $a is visible to other threads, as
123 another thread may have changed $a between the time it was fetched on
124 the right hand side and the time the new value is stored.
125
126 Multiprocessor Kernel Threads are the final step in thread support.
127 With multiprocessor kernel threads on a machine with multiple CPUs, the
128 OS may schedule two or more threads to run simultaneously on different
129 CPUs.
130
131 This can give a serious performance boost to your threaded program,
132 since more than one thread will be executing at the same time. As a
133 tradeoff, though, any of those nagging synchronization issues that
134 might not have shown with basic kernel threads will appear with a
135 vengeance.
136
137 In addition to the different levels of OS involvement in threads,
138 different OSes (and different thread implementations for a particular
139 OS) allocate CPU cycles to threads in different ways.
140
141 Cooperative multitasking systems have running threads give up control
142 if one of two things happen. If a thread calls a yield function, it
143 gives up control. It also gives up control if the thread does
144 something that would cause it to block, such as perform I/O. In a
145 cooperative multitasking implementation, one thread can starve all the
146 others for CPU time if it so chooses.
147
148 Preemptive multitasking systems interrupt threads at regular intervals
149 while the system decides which thread should run next. In a preemptive
150 multitasking system, one thread usually won't monopolize the CPU.
151
152 On some systems, there can be cooperative and preemptive threads
153 running simultaneously. (Threads running with realtime priorities often
154 behave cooperatively, for example, while threads running at normal
155 priorities behave preemptively.)
156
158 If you have experience with other thread implementations, you might
159 find that things aren't quite what you expect. It's very important to
160 remember when dealing with Perl threads that Perl Threads Are Not X
161 Threads, for all values of X. They aren't POSIX threads, or
162 DecThreads, or Java's Green threads, or Win32 threads. There are
163 similarities, and the broad concepts are the same, but if you start
164 looking for implementation details you're going to be either
165 disappointed or confused. Possibly both.
166
167 This is not to say that Perl threads are completely different from
168 everything that's ever come before--they're not. Perl's threading
169 model owes a lot to other thread models, especially POSIX. Just as
170 Perl is not C, though, Perl threads are not POSIX threads. So if you
171 find yourself looking for mutexes, or thread priorities, it's time to
172 step back a bit and think about what you want to do and how Perl can do
173 it.
174
176 The addition of threads has changed Perl's internals substantially.
177 There are implications for people who write modules--especially modules
178 with XS code or external libraries. While most modules won't encounter
179 any problems, modules that aren't explicitly tagged as thread-safe
180 should be tested before being used in production code.
181
182 Not all modules that you might use are thread-safe, and you should
183 always assume a module is unsafe unless the documentation says
184 otherwise. This includes modules that are distributed as part of the
185 core. Threads are a beta feature, and even some of the standard
186 modules aren't thread-safe.
187
188 If you're using a module that's not thread-safe for some reason, you
189 can protect yourself by using semaphores and lots of programming
190 discipline to control access to the module. Semaphores are covered
191 later in the article. Perl Threads Are Different
192
194 The core Thread module provides the basic functions you need to write
195 threaded programs. In the following sections we'll cover the basics,
196 showing you what you need to do to create a threaded program. After
197 that, we'll go over some of the features of the Thread module that make
198 threaded programming easier.
199
200 Basic Thread Support
201 Thread support is a Perl compile-time option-it's something that's
202 turned on or off when Perl is built at your site, rather than when your
203 programs are compiled. If your Perl wasn't compiled with thread support
204 enabled, then any attempt to use threads will fail.
205
206 Remember that the threading support in 5.005 is in beta release, and
207 should be treated as such. You should expect that it may not function
208 entirely properly, and the thread interface may well change some before
209 it is a fully supported, production release. The beta version
210 shouldn't be used for mission-critical projects. Having said that,
211 threaded Perl is pretty nifty, and worth a look.
212
213 Your programs can use the Config module to check whether threads are
214 enabled. If your program can't run without them, you can say something
215 like:
216
217 $Config{usethreads} or die "Recompile Perl with threads to run this program.";
218
219 A possibly-threaded program using a possibly-threaded module might have
220 code like this:
221
222 use Config;
223 use MyMod;
224
225 if ($Config{usethreads}) {
226 # We have threads
227 require MyMod_threaded;
228 import MyMod_threaded;
229 } else {
230 require MyMod_unthreaded;
231 import MyMod_unthreaded;
232 }
233
234 Since code that runs both with and without threads is usually pretty
235 messy, it's best to isolate the thread-specific code in its own module.
236 In our example above, that's what MyMod_threaded is, and it's only
237 imported if we're running on a threaded Perl.
238
239 Creating Threads
240 The Thread package provides the tools you need to create new threads.
241 Like any other module, you need to tell Perl you want to use it; use
242 Thread imports all the pieces you need to create basic threads.
243
244 The simplest, straightforward way to create a thread is with new():
245
246 use Thread;
247
248 $thr = Thread->new( \&sub1 );
249
250 sub sub1 {
251 print "In the thread\n";
252 }
253
254 The new() method takes a reference to a subroutine and creates a new
255 thread, which starts executing in the referenced subroutine. Control
256 then passes both to the subroutine and the caller.
257
258 If you need to, your program can pass parameters to the subroutine as
259 part of the thread startup. Just include the list of parameters as
260 part of the "Thread::new" call, like this:
261
262 use Thread;
263 $Param3 = "foo";
264 $thr = Thread->new( \&sub1, "Param 1", "Param 2", $Param3 );
265 $thr = Thread->new( \&sub1, @ParamList );
266 $thr = Thread->new( \&sub1, qw(Param1 Param2 $Param3) );
267
268 sub sub1 {
269 my @InboundParameters = @_;
270 print "In the thread\n";
271 print "got parameters >", join("<>", @InboundParameters), "<\n";
272 }
273
274 The subroutine runs like a normal Perl subroutine, and the call to new
275 Thread returns whatever the subroutine returns.
276
277 The last example illustrates another feature of threads. You can spawn
278 off several threads using the same subroutine. Each thread executes
279 the same subroutine, but in a separate thread with a separate
280 environment and potentially separate arguments.
281
282 The other way to spawn a new thread is with async(), which is a way to
283 spin off a chunk of code like eval(), but into its own thread:
284
285 use Thread qw(async);
286
287 $LineCount = 0;
288
289 $thr = async {
290 while(<>) {$LineCount++}
291 print "Got $LineCount lines\n";
292 };
293
294 print "Waiting for the linecount to end\n";
295 $thr->join;
296 print "All done\n";
297
298 You'll notice we did a use Thread qw(async) in that example. async is
299 not exported by default, so if you want it, you'll either need to
300 import it before you use it or fully qualify it as Thread::async.
301 You'll also note that there's a semicolon after the closing brace.
302 That's because async() treats the following block as an anonymous
303 subroutine, so the semicolon is necessary.
304
305 Like eval(), the code executes in the same context as it would if it
306 weren't spun off. Since both the code inside and after the async start
307 executing, you need to be careful with any shared resources. Locking
308 and other synchronization techniques are covered later.
309
310 Giving up control
311 There are times when you may find it useful to have a thread explicitly
312 give up the CPU to another thread. Your threading package might not
313 support preemptive multitasking for threads, for example, or you may be
314 doing something compute-intensive and want to make sure that the user-
315 interface thread gets called frequently. Regardless, there are times
316 that you might want a thread to give up the processor.
317
318 Perl's threading package provides the yield() function that does this.
319 yield() is pretty straightforward, and works like this:
320
321 use Thread qw(yield async);
322 async {
323 my $foo = 50;
324 while ($foo--) { print "first async\n" }
325 yield;
326 $foo = 50;
327 while ($foo--) { print "first async\n" }
328 };
329 async {
330 my $foo = 50;
331 while ($foo--) { print "second async\n" }
332 yield;
333 $foo = 50;
334 while ($foo--) { print "second async\n" }
335 };
336
337 Waiting For A Thread To Exit
338 Since threads are also subroutines, they can return values. To wait
339 for a thread to exit and extract any scalars it might return, you can
340 use the join() method.
341
342 use Thread;
343 $thr = Thread->new( \&sub1 );
344
345 @ReturnData = $thr->join;
346 print "Thread returned @ReturnData";
347
348 sub sub1 { return "Fifty-six", "foo", 2; }
349
350 In the example above, the join() method returns as soon as the thread
351 ends. In addition to waiting for a thread to finish and gathering up
352 any values that the thread might have returned, join() also performs
353 any OS cleanup necessary for the thread. That cleanup might be
354 important, especially for long-running programs that spawn lots of
355 threads. If you don't want the return values and don't want to wait
356 for the thread to finish, you should call the detach() method instead.
357 detach() is covered later in the article.
358
359 Errors In Threads
360 So what happens when an error occurs in a thread? Any errors that could
361 be caught with eval() are postponed until the thread is joined. If
362 your program never joins, the errors appear when your program exits.
363
364 Errors deferred until a join() can be caught with eval():
365
366 use Thread qw(async);
367 $thr = async {$b = 3/0}; # Divide by zero error
368 $foo = eval {$thr->join};
369 if ($@) {
370 print "died with error $@\n";
371 } else {
372 print "Hey, why aren't you dead?\n";
373 }
374
375 eval() passes any results from the joined thread back unmodified, so if
376 you want the return value of the thread, this is your only chance to
377 get them.
378
379 Ignoring A Thread
380 join() does three things: it waits for a thread to exit, cleans up
381 after it, and returns any data the thread may have produced. But what
382 if you're not interested in the thread's return values, and you don't
383 really care when the thread finishes? All you want is for the thread to
384 get cleaned up after when it's done.
385
386 In this case, you use the detach() method. Once a thread is detached,
387 it'll run until it's finished, then Perl will clean up after it
388 automatically.
389
390 use Thread;
391 $thr = Thread->new( \&sub1 ); # Spawn the thread
392
393 $thr->detach; # Now we officially don't care any more
394
395 sub sub1 {
396 $a = 0;
397 while (1) {
398 $a++;
399 print "\$a is $a\n";
400 sleep 1;
401 }
402 }
403
404 Once a thread is detached, it may not be joined, and any output that it
405 might have produced (if it was done and waiting for a join) is lost.
406
408 Now that we've covered the basics of threads, it's time for our next
409 topic: data. Threading introduces a couple of complications to data
410 access that non-threaded programs never need to worry about.
411
412 Shared And Unshared Data
413 The single most important thing to remember when using threads is that
414 all threads potentially have access to all the data anywhere in your
415 program. While this is true with a nonthreaded Perl program as well,
416 it's especially important to remember with a threaded program, since
417 more than one thread can be accessing this data at once.
418
419 Perl's scoping rules don't change because you're using threads. If a
420 subroutine (or block, in the case of async()) could see a variable if
421 you weren't running with threads, it can see it if you are. This is
422 especially important for the subroutines that create, and makes "my"
423 variables even more important. Remember--if your variables aren't
424 lexically scoped (declared with "my") you're probably sharing them
425 between threads.
426
427 Thread Pitfall: Races
428 While threads bring a new set of useful tools, they also bring a number
429 of pitfalls. One pitfall is the race condition:
430
431 use Thread;
432 $a = 1;
433 $thr1 = Thread->new(\&sub1);
434 $thr2 = Thread->new(\&sub2);
435
436 sleep 10;
437 print "$a\n";
438
439 sub sub1 { $foo = $a; $a = $foo + 1; }
440 sub sub2 { $bar = $a; $a = $bar + 1; }
441
442 What do you think $a will be? The answer, unfortunately, is "it
443 depends." Both sub1() and sub2() access the global variable $a, once to
444 read and once to write. Depending on factors ranging from your thread
445 implementation's scheduling algorithm to the phase of the moon, $a can
446 be 2 or 3.
447
448 Race conditions are caused by unsynchronized access to shared data.
449 Without explicit synchronization, there's no way to be sure that
450 nothing has happened to the shared data between the time you access it
451 and the time you update it. Even this simple code fragment has the
452 possibility of error:
453
454 use Thread qw(async);
455 $a = 2;
456 async{ $b = $a; $a = $b + 1; };
457 async{ $c = $a; $a = $c + 1; };
458
459 Two threads both access $a. Each thread can potentially be interrupted
460 at any point, or be executed in any order. At the end, $a could be 3
461 or 4, and both $b and $c could be 2 or 3.
462
463 Whenever your program accesses data or resources that can be accessed
464 by other threads, you must take steps to coordinate access or risk data
465 corruption and race conditions.
466
467 Controlling access: lock()
468 The lock() function takes a variable (or subroutine, but we'll get to
469 that later) and puts a lock on it. No other thread may lock the
470 variable until the locking thread exits the innermost block containing
471 the lock. Using lock() is straightforward:
472
473 use Thread qw(async);
474 $a = 4;
475 $thr1 = async {
476 $foo = 12;
477 {
478 lock ($a); # Block until we get access to $a
479 $b = $a;
480 $a = $b * $foo;
481 }
482 print "\$foo was $foo\n";
483 };
484 $thr2 = async {
485 $bar = 7;
486 {
487 lock ($a); # Block until we can get access to $a
488 $c = $a;
489 $a = $c * $bar;
490 }
491 print "\$bar was $bar\n";
492 };
493 $thr1->join;
494 $thr2->join;
495 print "\$a is $a\n";
496
497 lock() blocks the thread until the variable being locked is available.
498 When lock() returns, your thread can be sure that no other thread can
499 lock that variable until the innermost block containing the lock exits.
500
501 It's important to note that locks don't prevent access to the variable
502 in question, only lock attempts. This is in keeping with Perl's
503 longstanding tradition of courteous programming, and the advisory file
504 locking that flock() gives you. Locked subroutines behave differently,
505 however. We'll cover that later in the article.
506
507 You may lock arrays and hashes as well as scalars. Locking an array,
508 though, will not block subsequent locks on array elements, just lock
509 attempts on the array itself.
510
511 Finally, locks are recursive, which means it's okay for a thread to
512 lock a variable more than once. The lock will last until the outermost
513 lock() on the variable goes out of scope.
514
515 Thread Pitfall: Deadlocks
516 Locks are a handy tool to synchronize access to data. Using them
517 properly is the key to safe shared data. Unfortunately, locks aren't
518 without their dangers. Consider the following code:
519
520 use Thread qw(async yield);
521 $a = 4;
522 $b = "foo";
523 async {
524 lock($a);
525 yield;
526 sleep 20;
527 lock ($b);
528 };
529 async {
530 lock($b);
531 yield;
532 sleep 20;
533 lock ($a);
534 };
535
536 This program will probably hang until you kill it. The only way it
537 won't hang is if one of the two async() routines acquires both locks
538 first. A guaranteed-to-hang version is more complicated, but the
539 principle is the same.
540
541 The first thread spawned by async() will grab a lock on $a then, a
542 second or two later, try to grab a lock on $b. Meanwhile, the second
543 thread grabs a lock on $b, then later tries to grab a lock on $a. The
544 second lock attempt for both threads will block, each waiting for the
545 other to release its lock.
546
547 This condition is called a deadlock, and it occurs whenever two or more
548 threads are trying to get locks on resources that the others own. Each
549 thread will block, waiting for the other to release a lock on a
550 resource. That never happens, though, since the thread with the
551 resource is itself waiting for a lock to be released.
552
553 There are a number of ways to handle this sort of problem. The best
554 way is to always have all threads acquire locks in the exact same
555 order. If, for example, you lock variables $a, $b, and $c, always lock
556 $a before $b, and $b before $c. It's also best to hold on to locks for
557 as short a period of time to minimize the risks of deadlock.
558
559 Queues: Passing Data Around
560 A queue is a special thread-safe object that lets you put data in one
561 end and take it out the other without having to worry about
562 synchronization issues. They're pretty straightforward, and look like
563 this:
564
565 use Thread qw(async);
566 use Thread::Queue;
567
568 my $DataQueue = Thread::Queue->new();
569 $thr = async {
570 while ($DataElement = $DataQueue->dequeue) {
571 print "Popped $DataElement off the queue\n";
572 }
573 };
574
575 $DataQueue->enqueue(12);
576 $DataQueue->enqueue("A", "B", "C");
577 sleep 10;
578 $DataQueue->enqueue(undef);
579
580 You create the queue with "Thread::Queue->new". Then you can add lists
581 of scalars onto the end with enqueue(), and pop scalars off the front
582 of it with dequeue(). A queue has no fixed size, and can grow as
583 needed to hold everything pushed on to it.
584
585 If a queue is empty, dequeue() blocks until another thread enqueues
586 something. This makes queues ideal for event loops and other
587 communications between threads.
588
590 In addition to providing thread-safe access to data via locks and
591 queues, threaded Perl also provides general-purpose semaphores for
592 coarser synchronization than locks provide and thread-safe access to
593 entire subroutines.
594
595 Semaphores: Synchronizing Data Access
596 Semaphores are a kind of generic locking mechanism. Unlike lock, which
597 gets a lock on a particular scalar, Perl doesn't associate any
598 particular thing with a semaphore so you can use them to control access
599 to anything you like. In addition, semaphores can allow more than one
600 thread to access a resource at once, though by default semaphores only
601 allow one thread access at a time.
602
603 Basic semaphores
604 Semaphores have two methods, down and up. down decrements the
605 resource count, while up increments it. down calls will block if
606 the semaphore's current count would decrement below zero. This
607 program gives a quick demonstration:
608
609 use Thread qw(yield);
610 use Thread::Semaphore;
611 my $semaphore = Thread::Semaphore->new();
612 $GlobalVariable = 0;
613
614 $thr1 = Thread->new( \&sample_sub, 1 );
615 $thr2 = Thread->new( \&sample_sub, 2 );
616 $thr3 = Thread->new( \&sample_sub, 3 );
617
618 sub sample_sub {
619 my $SubNumber = shift @_;
620 my $TryCount = 10;
621 my $LocalCopy;
622 sleep 1;
623 while ($TryCount--) {
624 $semaphore->down;
625 $LocalCopy = $GlobalVariable;
626 print "$TryCount tries left for sub $SubNumber (\$GlobalVariable is $GlobalVariable)\n";
627 yield;
628 sleep 2;
629 $LocalCopy++;
630 $GlobalVariable = $LocalCopy;
631 $semaphore->up;
632 }
633 }
634
635 The three invocations of the subroutine all operate in sync. The
636 semaphore, though, makes sure that only one thread is accessing the
637 global variable at once.
638
639 Advanced Semaphores
640 By default, semaphores behave like locks, letting only one thread
641 down() them at a time. However, there are other uses for
642 semaphores.
643
644 Each semaphore has a counter attached to it. down() decrements the
645 counter and up() increments the counter. By default, semaphores
646 are created with the counter set to one, down() decrements by one,
647 and up() increments by one. If down() attempts to decrement the
648 counter below zero, it blocks until the counter is large enough.
649 Note that while a semaphore can be created with a starting count of
650 zero, any up() or down() always changes the counter by at least
651 one. $semaphore->down(0) is the same as $semaphore->down(1).
652
653 The question, of course, is why would you do something like this?
654 Why create a semaphore with a starting count that's not one, or why
655 decrement/increment it by more than one? The answer is resource
656 availability. Many resources that you want to manage access for
657 can be safely used by more than one thread at once.
658
659 For example, let's take a GUI driven program. It has a semaphore
660 that it uses to synchronize access to the display, so only one
661 thread is ever drawing at once. Handy, but of course you don't
662 want any thread to start drawing until things are properly set up.
663 In this case, you can create a semaphore with a counter set to
664 zero, and up it when things are ready for drawing.
665
666 Semaphores with counters greater than one are also useful for
667 establishing quotas. Say, for example, that you have a number of
668 threads that can do I/O at once. You don't want all the threads
669 reading or writing at once though, since that can potentially swamp
670 your I/O channels, or deplete your process' quota of filehandles.
671 You can use a semaphore initialized to the number of concurrent I/O
672 requests (or open files) that you want at any one time, and have
673 your threads quietly block and unblock themselves.
674
675 Larger increments or decrements are handy in those cases where a
676 thread needs to check out or return a number of resources at once.
677
678 Attributes: Restricting Access To Subroutines
679 In addition to synchronizing access to data or resources, you might
680 find it useful to synchronize access to subroutines. You may be
681 accessing a singular machine resource (perhaps a vector processor), or
682 find it easier to serialize calls to a particular subroutine than to
683 have a set of locks and semaphores.
684
685 One of the additions to Perl 5.005 is subroutine attributes. The
686 Thread package uses these to provide several flavors of serialization.
687 It's important to remember that these attributes are used in the
688 compilation phase of your program so you can't change a subroutine's
689 behavior while your program is actually running.
690
691 Subroutine Locks
692 The basic subroutine lock looks like this:
693
694 sub test_sub :locked {
695 }
696
697 This ensures that only one thread will be executing this subroutine at
698 any one time. Once a thread calls this subroutine, any other thread
699 that calls it will block until the thread in the subroutine exits it.
700 A more elaborate example looks like this:
701
702 use Thread qw(yield);
703
704 Thread->new(\&thread_sub, 1);
705 Thread->new(\&thread_sub, 2);
706 Thread->new(\&thread_sub, 3);
707 Thread->new(\&thread_sub, 4);
708
709 sub sync_sub :locked {
710 my $CallingThread = shift @_;
711 print "In sync_sub for thread $CallingThread\n";
712 yield;
713 sleep 3;
714 print "Leaving sync_sub for thread $CallingThread\n";
715 }
716
717 sub thread_sub {
718 my $ThreadID = shift @_;
719 print "Thread $ThreadID calling sync_sub\n";
720 sync_sub($ThreadID);
721 print "$ThreadID is done with sync_sub\n";
722 }
723
724 The "locked" attribute tells perl to lock sync_sub(), and if you run
725 this, you can see that only one thread is in it at any one time.
726
727 Methods
728 Locking an entire subroutine can sometimes be overkill, especially when
729 dealing with Perl objects. When calling a method for an object, for
730 example, you want to serialize calls to a method, so that only one
731 thread will be in the subroutine for a particular object, but threads
732 calling that subroutine for a different object aren't blocked. The
733 method attribute indicates whether the subroutine is really a method.
734
735 use Thread;
736
737 sub tester {
738 my $thrnum = shift @_;
739 my $bar = Foo->new();
740 foreach (1..10) {
741 print "$thrnum calling per_object\n";
742 $bar->per_object($thrnum);
743 print "$thrnum out of per_object\n";
744 yield;
745 print "$thrnum calling one_at_a_time\n";
746 $bar->one_at_a_time($thrnum);
747 print "$thrnum out of one_at_a_time\n";
748 yield;
749 }
750 }
751
752 foreach my $thrnum (1..10) {
753 Thread->new(\&tester, $thrnum);
754 }
755
756 package Foo;
757 sub new {
758 my $class = shift @_;
759 return bless [@_], $class;
760 }
761
762 sub per_object :locked :method {
763 my ($class, $thrnum) = @_;
764 print "In per_object for thread $thrnum\n";
765 yield;
766 sleep 2;
767 print "Exiting per_object for thread $thrnum\n";
768 }
769
770 sub one_at_a_time :locked {
771 my ($class, $thrnum) = @_;
772 print "In one_at_a_time for thread $thrnum\n";
773 yield;
774 sleep 2;
775 print "Exiting one_at_a_time for thread $thrnum\n";
776 }
777
778 As you can see from the output (omitted for brevity; it's 800 lines)
779 all the threads can be in per_object() simultaneously, but only one
780 thread is ever in one_at_a_time() at once.
781
782 Locking A Subroutine
783 You can lock a subroutine as you would lock a variable. Subroutine
784 locks work the same as specifying a "locked" attribute for the
785 subroutine, and block all access to the subroutine for other threads
786 until the lock goes out of scope. When the subroutine isn't locked,
787 any number of threads can be in it at once, and getting a lock on a
788 subroutine doesn't affect threads already in the subroutine. Getting a
789 lock on a subroutine looks like this:
790
791 lock(\&sub_to_lock);
792
793 Simple enough. Unlike the "locked" attribute, which is a compile time
794 option, locking and unlocking a subroutine can be done at runtime at
795 your discretion. There is some runtime penalty to using lock(\&sub)
796 instead of the "locked" attribute, so make sure you're choosing the
797 proper method to do the locking.
798
799 You'd choose lock(\&sub) when writing modules and code to run on both
800 threaded and unthreaded Perl, especially for code that will run on
801 5.004 or earlier Perls. In that case, it's useful to have subroutines
802 that should be serialized lock themselves if they're running threaded,
803 like so:
804
805 package Foo;
806 use Config;
807 $Running_Threaded = 0;
808
809 BEGIN { $Running_Threaded = $Config{'usethreads'} }
810
811 sub sub1 { lock(\&sub1) if $Running_Threaded }
812
813 This way you can ensure single-threadedness regardless of which version
814 of Perl you're running.
815
817 We've covered the workhorse parts of Perl's threading package, and with
818 these tools you should be well on your way to writing threaded code and
819 packages. There are a few useful little pieces that didn't really fit
820 in anyplace else.
821
822 What Thread Am I In?
823 The Thread->self method provides your program with a way to get an
824 object representing the thread it's currently in. You can use this
825 object in the same way as the ones returned from the thread creation.
826
827 Thread IDs
828 tid() is a thread object method that returns the thread ID of the
829 thread the object represents. Thread IDs are integers, with the main
830 thread in a program being 0. Currently Perl assigns a unique tid to
831 every thread ever created in your program, assigning the first thread
832 to be created a tid of 1, and increasing the tid by 1 for each new
833 thread that's created.
834
835 Are These Threads The Same?
836 The equal() method takes two thread objects and returns true if the
837 objects represent the same thread, and false if they don't.
838
839 What Threads Are Running?
840 Thread->list returns a list of thread objects, one for each thread
841 that's currently running. Handy for a number of things, including
842 cleaning up at the end of your program:
843
844 # Loop through all the threads
845 foreach $thr (Thread->list) {
846 # Don't join the main thread or ourselves
847 if ($thr->tid && !Thread::equal($thr, Thread->self)) {
848 $thr->join;
849 }
850 }
851
852 The example above is just for illustration. It isn't strictly
853 necessary to join all the threads you create, since Perl detaches all
854 the threads before it exits.
855
857 Confused yet? It's time for an example program to show some of the
858 things we've covered. This program finds prime numbers using threads.
859
860 1 #!/usr/bin/perl -w
861 2 # prime-pthread, courtesy of Tom Christiansen
862 3
863 4 use strict;
864 5
865 6 use Thread;
866 7 use Thread::Queue;
867 8
868 9 my $stream = Thread::Queue->new();
869 10 my $kid = Thread->new(\&check_num, $stream, 2);
870 11
871 12 for my $i ( 3 .. 1000 ) {
872 13 $stream->enqueue($i);
873 14 }
874 15
875 16 $stream->enqueue(undef);
876 17 $kid->join();
877 18
878 19 sub check_num {
879 20 my ($upstream, $cur_prime) = @_;
880 21 my $kid;
881 22 my $downstream = Thread::Queue->new();
882 23 while (my $num = $upstream->dequeue) {
883 24 next unless $num % $cur_prime;
884 25 if ($kid) {
885 26 $downstream->enqueue($num);
886 27 } else {
887 28 print "Found prime $num\n";
888 29 $kid = Thread->new(\&check_num, $downstream, $num);
889 30 }
890 31 }
891 32 $downstream->enqueue(undef) if $kid;
892 33 $kid->join() if $kid;
893 34 }
894
895 This program uses the pipeline model to generate prime numbers. Each
896 thread in the pipeline has an input queue that feeds numbers to be
897 checked, a prime number that it's responsible for, and an output queue
898 that it funnels numbers that have failed the check into. If the thread
899 has a number that's failed its check and there's no child thread, then
900 the thread must have found a new prime number. In that case, a new
901 child thread is created for that prime and stuck on the end of the
902 pipeline.
903
904 This probably sounds a bit more confusing than it really is, so lets go
905 through this program piece by piece and see what it does. (For those
906 of you who might be trying to remember exactly what a prime number is,
907 it's a number that's only evenly divisible by itself and 1)
908
909 The bulk of the work is done by the check_num() subroutine, which takes
910 a reference to its input queue and a prime number that it's responsible
911 for. After pulling in the input queue and the prime that the
912 subroutine's checking (line 20), we create a new queue (line 22) and
913 reserve a scalar for the thread that we're likely to create later (line
914 21).
915
916 The while loop from lines 23 to line 31 grabs a scalar off the input
917 queue and checks against the prime this thread is responsible for.
918 Line 24 checks to see if there's a remainder when we modulo the number
919 to be checked against our prime. If there is one, the number must not
920 be evenly divisible by our prime, so we need to either pass it on to
921 the next thread if we've created one (line 26) or create a new thread
922 if we haven't.
923
924 The new thread creation is line 29. We pass on to it a reference to
925 the queue we've created, and the prime number we've found.
926
927 Finally, once the loop terminates (because we got a 0 or undef in the
928 queue, which serves as a note to die), we pass on the notice to our
929 child and wait for it to exit if we've created a child (Lines 32 and
930 37).
931
932 Meanwhile, back in the main thread, we create a queue (line 9) and the
933 initial child thread (line 10), and pre-seed it with the first prime:
934 2. Then we queue all the numbers from 3 to 1000 for checking (lines
935 12-14), then queue a die notice (line 16) and wait for the first child
936 thread to terminate (line 17). Because a child won't die until its
937 child has died, we know that we're done once we return from the join.
938
939 That's how it works. It's pretty simple; as with many Perl programs,
940 the explanation is much longer than the program.
941
943 A complete thread tutorial could fill a book (and has, many times), but
944 this should get you well on your way. The final authority on how
945 Perl's threads behave is the documentation bundled with the Perl
946 distribution, but with what we've covered in this article, you should
947 be well on your way to becoming a threaded Perl expert.
948
950 Here's a short bibliography courtesy of JA~Xrgen Christoffel:
951
952 Introductory Texts
953 Birrell, Andrew D. An Introduction to Programming with Threads. Digital
954 Equipment Corporation, 1989, DEC-SRC Research Report #35 online as
955 http://www.research.digital.com/SRC/staff/birrell/bib.html (highly
956 recommended)
957
958 Robbins, Kay. A., and Steven Robbins. Practical Unix Programming: A
959 Guide to Concurrency, Communication, and Multithreading. Prentice-Hall,
960 1996.
961
962 Lewis, Bill, and Daniel J. Berg. Multithreaded Programming with
963 Pthreads. Prentice Hall, 1997, ISBN 0-13-443698-9 (a well-written
964 introduction to threads).
965
966 Nelson, Greg (editor). Systems Programming with Modula-3. Prentice
967 Hall, 1991, ISBN 0-13-590464-1.
968
969 Nichols, Bradford, Dick Buttlar, and Jacqueline Proulx Farrell.
970 Pthreads Programming. O'Reilly & Associates, 1996, ISBN 156592-115-1
971 (covers POSIX threads).
972
973 OS-Related References
974 Boykin, Joseph, David Kirschen, Alan Langerman, and Susan LoVerso.
975 Programming under Mach. Addison-Wesley, 1994, ISBN 0-201-52739-1.
976
977 Tanenbaum, Andrew S. Distributed Operating Systems. Prentice Hall,
978 1995, ISBN 0-13-219908-4 (great textbook).
979
980 Silberschatz, Abraham, and Peter B. Galvin. Operating System Concepts,
981 4th ed. Addison-Wesley, 1995, ISBN 0-201-59292-4
982
983 Other References
984 Arnold, Ken and James Gosling. The Java Programming Language, 2nd ed.
985 Addison-Wesley, 1998, ISBN 0-201-31006-6.
986
987 Le Sergent, T. and B. Berthomieu. "Incremental MultiThreaded Garbage
988 Collection on Virtually Shared Memory Architectures" in Memory
989 Management: Proc. of the International Workshop IWMM 92, St. Malo,
990 France, September 1992, Yves Bekkers and Jacques Cohen, eds. Springer,
991 1992, ISBN 3540-55940-X (real-life thread applications).
992
994 Thanks (in no particular order) to Chaim Frenkel, Steve Fink, Gurusamy
995 Sarathy, Ilya Zakharevich, Benjamin Sugars, JA~Xrgen Christoffel,
996 Joshua Pritikin, and Alan Burlison, for their help in reality-checking
997 and polishing this article. Big thanks to Tom Christiansen for his
998 rewrite of the prime number generator.
999
1001 Dan Sugalski <sugalskd@ous.edu>
1002
1004 This article originally appeared in The Perl Journal #10, and is
1005 copyright 1998 The Perl Journal. It appears courtesy of Jon Orwant and
1006 The Perl Journal. This document may be distributed under the same
1007 terms as Perl itself.
1008
1009
1010
1011perl v5.10.1 2009-04-12 PERLOTHRTUT(1)