perlthrtut(1)

1PERLTHRTUT(1)          Perl Programmers Reference Guide          PERLTHRTUT(1)
2
3
4

NAME

6       perlthrtut - Tutorial on threads in Perl
7

DESCRIPTION

9       This tutorial describes the use of Perl interpreter threads (sometimes
10       referred to as ithreads) that was first introduced in Perl 5.6.0.  In
11       this model, each thread runs in its own Perl interpreter, and any data
12       sharing between threads must be explicit.  The user-level interface for
13       ithreads uses the threads class.
14
15       NOTE: There was another older Perl threading flavor called the 5.005
16       model that used the Threads class.  This old model was known to have
17       problems, is deprecated, and was removed for release 5.10.  You are
18       strongly encouraged to migrate any existing 5.005 threads code to the
19       new model as soon as possible.
20
21       You can see which (or neither) threading flavour you have by running
22       "perl -V" and looking at the "Platform" section.  If you have
23       "useithreads=define" you have ithreads, if you have
24       "use5005threads=define" you have 5.005 threads.  If you have neither,
25       you don't have any thread support built in.  If you have both, you are
26       in trouble.
27
28       The threads and threads::shared modules are included in the core Perl
29       distribution.  Additionally, they are maintained as a separate modules
30       on CPAN, so you can check there for any updates.
31

What Is A Thread Anyway?

33       A thread is a flow of control through a program with a single execution
34       point.
35
36       Sounds an awful lot like a process, doesn't it? Well, it should.
37       Threads are one of the pieces of a process.  Every process has at least
38       one thread and, up until now, every process running Perl had only one
39       thread.  With 5.8, though, you can create extra threads.  We're going
40       to show you how, when, and why.
41

Threaded Program Models

43       There are three basic ways that you can structure a threaded program.
44       Which model you choose depends on what you need your program to do.
45       For many non-trivial threaded programs, you'll need to choose different
46       models for different pieces of your program.
47
48   Boss/Worker
49       The boss/worker model usually has one boss thread and one or more
50       worker threads.  The boss thread gathers or generates tasks that need
51       to be done, then parcels those tasks out to the appropriate worker
52       thread.
53
54       This model is common in GUI and server programs, where a main thread
55       waits for some event and then passes that event to the appropriate
56       worker threads for processing.  Once the event has been passed on, the
57       boss thread goes back to waiting for another event.
58
59       The boss thread does relatively little work.  While tasks aren't
60       necessarily performed faster than with any other method, it tends to
61       have the best user-response times.
62
63   Work Crew
64       In the work crew model, several threads are created that do essentially
65       the same thing to different pieces of data.  It closely mirrors
66       classical parallel processing and vector processors, where a large
67       array of processors do the exact same thing to many pieces of data.
68
69       This model is particularly useful if the system running the program
70       will distribute multiple threads across different processors.  It can
71       also be useful in ray tracing or rendering engines, where the
72       individual threads can pass on interim results to give the user visual
73       feedback.
74
75   Pipeline
76       The pipeline model divides up a task into a series of steps, and passes
77       the results of one step on to the thread processing the next.  Each
78       thread does one thing to each piece of data and passes the results to
79       the next thread in line.
80
81       This model makes the most sense if you have multiple processors so two
82       or more threads will be executing in parallel, though it can often make
83       sense in other contexts as well.  It tends to keep the individual tasks
84       small and simple, as well as allowing some parts of the pipeline to
85       block (on I/O or system calls, for example) while other parts keep
86       going.  If you're running different parts of the pipeline on different
87       processors you may also take advantage of the caches on each processor.
88
89       This model is also handy for a form of recursive programming where,
90       rather than having a subroutine call itself, it instead creates another
91       thread.  Prime and Fibonacci generators both map well to this form of
92       the pipeline model. (A version of a prime number generator is presented
93       later on.)
94

What kind of threads are Perl threads?

96       If you have experience with other thread implementations, you might
97       find that things aren't quite what you expect.  It's very important to
98       remember when dealing with Perl threads that Perl Threads Are Not X
99       Threads for all values of X.  They aren't POSIX threads, or DecThreads,
100       or Java's Green threads, or Win32 threads.  There are similarities, and
101       the broad concepts are the same, but if you start looking for
102       implementation details you're going to be either disappointed or
103       confused.  Possibly both.
104
105       This is not to say that Perl threads are completely different from
106       everything that's ever come before -- they're not.  Perl's threading
107       model owes a lot to other thread models, especially POSIX.  Just as
108       Perl is not C, though, Perl threads are not POSIX threads.  So if you
109       find yourself looking for mutexes, or thread priorities, it's time to
110       step back a bit and think about what you want to do and how Perl can do
111       it.
112
113       However, it is important to remember that Perl threads cannot magically
114       do things unless your operating system's threads allow it. So if your
115       system blocks the entire process on "sleep()", Perl usually will, as
116       well.
117
118       Perl Threads Are Different.
119

Thread-Safe Modules

121       The addition of threads has changed Perl's internals substantially.
122       There are implications for people who write modules with XS code or
123       external libraries. However, since Perl data is not shared among
124       threads by default, Perl modules stand a high chance of being thread-
125       safe or can be made thread-safe easily.  Modules that are not tagged as
126       thread-safe should be tested or code reviewed before being used in
127       production code.
128
129       Not all modules that you might use are thread-safe, and you should
130       always assume a module is unsafe unless the documentation says
131       otherwise.  This includes modules that are distributed as part of the
132       core.  Threads are a relatively new feature, and even some of the
133       standard modules aren't thread-safe.
134
135       Even if a module is thread-safe, it doesn't mean that the module is
136       optimized to work well with threads. A module could possibly be
137       rewritten to utilize the new features in threaded Perl to increase
138       performance in a threaded environment.
139
140       If you're using a module that's not thread-safe for some reason, you
141       can protect yourself by using it from one, and only one thread at all.
142       If you need multiple threads to access such a module, you can use
143       semaphores and lots of programming discipline to control access to it.
144       Semaphores are covered in "Basic semaphores".
145
146       See also "Thread-Safety of System Libraries".
147

Thread Basics

149       The threads module provides the basic functions you need to write
150       threaded programs.  In the following sections, we'll cover the basics,
151       showing you what you need to do to create a threaded program.   After
152       that, we'll go over some of the features of the threads module that
153       make threaded programming easier.
154
155   Basic Thread Support
156       Thread support is a Perl compile-time option -- it's something that's
157       turned on or off when Perl is built at your site, rather than when your
158       programs are compiled. If your Perl wasn't compiled with thread support
159       enabled, then any attempt to use threads will fail.
160
161       Your programs can use the Config module to check whether threads are
162       enabled. If your program can't run without them, you can say something
163       like:
164
165           use Config;
166           $Config{useithreads} or die('Recompile Perl with threads to run this program.');
167
168       A possibly-threaded program using a possibly-threaded module might have
169       code like this:
170
171           use Config;
172           use MyMod;
173
174           BEGIN {
175               if ($Config{useithreads}) {
176                   # We have threads
177                   require MyMod_threaded;
178                   import MyMod_threaded;
179               } else {
180                   require MyMod_unthreaded;
181                   import MyMod_unthreaded;
182               }
183           }
184
185       Since code that runs both with and without threads is usually pretty
186       messy, it's best to isolate the thread-specific code in its own module.
187       In our example above, that's what "MyMod_threaded" is, and it's only
188       imported if we're running on a threaded Perl.
189
190   A Note about the Examples
191       In a real situation, care should be taken that all threads are finished
192       executing before the program exits.  That care has not been taken in
193       these examples in the interest of simplicity.  Running these examples
194       as is will produce error messages, usually caused by the fact that
195       there are still threads running when the program exits.  You should not
196       be alarmed by this.
197
198   Creating Threads
199       The threads module provides the tools you need to create new threads.
200       Like any other module, you need to tell Perl that you want to use it;
201       "use threads;" imports all the pieces you need to create basic threads.
202
203       The simplest, most straightforward way to create a thread is with
204       "create()":
205
206           use threads;
207
208           my $thr = threads->create(\&sub1);
209
210           sub sub1 {
211               print("In the thread\n");
212           }
213
214       The "create()" method takes a reference to a subroutine and creates a
215       new thread that starts executing in the referenced subroutine.  Control
216       then passes both to the subroutine and the caller.
217
218       If you need to, your program can pass parameters to the subroutine as
219       part of the thread startup.  Just include the list of parameters as
220       part of the "threads->create()" call, like this:
221
222           use threads;
223
224           my $Param3 = 'foo';
225           my $thr1 = threads->create(\&sub1, 'Param 1', 'Param 2', $Param3);
226           my @ParamList = (42, 'Hello', 3.14);
227           my $thr2 = threads->create(\&sub1, @ParamList);
228           my $thr3 = threads->create(\&sub1, qw(Param1 Param2 Param3));
229
230           sub sub1 {
231               my @InboundParameters = @_;
232               print("In the thread\n");
233               print('Got parameters >', join('<>', @InboundParameters), "<\n");
234           }
235
236       The last example illustrates another feature of threads.  You can spawn
237       off several threads using the same subroutine.  Each thread executes
238       the same subroutine, but in a separate thread with a separate
239       environment and potentially separate arguments.
240
241       "new()" is a synonym for "create()".
242
243   Waiting For A Thread To Exit
244       Since threads are also subroutines, they can return values.  To wait
245       for a thread to exit and extract any values it might return, you can
246       use the "join()" method:
247
248           use threads;
249
250           my ($thr) = threads->create(\&sub1);
251
252           my @ReturnData = $thr->join();
253           print('Thread returned ', join(', ', @ReturnData), "\n");
254
255           sub sub1 { return ('Fifty-six', 'foo', 2); }
256
257       In the example above, the "join()" method returns as soon as the thread
258       ends.  In addition to waiting for a thread to finish and gathering up
259       any values that the thread might have returned, "join()" also performs
260       any OS cleanup necessary for the thread.  That cleanup might be
261       important, especially for long-running programs that spawn lots of
262       threads.  If you don't want the return values and don't want to wait
263       for the thread to finish, you should call the "detach()" method
264       instead, as described next.
265
266       NOTE: In the example above, the thread returns a list, thus
267       necessitating that the thread creation call be made in list context
268       (i.e., "my ($thr)").  See ""$thr-" in threadsjoin()"> and "THREAD
269       CONTEXT" in threads for more details on thread context and return
270       values.
271
272   Ignoring A Thread
273       "join()" does three things: it waits for a thread to exit, cleans up
274       after it, and returns any data the thread may have produced.  But what
275       if you're not interested in the thread's return values, and you don't
276       really care when the thread finishes? All you want is for the thread to
277       get cleaned up after when it's done.
278
279       In this case, you use the "detach()" method.  Once a thread is
280       detached, it'll run until it's finished; then Perl will clean up after
281       it automatically.
282
283           use threads;
284
285           my $thr = threads->create(\&sub1);   # Spawn the thread
286
287           $thr->detach();   # Now we officially don't care any more
288
289           sleep(15);        # Let thread run for awhile
290
291           sub sub1 {
292               $a = 0;
293               while (1) {
294                   $a++;
295                   print("\$a is $a\n");
296                   sleep(1);
297               }
298           }
299
300       Once a thread is detached, it may not be joined, and any return data
301       that it might have produced (if it was done and waiting for a join) is
302       lost.
303
304       "detach()" can also be called as a class method to allow a thread to
305       detach itself:
306
307           use threads;
308
309           my $thr = threads->create(\&sub1);
310
311           sub sub1 {
312               threads->detach();
313               # Do more work
314           }
315
316   Process and Thread Termination
317       With threads one must be careful to make sure they all have a chance to
318       run to completion, assuming that is what you want.
319
320       An action that terminates a process will terminate all running threads.
321       die() and exit() have this property, and perl does an exit when the
322       main thread exits, perhaps implicitly by falling off the end of your
323       code, even if that's not what you want.
324
325       As an example of this case, this code prints the message "Perl exited
326       with active threads: 2 running and unjoined":
327
328           use threads;
329           my $thr1 = threads->new(\&thrsub, "test1");
330           my $thr2 = threads->new(\&thrsub, "test2");
331           sub thrsub {
332              my ($message) = @_;
333              sleep 1;
334              print "thread $message\n";
335           }
336
337       But when the following lines are added at the end:
338
339           $thr1->join();
340           $thr2->join();
341
342       it prints two lines of output, a perhaps more useful outcome.
343

Threads And Data

345       Now that we've covered the basics of threads, it's time for our next
346       topic: Data.  Threading introduces a couple of complications to data
347       access that non-threaded programs never need to worry about.
348
349   Shared And Unshared Data
350       The biggest difference between Perl ithreads and the old 5.005 style
351       threading, or for that matter, to most other threading systems out
352       there, is that by default, no data is shared. When a new Perl thread is
353       created, all the data associated with the current thread is copied to
354       the new thread, and is subsequently private to that new thread!  This
355       is similar in feel to what happens when a UNIX process forks, except
356       that in this case, the data is just copied to a different part of
357       memory within the same process rather than a real fork taking place.
358
359       To make use of threading, however, one usually wants the threads to
360       share at least some data between themselves. This is done with the
361       threads::shared module and the ":shared" attribute:
362
363           use threads;
364           use threads::shared;
365
366           my $foo :shared = 1;
367           my $bar = 1;
368           threads->create(sub { $foo++; $bar++; })->join();
369
370           print("$foo\n");  # Prints 2 since $foo is shared
371           print("$bar\n");  # Prints 1 since $bar is not shared
372
373       In the case of a shared array, all the array's elements are shared, and
374       for a shared hash, all the keys and values are shared. This places
375       restrictions on what may be assigned to shared array and hash elements:
376       only simple values or references to shared variables are allowed - this
377       is so that a private variable can't accidentally become shared. A bad
378       assignment will cause the thread to die. For example:
379
380           use threads;
381           use threads::shared;
382
383           my $var          = 1;
384           my $svar :shared = 2;
385           my %hash :shared;
386
387           ... create some threads ...
388
389           $hash{a} = 1;       # All threads see exists($hash{a}) and $hash{a} == 1
390           $hash{a} = $var;    # okay - copy-by-value: same effect as previous
391           $hash{a} = $svar;   # okay - copy-by-value: same effect as previous
392           $hash{a} = \$svar;  # okay - a reference to a shared variable
393           $hash{a} = \$var;   # This will die
394           delete($hash{a});   # okay - all threads will see !exists($hash{a})
395
396       Note that a shared variable guarantees that if two or more threads try
397       to modify it at the same time, the internal state of the variable will
398       not become corrupted. However, there are no guarantees beyond this, as
399       explained in the next section.
400
401   Thread Pitfalls: Races
402       While threads bring a new set of useful tools, they also bring a number
403       of pitfalls.  One pitfall is the race condition:
404
405           use threads;
406           use threads::shared;
407
408           my $a :shared = 1;
409           my $thr1 = threads->create(\&sub1);
410           my $thr2 = threads->create(\&sub2);
411
412           $thr1->join();
413           $thr2->join();
414           print("$a\n");
415
416           sub sub1 { my $foo = $a; $a = $foo + 1; }
417           sub sub2 { my $bar = $a; $a = $bar + 1; }
418
419       What do you think $a will be? The answer, unfortunately, is it depends.
420       Both "sub1()" and "sub2()" access the global variable $a, once to read
421       and once to write.  Depending on factors ranging from your thread
422       implementation's scheduling algorithm to the phase of the moon, $a can
423       be 2 or 3.
424
425       Race conditions are caused by unsynchronized access to shared data.
426       Without explicit synchronization, there's no way to be sure that
427       nothing has happened to the shared data between the time you access it
428       and the time you update it.  Even this simple code fragment has the
429       possibility of error:
430
431           use threads;
432           my $a :shared = 2;
433           my $b :shared;
434           my $c :shared;
435           my $thr1 = threads->create(sub { $b = $a; $a = $b + 1; });
436           my $thr2 = threads->create(sub { $c = $a; $a = $c + 1; });
437           $thr1->join();
438           $thr2->join();
439
440       Two threads both access $a.  Each thread can potentially be interrupted
441       at any point, or be executed in any order.  At the end, $a could be 3
442       or 4, and both $b and $c could be 2 or 3.
443
444       Even "$a += 5" or "$a++" are not guaranteed to be atomic.
445
446       Whenever your program accesses data or resources that can be accessed
447       by other threads, you must take steps to coordinate access or risk data
448       inconsistency and race conditions. Note that Perl will protect its
449       internals from your race conditions, but it won't protect you from you.
450

Synchronization and control

452       Perl provides a number of mechanisms to coordinate the interactions
453       between themselves and their data, to avoid race conditions and the
454       like.  Some of these are designed to resemble the common techniques
455       used in thread libraries such as "pthreads"; others are Perl-specific.
456       Often, the standard techniques are clumsy and difficult to get right
457       (such as condition waits). Where possible, it is usually easier to use
458       Perlish techniques such as queues, which remove some of the hard work
459       involved.
460
461   Controlling access: lock()
462       The "lock()" function takes a shared variable and puts a lock on it.
463       No other thread may lock the variable until the variable is unlocked by
464       the thread holding the lock. Unlocking happens automatically when the
465       locking thread exits the block that contains the call to the "lock()"
466       function.  Using "lock()" is straightforward: This example has several
467       threads doing some calculations in parallel, and occasionally updating
468       a running total:
469
470           use threads;
471           use threads::shared;
472
473           my $total :shared = 0;
474
475           sub calc {
476               while (1) {
477                   my $result;
478                   # (... do some calculations and set $result ...)
479                   {
480                       lock($total);  # Block until we obtain the lock
481                       $total += $result;
482                   } # Lock implicitly released at end of scope
483                   last if $result == 0;
484               }
485           }
486
487           my $thr1 = threads->create(\&calc);
488           my $thr2 = threads->create(\&calc);
489           my $thr3 = threads->create(\&calc);
490           $thr1->join();
491           $thr2->join();
492           $thr3->join();
493           print("total=$total\n");
494
495       "lock()" blocks the thread until the variable being locked is
496       available.  When "lock()" returns, your thread can be sure that no
497       other thread can lock that variable until the block containing the lock
498       exits.
499
500       It's important to note that locks don't prevent access to the variable
501       in question, only lock attempts.  This is in keeping with Perl's
502       longstanding tradition of courteous programming, and the advisory file
503       locking that "flock()" gives you.
504
505       You may lock arrays and hashes as well as scalars.  Locking an array,
506       though, will not block subsequent locks on array elements, just lock
507       attempts on the array itself.
508
509       Locks are recursive, which means it's okay for a thread to lock a
510       variable more than once.  The lock will last until the outermost
511       "lock()" on the variable goes out of scope. For example:
512
513           my $x :shared;
514           doit();
515
516           sub doit {
517               {
518                   {
519                       lock($x); # Wait for lock
520                       lock($x); # NOOP - we already have the lock
521                       {
522                           lock($x); # NOOP
523                           {
524                               lock($x); # NOOP
525                               lockit_some_more();
526                           }
527                       }
528                   } # *** Implicit unlock here ***
529               }
530           }
531
532           sub lockit_some_more {
533               lock($x); # NOOP
534           } # Nothing happens here
535
536       Note that there is no "unlock()" function - the only way to unlock a
537       variable is to allow it to go out of scope.
538
539       A lock can either be used to guard the data contained within the
540       variable being locked, or it can be used to guard something else, like
541       a section of code. In this latter case, the variable in question does
542       not hold any useful data, and exists only for the purpose of being
543       locked. In this respect, the variable behaves like the mutexes and
544       basic semaphores of traditional thread libraries.
545
546   A Thread Pitfall: Deadlocks
547       Locks are a handy tool to synchronize access to data, and using them
548       properly is the key to safe shared data.  Unfortunately, locks aren't
549       without their dangers, especially when multiple locks are involved.
550       Consider the following code:
551
552           use threads;
553
554           my $a :shared = 4;
555           my $b :shared = 'foo';
556           my $thr1 = threads->create(sub {
557               lock($a);
558               sleep(20);
559               lock($b);
560           });
561           my $thr2 = threads->create(sub {
562               lock($b);
563               sleep(20);
564               lock($a);
565           });
566
567       This program will probably hang until you kill it.  The only way it
568       won't hang is if one of the two threads acquires both locks first.  A
569       guaranteed-to-hang version is more complicated, but the principle is
570       the same.
571
572       The first thread will grab a lock on $a, then, after a pause during
573       which the second thread has probably had time to do some work, try to
574       grab a lock on $b.  Meanwhile, the second thread grabs a lock on $b,
575       then later tries to grab a lock on $a.  The second lock attempt for
576       both threads will block, each waiting for the other to release its
577       lock.
578
579       This condition is called a deadlock, and it occurs whenever two or more
580       threads are trying to get locks on resources that the others own.  Each
581       thread will block, waiting for the other to release a lock on a
582       resource.  That never happens, though, since the thread with the
583       resource is itself waiting for a lock to be released.
584
585       There are a number of ways to handle this sort of problem.  The best
586       way is to always have all threads acquire locks in the exact same
587       order.  If, for example, you lock variables $a, $b, and $c, always lock
588       $a before $b, and $b before $c.  It's also best to hold on to locks for
589       as short a period of time to minimize the risks of deadlock.
590
591       The other synchronization primitives described below can suffer from
592       similar problems.
593
594   Queues: Passing Data Around
595       A queue is a special thread-safe object that lets you put data in one
596       end and take it out the other without having to worry about
597       synchronization issues.  They're pretty straightforward, and look like
598       this:
599
600           use threads;
601           use Thread::Queue;
602
603           my $DataQueue = Thread::Queue->new();
604           my $thr = threads->create(sub {
605               while (my $DataElement = $DataQueue->dequeue()) {
606                   print("Popped $DataElement off the queue\n");
607               }
608           });
609
610           $DataQueue->enqueue(12);
611           $DataQueue->enqueue("A", "B", "C");
612           sleep(10);
613           $DataQueue->enqueue(undef);
614           $thr->join();
615
616       You create the queue with "Thread::Queue->new()".  Then you can add
617       lists of scalars onto the end with "enqueue()", and pop scalars off the
618       front of it with "dequeue()".  A queue has no fixed size, and can grow
619       as needed to hold everything pushed on to it.
620
621       If a queue is empty, "dequeue()" blocks until another thread enqueues
622       something.  This makes queues ideal for event loops and other
623       communications between threads.
624
625   Semaphores: Synchronizing Data Access
626       Semaphores are a kind of generic locking mechanism. In their most basic
627       form, they behave very much like lockable scalars, except that they
628       can't hold data, and that they must be explicitly unlocked. In their
629       advanced form, they act like a kind of counter, and can allow multiple
630       threads to have the lock at any one time.
631
632   Basic semaphores
633       Semaphores have two methods, "down()" and "up()": "down()" decrements
634       the resource count, while "up()" increments it. Calls to "down()" will
635       block if the semaphore's current count would decrement below zero.
636       This program gives a quick demonstration:
637
638           use threads;
639           use Thread::Semaphore;
640
641           my $semaphore = Thread::Semaphore->new();
642           my $GlobalVariable :shared = 0;
643
644           $thr1 = threads->create(\&sample_sub, 1);
645           $thr2 = threads->create(\&sample_sub, 2);
646           $thr3 = threads->create(\&sample_sub, 3);
647
648           sub sample_sub {
649               my $SubNumber = shift(@_);
650               my $TryCount = 10;
651               my $LocalCopy;
652               sleep(1);
653               while ($TryCount--) {
654                   $semaphore->down();
655                   $LocalCopy = $GlobalVariable;
656                   print("$TryCount tries left for sub $SubNumber (\$GlobalVariable is $GlobalVariable)\n");
657                   sleep(2);
658                   $LocalCopy++;
659                   $GlobalVariable = $LocalCopy;
660                   $semaphore->up();
661               }
662           }
663
664           $thr1->join();
665           $thr2->join();
666           $thr3->join();
667
668       The three invocations of the subroutine all operate in sync.  The
669       semaphore, though, makes sure that only one thread is accessing the
670       global variable at once.
671
672   Advanced Semaphores
673       By default, semaphores behave like locks, letting only one thread
674       "down()" them at a time.  However, there are other uses for semaphores.
675
676       Each semaphore has a counter attached to it. By default, semaphores are
677       created with the counter set to one, "down()" decrements the counter by
678       one, and "up()" increments by one. However, we can override any or all
679       of these defaults simply by passing in different values:
680
681           use threads;
682           use Thread::Semaphore;
683
684           my $semaphore = Thread::Semaphore->new(5);
685                           # Creates a semaphore with the counter set to five
686
687           my $thr1 = threads->create(\&sub1);
688           my $thr2 = threads->create(\&sub1);
689
690           sub sub1 {
691               $semaphore->down(5); # Decrements the counter by five
692               # Do stuff here
693               $semaphore->up(5); # Increment the counter by five
694           }
695
696           $thr1->detach();
697           $thr2->detach();
698
699       If "down()" attempts to decrement the counter below zero, it blocks
700       until the counter is large enough.  Note that while a semaphore can be
701       created with a starting count of zero, any "up()" or "down()" always
702       changes the counter by at least one, and so "$semaphore->down(0)" is
703       the same as "$semaphore->down(1)".
704
705       The question, of course, is why would you do something like this? Why
706       create a semaphore with a starting count that's not one, or why
707       decrement or increment it by more than one? The answer is resource
708       availability.  Many resources that you want to manage access for can be
709       safely used by more than one thread at once.
710
711       For example, let's take a GUI driven program.  It has a semaphore that
712       it uses to synchronize access to the display, so only one thread is
713       ever drawing at once.  Handy, but of course you don't want any thread
714       to start drawing until things are properly set up.  In this case, you
715       can create a semaphore with a counter set to zero, and up it when
716       things are ready for drawing.
717
718       Semaphores with counters greater than one are also useful for
719       establishing quotas.  Say, for example, that you have a number of
720       threads that can do I/O at once.  You don't want all the threads
721       reading or writing at once though, since that can potentially swamp
722       your I/O channels, or deplete your process' quota of filehandles.  You
723       can use a semaphore initialized to the number of concurrent I/O
724       requests (or open files) that you want at any one time, and have your
725       threads quietly block and unblock themselves.
726
727       Larger increments or decrements are handy in those cases where a thread
728       needs to check out or return a number of resources at once.
729
730   Waiting for a Condition
731       The functions "cond_wait()" and "cond_signal()" can be used in
732       conjunction with locks to notify co-operating threads that a resource
733       has become available. They are very similar in use to the functions
734       found in "pthreads". However for most purposes, queues are simpler to
735       use and more intuitive. See threads::shared for more details.
736
737   Giving up control
738       There are times when you may find it useful to have a thread explicitly
739       give up the CPU to another thread.  You may be doing something
740       processor-intensive and want to make sure that the user-interface
741       thread gets called frequently.  Regardless, there are times that you
742       might want a thread to give up the processor.
743
744       Perl's threading package provides the "yield()" function that does
745       this. "yield()" is pretty straightforward, and works like this:
746
747           use threads;
748
749           sub loop {
750               my $thread = shift;
751               my $foo = 50;
752               while($foo--) { print("In thread $thread\n"); }
753               threads->yield();
754               $foo = 50;
755               while($foo--) { print("In thread $thread\n"); }
756           }
757
758           my $thr1 = threads->create(\&loop, 'first');
759           my $thr2 = threads->create(\&loop, 'second');
760           my $thr3 = threads->create(\&loop, 'third');
761
762       It is important to remember that "yield()" is only a hint to give up
763       the CPU, it depends on your hardware, OS and threading libraries what
764       actually happens.  On many operating systems, yield() is a no-op.
765       Therefore it is important to note that one should not build the
766       scheduling of the threads around "yield()" calls. It might work on your
767       platform but it won't work on another platform.
768

General Thread Utility Routines

770       We've covered the workhorse parts of Perl's threading package, and with
771       these tools you should be well on your way to writing threaded code and
772       packages.  There are a few useful little pieces that didn't really fit
773       in anyplace else.
774
775   What Thread Am I In?
776       The "threads->self()" class method provides your program with a way to
777       get an object representing the thread it's currently in.  You can use
778       this object in the same way as the ones returned from thread creation.
779
780   Thread IDs
781       "tid()" is a thread object method that returns the thread ID of the
782       thread the object represents.  Thread IDs are integers, with the main
783       thread in a program being 0.  Currently Perl assigns a unique TID to
784       every thread ever created in your program, assigning the first thread
785       to be created a TID of 1, and increasing the TID by 1 for each new
786       thread that's created.  When used as a class method, "threads->tid()"
787       can be used by a thread to get its own TID.
788
789   Are These Threads The Same?
790       The "equal()" method takes two thread objects and returns true if the
791       objects represent the same thread, and false if they don't.
792
793       Thread objects also have an overloaded "==" comparison so that you can
794       do comparison on them as you would with normal objects.
795
796   What Threads Are Running?
797       "threads->list()" returns a list of thread objects, one for each thread
798       that's currently running and not detached.  Handy for a number of
799       things, including cleaning up at the end of your program (from the main
800       Perl thread, of course):
801
802           # Loop through all the threads
803           foreach my $thr (threads->list()) {
804               $thr->join();
805           }
806
807       If some threads have not finished running when the main Perl thread
808       ends, Perl will warn you about it and die, since it is impossible for
809       Perl to clean up itself while other threads are running.
810
811       NOTE:  The main Perl thread (thread 0) is in a detached state, and so
812       does not appear in the list returned by "threads->list()".
813

A Complete Example

815       Confused yet? It's time for an example program to show some of the
816       things we've covered.  This program finds prime numbers using threads.
817
818            1 #!/usr/bin/perl
819            2 # prime-pthread, courtesy of Tom Christiansen
820            3
821            4 use strict;
822            5 use warnings;
823            6
824            7 use threads;
825            8 use Thread::Queue;
826            9
827           10 sub check_num {
828           11     my ($upstream, $cur_prime) = @_;
829           12     my $kid;
830           13     my $downstream = Thread::Queue->new();
831           14     while (my $num = $upstream->dequeue()) {
832           15         next unless ($num % $cur_prime);
833           16         if ($kid) {
834           17             $downstream->enqueue($num);
835           18         } else {
836           19             print("Found prime: $num\n");
837           20             $kid = threads->create(\&check_num, $downstream, $num);
838           21             if (! $kid) {
839           22                 warn("Sorry.  Ran out of threads.\n");
840           23                 last;
841           24             }
842           25         }
843           26     }
844           27     if ($kid) {
845           28         $downstream->enqueue(undef);
846           29         $kid->join();
847           30     }
848           31 }
849           32
850           33 my $stream = Thread::Queue->new(3..1000, undef);
851           34 check_num($stream, 2);
852
853       This program uses the pipeline model to generate prime numbers.  Each
854       thread in the pipeline has an input queue that feeds numbers to be
855       checked, a prime number that it's responsible for, and an output queue
856       into which it funnels numbers that have failed the check.  If the
857       thread has a number that's failed its check and there's no child
858       thread, then the thread must have found a new prime number.  In that
859       case, a new child thread is created for that prime and stuck on the end
860       of the pipeline.
861
862       This probably sounds a bit more confusing than it really is, so let's
863       go through this program piece by piece and see what it does.  (For
864       those of you who might be trying to remember exactly what a prime
865       number is, it's a number that's only evenly divisible by itself and 1.)
866
867       The bulk of the work is done by the "check_num()" subroutine, which
868       takes a reference to its input queue and a prime number that it's
869       responsible for.  After pulling in the input queue and the prime that
870       the subroutine is checking (line 11), we create a new queue (line 13)
871       and reserve a scalar for the thread that we're likely to create later
872       (line 12).
873
874       The while loop from line 14 to line 26 grabs a scalar off the input
875       queue and checks against the prime this thread is responsible for.
876       Line 15 checks to see if there's a remainder when we divide the number
877       to be checked by our prime.  If there is one, the number must not be
878       evenly divisible by our prime, so we need to either pass it on to the
879       next thread if we've created one (line 17) or create a new thread if we
880       haven't.
881
882       The new thread creation is line 20.  We pass on to it a reference to
883       the queue we've created, and the prime number we've found.  In lines 21
884       through 24, we check to make sure that our new thread got created, and
885       if not, we stop checking any remaining numbers in the queue.
886
887       Finally, once the loop terminates (because we got a 0 or "undef" in the
888       queue, which serves as a note to terminate), we pass on the notice to
889       our child, and wait for it to exit if we've created a child (lines 27
890       and 30).
891
892       Meanwhile, back in the main thread, we first create a queue (line 33)
893       and queue up all the numbers from 3 to 1000 for checking, plus a
894       termination notice.  Then all we have to do to get the ball rolling is
895       pass the queue and the first prime to the "check_num()" subroutine
896       (line 34).
897
898       That's how it works.  It's pretty simple; as with many Perl programs,
899       the explanation is much longer than the program.
900

Different implementations of threads

902       Some background on thread implementations from the operating system
903       viewpoint.  There are three basic categories of threads: user-mode
904       threads, kernel threads, and multiprocessor kernel threads.
905
906       User-mode threads are threads that live entirely within a program and
907       its libraries.  In this model, the OS knows nothing about threads.  As
908       far as it's concerned, your process is just a process.
909
910       This is the easiest way to implement threads, and the way most OSes
911       start.  The big disadvantage is that, since the OS knows nothing about
912       threads, if one thread blocks they all do.  Typical blocking activities
913       include most system calls, most I/O, and things like "sleep()".
914
915       Kernel threads are the next step in thread evolution.  The OS knows
916       about kernel threads, and makes allowances for them.  The main
917       difference between a kernel thread and a user-mode thread is blocking.
918       With kernel threads, things that block a single thread don't block
919       other threads.  This is not the case with user-mode threads, where the
920       kernel blocks at the process level and not the thread level.
921
922       This is a big step forward, and can give a threaded program quite a
923       performance boost over non-threaded programs.  Threads that block
924       performing I/O, for example, won't block threads that are doing other
925       things.  Each process still has only one thread running at once,
926       though, regardless of how many CPUs a system might have.
927
928       Since kernel threading can interrupt a thread at any time, they will
929       uncover some of the implicit locking assumptions you may make in your
930       program.  For example, something as simple as "$a = $a + 2" can behave
931       unpredictably with kernel threads if $a is visible to other threads, as
932       another thread may have changed $a between the time it was fetched on
933       the right hand side and the time the new value is stored.
934
935       Multiprocessor kernel threads are the final step in thread support.
936       With multiprocessor kernel threads on a machine with multiple CPUs, the
937       OS may schedule two or more threads to run simultaneously on different
938       CPUs.
939
940       This can give a serious performance boost to your threaded program,
941       since more than one thread will be executing at the same time.  As a
942       tradeoff, though, any of those nagging synchronization issues that
943       might not have shown with basic kernel threads will appear with a
944       vengeance.
945
946       In addition to the different levels of OS involvement in threads,
947       different OSes (and different thread implementations for a particular
948       OS) allocate CPU cycles to threads in different ways.
949
950       Cooperative multitasking systems have running threads give up control
951       if one of two things happen.  If a thread calls a yield function, it
952       gives up control.  It also gives up control if the thread does
953       something that would cause it to block, such as perform I/O.  In a
954       cooperative multitasking implementation, one thread can starve all the
955       others for CPU time if it so chooses.
956
957       Preemptive multitasking systems interrupt threads at regular intervals
958       while the system decides which thread should run next.  In a preemptive
959       multitasking system, one thread usually won't monopolize the CPU.
960
961       On some systems, there can be cooperative and preemptive threads
962       running simultaneously. (Threads running with realtime priorities often
963       behave cooperatively, for example, while threads running at normal
964       priorities behave preemptively.)
965
966       Most modern operating systems support preemptive multitasking nowadays.
967

Performance considerations

969       The main thing to bear in mind when comparing Perl's ithreads to other
970       threading models is the fact that for each new thread created, a
971       complete copy of all the variables and data of the parent thread has to
972       be taken. Thus, thread creation can be quite expensive, both in terms
973       of memory usage and time spent in creation. The ideal way to reduce
974       these costs is to have a relatively short number of long-lived threads,
975       all created fairly early on -- before the base thread has accumulated
976       too much data. Of course, this may not always be possible, so
977       compromises have to be made. However, after a thread has been created,
978       its performance and extra memory usage should be little different than
979       ordinary code.
980
981       Also note that under the current implementation, shared variables use a
982       little more memory and are a little slower than ordinary variables.
983

Process-scope Changes

985       Note that while threads themselves are separate execution threads and
986       Perl data is thread-private unless explicitly shared, the threads can
987       affect process-scope state, affecting all the threads.
988
989       The most common example of this is changing the current working
990       directory using "chdir()".  One thread calls "chdir()", and the working
991       directory of all the threads changes.
992
993       Even more drastic example of a process-scope change is "chroot()": the
994       root directory of all the threads changes, and no thread can undo it
995       (as opposed to "chdir()").
996
997       Further examples of process-scope changes include "umask()" and
998       changing uids and gids.
999
1000       Thinking of mixing "fork()" and threads?  Please lie down and wait
1001       until the feeling passes.  Be aware that the semantics of "fork()" vary
1002       between platforms.  For example, some UNIX systems copy all the current
1003       threads into the child process, while others only copy the thread that
1004       called "fork()". You have been warned!
1005
1006       Similarly, mixing signals and threads may be problematic.
1007       Implementations are platform-dependent, and even the POSIX semantics
1008       may not be what you expect (and Perl doesn't even give you the full
1009       POSIX API).  For example, there is no way to guarantee that a signal
1010       sent to a multi-threaded Perl application will get intercepted by any
1011       particular thread.  (However, a recently added feature does provide the
1012       capability to send signals between threads.  See ""THREAD SIGNALLING"
1013       in threads for more details.)
1014

Thread-Safety of System Libraries

1016       Whether various library calls are thread-safe is outside the control of
1017       Perl.  Calls often suffering from not being thread-safe include:
1018       "localtime()", "gmtime()",  functions fetching user, group and network
1019       information (such as "getgrent()", "gethostent()", "getnetent()" and so
1020       on), "readdir()", "rand()", and "srand()" -- in general, calls that
1021       depend on some global external state.
1022
1023       If the system Perl is compiled in has thread-safe variants of such
1024       calls, they will be used.  Beyond that, Perl is at the mercy of the
1025       thread-safety or -unsafety of the calls.  Please consult your C library
1026       call documentation.
1027
1028       On some platforms the thread-safe library interfaces may fail if the
1029       result buffer is too small (for example the user group databases may be
1030       rather large, and the reentrant interfaces may have to carry around a
1031       full snapshot of those databases).  Perl will start with a small
1032       buffer, but keep retrying and growing the result buffer until the
1033       result fits.  If this limitless growing sounds bad for security or
1034       memory consumption reasons you can recompile Perl with
1035       "PERL_REENTRANT_MAXSIZE" defined to the maximum number of bytes you
1036       will allow.
1037

Conclusion

1039       A complete thread tutorial could fill a book (and has, many times), but
1040       with what we've covered in this introduction, you should be well on
1041       your way to becoming a threaded Perl expert.
1042

Bibliography

1060       Here's a short bibliography courtesy of JA~Xrgen Christoffel:
1061
1062   Introductory Texts
1063       Birrell, Andrew D. An Introduction to Programming with Threads. Digital
1064       Equipment Corporation, 1989, DEC-SRC Research Report #35 online as
1065       http://gatekeeper.dec.com/pub/DEC/SRC/research-reports/abstracts/src-rr-035.html
1066       (highly recommended)
1067
1068       Robbins, Kay. A., and Steven Robbins. Practical Unix Programming: A
1069       Guide to Concurrency, Communication, and Multithreading. Prentice-Hall,
1070       1996.
1071
1072       Lewis, Bill, and Daniel J. Berg. Multithreaded Programming with
1073       Pthreads. Prentice Hall, 1997, ISBN 0-13-443698-9 (a well-written
1074       introduction to threads).
1075
1076       Nelson, Greg (editor). Systems Programming with Modula-3.  Prentice
1077       Hall, 1991, ISBN 0-13-590464-1.
1078
1079       Nichols, Bradford, Dick Buttlar, and Jacqueline Proulx Farrell.
1080       Pthreads Programming. O'Reilly & Associates, 1996, ISBN 156592-115-1
1081       (covers POSIX threads).
1082
1083   OS-Related References
1084       Boykin, Joseph, David Kirschen, Alan Langerman, and Susan LoVerso.
1085       Programming under Mach. Addison-Wesley, 1994, ISBN 0-201-52739-1.
1086
1087       Tanenbaum, Andrew S. Distributed Operating Systems. Prentice Hall,
1088       1995, ISBN 0-13-219908-4 (great textbook).
1089
1090       Silberschatz, Abraham, and Peter B. Galvin. Operating System Concepts,
1091       4th ed. Addison-Wesley, 1995, ISBN 0-201-59292-4
1092
1093   Other References
1094       Arnold, Ken and James Gosling. The Java Programming Language, 2nd ed.
1095       Addison-Wesley, 1998, ISBN 0-201-31006-6.
1096
1097       comp.programming.threads FAQ,
1098       <http://www.serpentine.com/~bos/threads-faq/>
1099
1100       Le Sergent, T. and B. Berthomieu. "Incremental MultiThreaded Garbage
1101       Collection on Virtually Shared Memory Architectures" in Memory
1102       Management: Proc. of the International Workshop IWMM 92, St. Malo,
1103       France, September 1992, Yves Bekkers and Jacques Cohen, eds. Springer,
1104       1992, ISBN 3540-55940-X (real-life thread applications).
1105
1106       Artur Bergman, "Where Wizards Fear To Tread", June 11, 2002,
1107       <http://www.perl.com/pub/a/2002/06/11/threads.html>
1108

Acknowledgements

1110       Thanks (in no particular order) to Chaim Frenkel, Steve Fink, Gurusamy
1111       Sarathy, Ilya Zakharevich, Benjamin Sugars, JA~Xrgen Christoffel,
1112       Joshua Pritikin, and Alan Burlison, for their help in reality-checking
1113       and polishing this article.  Big thanks to Tom Christiansen for his
1114       rewrite of the prime number generator.
1115

AUTHOR

1117       Dan Sugalski <dan@sidhe.org<gt>
1118
1119       Slightly modified by Arthur Bergman to fit the new thread model/module.
1120
1121       Reworked slightly by JA~Xrg Walter <jwalt@cpan.org<gt> to be more
1122       concise about thread-safety of Perl code.
1123
1124       Rearranged slightly by Elizabeth Mattijsen <liz@dijkmat.nl<gt> to put
1125       less emphasis on yield().
1126

Copyrights

1128       The original version of this article originally appeared in The Perl
1129       Journal #10, and is copyright 1998 The Perl Journal. It appears
1130       courtesy of Jon Orwant and The Perl Journal.  This document may be
1131       distributed under the same terms as Perl itself.
1132
1133
1134
1135perl v5.10.1                      2009-04-14                     PERLTHRTUT(1)