perlthrtut(1)

1PERLTHRTUT(1)          Perl Programmers Reference Guide          PERLTHRTUT(1)
2
3
4

NAME

6       perlthrtut - Tutorial on threads in Perl
7

DESCRIPTION

9       This tutorial describes the use of Perl interpreter threads (sometimes
10       referred to as ithreads).  In this model, each thread runs in its own
11       Perl interpreter, and any data sharing between threads must be
12       explicit.  The user-level interface for ithreads uses the threads
13       class.
14
15       NOTE: There was another older Perl threading flavor called the 5.005
16       model that used the threads class.  This old model was known to have
17       problems, is deprecated, and was removed for release 5.10.  You are
18       strongly encouraged to migrate any existing 5.005 threads code to the
19       new model as soon as possible.
20
21       You can see which (or neither) threading flavour you have by running
22       "perl -V" and looking at the "Platform" section.  If you have
23       "useithreads=define" you have ithreads, if you have
24       "use5005threads=define" you have 5.005 threads.  If you have neither,
25       you don't have any thread support built in.  If you have both, you are
26       in trouble.
27
28       The threads and threads::shared modules are included in the core Perl
29       distribution.  Additionally, they are maintained as a separate modules
30       on CPAN, so you can check there for any updates.
31

What Is A Thread Anyway?

33       A thread is a flow of control through a program with a single execution
34       point.
35
36       Sounds an awful lot like a process, doesn't it? Well, it should.
37       Threads are one of the pieces of a process.  Every process has at least
38       one thread and, up until now, every process running Perl had only one
39       thread.  With 5.8, though, you can create extra threads.  We're going
40       to show you how, when, and why.
41

Threaded Program Models

43       There are three basic ways that you can structure a threaded program.
44       Which model you choose depends on what you need your program to do.
45       For many non-trivial threaded programs, you'll need to choose different
46       models for different pieces of your program.
47
48   Boss/Worker
49       The boss/worker model usually has one boss thread and one or more
50       worker threads.  The boss thread gathers or generates tasks that need
51       to be done, then parcels those tasks out to the appropriate worker
52       thread.
53
54       This model is common in GUI and server programs, where a main thread
55       waits for some event and then passes that event to the appropriate
56       worker threads for processing.  Once the event has been passed on, the
57       boss thread goes back to waiting for another event.
58
59       The boss thread does relatively little work.  While tasks aren't
60       necessarily performed faster than with any other method, it tends to
61       have the best user-response times.
62
63   Work Crew
64       In the work crew model, several threads are created that do essentially
65       the same thing to different pieces of data.  It closely mirrors
66       classical parallel processing and vector processors, where a large
67       array of processors do the exact same thing to many pieces of data.
68
69       This model is particularly useful if the system running the program
70       will distribute multiple threads across different processors.  It can
71       also be useful in ray tracing or rendering engines, where the
72       individual threads can pass on interim results to give the user visual
73       feedback.
74
75   Pipeline
76       The pipeline model divides up a task into a series of steps, and passes
77       the results of one step on to the thread processing the next.  Each
78       thread does one thing to each piece of data and passes the results to
79       the next thread in line.
80
81       This model makes the most sense if you have multiple processors so two
82       or more threads will be executing in parallel, though it can often make
83       sense in other contexts as well.  It tends to keep the individual tasks
84       small and simple, as well as allowing some parts of the pipeline to
85       block (on I/O or system calls, for example) while other parts keep
86       going.  If you're running different parts of the pipeline on different
87       processors you may also take advantage of the caches on each processor.
88
89       This model is also handy for a form of recursive programming where,
90       rather than having a subroutine call itself, it instead creates another
91       thread.  Prime and Fibonacci generators both map well to this form of
92       the pipeline model. (A version of a prime number generator is presented
93       later on.)
94

What kind of threads are Perl threads?

96       If you have experience with other thread implementations, you might
97       find that things aren't quite what you expect.  It's very important to
98       remember when dealing with Perl threads that Perl Threads Are Not X
99       Threads for all values of X.  They aren't POSIX threads, or DecThreads,
100       or Java's Green threads, or Win32 threads.  There are similarities, and
101       the broad concepts are the same, but if you start looking for
102       implementation details you're going to be either disappointed or
103       confused.  Possibly both.
104
105       This is not to say that Perl threads are completely different from
106       everything that's ever come before. They're not.  Perl's threading
107       model owes a lot to other thread models, especially POSIX.  Just as
108       Perl is not C, though, Perl threads are not POSIX threads.  So if you
109       find yourself looking for mutexes, or thread priorities, it's time to
110       step back a bit and think about what you want to do and how Perl can do
111       it.
112
113       However, it is important to remember that Perl threads cannot magically
114       do things unless your operating system's threads allow it. So if your
115       system blocks the entire process on sleep(), Perl usually will, as
116       well.
117
118       Perl Threads Are Different.
119

Thread-Safe Modules

121       The addition of threads has changed Perl's internals substantially.
122       There are implications for people who write modules with XS code or
123       external libraries. However, since Perl data is not shared among
124       threads by default, Perl modules stand a high chance of being thread-
125       safe or can be made thread-safe easily.  Modules that are not tagged as
126       thread-safe should be tested or code reviewed before being used in
127       production code.
128
129       Not all modules that you might use are thread-safe, and you should
130       always assume a module is unsafe unless the documentation says
131       otherwise.  This includes modules that are distributed as part of the
132       core.  Threads are a relatively new feature, and even some of the
133       standard modules aren't thread-safe.
134
135       Even if a module is thread-safe, it doesn't mean that the module is
136       optimized to work well with threads. A module could possibly be
137       rewritten to utilize the new features in threaded Perl to increase
138       performance in a threaded environment.
139
140       If you're using a module that's not thread-safe for some reason, you
141       can protect yourself by using it from one, and only one thread at all.
142       If you need multiple threads to access such a module, you can use
143       semaphores and lots of programming discipline to control access to it.
144       Semaphores are covered in "Basic semaphores".
145
146       See also "Thread-Safety of System Libraries".
147

Thread Basics

149       The threads module provides the basic functions you need to write
150       threaded programs.  In the following sections, we'll cover the basics,
151       showing you what you need to do to create a threaded program.   After
152       that, we'll go over some of the features of the threads module that
153       make threaded programming easier.
154
155   Basic Thread Support
156       Thread support is a Perl compile-time option. It's something that's
157       turned on or off when Perl is built at your site, rather than when your
158       programs are compiled. If your Perl wasn't compiled with thread support
159       enabled, then any attempt to use threads will fail.
160
161       Your programs can use the Config module to check whether threads are
162       enabled. If your program can't run without them, you can say something
163       like:
164
165           use Config;
166           $Config{useithreads} or
167               die('Recompile Perl with threads to run this program.');
168
169       A possibly-threaded program using a possibly-threaded module might have
170       code like this:
171
172           use Config;
173           use MyMod;
174
175           BEGIN {
176               if ($Config{useithreads}) {
177                   # We have threads
178                   require MyMod_threaded;
179                   import MyMod_threaded;
180               } else {
181                   require MyMod_unthreaded;
182                   import MyMod_unthreaded;
183               }
184           }
185
186       Since code that runs both with and without threads is usually pretty
187       messy, it's best to isolate the thread-specific code in its own module.
188       In our example above, that's what "MyMod_threaded" is, and it's only
189       imported if we're running on a threaded Perl.
190
191   A Note about the Examples
192       In a real situation, care should be taken that all threads are finished
193       executing before the program exits.  That care has not been taken in
194       these examples in the interest of simplicity.  Running these examples
195       as is will produce error messages, usually caused by the fact that
196       there are still threads running when the program exits.  You should not
197       be alarmed by this.
198
199   Creating Threads
200       The threads module provides the tools you need to create new threads.
201       Like any other module, you need to tell Perl that you want to use it;
202       "use threads;" imports all the pieces you need to create basic threads.
203
204       The simplest, most straightforward way to create a thread is with
205       create():
206
207           use threads;
208
209           my $thr = threads->create(\&sub1);
210
211           sub sub1 {
212               print("In the thread\n");
213           }
214
215       The create() method takes a reference to a subroutine and creates a new
216       thread that starts executing in the referenced subroutine.  Control
217       then passes both to the subroutine and the caller.
218
219       If you need to, your program can pass parameters to the subroutine as
220       part of the thread startup.  Just include the list of parameters as
221       part of the "threads->create()" call, like this:
222
223           use threads;
224
225           my $Param3 = 'foo';
226           my $thr1 = threads->create(\&sub1, 'Param 1', 'Param 2', $Param3);
227           my @ParamList = (42, 'Hello', 3.14);
228           my $thr2 = threads->create(\&sub1, @ParamList);
229           my $thr3 = threads->create(\&sub1, qw(Param1 Param2 Param3));
230
231           sub sub1 {
232               my @InboundParameters = @_;
233               print("In the thread\n");
234               print('Got parameters >', join('<>',@InboundParameters), "<\n");
235           }
236
237       The last example illustrates another feature of threads.  You can spawn
238       off several threads using the same subroutine.  Each thread executes
239       the same subroutine, but in a separate thread with a separate
240       environment and potentially separate arguments.
241
242       new() is a synonym for create().
243
244   Waiting For A Thread To Exit
245       Since threads are also subroutines, they can return values.  To wait
246       for a thread to exit and extract any values it might return, you can
247       use the join() method:
248
249           use threads;
250
251           my ($thr) = threads->create(\&sub1);
252
253           my @ReturnData = $thr->join();
254           print('Thread returned ', join(', ', @ReturnData), "\n");
255
256           sub sub1 { return ('Fifty-six', 'foo', 2); }
257
258       In the example above, the join() method returns as soon as the thread
259       ends.  In addition to waiting for a thread to finish and gathering up
260       any values that the thread might have returned, join() also performs
261       any OS cleanup necessary for the thread.  That cleanup might be
262       important, especially for long-running programs that spawn lots of
263       threads.  If you don't want the return values and don't want to wait
264       for the thread to finish, you should call the detach() method instead,
265       as described next.
266
267       NOTE: In the example above, the thread returns a list, thus
268       necessitating that the thread creation call be made in list context
269       (i.e., "my ($thr)").  See "$thr->join()" in threads and "THREAD
270       CONTEXT" in threads for more details on thread context and return
271       values.
272
273   Ignoring A Thread
274       join() does three things: it waits for a thread to exit, cleans up
275       after it, and returns any data the thread may have produced.  But what
276       if you're not interested in the thread's return values, and you don't
277       really care when the thread finishes? All you want is for the thread to
278       get cleaned up after when it's done.
279
280       In this case, you use the detach() method.  Once a thread is detached,
281       it'll run until it's finished; then Perl will clean up after it
282       automatically.
283
284           use threads;
285
286           my $thr = threads->create(\&sub1);   # Spawn the thread
287
288           $thr->detach();   # Now we officially don't care any more
289
290           sleep(15);        # Let thread run for awhile
291
292           sub sub1 {
293               my $count = 0;
294               while (1) {
295                   $count++;
296                   print("\$count is $count\n");
297                   sleep(1);
298               }
299           }
300
301       Once a thread is detached, it may not be joined, and any return data
302       that it might have produced (if it was done and waiting for a join) is
303       lost.
304
305       detach() can also be called as a class method to allow a thread to
306       detach itself:
307
308           use threads;
309
310           my $thr = threads->create(\&sub1);
311
312           sub sub1 {
313               threads->detach();
314               # Do more work
315           }
316
317   Process and Thread Termination
318       With threads one must be careful to make sure they all have a chance to
319       run to completion, assuming that is what you want.
320
321       An action that terminates a process will terminate all running threads.
322       die() and exit() have this property, and perl does an exit when the
323       main thread exits, perhaps implicitly by falling off the end of your
324       code, even if that's not what you want.
325
326       As an example of this case, this code prints the message "Perl exited
327       with active threads: 2 running and unjoined":
328
329           use threads;
330           my $thr1 = threads->new(\&thrsub, "test1");
331           my $thr2 = threads->new(\&thrsub, "test2");
332           sub thrsub {
333              my ($message) = @_;
334              sleep 1;
335              print "thread $message\n";
336           }
337
338       But when the following lines are added at the end:
339
340           $thr1->join();
341           $thr2->join();
342
343       it prints two lines of output, a perhaps more useful outcome.
344

Threads And Data

346       Now that we've covered the basics of threads, it's time for our next
347       topic: Data.  Threading introduces a couple of complications to data
348       access that non-threaded programs never need to worry about.
349
350   Shared And Unshared Data
351       The biggest difference between Perl ithreads and the old 5.005 style
352       threading, or for that matter, to most other threading systems out
353       there, is that by default, no data is shared. When a new Perl thread is
354       created, all the data associated with the current thread is copied to
355       the new thread, and is subsequently private to that new thread!  This
356       is similar in feel to what happens when a Unix process forks, except
357       that in this case, the data is just copied to a different part of
358       memory within the same process rather than a real fork taking place.
359
360       To make use of threading, however, one usually wants the threads to
361       share at least some data between themselves. This is done with the
362       threads::shared module and the ":shared" attribute:
363
364           use threads;
365           use threads::shared;
366
367           my $foo :shared = 1;
368           my $bar = 1;
369           threads->create(sub { $foo++; $bar++; })->join();
370
371           print("$foo\n");  # Prints 2 since $foo is shared
372           print("$bar\n");  # Prints 1 since $bar is not shared
373
374       In the case of a shared array, all the array's elements are shared, and
375       for a shared hash, all the keys and values are shared. This places
376       restrictions on what may be assigned to shared array and hash elements:
377       only simple values or references to shared variables are allowed - this
378       is so that a private variable can't accidentally become shared. A bad
379       assignment will cause the thread to die. For example:
380
381           use threads;
382           use threads::shared;
383
384           my $var          = 1;
385           my $svar :shared = 2;
386           my %hash :shared;
387
388           ... create some threads ...
389
390           $hash{a} = 1;       # All threads see exists($hash{a})
391                               # and $hash{a} == 1
392           $hash{a} = $var;    # okay - copy-by-value: same effect as previous
393           $hash{a} = $svar;   # okay - copy-by-value: same effect as previous
394           $hash{a} = \$svar;  # okay - a reference to a shared variable
395           $hash{a} = \$var;   # This will die
396           delete($hash{a});   # okay - all threads will see !exists($hash{a})
397
398       Note that a shared variable guarantees that if two or more threads try
399       to modify it at the same time, the internal state of the variable will
400       not become corrupted. However, there are no guarantees beyond this, as
401       explained in the next section.
402
403   Thread Pitfalls: Races
404       While threads bring a new set of useful tools, they also bring a number
405       of pitfalls.  One pitfall is the race condition:
406
407           use threads;
408           use threads::shared;
409
410           my $x :shared = 1;
411           my $thr1 = threads->create(\&sub1);
412           my $thr2 = threads->create(\&sub2);
413
414           $thr1->join();
415           $thr2->join();
416           print("$x\n");
417
418           sub sub1 { my $foo = $x; $x = $foo + 1; }
419           sub sub2 { my $bar = $x; $x = $bar + 1; }
420
421       What do you think $x will be? The answer, unfortunately, is it depends.
422       Both sub1() and sub2() access the global variable $x, once to read and
423       once to write.  Depending on factors ranging from your thread
424       implementation's scheduling algorithm to the phase of the moon, $x can
425       be 2 or 3.
426
427       Race conditions are caused by unsynchronized access to shared data.
428       Without explicit synchronization, there's no way to be sure that
429       nothing has happened to the shared data between the time you access it
430       and the time you update it.  Even this simple code fragment has the
431       possibility of error:
432
433           use threads;
434           my $x :shared = 2;
435           my $y :shared;
436           my $z :shared;
437           my $thr1 = threads->create(sub { $y = $x; $x = $y + 1; });
438           my $thr2 = threads->create(sub { $z = $x; $x = $z + 1; });
439           $thr1->join();
440           $thr2->join();
441
442       Two threads both access $x.  Each thread can potentially be interrupted
443       at any point, or be executed in any order.  At the end, $x could be 3
444       or 4, and both $y and $z could be 2 or 3.
445
446       Even "$x += 5" or "$x++" are not guaranteed to be atomic.
447
448       Whenever your program accesses data or resources that can be accessed
449       by other threads, you must take steps to coordinate access or risk data
450       inconsistency and race conditions. Note that Perl will protect its
451       internals from your race conditions, but it won't protect you from you.
452

Synchronization and control

454       Perl provides a number of mechanisms to coordinate the interactions
455       between themselves and their data, to avoid race conditions and the
456       like.  Some of these are designed to resemble the common techniques
457       used in thread libraries such as "pthreads"; others are Perl-specific.
458       Often, the standard techniques are clumsy and difficult to get right
459       (such as condition waits). Where possible, it is usually easier to use
460       Perlish techniques such as queues, which remove some of the hard work
461       involved.
462
463   Controlling access: lock()
464       The lock() function takes a shared variable and puts a lock on it.  No
465       other thread may lock the variable until the variable is unlocked by
466       the thread holding the lock. Unlocking happens automatically when the
467       locking thread exits the block that contains the call to the lock()
468       function.  Using lock() is straightforward: This example has several
469       threads doing some calculations in parallel, and occasionally updating
470       a running total:
471
472           use threads;
473           use threads::shared;
474
475           my $total :shared = 0;
476
477           sub calc {
478               while (1) {
479                   my $result;
480                   # (... do some calculations and set $result ...)
481                   {
482                       lock($total);  # Block until we obtain the lock
483                       $total += $result;
484                   } # Lock implicitly released at end of scope
485                   last if $result == 0;
486               }
487           }
488
489           my $thr1 = threads->create(\&calc);
490           my $thr2 = threads->create(\&calc);
491           my $thr3 = threads->create(\&calc);
492           $thr1->join();
493           $thr2->join();
494           $thr3->join();
495           print("total=$total\n");
496
497       lock() blocks the thread until the variable being locked is available.
498       When lock() returns, your thread can be sure that no other thread can
499       lock that variable until the block containing the lock exits.
500
501       It's important to note that locks don't prevent access to the variable
502       in question, only lock attempts.  This is in keeping with Perl's
503       longstanding tradition of courteous programming, and the advisory file
504       locking that flock() gives you.
505
506       You may lock arrays and hashes as well as scalars.  Locking an array,
507       though, will not block subsequent locks on array elements, just lock
508       attempts on the array itself.
509
510       Locks are recursive, which means it's okay for a thread to lock a
511       variable more than once.  The lock will last until the outermost lock()
512       on the variable goes out of scope. For example:
513
514           my $x :shared;
515           doit();
516
517           sub doit {
518               {
519                   {
520                       lock($x); # Wait for lock
521                       lock($x); # NOOP - we already have the lock
522                       {
523                           lock($x); # NOOP
524                           {
525                               lock($x); # NOOP
526                               lockit_some_more();
527                           }
528                       }
529                   } # *** Implicit unlock here ***
530               }
531           }
532
533           sub lockit_some_more {
534               lock($x); # NOOP
535           } # Nothing happens here
536
537       Note that there is no unlock() function - the only way to unlock a
538       variable is to allow it to go out of scope.
539
540       A lock can either be used to guard the data contained within the
541       variable being locked, or it can be used to guard something else, like
542       a section of code. In this latter case, the variable in question does
543       not hold any useful data, and exists only for the purpose of being
544       locked. In this respect, the variable behaves like the mutexes and
545       basic semaphores of traditional thread libraries.
546
547   A Thread Pitfall: Deadlocks
548       Locks are a handy tool to synchronize access to data, and using them
549       properly is the key to safe shared data.  Unfortunately, locks aren't
550       without their dangers, especially when multiple locks are involved.
551       Consider the following code:
552
553           use threads;
554
555           my $x :shared = 4;
556           my $y :shared = 'foo';
557           my $thr1 = threads->create(sub {
558               lock($x);
559               sleep(20);
560               lock($y);
561           });
562           my $thr2 = threads->create(sub {
563               lock($y);
564               sleep(20);
565               lock($x);
566           });
567
568       This program will probably hang until you kill it.  The only way it
569       won't hang is if one of the two threads acquires both locks first.  A
570       guaranteed-to-hang version is more complicated, but the principle is
571       the same.
572
573       The first thread will grab a lock on $x, then, after a pause during
574       which the second thread has probably had time to do some work, try to
575       grab a lock on $y.  Meanwhile, the second thread grabs a lock on $y,
576       then later tries to grab a lock on $x.  The second lock attempt for
577       both threads will block, each waiting for the other to release its
578       lock.
579
580       This condition is called a deadlock, and it occurs whenever two or more
581       threads are trying to get locks on resources that the others own.  Each
582       thread will block, waiting for the other to release a lock on a
583       resource.  That never happens, though, since the thread with the
584       resource is itself waiting for a lock to be released.
585
586       There are a number of ways to handle this sort of problem.  The best
587       way is to always have all threads acquire locks in the exact same
588       order.  If, for example, you lock variables $x, $y, and $z, always lock
589       $x before $y, and $y before $z.  It's also best to hold on to locks for
590       as short a period of time to minimize the risks of deadlock.
591
592       The other synchronization primitives described below can suffer from
593       similar problems.
594
595   Queues: Passing Data Around
596       A queue is a special thread-safe object that lets you put data in one
597       end and take it out the other without having to worry about
598       synchronization issues.  They're pretty straightforward, and look like
599       this:
600
601           use threads;
602           use Thread::Queue;
603
604           my $DataQueue = Thread::Queue->new();
605           my $thr = threads->create(sub {
606               while (my $DataElement = $DataQueue->dequeue()) {
607                   print("Popped $DataElement off the queue\n");
608               }
609           });
610
611           $DataQueue->enqueue(12);
612           $DataQueue->enqueue("A", "B", "C");
613           sleep(10);
614           $DataQueue->enqueue(undef);
615           $thr->join();
616
617       You create the queue with "Thread::Queue->new()".  Then you can add
618       lists of scalars onto the end with enqueue(), and pop scalars off the
619       front of it with dequeue().  A queue has no fixed size, and can grow as
620       needed to hold everything pushed on to it.
621
622       If a queue is empty, dequeue() blocks until another thread enqueues
623       something.  This makes queues ideal for event loops and other
624       communications between threads.
625
626   Semaphores: Synchronizing Data Access
627       Semaphores are a kind of generic locking mechanism. In their most basic
628       form, they behave very much like lockable scalars, except that they
629       can't hold data, and that they must be explicitly unlocked. In their
630       advanced form, they act like a kind of counter, and can allow multiple
631       threads to have the lock at any one time.
632
633   Basic semaphores
634       Semaphores have two methods, down() and up(): down() decrements the
635       resource count, while up() increments it. Calls to down() will block if
636       the semaphore's current count would decrement below zero.  This program
637       gives a quick demonstration:
638
639           use threads;
640           use Thread::Semaphore;
641
642           my $semaphore = Thread::Semaphore->new();
643           my $GlobalVariable :shared = 0;
644
645           $thr1 = threads->create(\&sample_sub, 1);
646           $thr2 = threads->create(\&sample_sub, 2);
647           $thr3 = threads->create(\&sample_sub, 3);
648
649           sub sample_sub {
650               my $SubNumber = shift(@_);
651               my $TryCount = 10;
652               my $LocalCopy;
653               sleep(1);
654               while ($TryCount--) {
655                   $semaphore->down();
656                   $LocalCopy = $GlobalVariable;
657                   print("$TryCount tries left for sub $SubNumber "
658                        ."(\$GlobalVariable is $GlobalVariable)\n");
659                   sleep(2);
660                   $LocalCopy++;
661                   $GlobalVariable = $LocalCopy;
662                   $semaphore->up();
663               }
664           }
665
666           $thr1->join();
667           $thr2->join();
668           $thr3->join();
669
670       The three invocations of the subroutine all operate in sync.  The
671       semaphore, though, makes sure that only one thread is accessing the
672       global variable at once.
673
674   Advanced Semaphores
675       By default, semaphores behave like locks, letting only one thread
676       down() them at a time.  However, there are other uses for semaphores.
677
678       Each semaphore has a counter attached to it. By default, semaphores are
679       created with the counter set to one, down() decrements the counter by
680       one, and up() increments by one. However, we can override any or all of
681       these defaults simply by passing in different values:
682
683           use threads;
684           use Thread::Semaphore;
685
686           my $semaphore = Thread::Semaphore->new(5);
687                           # Creates a semaphore with the counter set to five
688
689           my $thr1 = threads->create(\&sub1);
690           my $thr2 = threads->create(\&sub1);
691
692           sub sub1 {
693               $semaphore->down(5); # Decrements the counter by five
694               # Do stuff here
695               $semaphore->up(5); # Increment the counter by five
696           }
697
698           $thr1->detach();
699           $thr2->detach();
700
701       If down() attempts to decrement the counter below zero, it blocks until
702       the counter is large enough.  Note that while a semaphore can be
703       created with a starting count of zero, any up() or down() always
704       changes the counter by at least one, and so "$semaphore->down(0)" is
705       the same as "$semaphore->down(1)".
706
707       The question, of course, is why would you do something like this? Why
708       create a semaphore with a starting count that's not one, or why
709       decrement or increment it by more than one? The answer is resource
710       availability.  Many resources that you want to manage access for can be
711       safely used by more than one thread at once.
712
713       For example, let's take a GUI driven program.  It has a semaphore that
714       it uses to synchronize access to the display, so only one thread is
715       ever drawing at once.  Handy, but of course you don't want any thread
716       to start drawing until things are properly set up.  In this case, you
717       can create a semaphore with a counter set to zero, and up it when
718       things are ready for drawing.
719
720       Semaphores with counters greater than one are also useful for
721       establishing quotas.  Say, for example, that you have a number of
722       threads that can do I/O at once.  You don't want all the threads
723       reading or writing at once though, since that can potentially swamp
724       your I/O channels, or deplete your process's quota of filehandles.  You
725       can use a semaphore initialized to the number of concurrent I/O
726       requests (or open files) that you want at any one time, and have your
727       threads quietly block and unblock themselves.
728
729       Larger increments or decrements are handy in those cases where a thread
730       needs to check out or return a number of resources at once.
731
732   Waiting for a Condition
733       The functions cond_wait() and cond_signal() can be used in conjunction
734       with locks to notify co-operating threads that a resource has become
735       available. They are very similar in use to the functions found in
736       "pthreads". However for most purposes, queues are simpler to use and
737       more intuitive. See threads::shared for more details.
738
739   Giving up control
740       There are times when you may find it useful to have a thread explicitly
741       give up the CPU to another thread.  You may be doing something
742       processor-intensive and want to make sure that the user-interface
743       thread gets called frequently.  Regardless, there are times that you
744       might want a thread to give up the processor.
745
746       Perl's threading package provides the yield() function that does this.
747       yield() is pretty straightforward, and works like this:
748
749           use threads;
750
751           sub loop {
752               my $thread = shift;
753               my $foo = 50;
754               while($foo--) { print("In thread $thread\n"); }
755               threads->yield();
756               $foo = 50;
757               while($foo--) { print("In thread $thread\n"); }
758           }
759
760           my $thr1 = threads->create(\&loop, 'first');
761           my $thr2 = threads->create(\&loop, 'second');
762           my $thr3 = threads->create(\&loop, 'third');
763
764       It is important to remember that yield() is only a hint to give up the
765       CPU, it depends on your hardware, OS and threading libraries what
766       actually happens.  On many operating systems, yield() is a no-op.
767       Therefore it is important to note that one should not build the
768       scheduling of the threads around yield() calls. It might work on your
769       platform but it won't work on another platform.
770

General Thread Utility Routines

772       We've covered the workhorse parts of Perl's threading package, and with
773       these tools you should be well on your way to writing threaded code and
774       packages.  There are a few useful little pieces that didn't really fit
775       in anyplace else.
776
777   What Thread Am I In?
778       The "threads->self()" class method provides your program with a way to
779       get an object representing the thread it's currently in.  You can use
780       this object in the same way as the ones returned from thread creation.
781
782   Thread IDs
783       tid() is a thread object method that returns the thread ID of the
784       thread the object represents.  Thread IDs are integers, with the main
785       thread in a program being 0.  Currently Perl assigns a unique TID to
786       every thread ever created in your program, assigning the first thread
787       to be created a TID of 1, and increasing the TID by 1 for each new
788       thread that's created.  When used as a class method, "threads->tid()"
789       can be used by a thread to get its own TID.
790
791   Are These Threads The Same?
792       The equal() method takes two thread objects and returns true if the
793       objects represent the same thread, and false if they don't.
794
795       Thread objects also have an overloaded "==" comparison so that you can
796       do comparison on them as you would with normal objects.
797
798   What Threads Are Running?
799       "threads->list()" returns a list of thread objects, one for each thread
800       that's currently running and not detached.  Handy for a number of
801       things, including cleaning up at the end of your program (from the main
802       Perl thread, of course):
803
804           # Loop through all the threads
805           foreach my $thr (threads->list()) {
806               $thr->join();
807           }
808
809       If some threads have not finished running when the main Perl thread
810       ends, Perl will warn you about it and die, since it is impossible for
811       Perl to clean up itself while other threads are running.
812
813       NOTE:  The main Perl thread (thread 0) is in a detached state, and so
814       does not appear in the list returned by "threads->list()".
815

A Complete Example

817       Confused yet? It's time for an example program to show some of the
818       things we've covered.  This program finds prime numbers using threads.
819
820          1 #!/usr/bin/perl
821          2 # prime-pthread, courtesy of Tom Christiansen
822          3
823          4 use v5.36;
824          5
825          6 use threads;
826          7 use Thread::Queue;
827          8
828          9 sub check_num ($upstream, $cur_prime) {
829         10     my $kid;
830         11     my $downstream = Thread::Queue->new();
831         12     while (my $num = $upstream->dequeue()) {
832         13         next unless ($num % $cur_prime);
833         14         if ($kid) {
834         15             $downstream->enqueue($num);
835         16         } else {
836         17             print("Found prime: $num\n");
837         18             $kid = threads->create(\&check_num, $downstream, $num);
838         19             if (! $kid) {
839         20                 warn("Sorry.  Ran out of threads.\n");
840         21                 last;
841         22             }
842         23         }
843         24     }
844         25     if ($kid) {
845         26         $downstream->enqueue(undef);
846         27         $kid->join();
847         28     }
848         29 }
849         30
850         31 my $stream = Thread::Queue->new(3..1000, undef);
851         32 check_num($stream, 2);
852
853       This program uses the pipeline model to generate prime numbers.  Each
854       thread in the pipeline has an input queue that feeds numbers to be
855       checked, a prime number that it's responsible for, and an output queue
856       into which it funnels numbers that have failed the check.  If the
857       thread has a number that's failed its check and there's no child
858       thread, then the thread must have found a new prime number.  In that
859       case, a new child thread is created for that prime and stuck on the end
860       of the pipeline.
861
862       This probably sounds a bit more confusing than it really is, so let's
863       go through this program piece by piece and see what it does.  (For
864       those of you who might be trying to remember exactly what a prime
865       number is, it's a number that's only evenly divisible by itself and 1.)
866
867       The bulk of the work is done by the check_num() subroutine, which takes
868       a reference to its input queue and a prime number that it's responsible
869       for.  We create a new queue (line 11) and reserve a scalar for the
870       thread that we're likely to create later (line 10).
871
872       The while loop from line 12 to line 24 grabs a scalar off the input
873       queue and checks against the prime this thread is responsible for.
874       Line 13 checks to see if there's a remainder when we divide the number
875       to be checked by our prime.  If there is one, the number must not be
876       evenly divisible by our prime, so we need to either pass it on to the
877       next thread if we've created one (line 15) or create a new thread if we
878       haven't.
879
880       The new thread creation is line 18.  We pass on to it a reference to
881       the queue we've created, and the prime number we've found.  In lines 19
882       through 22, we check to make sure that our new thread got created, and
883       if not, we stop checking any remaining numbers in the queue.
884
885       Finally, once the loop terminates (because we got a 0 or "undef" in the
886       queue, which serves as a note to terminate), we pass on the notice to
887       our child, and wait for it to exit if we've created a child (lines 25
888       and 28).
889
890       Meanwhile, back in the main thread, we first create a queue (line 31)
891       and queue up all the numbers from 3 to 1000 for checking, plus a
892       termination notice.  Then all we have to do to get the ball rolling is
893       pass the queue and the first prime to the check_num() subroutine (line
894       32).
895
896       That's how it works.  It's pretty simple; as with many Perl programs,
897       the explanation is much longer than the program.
898

Different implementations of threads

900       Some background on thread implementations from the operating system
901       viewpoint.  There are three basic categories of threads: user-mode
902       threads, kernel threads, and multiprocessor kernel threads.
903
904       User-mode threads are threads that live entirely within a program and
905       its libraries.  In this model, the OS knows nothing about threads.  As
906       far as it's concerned, your process is just a process.
907
908       This is the easiest way to implement threads, and the way most OSes
909       start.  The big disadvantage is that, since the OS knows nothing about
910       threads, if one thread blocks they all do.  Typical blocking activities
911       include most system calls, most I/O, and things like sleep().
912
913       Kernel threads are the next step in thread evolution.  The OS knows
914       about kernel threads, and makes allowances for them.  The main
915       difference between a kernel thread and a user-mode thread is blocking.
916       With kernel threads, things that block a single thread don't block
917       other threads.  This is not the case with user-mode threads, where the
918       kernel blocks at the process level and not the thread level.
919
920       This is a big step forward, and can give a threaded program quite a
921       performance boost over non-threaded programs.  Threads that block
922       performing I/O, for example, won't block threads that are doing other
923       things.  Each process still has only one thread running at once,
924       though, regardless of how many CPUs a system might have.
925
926       Since kernel threading can interrupt a thread at any time, they will
927       uncover some of the implicit locking assumptions you may make in your
928       program.  For example, something as simple as "$x = $x + 2" can behave
929       unpredictably with kernel threads if $x is visible to other threads, as
930       another thread may have changed $x between the time it was fetched on
931       the right hand side and the time the new value is stored.
932
933       Multiprocessor kernel threads are the final step in thread support.
934       With multiprocessor kernel threads on a machine with multiple CPUs, the
935       OS may schedule two or more threads to run simultaneously on different
936       CPUs.
937
938       This can give a serious performance boost to your threaded program,
939       since more than one thread will be executing at the same time.  As a
940       tradeoff, though, any of those nagging synchronization issues that
941       might not have shown with basic kernel threads will appear with a
942       vengeance.
943
944       In addition to the different levels of OS involvement in threads,
945       different OSes (and different thread implementations for a particular
946       OS) allocate CPU cycles to threads in different ways.
947
948       Cooperative multitasking systems have running threads give up control
949       if one of two things happen.  If a thread calls a yield function, it
950       gives up control.  It also gives up control if the thread does
951       something that would cause it to block, such as perform I/O.  In a
952       cooperative multitasking implementation, one thread can starve all the
953       others for CPU time if it so chooses.
954
955       Preemptive multitasking systems interrupt threads at regular intervals
956       while the system decides which thread should run next.  In a preemptive
957       multitasking system, one thread usually won't monopolize the CPU.
958
959       On some systems, there can be cooperative and preemptive threads
960       running simultaneously. (Threads running with realtime priorities often
961       behave cooperatively, for example, while threads running at normal
962       priorities behave preemptively.)
963
964       Most modern operating systems support preemptive multitasking nowadays.
965

Performance considerations

967       The main thing to bear in mind when comparing Perl's ithreads to other
968       threading models is the fact that for each new thread created, a
969       complete copy of all the variables and data of the parent thread has to
970       be taken. Thus, thread creation can be quite expensive, both in terms
971       of memory usage and time spent in creation. The ideal way to reduce
972       these costs is to have a relatively short number of long-lived threads,
973       all created fairly early on (before the base thread has accumulated too
974       much data). Of course, this may not always be possible, so compromises
975       have to be made. However, after a thread has been created, its
976       performance and extra memory usage should be little different than
977       ordinary code.
978
979       Also note that under the current implementation, shared variables use a
980       little more memory and are a little slower than ordinary variables.
981

Process-scope Changes

983       Note that while threads themselves are separate execution threads and
984       Perl data is thread-private unless explicitly shared, the threads can
985       affect process-scope state, affecting all the threads.
986
987       The most common example of this is changing the current working
988       directory using chdir().  One thread calls chdir(), and the working
989       directory of all the threads changes.
990
991       Even more drastic example of a process-scope change is chroot(): the
992       root directory of all the threads changes, and no thread can undo it
993       (as opposed to chdir()).
994
995       Further examples of process-scope changes include umask() and changing
996       uids and gids.
997
998       Thinking of mixing fork() and threads?  Please lie down and wait until
999       the feeling passes.  Be aware that the semantics of fork() vary between
1000       platforms.  For example, some Unix systems copy all the current threads
1001       into the child process, while others only copy the thread that called
1002       fork(). You have been warned!
1003
1004       Similarly, mixing signals and threads may be problematic.
1005       Implementations are platform-dependent, and even the POSIX semantics
1006       may not be what you expect (and Perl doesn't even give you the full
1007       POSIX API).  For example, there is no way to guarantee that a signal
1008       sent to a multi-threaded Perl application will get intercepted by any
1009       particular thread.  (However, a recently added feature does provide the
1010       capability to send signals between threads.  See "THREAD SIGNALLING" in
1011       threads for more details.)
1012

Thread-Safety of System Libraries

1014       Whether various library calls are thread-safe is outside the control of
1015       Perl.  Calls often suffering from not being thread-safe include:
1016       localtime(), gmtime(),  functions fetching user, group and network
1017       information (such as getgrent(), gethostent(), getnetent() and so on),
1018       readdir(), rand(), and srand(). In general, calls that depend on some
1019       global external state.
1020
1021       If the system Perl is compiled in has thread-safe variants of such
1022       calls, they will be used.  Beyond that, Perl is at the mercy of the
1023       thread-safety or -unsafety of the calls.  Please consult your C library
1024       call documentation.
1025
1026       On some platforms the thread-safe library interfaces may fail if the
1027       result buffer is too small (for example the user group databases may be
1028       rather large, and the reentrant interfaces may have to carry around a
1029       full snapshot of those databases).  Perl will start with a small
1030       buffer, but keep retrying and growing the result buffer until the
1031       result fits.  If this limitless growing sounds bad for security or
1032       memory consumption reasons you can recompile Perl with
1033       "PERL_REENTRANT_MAXSIZE" defined to the maximum number of bytes you
1034       will allow.
1035

Conclusion

1037       A complete thread tutorial could fill a book (and has, many times), but
1038       with what we've covered in this introduction, you should be well on
1039       your way to becoming a threaded Perl expert.
1040

Bibliography

1056       Here's a short bibliography courtesy of Jürgen Christoffel:
1057
1058   Introductory Texts
1059       Birrell, Andrew D. An Introduction to Programming with Threads. Digital
1060       Equipment Corporation, 1989, DEC-SRC Research Report #35 online as
1061       <https://www.hpl.hp.com/techreports/Compaq-DEC/SRC-RR-35.pdf> (highly
1062       recommended)
1063
1064       Robbins, Kay. A., and Steven Robbins. Practical Unix Programming: A
1065       Guide to Concurrency, Communication, and Multithreading. Prentice-Hall,
1066       1996.
1067
1068       Lewis, Bill, and Daniel J. Berg. Multithreaded Programming with
1069       Pthreads. Prentice Hall, 1997, ISBN 0-13-443698-9 (a well-written
1070       introduction to threads).
1071
1072       Nelson, Greg (editor). Systems Programming with Modula-3.  Prentice
1073       Hall, 1991, ISBN 0-13-590464-1.
1074
1075       Nichols, Bradford, Dick Buttlar, and Jacqueline Proulx Farrell.
1076       Pthreads Programming. O'Reilly & Associates, 1996, ISBN 156592-115-1
1077       (covers POSIX threads).
1078
1079   OS-Related References
1080       Boykin, Joseph, David Kirschen, Alan Langerman, and Susan LoVerso.
1081       Programming under Mach. Addison-Wesley, 1994, ISBN 0-201-52739-1.
1082
1083       Tanenbaum, Andrew S. Distributed Operating Systems. Prentice Hall,
1084       1995, ISBN 0-13-219908-4 (great textbook).
1085
1086       Silberschatz, Abraham, and Peter B. Galvin. Operating System Concepts,
1087       4th ed. Addison-Wesley, 1995, ISBN 0-201-59292-4
1088
1089   Other References
1090       Arnold, Ken and James Gosling. The Java Programming Language, 2nd ed.
1091       Addison-Wesley, 1998, ISBN 0-201-31006-6.
1092
1093       comp.programming.threads FAQ,
1094       <http://www.serpentine.com/~bos/threads-faq/>
1095
1096       Le Sergent, T. and B. Berthomieu. "Incremental MultiThreaded Garbage
1097       Collection on Virtually Shared Memory Architectures" in Memory
1098       Management: Proc. of the International Workshop IWMM 92, St. Malo,
1099       France, September 1992, Yves Bekkers and Jacques Cohen, eds. Springer,
1100       1992, ISBN 3540-55940-X (real-life thread applications).
1101
1102       Artur Bergman, "Where Wizards Fear To Tread", June 11, 2002,
1103       <http://www.perl.com/pub/a/2002/06/11/threads.html>
1104

Acknowledgements

1106       Thanks (in no particular order) to Chaim Frenkel, Steve Fink, Gurusamy
1107       Sarathy, Ilya Zakharevich, Benjamin Sugars, Jürgen Christoffel, Joshua
1108       Pritikin, and Alan Burlison, for their help in reality-checking and
1109       polishing this article.  Big thanks to Tom Christiansen for his rewrite
1110       of the prime number generator.
1111

AUTHOR

1113       Dan Sugalski <dan@sidhe.org>
1114
1115       Slightly modified by Arthur Bergman to fit the new thread model/module.
1116
1117       Reworked slightly by Jörg Walter <jwalt@cpan.org> to be more concise
1118       about thread-safety of Perl code.
1119
1120       Rearranged slightly by Elizabeth Mattijsen <liz@dijkmat.nl> to put less
1121       emphasis on yield().
1122

Copyrights

1124       The original version of this article originally appeared in The Perl
1125       Journal #10, and is copyright 1998 The Perl Journal. It appears
1126       courtesy of Jon Orwant and The Perl Journal.  This document may be
1127       distributed under the same terms as Perl itself.
1128
1129
1130
1131perl v5.38.2                      2023-11-30                     PERLTHRTUT(1)