1Chatbot::Eliza(3)     User Contributed Perl Documentation    Chatbot::Eliza(3)
2
3
4

NAME

6       Chatbot::Eliza - A clone of the classic Eliza program
7

SYNOPSIS

9         use Chatbot::Eliza;
10
11         $mybot = new Chatbot::Eliza;
12         $mybot->command_interface;
13
14         # see below for details
15

DESCRIPTION

17       This module implements the classic Eliza algorithm.  The original Eliza
18       program was written by Joseph Weizenbaum and described in the
19       Communications of the ACM in 1966.  Eliza is a mock Rogerian
20       psychotherapist.  It prompts for user input, and uses a simple
21       transformation algorithm to change user input into a follow-up
22       question.  The program is designed to give the appearance of
23       understanding.
24
25       This program is a faithful implementation of the program described by
26       Weizenbaum.  It uses a simplified script language (devised by Charles
27       Hayden).  The content of the script is the same as Weizenbaum's.
28
29       This module encapsulates the Eliza algorithm in the form of an object.
30       This should make the functionality easy to incorporate in larger
31       programs.
32

INSTALLATION

34       The current version of Chatbot::Eliza.pm is available on CPAN:
35
36         http://www.perl.com/CPAN/modules/by-module/Chatbot/
37
38       To install this package, just change to the directory which you created
39       by untarring the package, and type the following:
40
41               perl Makefile.PL
42               make test
43               make
44               make install
45
46       This will copy Eliza.pm to your perl library directory for use by all
47       perl scripts.  You probably must be root to do this, unless you have
48       installed a personal copy of perl.
49

USAGE

51       This is all you need to do to launch a simple Eliza session:
52
53               use Chatbot::Eliza;
54
55               $mybot = new Chatbot::Eliza;
56               $mybot->command_interface;
57
58       You can also customize certain features of the session:
59
60               $myotherbot = new Chatbot::Eliza;
61
62               $myotherbot->name( "Hortense" );
63               $myotherbot->debug( 1 );
64
65               $myotherbot->command_interface;
66
67       These lines set the name of the bot to be "Hortense" and turn on the
68       debugging output.
69
70       When creating an Eliza object, you can specify a name and an
71       alternative scriptfile:
72
73               $bot = new Chatbot::Eliza "Brian", "myscript.txt";
74
75       You can also use an anonymous hash to set these parameters.  Any of the
76       fields can be initialized using this syntax:
77
78               $bot = new Chatbot::Eliza {
79                       name       => "Brian",
80                       scriptfile => "myscript.txt",
81                       debug      => 1,
82                       prompts_on => 1,
83                       memory_on  => 0,
84                       myrand     =>
85                               sub { my $N = defined $_[0] ? $_[0] : 1;  rand($N); },
86               };
87
88       If you don't specify a script file, then the new object will be
89       initialized with a default script.  The module contains this script
90       within itself.
91
92       You can use any of the internal functions in a calling program.  The
93       code below takes an arbitrary string and retrieves the reply from the
94       Eliza object:
95
96               my $string = "I have too many problems.";
97               my $reply  = $mybot->transform( $string );
98
99       You can easily create two bots, each with a different script, and see
100       how they interact:
101
102               use Chatbot::Eliza
103
104               my ($harry, $sally, $he_says, $she_says);
105
106               $sally = new Chatbot::Eliza "Sally", "histext.txt";
107               $harry = new Chatbot::Eliza "Harry", "hertext.txt";
108
109               $he_says  = "I am sad.";
110
111               # Seed the random number generator.
112               srand( time ^ ($$ + ($$ << 15)) );
113
114               while (1) {
115                       $she_says = $sally->transform( $he_says );
116                       print $sally->name, ": $she_says \n";
117
118                       $he_says  = $harry->transform( $she_says );
119                       print $harry->name, ": $he_says \n";
120               }
121
122       Mechanically, this works well.  However, it critically depends on the
123       actual script data.  Having two mock Rogerian therapists talk to each
124       other usually does not produce any sensible conversation, of course.
125
126       After each call to the transform() method, the debugging output for
127       that transformation is stored in a variable called $debug_text.
128
129               my $reply      = $mybot->transform( "My foot hurts" );
130               my $debugging  = $mybot->debug_text;
131
132       This feature always available, even if the instance's $debug variable
133       is set to 0.
134
135       Calling programs can specify their own random-number generators.  Use
136       this syntax:
137
138               $chatbot = new Chatbot::Eliza;
139               $chatbot->myrand(
140                       sub {
141                               #function goes here!
142                       }
143               );
144
145       The custom random function should have the same prototype as perl's
146       built-in rand() function.  That is, it should take a single (numeric)
147       expression as a parameter, and it should return a floating-point value
148       between 0 and that number.
149
150       What this code actually does is pass a reference to an anonymous
151       subroutine ("code reference").  Make sure you've read the perlref
152       manpage for details on how code references actually work.
153
154       If you don't specify any custom rand function, then the Eliza object
155       will just use the built-in rand() function.
156

MAIN DATA MEMBERS

158       Each Eliza object uses the following data structures to hold the script
159       data in memory:
160
161   %decomplist
162       Hash: the set of keywords;  Values: strings containing the
163       decomposition rules.
164
165   %reasmblist
166       Hash: a set of values which are each the join of a keyword and a
167       corresponding decomposition rule; Values: the set of possible
168       reassembly statements for that keyword and decomposition rule.
169
170   %reasmblist_for_memory
171       This structure is identical to %reasmblist, except that these rules are
172       only invoked when a user comment is being retrieved from memory. These
173       contain comments such as "Earlier you mentioned that...," which are
174       only appropriate for remembered comments.  Rules in the script must be
175       specially marked in order to be included in this list rather than
176       %reasmblist. The default script only has a few of these rules.
177
178   @memory
179       A list of user comments which an Eliza instance is remembering for
180       future use.  Eliza does not remember everything, only some things.  In
181       this implementation, Eliza will only remember comments which match a
182       decomposition rule which actually has reassembly rules that are marked
183       with the keyword "reasm_for_memory" rather than the normal "reasmb".
184       The default script only has a few of these.
185
186   %keyranks
187       Hash: the set of keywords;  Values: the ranks for each keyword
188
189   @quit
190       "quit" words -- that is, words the user might use to try to exit the
191       program.
192
193   @initial
194       Possible greetings for the beginning of the program.
195
196   @final
197       Possible farewells for the end of the program.
198
199   %pre
200       Hash: words which are replaced before any transformations; Values: the
201       respective replacement words.
202
203   %post
204       Hash: words which are replaced after the transformations and after the
205       reply is constructed;  Values: the respective replacement words.
206
207   %synon
208       Hash: words which are found in decomposition rules; Values: words which
209       are treated just like their corresponding synonyms during matching of
210       decomposition rules.
211
212   Other data members
213       There are several other internal data members.  Hopefully these are
214       sufficiently obvious that you can learn about them just by reading the
215       source code.
216

METHODS

218   new()
219           my $chatterbot = new Chatbot::Eliza;
220
221       new() creates a new Eliza object.  This method also calls the internal
222       _initialize() method, which in turn calls the parse_script_data()
223       method, which initializes the script data.
224
225           my $chatterbot = new Chatbot::Eliza 'Ahmad', 'myfile.txt';
226
227       The eliza object defaults to the name "Eliza", and it contains default
228       script data within itself.  However, using the syntax above, you can
229       specify an alternative name and an alternative script file.
230
231       See the method parse_script_data(). for a description of the format of
232       the script file.
233
234   command_interface()
235           $chatterbot->command_interface;
236
237       command_interface() opens an interactive session with the Eliza object,
238       just like the original Eliza program.
239
240       If you want to design your own session format, then you can write your
241       own while loop and your own functions for prompting for and reading
242       user input, and use the transform() method to generate Eliza's
243       responses.  (Note: you do not need to invoke preprocess() and
244       postprocess() directly, because these are invoked from within the
245       transform() method.)
246
247       But if you're lazy and you want to skip all that, then just use
248       command_interface().  It's all done for you.
249
250       During an interactive session invoked using command_interface(), you
251       can enter the word "debug" to toggle debug mode on and off.  You can
252       also enter the keyword "memory" to invoke the _debug_memory() method
253       and print out the contents of the Eliza instance's memory.
254
255   preprocess()
256           $string = preprocess($string);
257
258       preprocess() applies simple substitution rules to the input string.
259       Mostly this is to catch varieties in spelling, misspellings,
260       contractions and the like.
261
262       preprocess() is called from within the transform() method.  It is
263       applied to user-input text, BEFORE any processing, and before a
264       reassebly statement has been selected.
265
266       It uses the array %pre, which is created during the parse of the
267       script.
268
269   postprocess()
270           $string = postprocess($string);
271
272       postprocess() applies simple substitution rules to the reassembly rule.
273       This is where all the "I"'s and "you"'s are exchanged.  postprocess()
274       is called from within the transform() function.
275
276       It uses the array %post, created during the parse of the script.
277
278   _testquit()
279            if ($self->_testquit($user_input) ) { ... }
280
281       _testquit() detects words like "bye" and "quit" and returns true if it
282       finds one of them as the first word in the sentence.
283
284       These words are listed in the script, under the keyword "quit".
285
286   _debug_memory()
287            $self->_debug_memory()
288
289       _debug_memory() is a special function which returns the contents of
290       Eliza's memory stack.
291
292   transform()
293           $reply = $chatterbot->transform( $string, $use_memory );
294
295       transform() applies transformation rules to the user input string.  It
296       invokes preprocess(), does transformations, then invokes postprocess().
297       It returns the transformed output string, called $reasmb.
298
299       The algorithm embedded in the transform() method has three main parts:
300
301       1.  Search the input string for a keyword.
302
303       2.  If we find a keyword, use the list of decomposition rules for that
304           keyword, and pattern-match the input string against each rule.
305
306       3.  If the input string matches any of the decomposition rules, then
307           randomly select one of the reassembly rules for that decomposition
308           rule, and use it to construct the reply.
309
310       transform() takes two parameters.  The first is the string we want to
311       transform.  The second is a flag which indicates where this sting came
312       from.  If the flag is set, then the string has been pulled from memory,
313       and we should use reassembly rules appropriate for that.  If the flag
314       is not set, then the string is the most recent user input, and we can
315       use the ordinary reassembly rules.
316
317       The memory flag is only set when the transform() function is called
318       recursively.  The mechanism for setting this parameter is embedded in
319       the transoform method itself.  If the flag is set inappropriately, it
320       is ignored.
321
322   How memory is used
323       In the script, some reassembly rules are special.  They are marked with
324       the keyword "reasm_for_memory", rather than just "reasm".  Eliza
325       "remembers" any comment when it matches a docomposition rule for which
326       there are any reassembly rules for memory.  An Eliza object remembers
327       up to $max_memory_size (default: 5) user input strings.
328
329       If, during a subsequent run, the transform() method fails to find any
330       appropriate decomposition rule for a user's comment, and if there are
331       any comments inside the memory array, then Eliza may elect to ignore
332       the most recent comment and instead pull out one of the strings from
333       memory.  In this case, the transform method is called recursively with
334       the memory flag.
335
336       Honestly, I am not sure exactly how this memory functionality was
337       implemented in the original Eliza program.  Hopefully this
338       implementation is not too far from Weizenbaum's.
339
340       If you don't want to use the memory functionality at all, then you can
341       disable it:
342
343               $mybot->memory_on(0);
344
345       You can also achieve the same effect by making sure that the script
346       data does not contain any reassembly rules marked with the keyword
347       "reasm_for_memory".  The default script data only has 4 such items.
348
349   parse_script_data()
350           $self->parse_script_data;
351           $self->parse_script_data( $script_file );
352
353       parse_script_data() is invoked from the _initialize() method, which is
354       called from the new() function.  However, you can also call this method
355       at any time against an already-instantiated Eliza instance.  In that
356       case, the new script data is added to the old script data.  The old
357       script data is not deleted.
358
359       You can pass a parameter to this function, which is the name of the
360       script file, and it will read in and parse that file.  If you do not
361       pass any parameter to this method, then it will read the data embedded
362       at the end of the module as its default script data.
363
364       If you pass the name of a script file to parse_script_data(), and that
365       file is not available for reading, then the module dies.
366

Format of the script file

368       This module includes a default script file within itself, so it is not
369       necessary to explicitly specify a script file when instantiating an
370       Eliza object.
371
372       Each line in the script file can specify a key, a decomposition rule,
373       or a reassembly rule.
374
375         key: remember 5
376           decomp: * i remember *
377             reasmb: Do you often think of (2) ?
378             reasmb: Does thinking of (2) bring anything else to mind ?
379           decomp: * do you remember *
380             reasmb: Did you think I would forget (2) ?
381             reasmb: What about (2) ?
382             reasmb: goto what
383         pre: equivalent alike
384         synon: belief feel think believe wish
385
386       The number after the key specifies the rank.  If a user's input
387       contains the keyword, then the transform() function will try to match
388       one of the decomposition rules for that keyword.  If one matches, then
389       it will select one of the reassembly rules at random.  The number (2)
390       here means "use whatever set of words matched the second asterisk in
391       the decomposition rule."
392
393       If you specify a list of synonyms for a word, the you should use a "@"
394       when you use that word in a decomposition rule:
395
396         decomp: * i @belief i *
397           reasmb: Do you really think so ?
398           reasmb: But you are not sure you (3).
399
400       Otherwise, the script will never check to see if there are any synonyms
401       for that keyword.
402
403       Reassembly rules should be marked with reasm_for_memory rather than
404       reasmb when it is appropriate for use when a user's comment has been
405       extracted from memory.
406
407         key: my 2
408           decomp: * my *
409             reasm_for_memory: Let's discuss further why your (2).
410             reasm_for_memory: Earlier you said your (2).
411             reasm_for_memory: But your (2).
412             reasm_for_memory: Does that have anything to do with the fact that your (2) ?
413

How the script file is parsed

415       Each line in the script file contains an "entrytype" (key, decomp,
416       synon) and an "entry", separated by a colon.  In turn, each "entry" can
417       itself be composed of a "key" and a "value", separated by a space.  The
418       parse_script_data() function parses each line out, and splits the
419       "entry" and "entrytype" portion of each line into two variables, $entry
420       and $entrytype.
421
422       Next, it uses the string $entrytype to determine what sort of stuff to
423       expect in the $entry variable, if anything, and parses it accordingly.
424       In some cases, there is no second level of key-value pair, so the
425       function does not even bother to isolate or create $key and $value.
426
427       $key is always a single word.  $value can be null, or one single word,
428       or a string composed of several words, or an array of words.
429
430       Based on all these entries and keys and values, the function creates
431       two giant hashes: %decomplist, which holds the decomposition rules for
432       each keyword, and %reasmblist, which holds the reassembly phrases for
433       each decomposition rule.  It also creates %keyranks, which holds the
434       ranks for each key.
435
436       Six other arrays are created: "%reasm_for_memory, %pre, %post, %synon,
437       @initial," and @final.
438
440       This software is copyright (c) 2003 by John Nolan  <jpnolan@sonic.net>.
441
442       This is free software; you can redistribute it and/or modify it under
443       the same terms as the Perl 5 programming language system itself.
444

AUTHOR

446       John Nolan  jpnolan@sonic.net  January 2003.
447
448       Implements the classic Eliza algorithm by Prof. Joseph Weizenbaum.
449       Script format devised by Charles Hayden.
450
451
452
453perl v5.38.0                      2023-07-20                 Chatbot::Eliza(3)
Impressum