1AI::Categorizer::LearneUrs:e:rKNCNo(n3tprmi)buted Perl DAoIc:u:mCeantteagtoiroinzer::Learner::KNN(3pm)
2
3
4
6 AI::Categorizer::Learner::KNN - K Nearest Neighbour Algorithm For
7 AI::Categorizer
8
10 use AI::Categorizer::Learner::KNN;
11
12 # Here $k is an AI::Categorizer::KnowledgeSet object
13
14 my $nb = new AI::Categorizer::Learner::KNN(...parameters...);
15 $nb->train(knowledge_set => $k);
16 $nb->save_state('filename');
17
18 ... time passes ...
19
20 $l = AI::Categorizer::Learner->restore_state('filename');
21 my $c = new AI::Categorizer::Collection::Files( path => ... );
22 while (my $document = $c->next) {
23 my $hypothesis = $l->categorize($document);
24 print "Best assigned category: ", $hypothesis->best_category, "\n";
25 print "All assigned categories: ", join(', ', $hypothesis->categories), "\n";
26 }
27
29 This is an implementation of the k-Nearest-Neighbor decision-making
30 algorithm, applied to the task of document categorization (as defined
31 by the AI::Categorizer module). See AI::Categorizer for a complete
32 description of the interface.
33
35 This class inherits from the "AI::Categorizer::Learner" class, so all
36 of its methods are available unless explicitly mentioned here.
37
38 new()
39 Creates a new KNN Learner and returns it. In addition to the
40 parameters accepted by the "AI::Categorizer::Learner" class, the KNN
41 subclass accepts the following parameters:
42
43 threshold
44 Sets the score threshold for category membership. The default is
45 currently 0.1. Set the threshold lower to assign more categories
46 per document, set it higher to assign fewer. This can be an
47 effective way to trade of between precision and recall.
48
49 k_value
50 Sets the "k" value (as in k-Nearest-Neighbor) to the given integer.
51 This indicates how many of each document's nearest neighbors should
52 be considered when assigning categories. The default is 5.
53
54 threshold()
55 Returns the current threshold value. With an optional numeric
56 argument, you may set the threshold.
57
58 train(knowledge_set => $k)
59 Trains the categorizer. This prepares it for later use in categorizing
60 documents. The "knowledge_set" parameter must provide an object of the
61 class "AI::Categorizer::KnowledgeSet" (or a subclass thereof),
62 populated with lots of documents and categories. See
63 AI::Categorizer::KnowledgeSet for the details of how to create such an
64 object.
65
66 categorize($document)
67 Returns an "AI::Categorizer::Hypothesis" object representing the
68 categorizer's "best guess" about which categories the given document
69 should be assigned to. See AI::Categorizer::Hypothesis for more
70 details on how to use this object.
71
72 save_state($path)
73 Saves the categorizer for later use. This method is inherited from
74 "AI::Categorizer::Storable".
75
77 Originally written by David Bell ("<dave@student.usyd.edu.au>"),
78 October 2002.
79
80 Added to AI::Categorizer November 2002, modified, and maintained by Ken
81 Williams ("<ken@mathforum.org>").
82
84 Copyright 2000-2003 Ken Williams. All rights reserved.
85
86 This library is free software; you can redistribute it and/or modify it
87 under the same terms as Perl itself.
88
90 AI::Categorizer(3)
91
92 "A re-examination of text categorization methods" by Yiming Yang
93 <http://www.cs.cmu.edu/~yiming/publications.html>
94
95
96
97perl v5.38.0 2023-07-20AI::Categorizer::Learner::KNN(3pm)