1AI::Categorizer::LearneUrs:e:rKNCNo(n3t)ributed Perl DocAuIm:e:nCtaatteigoonrizer::Learner::KNN(3)
2
3
4

NAME

6       AI::Categorizer::Learner::KNN - K Nearest Neighbour Algorithm For
7       AI::Categorizer
8

SYNOPSIS

10         use AI::Categorizer::Learner::KNN;
11
12         # Here $k is an AI::Categorizer::KnowledgeSet object
13
14         my $nb = new AI::Categorizer::Learner::KNN(...parameters...);
15         $nb->train(knowledge_set => $k);
16         $nb->save_state('filename');
17
18         ... time passes ...
19
20         $l = AI::Categorizer::Learner->restore_state('filename');
21         my $c = new AI::Categorizer::Collection::Files( path => ... );
22         while (my $document = $c->next) {
23           my $hypothesis = $l->categorize($document);
24           print "Best assigned category: ", $hypothesis->best_category, "\n";
25           print "All assigned categories: ", join(', ', $hypothesis->categories), "\n";
26         }
27

DESCRIPTION

29       This is an implementation of the k-Nearest-Neighbor decision-making
30       algorithm, applied to the task of document categorization (as defined
31       by the AI::Categorizer module).  See AI::Categorizer for a complete
32       description of the interface.
33

METHODS

35       This class inherits from the "AI::Categorizer::Learner" class, so all
36       of its methods are available unless explicitly mentioned here.
37
38   new()
39       Creates a new KNN Learner and returns it.  In addition to the
40       parameters accepted by the "AI::Categorizer::Learner" class, the KNN
41       subclass accepts the following parameters:
42
43       threshold
44           Sets the score threshold for category membership.  The default is
45           currently 0.1.  Set the threshold lower to assign more categories
46           per document, set it higher to assign fewer.  This can be an
47           effective way to trade of between precision and recall.
48
49       k_value
50           Sets the "k" value (as in k-Nearest-Neighbor) to the given integer.
51           This indicates how many of each document's nearest neighbors should
52           be considered when assigning categories.  The default is 5.
53
54   threshold()
55       Returns the current threshold value.  With an optional numeric
56       argument, you may set the threshold.
57
58   train(knowledge_set => $k)
59       Trains the categorizer.  This prepares it for later use in categorizing
60       documents.  The "knowledge_set" parameter must provide an object of the
61       class "AI::Categorizer::KnowledgeSet" (or a subclass thereof),
62       populated with lots of documents and categories.  See
63       AI::Categorizer::KnowledgeSet for the details of how to create such an
64       object.
65
66   categorize($document)
67       Returns an "AI::Categorizer::Hypothesis" object representing the
68       categorizer's "best guess" about which categories the given document
69       should be assigned to.  See AI::Categorizer::Hypothesis for more
70       details on how to use this object.
71
72   save_state($path)
73       Saves the categorizer for later use.  This method is inherited from
74       "AI::Categorizer::Storable".
75

AUTHOR

77       Originally written by David Bell ("<dave@student.usyd.edu.au>"),
78       October 2002.
79
80       Added to AI::Categorizer November 2002, modified, and maintained by Ken
81       Williams ("<ken@mathforum.org>").
82
84       Copyright 2000-2003 Ken Williams.  All rights reserved.
85
86       This library is free software; you can redistribute it and/or modify it
87       under the same terms as Perl itself.
88

SEE ALSO

90       AI::Categorizer(3)
91
92       "A re-examination of text categorization methods" by Yiming Yang
93       <http://www.cs.cmu.edu/~yiming/publications.html>
94
95
96
97perl v5.32.1                      2021-01-26  AI::Categorizer::Learner::KNN(3)
Impressum