1XML::FeedPP(3) User Contributed Perl Documentation XML::FeedPP(3)
2
3
4
6 XML::FeedPP -- Parse/write/merge/edit RSS/RDF/Atom syndication feeds
7
9 Get an RSS file and parse it:
10
11 my $source = 'http://use.perl.org/index.rss';
12 my $feed = XML::FeedPP->new( $source );
13 print "Title: ", $feed->title(), "\n";
14 print "Date: ", $feed->pubDate(), "\n";
15 foreach my $item ( $feed->get_item() ) {
16 print "URL: ", $item->link(), "\n";
17 print "Title: ", $item->title(), "\n";
18 }
19
20 Generate an RDF file and save it:
21
22 my $feed = XML::FeedPP::RDF->new();
23 $feed->title( "use Perl" );
24 $feed->link( "http://use.perl.org/" );
25 $feed->pubDate( "Thu, 23 Feb 2006 14:43:43 +0900" );
26 my $item = $feed->add_item( "http://search.cpan.org/~kawasaki/XML-TreePP-0.02" );
27 $item->title( "Pure Perl implementation for parsing/writing xml file" );
28 $item->pubDate( "2006-02-23T14:43:43+09:00" );
29 $feed->to_file( "index.rdf" );
30
31 Convert some RSS/RDF files to Atom format:
32
33 my $feed = XML::FeedPP::Atom->new(); # create empty atom file
34 $feed->merge( "rss.xml" ); # load local RSS file
35 $feed->merge( "http://www.kawa.net/index.rdf" ); # load remote RDF file
36 my $now = time();
37 $feed->pubDate( $now ); # touch date
38 my $atom = $feed->to_string(); # get Atom source code
39
41 "XML::FeedPP" is an all-purpose syndication utility that parses and
42 publishes RSS 2.0, RSS 1.0 (RDF), Atom 0.3 and 1.0 feeds. It allows
43 you to add new content, merge feeds, and convert among these various
44 formats. It is a pure Perl implementation and does not require any
45 other module except for XML::TreePP.
46
48 $feed = XML::FeedPP->new( "index.rss" );
49 This constructor method creates an "XML::FeedPP" feed instance. The
50 only argument is the local filename. The format of $source must be one
51 of the supported feed formats -- RSS, RDF or Atom -- or execution is
52 halted.
53
54 $feed = XML::FeedPP->new( "http://use.perl.org/index.rss" );
55 The URL on the remote web server is also available as the first
56 argument. LWP::UserAgent is required to download it.
57
58 $feed = XML::FeedPP->new( '<rss version="2.0">...' );
59 The XML source code is also available as the first argument.
60
61 $feed = XML::FeedPP->new( $source, -type => $type );
62 The "-type" argument allows you to specify type of $source from choice
63 of 'file', 'url' or 'string'.
64
65 $feed = XML::FeedPP->new( $source, utf8_flag => 1 );
66 This makes utf8 flag on for every feed elements. Perl 5.8.1 or later
67 is required to use this.
68
69 Note that any other options for "XML::TreePP" constructor are also
70 allowed like this. See more detail on XML::TreePP.
71
72 $feed = XML::FeedPP::RSS->new( $source );
73 This constructor method creates an instance for an RSS 2.0 feed. The
74 first argument is optional, but must be valid an RSS source if
75 specified. This method returns an empty instance when $source is
76 undefined.
77
78 $feed = XML::FeedPP::RDF->new( $source );
79 This constructor method creates an instance for RSS 1.0 (RDF) feed.
80 The first argument is optional, but must be an RDF source if specified.
81 This method returns an empty instance when $source is undefined.
82
83 $feed = XML::FeedPP::Atom->new( $source );
84 This constructor method creates an instance for an Atom 0.3/1.0 feed.
85 The first argument is optional, but must be an Atom source if
86 specified. This method returns an empty instance when $source is
87 undefined.
88
89 Atom 1.0 feed is also supported since "XML::FeedPP" version 0.30. Atom
90 0.3 is still default, however, future version of this module would
91 create Atom 1.0 as default.
92
93 $feed = XML::FeedPP::Atom::Atom03->new();
94 This creates an empty Atom 0.3 instance obviously.
95
96 $feed = XML::FeedPP::Atom::Atom10->new();
97 This creates an empty Atom 1.0 instance intended.
98
99 $feed = XML::FeedPP::RSS->new( link => $link, title => $tile, ... );
100 This creates a RSS instance which has "link", "title" elements etc.
101
102 $feed->load( $source );
103 This method loads an RSS/RDF/Atom file, much like "new()" method does.
104
105 $feed->merge( $source );
106 This method merges an RSS/RDF/Atom file into the existing $feed
107 instance. Top-level metadata from the imported feed is incorporated
108 only if missing from the present feed.
109
110 $string = $feed->to_string( $encoding );
111 This method generates XML source as string and returns it. The output
112 $encoding is optional, and the default encoding is 'UTF-8'. On Perl
113 5.8 and later, any encodings supported by the Encode module are
114 available. On Perl 5.005 and 5.6.1, only four encodings supported by
115 the Jcode module are available: 'UTF-8', 'Shift_JIS', 'EUC-JP' and
116 'ISO-2022-JP'. 'UTF-8' is recommended for overall compatibility.
117
118 $string = $feed->to_string( indent => 4 );
119 This makes the output more human readable by indenting appropriately.
120 This does not strictly follow the XML specification but does looks
121 nice.
122
123 Note that any other options for "XML::TreePP" constructor are also
124 allowed like this. See more detail on XML::TreePP.
125
126 $feed->to_file( $filename, $encoding );
127 This method generate an XML file. The output $encoding is optional,
128 and the default is 'UTF-8'.
129
130 $item = $feed->add_item( $link );
131 This method creates a new item/entry and returns its instance. A
132 mandatory $link argument is the URL of the new item/entry.
133
134 $item = $feed->add_item( $srcitem );
135 This method duplicates an item/entry and adds it to $feed. $srcitem is
136 a "XML::FeedPP::*::Item" class's instance which is returned by
137 "get_item()" method, as described above.
138
139 $item = $feed->add_item( link => $link, title => $tile, ... );
140 This method creates an new item/entry which has "link", "title"
141 elements etc.
142
143 $item = $feed->get_item( $index );
144 This method returns item(s) in a $feed. A valid zero-based array
145 $index returns the corresponding item in the feed. An invalid $index
146 yields undef. If $index is undefined in array context, it returns an
147 array of all items. If $index is undefined in scalar context, it
148 returns the number of items.
149
150 @items = $feed->match_item( link => qr/.../, title => qr/.../, ... );
151 This method finds item(s) which match all regular expressions given.
152 This method returns an array of all matched items in array context.
153 This method returns the first matched item in scalar context.
154
155 $feed->remove_item( $index );
156 This method removes an item/entry from $feed, where $index is a valid
157 zero-based array index.
158
159 $feed->clear_item();
160 This method removes all items/entries from the $feed.
161
162 $feed->sort_item();
163 This method sorts the order of items in $feed by "pubDate".
164
165 $feed->uniq_item();
166 This method makes items unique. The second and succeeding items that
167 have the same link URL are removed.
168
169 $feed->normalize();
170 This method calls both the "sort_item()" and "uniq_item()" methods.
171
172 $feed->limit_item( $num );
173 Removes items in excess of the specified numeric limit. Items at the
174 end of the list are removed. When preceded by "sort_item()" or
175 "normalize()", this deletes more recent items.
176
177 $feed->xmlns( "xmlns:media" => "http://search.yahoo.com/mrss" );
178 Adds an XML namespace at the document root of the feed.
179
180 $url = $feed->xmlns( "xmlns:media" );
181 Returns the URL of the specified XML namespace.
182
183 @list = $feed->xmlns();
184 Returns the list of all XML namespaces used in $feed.
185
187 $feed->title( $text );
188 This method sets/gets the feed's "title" element, returning its current
189 value when $title is undefined.
190
191 $feed->description( $html );
192 This method sets/gets the feed's "description" element in plain text or
193 HTML, returning its current value when $html is undefined. It is
194 mapped to "content" element for Atom 0.3/1.0.
195
196 $feed->pubDate( $date );
197 This method sets/gets the feed's "pubDate" element for RSS, returning
198 its current value when $date is undefined. It is mapped to "dc:date"
199 element for RDF, "modified" for Atom 0.3, and "updated" for Atom 1.0.
200 See also "DATE AND TIME FORMATS" section below.
201
202 $feed->copyright( $text );
203 This method sets/gets the feed's "copyright" element for RSS, returning
204 its current value when $text is undefined. It is mapped to "dc:rights"
205 element for RDF, "copyright" for Atom 0.3, and "rights" for Atom 1.0.
206
207 $feed->link( $url );
208 This method sets/gets the URL of the web site as the feed's "link"
209 element, returning its current value when the $url is undefined.
210
211 $feed->language( $lang );
212 This method sets/gets the feed's "language" element for RSS, returning
213 its current value when the $lang is undefined. It is mapped to
214 "dc:language" element for RDF, "feed xml:lang=""" for Atom 0.3/1.0.
215
216 $feed->image( $url, $title, $link, $description, $width, $height )
217 This method sets/gets the feed's "image" element and its child nodes,
218 returning a list of current values when any arguments are undefined.
219
221 $item->title( $text );
222 This method sets/gets the item's "title" element, returning its current
223 value when the $text is undefined.
224
225 $item->description( $html );
226 This method sets/gets the item's "description" element in HTML or plain
227 text, returning its current value when $text is undefined. It is
228 mapped to "content" element for Atom 0.3/1.0.
229
230 $item->pubDate( $date );
231 This method sets/gets the item's "pubDate" element, returning its
232 current value when $date is undefined. It is mapped to "dc:date"
233 element for RDF, "modified" for Atom 0.3, and "updated" for Atom 1.0.
234 See also "DATE AND TIME FORMATS" section below.
235
236 $item->category( $text );
237 This method sets/gets the item's "category" element. returning its
238 current value when $text is undefined. It is mapped to "dc:subject"
239 element for RDF, and ignored for Atom 0.3.
240
241 $item->author( $name );
242 This method sets/gets the item's "author" element, returning its
243 current value when $name is undefined. It is mapped to "dc:creator"
244 element for RDF, "author" for Atom 0.3/1.0.
245
246 $item->guid( $guid, isPermaLink => $bool );
247 This method sets/gets the item's "guid" element, returning its current
248 value when $guid is undefined. It is mapped to "id" element for Atom,
249 and ignored for RDF. The second argument is optional.
250
251 $item->set( $key => $value, ... );
252 This method sets customized node values or attributes. See also
253 "ACCESSOR AND MUTATORS" section below.
254
255 $value = $item->get( $key );
256 This method returns the node value or attribute. See also "ACCESSOR
257 AND MUTATORS" section below.
258
259 $link = $item->link();
260 This method returns the item's "link" element.
261
263 This module understands only subset of "rdf:*", "dc:*" modules and
264 RSS/RDF/Atom's default namespaces by itself. There are NO native
265 methods for any other external modules, such as "media:*". But "set()"
266 and "get()" methods are available to get/set the value of any elements
267 or attributes for these modules.
268
269 $item->set( "module:name" => $value );
270 This sets the value of the child node:
271
272 <item><module:name>$value</module:name>...</item>
273
274 $item->set( "module:name@attr" => $value );
275 This sets the value of the child node's attribute:
276
277 <item><module:name attr="$value" />...</item>
278
279 $item->set( "@attr" => $value );
280 This sets the value of the item's attribute:
281
282 <item attr="$value">...</item>
283
284 $item->set( "hoge/pomu@hare" => $value );
285 This code sets the value of the child node's child node's attribute:
286
287 <item><hoge><pomu attr="$value" /></hoge>...</item>
288
290 "XML::FeedPP" allows you to describe date/time using any of the three
291 following formats:
292
293 $date = "Thu, 23 Feb 2006 14:43:43 +0900";
294 This is the HTTP protocol's preferred format and RSS 2.0's native
295 format, as defined by RFC 1123.
296
297 $date = "2006-02-23T14:43:43+09:00";
298 W3CDTF is the native format of RDF, as defined by ISO 8601.
299
300 $date = 1140705823;
301 The last format is the number of seconds since the epoch,
302 "1970-01-01T00:00:00Z". You know, this is the native format of Perl's
303 "time()" function.
304
306 "XML::FeedPP" requires only XML::TreePP which likewise is a pure Perl
307 implementation. The standard LWP::UserAgent is required to download
308 feeds from remote web servers. "Jcode.pm" is required to convert
309 Japanese encodings on Perl 5.005 and 5.6.1, but is NOT required on Perl
310 5.8.x and later.
311
313 Yusuke Kawasaki, http://www.kawa.net/
314
316 Copyright (c) 2006-2009 Yusuke Kawasaki. All rights reserved. This
317 program is free software; you can redistribute it and/or modify it
318 under the same terms as Perl itself.
319
320
321
322perl v5.12.1 2009-11-21 XML::FeedPP(3)