1XML::FeedPP(3) User Contributed Perl Documentation XML::FeedPP(3)
2
3
4
6 XML::FeedPP -- Parse/write/merge/edit RSS/RDF/Atom syndication feeds
7
9 Get an RSS file and parse it:
10
11 use XML::FeedPP ();
12 my $source = 'http://use.perl.org/index.rss';
13 my $feed = XML::FeedPP->new( $source );
14 print "Title: ", $feed->title(), "\n";
15 print "Date: ", $feed->pubDate(), "\n";
16 foreach my $item ( $feed->get_item() ) {
17 print "URL: ", $item->link(), "\n";
18 print "Title: ", $item->title(), "\n";
19 }
20
21 Generate an RDF file and save it:
22
23 use XML::FeedPP ();
24 my $feed = XML::FeedPP::RDF->new();
25 $feed->title( "use Perl" );
26 $feed->link( "http://use.perl.org/" );
27 $feed->pubDate( "Thu, 23 Feb 2006 14:43:43 +0900" );
28 my $item = $feed->add_item( "http://search.cpan.org/~kawasaki/XML-TreePP-0.02" );
29 $item->title( "Pure Perl implementation for parsing/writing xml file" );
30 $item->pubDate( "2006-02-23T14:43:43+09:00" );
31 $feed->to_file( "index.rdf" );
32
33 Convert some RSS/RDF files to Atom format:
34
35 use XML::FeedPP ();
36 my $feed = XML::FeedPP::Atom::Atom10->new(); # create empty atom file
37 $feed->merge( "rss.xml" ); # load local RSS file
38 $feed->merge( "http://www.kawa.net/index.rdf" ); # load remote RDF file
39 my $now = time();
40 $feed->pubDate( $now ); # touch date
41 my $atom = $feed->to_string(); # get Atom source code
42
44 "XML::FeedPP" is an all-purpose syndication utility that parses and
45 publishes RSS 2.0, RSS 1.0 (RDF), Atom 0.3 and 1.0 feeds. It allows
46 you to add new content, merge feeds, and convert among these various
47 formats. It is a pure Perl implementation and does not require any
48 other module except for XML::TreePP.
49
51 $feed = XML::FeedPP->new( "index.rss" );
52 This constructor method creates an "XML::FeedPP" feed instance. The
53 only argument is the local filename. The format of $source must be one
54 of the supported feed formats -- RSS, RDF or Atom -- or execution is
55 halted.
56
57 $feed = XML::FeedPP->new( "http://use.perl.org/index.rss" );
58 The URL on the remote web server is also available as the first
59 argument. LWP::UserAgent is required to download it.
60
61 $feed = XML::FeedPP->new( '<rss version="2.0">...' );
62 The XML source code is also available as the first argument.
63
64 $feed = XML::FeedPP->new( $source, -type => $type );
65 The "-type" argument allows you to specify type of $source from choice
66 of 'file', 'url' or 'string'.
67
68 $feed = XML::FeedPP->new( $source, utf8_flag => 1 );
69 This makes utf8 flag on for all feed elements. Perl 5.8.1 or later is
70 required to use this.
71
72 Note that any other options for "XML::TreePP" constructor are also
73 allowed like this. See more detail on XML::TreePP.
74
75 $feed = XML::FeedPP::RSS->new( $source );
76 This constructor method creates an instance for an RSS 2.0 feed. The
77 first argument is optional, but must be valid an RSS source if
78 specified. This method returns an empty instance when $source is
79 undefined.
80
81 $feed = XML::FeedPP::RDF->new( $source );
82 This constructor method creates an instance for RSS 1.0 (RDF) feed.
83 The first argument is optional, but must be an RDF source if specified.
84 This method returns an empty instance when $source is undefined.
85
86 $feed = XML::FeedPP::Atom->new( $source );
87 This constructor method creates an instance for an Atom 0.3/1.0 feed.
88 The first argument is optional, but must be an Atom source if
89 specified. This method returns an empty instance when $source is
90 undefined.
91
92 Atom 1.0 feed is also supported since "XML::FeedPP" version 0.30. Atom
93 0.3 is still default, however, future version of this module would
94 create Atom 1.0 as default.
95
96 $feed = XML::FeedPP::Atom::Atom03->new();
97 This creates an empty Atom 0.3 instance obviously.
98
99 $feed = XML::FeedPP::Atom::Atom10->new();
100 This creates an empty Atom 1.0 instance intended.
101
102 $feed = XML::FeedPP::RSS->new( link => $link, title => $tile, ... );
103 This creates a RSS instance which has "link", "title" elements etc.
104
105 $feed->load( $source );
106 This method loads an RSS/RDF/Atom file, much like "new()" method does.
107
108 $feed->merge( $source );
109 This method merges an RSS/RDF/Atom file into the existing $feed
110 instance. Top-level metadata from the imported feed is incorporated
111 only if missing from the present feed.
112
113 $string = $feed->to_string( $encoding );
114 This method generates XML source as string and returns it. The output
115 $encoding is optional, and the default encoding is 'UTF-8'. On Perl
116 5.8 and later, any encodings supported by the Encode module are
117 available. On Perl 5.005 and 5.6.1, only four encodings supported by
118 the Jcode module are available: 'UTF-8', 'Shift_JIS', 'EUC-JP' and
119 'ISO-2022-JP'. 'UTF-8' is recommended for overall compatibility.
120
121 $string = $feed->to_string( indent => 4 );
122 This makes the output more human readable by indenting appropriately.
123 This does not strictly follow the XML specification but does looks
124 nice.
125
126 Note that any other options for "XML::TreePP" constructor are also
127 allowed like this. See more detail on XML::TreePP.
128
129 $feed->to_file( $filename, $encoding );
130 This method generate an XML file. The output $encoding is optional,
131 and the default is 'UTF-8'.
132
133 $item = $feed->add_item( $link );
134 This method creates a new item/entry and returns its instance. A
135 mandatory $link argument is the URL of the new item/entry.
136
137 $item = $feed->add_item( $srcitem );
138 This method duplicates an item/entry and adds it to $feed. $srcitem is
139 a "XML::FeedPP::*::Item" class's instance which is returned by
140 "get_item()" method, as described above.
141
142 $item = $feed->add_item( link => $link, title => $tile, ... );
143 This method creates an new item/entry which has "link", "title"
144 elements etc.
145
146 $item = $feed->get_item( $index );
147 This method returns item(s) in a $feed. A valid zero-based array
148 $index returns the corresponding item in the feed. An invalid $index
149 yields undef. If $index is undefined in array context, it returns an
150 array of all items. If $index is undefined in scalar context, it
151 returns the number of items.
152
153 @items = $feed->match_item( link => qr/.../, title => qr/.../, ... );
154 This method finds item(s) which match all regular expressions given.
155 This method returns an array of all matched items in array context.
156 This method returns the first matched item in scalar context.
157
158 $feed->remove_item( $index or $link );
159 This method removes an item/entry specified by zero-based array index
160 or link URL.
161
162 $feed->clear_item();
163 This method removes all items/entries from the $feed.
164
165 $feed->sort_item();
166 This method sorts the order of items in $feed by "pubDate".
167
168 $feed->uniq_item();
169 Reduces the list of items, not to include duplicates. RDF and Atoms
170 have a guid attribute to defined uniqueness, for RSS we use the link.
171
172 $feed->normalize();
173 This method calls both the "sort_item()" and "uniq_item()" methods.
174
175 $feed->limit_item( $num );
176 Removes items in excess of the specified numeric limit. Items at the
177 end of the list are removed. When preceded by "sort_item()" or
178 "normalize()", this deletes more recent items.
179
180 $feed->xmlns( "xmlns:media" => "http://search.yahoo.com/mrss" );
181 Adds an XML namespace at the document root of the feed.
182
183 $url = $feed->xmlns( "xmlns:media" );
184 Returns the URL of the specified XML namespace.
185
186 @list = $feed->xmlns();
187 Returns the list of all XML namespaces used in $feed.
188
190 $feed->title( $text );
191 This method sets/gets the feed's "title" element, returning its current
192 value when $title is undefined.
193
194 $feed->description( $html );
195 This method sets/gets the feed's "description" element in plain text or
196 HTML, returning its current value when $html is undefined. It is
197 mapped to "content" element for Atom 0.3/1.0.
198
199 $feed->pubDate( $date );
200 This method sets/gets the feed's "pubDate" element for RSS, returning
201 its current value when $date is undefined. It is mapped to "dc:date"
202 element for RDF, "modified" for Atom 0.3, and "updated" for Atom 1.0.
203 See also "DATE AND TIME FORMATS" section below.
204
205 $feed->copyright( $text );
206 This method sets/gets the feed's "copyright" element for RSS, returning
207 its current value when $text is undefined. It is mapped to "dc:rights"
208 element for RDF, "copyright" for Atom 0.3, and "rights" for Atom 1.0.
209
210 $feed->link( $url );
211 This method sets/gets the URL of the web site as the feed's "link"
212 element, returning its current value when the $url is undefined.
213
214 $feed->language( $lang );
215 This method sets/gets the feed's "language" element for RSS, returning
216 its current value when the $lang is undefined. It is mapped to
217 "dc:language" element for RDF, "feed xml:lang=""" for Atom 0.3/1.0.
218
219 $feed->image( $url, $title, $link, $description, $width, $height )
220 This method sets/gets the feed's "image" element and its child nodes,
221 returning a list of current values when any arguments are undefined.
222
224 $item->title( $text );
225 This method sets/gets the item's "title" element, returning its current
226 value when the $text is undefined.
227
228 $item->description( $html );
229 This method sets/gets the item's "description" element in HTML or plain
230 text, returning its current value when $text is undefined. It is
231 mapped to "content" element for Atom 0.3/1.0.
232
233 $item->pubDate( $date );
234 This method sets/gets the item's "pubDate" element, returning its
235 current value when $date is undefined. It is mapped to "dc:date"
236 element for RDF, "modified" for Atom 0.3, and "updated" for Atom 1.0.
237 See also "DATE AND TIME FORMATS" section below.
238
239 $item->category( $text );
240 This method sets/gets the item's "category" element. returning its
241 current value when $text is undefined. It is mapped to "dc:subject"
242 element for RDF, and ignored for Atom 0.3.
243
244 $item->author( $name );
245 This method sets/gets the item's "author" element, returning its
246 current value when $name is undefined. It is mapped to "dc:creator"
247 element for RDF, "author" for Atom 0.3/1.0.
248
249 $item->guid( $guid, isPermaLink => $bool );
250 This method sets/gets the item's "guid" element, returning its current
251 value when $guid is undefined.
252
253 It is mapped to "id" element for Atom, and ignored for RDF. In case of
254 RSS, it will return a HASH. The "isParmaLink" is supported by RSS
255 only, and optional.
256
257 $item->set( $key => $value, ... );
258 This method sets customized node values or attributes. See also
259 "ACCESSOR AND MUTATORS" section below.
260
261 $value = $item->get( $key );
262 This method returns the node value or attribute. See also "ACCESSOR
263 AND MUTATORS" section below.
264
265 $link = $item->link();
266 This method returns the item's "link" element.
267
269 This module understands only subset of "rdf:*", "dc:*" modules and
270 RSS/RDF/Atom's default namespaces by itself. There are NO native
271 methods for any other external modules, such as "media:*". But "set()"
272 and "get()" methods are available to get/set the value of any elements
273 or attributes for these modules.
274
275 $item->set( "module:name" => $value );
276 This sets the value of the child node:
277
278 <item><module:name>$value</module:name>...</item>
279
280 $item->set( "module:name@attr" => $value );
281 This sets the value of the child node's attribute:
282
283 <item><module:name attr="$value" />...</item>
284
285 $item->set( "@attr" => $value );
286 This sets the value of the item's attribute:
287
288 <item attr="$value">...</item>
289
290 $item->set( "hoge/pomu@hare" => $value );
291 This code sets the value of the child node's child node's attribute:
292
293 <item><hoge><pomu attr="$value" /></hoge>...</item>
294
296 "XML::FeedPP" allows you to describe date/time using any of the three
297 following formats:
298
299 $date = "Thu, 23 Feb 2006 14:43:43 +0900";
300 This is the HTTP protocol's preferred format and RSS 2.0's native
301 format, as defined by RFC 1123.
302
303 $date = "2006-02-23T14:43:43+09:00";
304 W3CDTF is the native format of RDF, as defined by ISO 8601.
305
306 $date = 1140705823;
307 The last format is the number of seconds since the epoch,
308 "1970-01-01T00:00:00Z". You know, this is the native format of Perl's
309 "time()" function.
310
312 To publish Media RSS, add the "media" namespace then use "set()" setter
313 method to manipulate "media:content" element, etc.
314
315 my $feed = XML::FeedPP::RSS->new();
316 $feed->xmlns('xmlns:media' => 'http://search.yahoo.com/mrss/');
317 my $item = $feed->add_item('http://www.example.com/index.html');
318 $item->set('media:content@url' => 'http://www.example.com/image.jpg');
319 $item->set('media:content@type' => 'image/jpeg');
320 $item->set('media:content@width' => 640);
321 $item->set('media:content@height' => 480);
322
324 "XML::FeedPP" requires only XML::TreePP which likewise is a pure Perl
325 implementation. The standard LWP::UserAgent is required to download
326 feeds from remote web servers. "Jcode.pm" is required to convert
327 Japanese encodings on Perl 5.005 and 5.6.1, but is NOT required on Perl
328 5.8.x and later.
329
331 Yusuke Kawasaki, http://www.kawa.net/
332
334 The following copyright notice applies to all the files provided in
335 this distribution, including binary files, unless explicitly noted
336 otherwise.
337
338 Copyright 2006-2011 Yusuke Kawasaki
339
341 This library is free software; you can redistribute it and/or modify it
342 under the same terms as Perl itself.
343
344
345
346perl v5.32.0 2020-07-28 XML::FeedPP(3)