Giter Site home page Giter Site logo

mojo-feed's Introduction

Build Status

NAME

Mojo::Feed - Mojo::DOM-based parsing of RSS & Atom feeds

SYNOPSIS

use Mojo::Feed;
use Mojo::File qw(path);

my $feed = Mojo::Feed->new->parse(file => path("atom.xml"));
print $feed->title, "\n",
  $feed->items->map('title')->join("\n");

$feed = Mojo::Feed->new( body => $string );
$feed = Mojo::Feed->new( url => $rss_url );

my $feed = Mojo::Feed->new(
  url => "https://github.com/dotandimet/Mojo-Feed/commits/master.atom");
say $feed->title;
$feed->items->each(
  sub { say $_->title, q{ }, Mojo::Date->new($_->published); });

DESCRIPTION

Mojo::Feed is an Object Oriented module for identifying, fetching and parsing RSS and Atom Feeds. It relies on Mojo::DOM for XML/HTML parsing. Date parsing is done with HTTP::Date.

Mojo::Feed represents the parsed RSS/Atom feed; you can construct it by setting an XML string as the body attribute, by setting the file or url attributes to a Mojo::File or Mojo::URL respectively, or by using a Mojo::Feed::Reader object.

ATTRIBUTES

Mojo::Feed implements the following attributes.

body

The original decoded string of the feed.

dom

The parsed feed as Mojo::DOM object.

source

The source of the feed; either a Mojo::File or Mojo::URL object, or undef if the feed source was a string.

title

Returns the feed's title.

description

Description of the feed, filled from channel description (RSS), subtitle (Atom 1.0) or tagline (Atom 0.3)

link

Web page URL associated with the feed

items

Mojo::Collection of Mojo::Feed::Item objects representing feed news items

entries

Alias name for items.

subtitle

Optional feed description

author

Name from author, dc:creator or webMaster field

published

Time in epoch seconds (may be filled with pubDate, dc:date, created, issued, updated or modified)

url

A Mojo::URL object from which to load the file. If set, it will set source. The url attribute may change when the feed is loaded if the user agent receives a redirect.

file

A Mojo::File object from which to read the file. If set, it will set source.

is_valid

True if the top-level element of the DOM is a valid RSS (0.9x, 1.0, 2.0) or Atom tag. Otherwise, false.

feed_type

Detect type of feed - returns one of "RSS 1.0", "RSS 2.0", "Atom 0.3", "Atom 1.0" or "unknown"

METHODS

Mojo::Feed inherits all methods from Mojo::Base and implements the following new ones.

new

my $feed = Mojo::Feed->new;
my $feed = Mojo::Feed->new( body => $string);

Construct a new Mojo::Feed object.

to_hash

my $hash = $feed->to_hash;
print $hash->{title};

Return a hash reference representing the feed.

to_string

Return a XML serialized text of the feed's Mojo::DOM node. Note that this can be different from the original XML text in the feed.

is_feed_content_type

Accepts a mime type string as an argument; returns true if it is one of the accepted mime-types for RSS/Atom feeds, undef otherwise.

find_feed_links

Accepts a Mojo::Message::Response returned from an HTML page, uses its dom() method to find either LINK elements in the HEAD or links (A elements) that link to a possible RSS/Atom feed.

CREDITS

Dotan Dimet

Mario Domgoergen

Some tests adapted from Feed::Find and XML:Feed, Feed auto-discovery adapted from Feed::Find.

COPYRIGHT AND LICENSE

This software is Copyright (c) Dotan Dimet [email protected].

This library is free software; you can redistribute it and/or modify it under the terms of the Artistic License version 2.0.

Test data (web pages, feeds and excerpts) included in this package is intended for testing purposes only, and is not meant in any way to infringe on the rights of the respective authors.

AUTHOR

Dotan Dimet [email protected]

mojo-feed's People

Contributors

dotandimet avatar mdom avatar mohawk2 avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

mohawk2

mojo-feed's Issues

Extending Mojo::Feed

Mojo::Feed and Mojo::Feed::Item both expose a dom object, so it should be simple to extract more specialized fields from these objects. It would be nice to allow a simple extension mechanism for this.

Both Mojo::Feed and Mojo::Feed::Item derive from Mojo::Base, so we can use roles to extend their functionality. I added a simple role example, with a method that returns the type of a feed:
https://github.com/dotandimet/Mojo-Feed/blob/master/examples/extending.pl

However, there are complications. To extend Mojo::Feed::item, you need to override items in Mojo::Feed; to smoothly extend Mojo::Feed, you need to modify the internals of Mojo::Feed::Reader.

Or we could make all these classes injectable (ie, Mojo::Feed has item_class => 'Mojo::Feed::Item';, etc)

Tests fail with new Mojolicious (>= 8.03)

The t/24-plugin.t test fails:

    #   Failed test 'undef isa 'HASH''
    #   at t/24-plugin.t line 46.
    #     undef isn't defined
    
    #   Failed test at t/24-plugin.t line 47.
    #          got: undef
    #     expected: 'First Weblog'
    
    #   Failed test 'undef isa 'HASH''
    #   at t/24-plugin.t line 52.
    #     undef isn't defined
    
    #   Failed test at t/24-plugin.t line 53.
    #          got: undef
    #     expected: 'First Weblog'
    
    #   Failed test 'undef isa 'HASH''
    #   at t/24-plugin.t line 57.
    #     undef isn't defined
    
    #   Failed test 'string ref from body'
    #   at t/24-plugin.t line 58.
    #          got: undef
    #     expected: 'First Weblog'
Mojo::Transaction::success is DEPRECATED in favor of Mojo::Transaction::result and Mojo::Transaction::error at /home/cpansand/.cpan/build/2018110316/Mojo-Feed-0.16-mhgEZe/blib/lib/Mojo/Feed/Reader.pm line 73.
Error getting feed from url http://127.0.0.1:53713/atom.xml: Not Found at /home/cpansand/.cpan/build/2018110316/Mojo-Feed-0.16-mhgEZe/blib/lib/Mojolicious/Plugin/FeedReader.pm line 62.
    # Child (parse) exited without calling finalize()

#   Failed test 'parse'
#   at /usr/perl5.20.1Dp/lib/site_perl/5.20.1/Test/Builder.pm line 279.
# Tests were run but no plan was declared and done_testing() was not seen.
# Looks like your test exited with 255 just after 1.
t/24-plugin.t ...... 
Dubious, test returned 255 (wstat 65280, 0xff00)
Failed 1/1 subtests 

Statistical analysis suggests that this happens with new Mojolicious version:

****************************************************************
Regression 'mod:Mojolicious'
****************************************************************
Name           	       Theta	      StdErr	 T-stat
[0='const']    	      1.0000	      0.0000	9097471055264232.00
[1='eq_7.57']  	      0.0000	      0.0000	   3.57
[2='eq_7.58']  	     -0.0000	      0.0000	  -1.24
[3='eq_7.62']  	      0.0000	      0.0000	   2.06
[4='eq_7.64']  	      0.0000	      0.0000	   0.82
[5='eq_7.65']  	      0.0000	      0.0000	   1.65
[6='eq_7.68']  	      0.0000	      0.0000	   4.29
[7='eq_7.70']  	      0.0000	      0.0000	   4.96
[8='eq_7.71']  	     -0.0000	      0.0000	  -0.49
[9='eq_7.72']  	      0.0000	      0.0000	   3.91
[10='eq_7.75'] 	      0.0000	      0.0000	   4.37
[11='eq_7.81'] 	      0.0000	      0.0000	   1.75
[12='eq_7.83'] 	      0.0000	      0.0000	   0.00
[13='eq_7.84'] 	      0.0000	      0.0000	   4.37
[14='eq_7.85'] 	     -0.0000	      0.0000	  -1.65
[15='eq_7.88'] 	      0.0000	      0.0000	   0.00
[16='eq_8.0']  	     -0.0000	      0.0000	  -2.86
[17='eq_8.03'] 	     -1.0000	      0.0000	-7878641044052450.00
[18='eq_8.04'] 	     -1.0000	      0.0000	-8674098307939065.00
[19='eq_8.05'] 	     -1.0000	      0.0000	-8973690171151438.00

R^2= 1.000, N= 139, K= 20
****************************************************************

Also I see a deprecation warning:

Mojo::Transaction::success is DEPRECATED in favor of Mojo::Transaction::result and Mojo::Transaction::error at /home/cpansand/.cpan/build/2018110316/Mojo-Feed-0.16-mhgEZe/blib/lib/Mojo/Feed/Reader.pm line 73.

Can't locate object method "delay" via package "Mojo::IOLoop"

The test suite started to fail:

Can't locate object method "delay" via package "Mojo::IOLoop" at t/24-plugin.t line 77.
# Tests were run but no plan was declared and done_testing() was not seen.
# Looks like your test exited with 255 just after 1.
t/24-plugin.t ............... 
Dubious, test returned 255 (wstat 65280, 0xff00)
All 1 subtests passed 

Statistical analysis suggests that this problem is caused by Mojolicious 9.x:

****************************************************************
Regression 'mod:Mojolicious'
****************************************************************
Name           	       Theta	      StdErr	 T-stat
[0='const']    	      1.0000	      0.0581	  17.20
[1='eq_8.12']  	      0.0000	      0.0712	   0.00
[2='eq_8.17']  	      0.0000	      0.0822	   0.00
[3='eq_8.26']  	      0.0000	      0.0822	   0.00
[4='eq_8.40']  	     -1.0000	      0.0822	 -12.17
[5='eq_8.41']  	     -0.5000	      0.0712	  -7.02
[6='eq_8.43']  	      0.0000	      0.0822	   0.00
[7='eq_8.51']  	      0.0000	      0.0621	   0.00
[8='eq_8.52']  	      0.0000	      0.0650	   0.00
[9='eq_8.55']  	      0.0000	      0.0822	   0.00
[10='eq_8.56'] 	      0.0000	      0.0602	   0.00
[11='eq_8.57'] 	      0.0000	      0.0587	   0.00
[12='eq_9.0']  	     -1.0000	      0.0650	 -15.39
[13='eq_9.01'] 	     -1.0000	      0.0593	 -16.86
[14='eq_9.02'] 	     -1.0000	      0.0587	 -17.03

R^2= 0.988, N= 163, K= 15
****************************************************************

Support full Atom (and RSS) spec

@mdom said:

why not just add type to Mojo::Feed? I would even go so far and add all attributes mentioned in the spec. That shouldn't be too much work and have a negligent runtime cost. And even if we worry about that, we could just use a specialised import list to generate a minimal set of accessors or an extended list. I would love to use atoms expired to save updating feeds if they run their course. :)

I say, +1.

Document enclosures

Write documention for Mojo::Feed::Item::Enclosure. Feel free to assign that to me... :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.