Parsing date in RSS2 XML feed (RFC 822)

by , under python

If you manually parse the RSS2 feed and extract the date from it you need to first know that the format of the date in RSS2 xml schema is based on RFC 822. See this link on the format of date for various rss/atom feeds.

Format of date in RFC 822 (4 digit year) is Thu, 01 Jan 2004 19:48:21 GMT.

The solution is to use the python’s core datetime object of datetime module. See the code snippet below…

[sourcecode language=”python”]
import datetime

str = "Thu, 17 Dec 2009 08:00:00 GMT"
dt = datetime.datetime.strptime(str, "%a, %d %b %Y %H:%M:%S %Z")
print dt

See the python documentation for the various date format directives. You can follow this approach for parsing date from rss1/atom xml feeds by compiling the date format directives based on the date format used within the feed.