A brief overview...
For those who are diving straight in, let me explain the old code and how I will update it:I have a script feed.py that I have mapped using app.yaml. A cron script (configured by cron.yaml) simply connects to my Twitter account and converts my status updates into an RSS feed. It then stores the RSS feed into a Google Datastore object.
The feed script takes the Datastore object and displays it. We use another script (entity.py) to define the Datastore object.
We will now configure the system so that it can convert multiple twitter accounts into an RSS feed. To display a particular RSS feed we will use a HTTP GET request.
The main application
We will create a file called feed.py. This script will be our controller; it simply gets the HTTP requests and maps them to certain classes. These classes will then call other functions to perform the required tasks.# The webapp2 framework
import webapp2
# Our datastore interface
from google.appengine.ext import db
# Our entity library
import entity
# Our XML2RSS library
import XML2RSS
# Output the XML in a HTML friendly manner
class Cron(webapp2.RequestHandler):
# Respond to a HTTP GET request
def get(self):
# A try-catch statement
try:
XML2RSS.getTweets("almightyolive")
XML2RSS.getTweets("founding")
XML2RSS.getTweets("ABCNews24")
XML2RSS.getTweets("SBSNews")
# Our exception code
except (TypeError, ValueError):
self.response.out.write("<html><body><p>Invalid inputs</p></body></html>")
# Fetches an XML document and parses it
class MainPage(webapp2.RequestHandler):
# Respond to a HTTP GET request
def get(self):
# A try-catch statement
try:
account = self.request.get('account')
feed = entity.Rss()
feed_k = db.Key.from_path('Rss', account)
feed = db.get(feed_k)
# Outputs the RSS
self.response.out.write(feed.content)
# Our exception code
except (TypeError,ValueError):
self.response.out.write("<html><body><p>Invalid inputs (Type Error)</p></body></html>")
except:
self.response.out.write("<html><body><p>Unspecified Error</p></body></html>")
# Create our application instance that maps the root to our
# MainPage handler
app = webapp2.WSGIApplication([('/', MainPage),('/cron', Cron)], debug=True)
The XML2RSS script
As you may have noticed,the feed.py script made reference to an XML2RSS object. This is a separate script that outsources the conversion of XML to RSS into easy-to-call functions. Create a new file called XML2RSS.py and add the following:# The minidom library for XML parsing
from xml.dom.minidom import parseString
# The URL Fetch library
from google.appengine.api import urlfetch
# Our entity library
import entity
# Detects if it is a URL link and adds the HTML tags
def linkify(text):
# If http is present in, add the link tag
if "http" in text:
text = "<a href='" + text + "'>" + text + "</a>"
elif "@" in text:
text = "<a href='http://twitter.com/#!/" + text.split("@")[1] + "'>" + text + "</a>"
elif "#" in text:
text = "<a href='https://twitter.com/#!/search/%23" + text.split("#")[1] + "'>" + text + "</a>"
return text
# Output the XML in a HTML friendly manner
def outputRSS(xml, account):
# The get the states list
statuses = xml.getElementsByTagName("status")
# Our return string
outputString = "<?xml version='1.0'?>\n<rss version='2.0'>\n\t<channel>\n\t\t<title>Twitter: " + account + "</title>\n\t\t"
outputString+= "<link>https://twitter.com/#!/almightyolive</link>\n\t\t<description>The twitter feed for " + account + "</description>"
# Cycled through the states
for status in statuses:
#Gets the statuses
text = status.getElementsByTagName("text")[0].firstChild.data
date = status.getElementsByTagName("created_at")[0].firstChild.data
tweet = status.getElementsByTagName("id")[0].firstChild.data
# Insert links into the text
words = text.split()
for i in range (len(words)):
words[i] = linkify(words[i])
# Recompile words
text = " ".join(words)
# Creates our output
string = "\n\t\t<item>\n\t\t\t<title>" + str(date) + "</title>\n\t\t\t<link>https://twitter.com/AlmightyOlive/status/" + tweet + "</link>\n\t\t\t<description>" + str(text) + "</description>\n\t\t</item>"
outputString+=string
# Output string
outputString += "\n\t</channel>\n</rss>"
return outputString
# Our RSS storage function
def getTweets(account):
# Grabs the XML
url = urlfetch.fetch('https://api.twitter.com/1/statuses/user_timeline.xml?screen_name=' + account + '&count=10&trim_user=true')
# Parses the document
xml = parseString(url.content)
# Converts the XML into RSS
content = outputRSS(xml, account)
# Our RSS storage entity
rssStore = entity.Rss(key_name='' + account)
# Elements of our RSS
rssStore.feed = '' + account
rssStore.content = content
# Stores our RSS Feed into the datastore
rssStore.put()
The pieces to make it all work
If you have been following on from my previous work, then you should already have most of this code. I won't bother explaining it here because it is mostly self-explanatory.app.yaml:
application: almightynassar
version: 1
runtime: python27
api_version: 1
threadsafe: yes
handlers:
- url: /cron
script: feed.app
login: admin
- url: /.*
script: feed.app
cron.yaml:
cron:
- description: daily summary job
url: /cron
schedule: every 1 hours
entity.py:
# Our datastore interface
from google.appengine.ext import db
# Our RSS entity object
class Rss(db.Model):
feed = db.StringProperty()
content = db.TextProperty()
And that's it! You now have a fully functional application that just uses the webapp2 framework!
If you navigate to http://localhost:8080/?account=almightyolive you should now see the RSS feed. You can test if your mapping works by navigating to http://localhost:8080/?account=founding; you should see the Founding Institute twitter account instead!
References:
- Google's own getting started with webapp and Python.
- The official webapp2 reference
- The Google developer resource for GAE
- Google App Engine FAQs
- YAML reference
- app.yaml reference
Thanks for this. I was just wondering, why not change your blog title tag to show the post title before the blog name. It would be much easier to figure out post title on Google and for your SEO
ReplyDeleteCheers for the tip!
DeleteGonna look into it now; hopefully Blogger provides the option...