Pitted Knowledge of the Almighty Olive: Handling HTTP GET requests with webapp2 and Google App Engine: Python

This is a continuation of my Python 2.7 and Google App Engine series. This particular blog post builds upon the code given in my previous posts URL Routing and Cron and Datastore in Google App Engine: Python, which in turn builds upon my earlier work. If you don't understand parts of the code I highly suggest you browse my earlier blog posts so you can understand some of the design decisions I have made.

A brief overview...

For those who are diving straight in, let me explain the old code and how I will update it:

I have a script feed.py that I have mapped using app.yaml. A cron script (configured by cron.yaml) simply connects to my Twitter account and converts my status updates into an RSS feed. It then stores the RSS feed into a Google Datastore object.

The feed script takes the Datastore object and displays it. We use another script (entity.py) to define the Datastore object.

We will now configure the system so that it can convert multiple twitter accounts into an RSS feed. To display a particular RSS feed we will use a HTTP GET request.

The main application

We will create a file called feed.py. This script will be our controller; it simply gets the HTTP requests and maps them to certain classes. These classes will then call other functions to perform the required tasks.

# The webapp2 framework
import webapp2

# Our datastore interface
from google.appengine.ext import db

# Our entity library
import entity

# Our XML2RSS library
import XML2RSS

# Output the XML in a HTML friendly manner
class Cron(webapp2.RequestHandler):
 # Respond to a HTTP GET request
 def get(self):
 # A try-catch statement
 try:
 XML2RSS.getTweets("almightyolive")
 XML2RSS.getTweets("founding")
 XML2RSS.getTweets("ABCNews24")
 XML2RSS.getTweets("SBSNews")

 # Our exception code
 except (TypeError, ValueError):
 self.response.out.write("<html><body>Invalid inputs</body></html>")

# Fetches an XML document and parses it
class MainPage(webapp2.RequestHandler):
 # Respond to a HTTP GET request
 def get(self):
 # A try-catch statement
 try:
 account = self.request.get('account')

 feed = entity.Rss()
 feed_k = db.Key.from_path('Rss', account)
 feed = db.get(feed_k)

 # Outputs the RSS
 self.response.out.write(feed.content)

 # Our exception code
 except (TypeError,ValueError):
 self.response.out.write("<html><body>Invalid inputs (Type Error)</body></html>")
 except:
 self.response.out.write("<html><body>Unspecified Error</body></html>")

# Create our application instance that maps the root to our
# MainPage handler
app = webapp2.WSGIApplication([('/', MainPage),('/cron', Cron)], debug=True)

The XML2RSS script

As you may have noticed,the feed.py script made reference to an XML2RSS object. This is a separate script that outsources the conversion of XML to RSS into easy-to-call functions. Create a new file called XML2RSS.py and add the following:

# The minidom library for XML parsing
from xml.dom.minidom import parseString

# The URL Fetch library
from google.appengine.api import urlfetch

# Our entity library
import entity

# Detects if it is a URL link and adds the HTML tags
def linkify(text):
 # If http is present in, add the link tag
 if "http" in text:
 text = "<a href='" + text + "'>" + text + "</a>"
 elif "@" in text:
 text = "<a href='http://twitter.com/#!/" + text.split("@")[1] + "'>" + text + "</a>"
 elif "#" in text:
 text = "<a href='https://twitter.com/#!/search/%23" + text.split("#")[1] + "'>" + text + "</a>"

 return text

# Output the XML in a HTML friendly manner
def outputRSS(xml, account):
 # The get the states list
 statuses = xml.getElementsByTagName("status")

 # Our return string
 outputString = "<?xml version='1.0'?>\n<rss version='2.0'>\n\t<channel>\n\t\t<title>Twitter: " + account + "</title>\n\t\t"
 outputString+= "<link>https://twitter.com/#!/almightyolive</link>\n\t\t<description>The twitter feed for " + account + "</description>"

 # Cycled through the states
 for status in statuses:
 #Gets the statuses
 text = status.getElementsByTagName("text")[0].firstChild.data
 date = status.getElementsByTagName("created_at")[0].firstChild.data
 tweet = status.getElementsByTagName("id")[0].firstChild.data

 # Insert links into the text
 words = text.split()

 for i in range (len(words)):
 words[i] = linkify(words[i])

 # Recompile words
 text = " ".join(words)

 # Creates our output
 string = "\n\t\t<item>\n\t\t\t<title>" + str(date) + "</title>\n\t\t\t<link>https://twitter.com/AlmightyOlive/status/" + tweet + "</link>\n\t\t\t<description>" + str(text) + "</description>\n\t\t</item>"
 outputString+=string

 # Output string
 outputString += "\n\t</channel>\n</rss>"
 return outputString

# Our RSS storage function
def getTweets(account):
 # Grabs the XML
 url = urlfetch.fetch('https://api.twitter.com/1/statuses/user_timeline.xml?screen_name=' + account + '&count=10&trim_user=true')

 # Parses the document
 xml = parseString(url.content)

 # Converts the XML into RSS
 content = outputRSS(xml, account)

 # Our RSS storage entity
 rssStore = entity.Rss(key_name='' + account)

 # Elements of our RSS
 rssStore.feed = '' + account
 rssStore.content = content

 # Stores our RSS Feed into the datastore
 rssStore.put()

The pieces to make it all work

If you have been following on from my previous work, then you should already have most of this code. I won't bother explaining it here because it is mostly self-explanatory.

app.yaml:

application: almightynassar
version: 1
runtime: python27
api_version: 1
threadsafe: yes

handlers:
- url: /cron
script: feed.app
login: admin

- url: /.*
script: feed.app

cron.yaml:

cron:
- description: daily summary job
url: /cron
schedule: every 1 hours

entity.py:

# Our datastore interface
from google.appengine.ext import db

# Our RSS entity object
class Rss(db.Model):
feed = db.StringProperty()
content = db.TextProperty()

And that's it! You now have a fully functional application that just uses the webapp2 framework!

If you navigate to http://localhost:8080/?account=almightyolive you should now see the RSS feed. You can test if your mapping works by navigating to http://localhost:8080/?account=founding; you should see the Founding Institute twitter account instead!

References:

Google's own getting started with webapp and Python.
The official webapp2 reference
The Google developer resource for GAE
Google App Engine FAQs
YAML reference
app.yaml reference

2 comments:

UnknownAug 19, 2012, 6:18:00 PM
Thanks for this. I was just wondering, why not change your blog title tag to show the post title before the blog name. It would be much easier to figure out post title on Google and for your SEO

Thanks for contributing!! Try to keep on topic and please avoid flame wars!!

Jun 26, 2012

Handling HTTP GET requests with webapp2 and Google App Engine: Python