Jun 25, 2012

URL Routing through WebApp2 in Google App Engine: Python

This is a continuation of my Python 2.7 and Google App Engine series. This particular blog post builds upon the code given in my previous post Cron and Datastore in Google App Engine: Python, which in turns builds upon my earlier work. If you don't understand parts of the code I highly suggest you browse my earlier blog posts so you can understand some of the design decisions I have made.

A brief overview...

For those who are diving straight in, let me explain the old code and how I will update it:

I have two scripts (feed.py and cron.py) that I have mapped using app.yaml. The cron script simply connects to my Twitter account and converts my status updates into an RSS feed. It then stores the RSS feed into a Google Datastore object.

The feed script takes the Datastore object and displays it. Both scripts use a third script (entity.py) to define the Datastore object.

Currently the set-up is not thread-safe because I have to use two different scripts to handle my incoming requests. The plan is to replace this set-up with one that is thread-safe. Effectively, we will be using the URL routing functionality provided by the webapp2 framework.

Combining the scripts

The first thing we will do is combine both cron.py and feed.py into one script. The following code should be saved to a file called feed.py:

# The webapp2 framework
import webapp2

# Our datastore interface
from google.appengine.ext import db

# The minidom library for XML parsing
from xml.dom.minidom import parseString

# The URL Fetch library
from google.appengine.api import urlfetch

# Our entity library
import entity

# Detects if it is a URL link and adds the HTML tags
def linkify(text):
    # If http is present in, add the link tag
    if "http" in text:
        text = "<a href='" + text + "'>" + text + "</a>"
    elif "@" in text:
        text = "<a href='http://twitter.com/#!/" + text.split("@")[1] + "'>" + text
        text+= "</a>"
    elif "#" in text:
        text = "<a href='https://twitter.com/#!/search/%23" + text.split("#")[1] + "'>" + text + "</a>"
       
    return text

# Output the XML in a HTML friendly manner
def outputRSS(xml):
    # The get the states list
    statuses = xml.getElementsByTagName("status")
   
    # Our return string
    outputString = "<?xml version='1.0'?>\n<rss version='2.0'>\n\t<channel>"
    outputString+= "\n\t\t<title>Almightyolive Twitter</title>\n\t\t"
    outputString+= "<link>https://twitter.com/#!/almightyolive</link>\n"
    outputString+= "\t\t<description>The twitter feed for the Almighty "
    outputString+= "Olive</description>"
   
    # Cycled through the states
    for status in statuses:
        #Gets the statuses
        text = status.getElementsByTagName("text")[0].firstChild.data
        date = status.getElementsByTagName("created_at")[0].firstChild.data
        tweet = status.getElementsByTagName("id")[0].firstChild.data
       
        # Insert links into the text
        words = text.split()
       
        for i in range (len(words)):
            words[i] = linkify(words[i])
       
        # Recompile words
        text = " ".join(words)
       
        # Creates our output
        string = "\n\t\t<item>\n\t\t\t<title>" + str(date) + "</title>\n"
        string+= "\t\t\t<link>https://twitter.com/AlmightyOlive/status/" + tweet
        string+= "</link>\n\t\t\t<description>" + str(text) + "</description>\n"
        string+= "\t\t</item>"
        outputString+=string
       
    # Output string
    outputString += "\n\t</channel>\n</rss>"
    return outputString   

# Output the XML in a HTML friendly manner
class Cron(webapp2.RequestHandler):
    # Respond to a HTTP GET request
    def get(self):
        # A try-catch statement
        try:
            # Grabs the XML
            url = urlfetch.fetch('https://api.twitter.com/1/statuses/user_timeline.xml?screen_name=almightyolive&count=10&trim_user=true')
           
            # Parses the document
            xml = parseString(url.content)

            content = outputRSS(xml)
            # Our RSS storage entity
            rssStore = entity.Rss(key_name='almightyolive')
           
            # Elements of our RSS
            rssStore.feed = "almightyolive"
            rssStore.content = content

            # Stores our RSS Feed into the datastore
            rssStore.put()
       
        # Our exception code
        except (TypeError, ValueError):
            self.response.out.write("<html><body><p>Invalid inputs</p></body></html>")

# Fetches an XML document and parses it
class MainPage(webapp2.RequestHandler):
    # Respond to a HTTP GET request
    def get(self):
        # A try-catch statement
        try:
            feed = entity.Rss()
            feed_k = db.Key.from_path('Rss', 'almightyolive')
            feed = db.get(feed_k)
           
            # Outputs the RSS
            self.response.out.write(feed.content)

        # Our exception code
        except (TypeError, ValueError):
            self.response.out.write("<html><body><p>Invalid inputs</p></body></html>")

# Create our application instance that maps the root to our
# MainPage handler
app = webapp2.WSGIApplication([('/', MainPage),('/cron', Cron)], debug=True)

The big changes are:
  • We have added a new class called Cron, which included all of that loose code in cron.py
  • We have added a new URL mapping to our WSGI Application. This will hand over any request for '/cron' to our new Cron class

The pieces to make it all work

If you have been following on from my previous work, then you should already have most of this code. The only thing you need to touch is one line in app.yaml, which is to map /cron to our feed webapp.

app.yaml:
application: almightynassar
version: 1
runtime: python27
api_version: 1
threadsafe: yes

handlers:
- url: /cron
  script: feed.app
  login: admin
 
- url: /.*
  script: feed.app

cron.yaml:


cron:
- description: daily summary job
  url: /cron
  schedule: every 1 hours

entity.py:

# Our datastore interface
from google.appengine.ext import db

# Our RSS entity object
class Rss(db.Model):
    feed = db.StringProperty()
    content = db.TextProperty()

And that's it! You now have a fully functional application that just uses the webapp2 framework!

References:

No comments:

Post a Comment

Thanks for contributing!! Try to keep on topic and please avoid flame wars!!