jump to navigation

DNS Woes with Amazon’s EC2 May 24, 2011

Posted by PythonGuy in Web Technologies.
add a comment

If you are going to use Load Balancer with Amazon’s EC2 system (highly recommended), then you will note that a static IP address is not exposed for the load balancer. You may be tempted to look up the domain name they provide you, and use the static IP address. While this will work for a while, DON’T DO IT. I speak from experience, here. Amazon can and will change the IP address of the domain name for the load balancer, and they won’t even give you a heads up.

You’re going to have to setup a CNAME for your domain to point to the load balancer. There’s just no way around it.

Let’s say you own a domain, for example, example.com. If you are tempted to put a CNAME for your root domain (example.com), you are making a huge mistake. If a domain has a CNAME record, then all the other records, such as the MX and TXT, are generally ignored. I say generally because some programs and email systems are smart enough to figure things out, while others are not.

So you are left with using a subdomain that doesn’t have anything else attached to it. The ideal candidate is, of course, http://www.example.com. Setup a CNAME for that.

What do you do with example.com? Get an elastic IP address from Amazon, and have it point to one of your web server instances. That will be the A record for you root domain (example.com).

If you are running Apache’s absolutely wonderful HTTP server, you’re going to have to redirect all traffic that comes to your website but not through http://www.example.com to http://www.example.com. Except for, of course, the heartbeat URL that your load balancer will poke at to see if the web server is up. Here’s what the directives look like.

RewriteEngine on
RewriteCond %{REQUEST_URI} !^/ping$
RewriteCond %{HTTP_HOST} !^www\.example\.com$ [NC]
RewriteRule ^/?(.*) http://www.example.com/$1 [NE,R=301,L]

You can read more about what these directives do. Basically, it says:

  1. Ignore /ping
  2. Ignore anything coming through http://www.example.com
  3. Redirect everything else (301) to http://www.example.com, keeping the arguments and URL intact.

Hopefully, this will be useful for someone.

Advertisements

Teaching Programming With Python January 7, 2011

Posted by PythonGuy in Advanced Python, Beginning Programming, Mako, Python.
add a comment

Python Guy taught Python Guy Jr. (9 years old) about Boolean Algebra. (It’s really easy because there are only two numbers, and so you have a 50% chance of getting any problem right by guessing. I guess that’s why they don’t teach it in Kindergarten, although if we did, I think a lot of kids would learn to like math a lot more.)

Python Guy popped open python on the konsole in KDE on Fedora 13. Although this box doesn’t have Flash installed (on purpose), it is, to Python Guy’s spawnlings, the magic box that does things that blow their mind every day before they even wake up late for school.

Python Guy watched the Python Guy Jr. learn all about short-circuiting. When Python Guy was a young one, practicing on the Commodore 64 (or was it Borland C several years later?), I remember the thing I constantly awed over was the short-circuit operators, “and” and “or” in Python, but && and || in C.

Ah, the magic!


>>> 7 and 4
4
>>> 0 and 5
0
>>> 7 or 8
7
>>> 0 or 10
10

Of course, when you understand what is happening, you are tempted to do fancy things like this in Mako:


You have bought <% n %> <% n == 1 and "item" or "items" %>.

Until you mess up, that is, and foolishly choose something false for the positive case:


You have bought <% n %> item<% n == 1 and "" or "s" %>.

(Can you spot it? Neither can I when I do it.)

So Python has the infamous and hotly debated “a if b else c” expression, which does the right thing:


You have bought <% n %> <% "item" if n == 1 else "items" %>.

But is that really the right thing? No, it’s not. If you want to do plurals, you need to be really, really smart about it. Even if you have a library that will pluralize your English nouns, you’ve got to think of the Germans, Slovenians, and Japanese who might not like ordering their stuff in English.

Back to the original topic, Boolean Algebra is a great way to teach Algebra and a great way to play with some of the most advanced features of Python. It’s almost magical how it all works, at least in the imagination of a 9-year-old.

Python MUD Reborn April 15, 2008

Posted by PythonGuy in Advanced Python, MUD, Python, Web Technologies.
Tags: , ,
13 comments

I was prodded by one Jose from Spain who wanted to learn Python to revisit SimPyMUD. Of course, that project is old and dead, but I had learned enough about how to code up servers in Python in the meantime to write a new MUD from scratch in a matter of hours.

The model I used is a threaded one. Every connection gets its own thread. It is based on the SocketServer. Here’s some code to poke at.

Let me explain a little about how it works.

First, I use a TCP threaded socket server–ThreadingTCPServer. This server will spawn a new thread which calls the request handler you set with the parameters socket, (client address, client port) and the server instance.

The request handler requires a bit of magic to read and write at the same time. That magic is provided by an asyncore dispatcher. Each thread gets its own dispatch map to manage that dispatcher.

Around the dispatcher is a session object. This manages all the state of the current session. It starts off by writing out a welcome message and prompting for the login name.

The dispatcher handles writes by accumulating them in a buffer. When the dispatcher’s handle_write method is called, that’s a sign that the socket is ready to write to. The dispatcher writes as much data as it can, pulling it off of the buffer.

When an incoming bit of text arrives, the handle_read() method of the dispatcher is called. This will read as many bytes as it can, storing them in the incoming buffer. Then the buffer is split by lines and passed off to handle_read_line method. This is conveniently overridden with an attribute assignment to the dispatcher instance. Depending on the state of the session, that input will be rerouted to one of different places.

(An alternate implementation could keep track of the state and then have methods for all of the states. I would rather take advantage of Python’s dynamic environment than keep the rules of C.)

In the login state, input is read and when it is non-blank, it is stored as the login name. Then the state moves to the password input state. There, input is read that represents the password. The corresponding player is looked up in the database (really, a dict) and if one matches with the login and password, then the state moves to the playing state with the player assigned accordingly. Otherwise, an error message is shown and the state transfers back to the login mode.

In the play mode, commands are parsed. The commands dispatch to one of the command methods, which are supposed to do further parsing of the input. Based on the command, objects in the universe are modified, messages are sent out to different connections, and basically, stuff happens.

Thus, you have a MUD.

Now, in building a MUD, there are two bits you have to work on: (1) The functionality, (2) the world. The functionality, for the most part, is going to be implemented in python. New commands are new methods in the play state. Objects also have some methods that are invoked by the commands. If you want a new feature, you’ll have to change existing commands or add new ones.

The world, however, is represented with data. These are the object instances that have been created and linked to each other.

Python doesn’t do a tremendous job of allowing functionality to be represented with data. Ideally, I can imagine a mud universe where everything is code or everything is data because data and code are the same thing. (I guess I have been reading too many Lisp books.) You could log into this world and, as an administrator, change the very rules of nature. But with Python, I don’t quite see a way to do that without either implementing a new language inside of Python, or finding some way to record the Python program state. When I figure it out, I am sure there is some Python award or something.

An alternative method of implementing a MUD, one that I would like to explore, involves using greenlets to simulate threads while truly doing asynchronous socket programming. (the eventlet module is very interesting.) This is an interesting idea because the functionality doesn’t have to be implemented with states. It can be implemented with pure functions that are interrupted and resumed appropriately. It is kind of like the anti-Twisted environment.

Python Guy Gets WSGI February 25, 2008

Posted by PythonGuy in Advanced Python, Python, Web Technologies, WSGI.
add a comment

Thanks to patient explanation by an article written by James Gardner, this old web programmer gets the point of WSGI.

Let me say, WSGI rocks my world. It’s one of those things that says, “There’s a whole fresh new world of opportunity out there, and you’re invited.”

Before WSGI, the web world was strange and convoluted. You had to either edit a slew of Apache config files (which, while better than the predecessor, was still a major pain). Or you had to build your own web server (which, while better than before, was still painful.)

Apache was nice because you could use something like mod_perl or mod_python and actually get in there and change the way the request was handled. It was bad because the request stack was extremely complicated, and the config files didn’t make any sense to the new Apache hacker.

Hand-coded web servers were not nice because each one was different, and most were not that flexible. In the end, you had a new hand-coded web server that was incompatible with the last one. Hence, we have seen Zope and Django succeed.

Ruby-on-rails merely opened the world’s eyes to what a good set of options can give you in the world of hand-coded web servers. It is, in a way, the apex of that kind of technology.

However, WSGI is a whole new experience. It is a combination of the best parts of hand-coded web servers, with the best parts of the Apache request stack. Now, before you start pulling your hair out because it sounds so complicated (you mean I have to wrestle with config files and hack on web servers?), realize that WSGI is the simplest solution one could ever possible come up with.

It has, in its interface, three parts.

First, WSGI handlers are functions. (That’s not rocket science. Apache request handlers were functions. RoR controllers are, ultimately, functions.)

Second, the function takes exactly two parameters. The parameters are “environ” and “start_response”. “environ” is a dictionary, a map. It can list whatever you want. Some things are expected, but they don’t have to be. In different parts of the stack, people will expect different things. “environ” is by far the most complicated part of WSGI, and it is only complicated because different people will need and provide different keys. “start_response”, on the other hand, is a function. It takes a response code (“200 OK”) and a list of HTTP headers.

Third, the WSGI handler must return something iterable. This is going to be the body of the HTTP response.

Pretty simple, huh?

Now, let’s consider what you can do with this.

Obviously, you can create your hand-coded web application. It will simply be a WSGI handler. It will call “start_response” with the appropriate parameters, and then send back the body of the response. Plug this in to any default web server that supports WSGI, and viola! You have a working web application where you can replace the underlying technology. One solution fits all.

Or, you can create a WSGI server. This would take incoming HTTP requests and translate them into WSGI requests. It would take the response and send it out as an HTTP response. Pretty simple, huh? Well, the possibilities are endless. In fact, there are already WSGI handlers for pretty much anything you can imagine, as well as CGI-WSGI translators if you just want to work from the CGI environment. Your options are endless—and they are all interchangeable. One solution fits all.

Or, you can write WSGI middleware. What is middleware? It is what you can put between the web server (which handles the HTTP request and sends an HTTP response) and the web application (which does something interesting with the request.) For instance, if you wanted to do HTTP AUTH, you could write a WSGI middleware app that takes the incoming WSGI request, checks to see if it has been authenticated, and returns an auth request if not. Otherwise, it just passes it along, adding who was authenticated to the environ. You could write an error handler. It would catch exceptions and show a very neatly formatted web page, one that could even allow people to play with the program state. The possibilities are endless. And best of all, middleware is universally useful. One solution fits all.

In short, thanks to WSGI, the world of web application development just got a lot easier, and your options just got a lot bigger. You can literally remove one component of the request stack and replace it with another. And the interface is quite simple.

Now, once you understand this, and once you actually get around to building a web application with six or seven different pieces, you are going to find that the hardest part is configuration. Getting all the pieces configured to behave just the way you want is going to be a bear, because each one is configured in its own way. Never fear, that’s what Python Paste is for. Python Paste takes a config file (in .ini format) and configures each component on the stack. You just have to write the code that puts it together, informing each part that there is a configuration for it from the paste infrastructure.

Python Paste also handles some fairly common tasks, tasks that many different components will want to do. For instance, given a WSGI request, try to build the URL that generated that request. These util functions are actually quite simple. Take a look at them if you get a chance. But they are useful for anyone who is doing any work anywhere on the WSGI stack.

So Python Paste makes it really easy to put together your WSGI app and configure it.

With Python Paste and WSGI, your possibilities have grown significantly, and the overhead has dropped just as much. Now, more than ever, it is going to be possible to build web apps in an afternoon.

Content-type for Pylons Static Content February 6, 2008

Posted by PythonGuy in Pylons, Web Technologies.
3 comments

So I was playing with my shiny new Firefox3 beta 2. I was admiring the wonderful features of the canvas object. And I started to want to really get into SVG.

Well, I ran into a problem. My pylons server wasn’t serving SVG files as SVG files.

I put “test.svg” into my project’s “public” directory. I browsed to the URL http://localhost:5000/test.svg . I noticed that Firefox didn’t recognize the file and asked me to open another app to render it. (eog is nice, and it does a terrific job rendering it, but I wanted to have Firefox do the hard work this time around.)

Just to be sure that Firefox could render the file, I browsed to the absolute path on my local hard drive. That is, “file://home/pythonguy/MyProject/myproject/public/test.svg” Sure enough, it renders fine.

What else to do? Well, I realize by reading the FAQ on SVG for Firefox that if the server doesn’t send the correct Content-type response, then Firefox won’t guess. Sure enough, this telnet session showed me what my server was saying:

$ telnet localhost 5000
GET /test.svg HTTP/1.0

HTTP/1.0 200 OK
Server: PasteWSGIServer/0.5 Python/2.5.1
Date: Wed, 06 Feb 2008 03:24:32 GMT
content-type: application/octet-stream
...

So there’s the problem. My Pylons server doesn’t recognize the .svg ending on the file.

Digging deeper, I found this part in “config/middleware.py”:

    # Static files
    javascripts_app = StaticJavascripts()
    static_app = StaticURLParser(config['pylons.paths']['static_files'])
    app = Cascade([static_app, javascripts_app, app])
    return app

Well, this certainly is interesting. The “Cascade” object lives in “paste/cascade.py”. All it does is try out the different apps until it gets one that doesn’t give an error response. The “StaticURLParser” object is the one that serves the static files.

Looking in “paste/urlparser.py” (thank goodness for explicit imports in Python!), I found a reference to “FileApp” in “paste/fileapp.py”. There, I see that the mime type is determined by the mimetypes module, part of the standard packages for Python.

Poking around mimetypes, I am not surprised to see that files ending in “.svg” are not recognized. (I am surprised, very happily surprised, that Pylons doesn’t try to redesign the wheel when it comes to MIME types.) Reading the documentation, it points out where it learns which file names represent which MIME types. On my system, that is “/etc/mime.types”.

Sure enough, SVG is missing from the list of MIME types in /etc/mime.types. Maybe it is out of date? (Unlikely, but possible.) Nope, it turns out that SVG isn’t registered yet as a valid MIME type.

I hand-edited the file to recognize SVG. Telnet now gives me:

$ telnet localhost 5000
GET /test.svg HTTP/1.0

HTTP/1.0 200 OK
Server: PasteWSGIServer/0.5 Python/2.5.1
Date: Wed, 06 Feb 2008 03:24:32 GMT
content-type: image/svg+xml
...

But Firefox still doesn’t recognize it. Nuts.

SQLAlchemy Rocks! Part II August 16, 2007

Posted by PythonGuy in Advanced Python, Myghty, Pylons, SQLAlchemy, Web Technologies.
2 comments

As a follow up to the previous article, SQLAlchemy Rocks!, here’s how I used the page object.

Remember, I’m using Pylons. Here’s the controller.

    def index(self):
        page = int(request.params.get('p', 1))
        results_per_page = int(request.params.get('rpp', 10))

        c.portfolio_page = model.portfolios_page(page, results_per_page)

        return render_response('/home/index.myt')

And portfolios_page is defined as:

def portfolios_page(page, results_per_page):
    return Page(Portfolio.query().order_by(asc('name')),
            page, results_per_page)

Here, Portfolio is an ORM object, not a table.

The corresponding Myghty template (just the relevant bits):

Items <% c.portfolio_page.first+1 %>
to <% c.portfolio_page.last+1 %>
of <% c.portfolio_page.total %>
<table>
  <tr>
    <th>Portfolio</th>
    <th>Net Worth</th>
    <th>Permissions</th>
  </tr>
%   for p in c.portfolio_page:
  <tr>
    <td><a href="<% h.url_for(controller="portfolio", portfolio=p.id) %>"><%
        p.name %></a></td>
    <td><% p.net_worth |money %></td>
    <td><% p.permissions(c.user) %></td>
  </tr>
%   # end for
</table>
Results per page: <% c.portfolio_page.results_per_page %>
Page: <% c.portfolio_page.page %>
Total pages: <% c.portfolio_page.pages %>

Hope this isn’t too much code. Take your time to grok it all. Feel free to ask specific questions in the comments.

SQLAlchemy Rocks! July 27, 2007

Posted by PythonGuy in Advanced Python, SQLAlchemy, Web Technologies.
5 comments

I’ve been a Python guy for quite a while. The number of web frameworks and options is dizzying. The hardest thing about creating a website in Python is choosing the framework. Once that choice is made, the rest of the ride is easy.

I’ve recently picked up Pylons and SQLAlchemy. I chose Pylons because a good friend of mine (a Ruby fanatic) suggested I learn Rails. I can’t bring myself to pick up Ruby (the syntax is too confusing and ugly for my tastes, reminding me too much of the bad memory that perl is.) But I don’t mind playing with the next best thing, and apparently Pylons is it.

I think I’ve finally gotten over the initial learning hump. I say that because the TODO list is shrinking. I can’t think of things to do faster than I can do them now.

This article isn’t about Pylons though. This is about SQLAlchemy. SQLAlchemy is the ORM I would’ve written if I was smart enough and had enough time and motivation. I have yet to find even one thing in the framework that I would have chose to done differently. Even the naming is pleasantly consistent and predictable.

Anyway, let me share with you one of the pleasant surprises that SQLAlchemy had in store for me. I wanted to do pagination of results from a database. Normally, this is painful. That’s because SQL isn’t a fun thing to write, and it certainly isn’t fun to write a program to write SQL queries.

Well, I threw together this simple Pagination class in a few minutes.

class Subset(object):

    def __init__(self, query, first, num):
        self.query = query

        self.first = int(first)
        assert self.first>=0, 
                "first must be positive. (first=%r)" % self.first

        self.num = int(num)
        assert self.num>0, 
                "num must be greater than 0. (num=%r)" % self.num

        self.total = 0      # Should be the SQL COUNT() on the full query
        self.results = ()   # Should be the range of results.

        self.last = self.first + len(self.results) - 1

    def __len__(self): return len(self.results)
    def __getitem__(self, i): return self.results[i]
    def __iter__(self): return iter(self.results)

class Page(Subset):

    def __init__(self, query, page, results_per_page):
        self.page = int(page)
        assert self.page > 0, 
            "page must be greater than 0. (page=%r)" % self.page

        self.results_per_page = results_per_page
        assert self.results_per_page > 0, 
            "results_per_page must be greater than 0. " 
            "(results_per_page=%r)" % self.results_per_page

        Subset.__init__(self,
                query=query,
                first=(page-1)*results_per_page,
                num=results_per_page)

        self.pages = (self.total-1)/self.results_per_page + 1

I removed the __doc__ strings and other comments to condense the code a bit.

Let me walk through the code a bit. There are two classes—Subset and Page. Subset is a subset of a larger query—from first to first+num items. Page is a page of results—given the page number and the results per page.

Page is really something tacked on top of the Subset. The only difference is that you are talking about pages with Page and a range of results for Subset. They are both useful in a web framework. Page is used in traditional, non-AJAX type paged results. But Subset will be useful if I do something AJAX-y, like a scrollbar that picks up a group of results at a time.

Anyway, there is a missing bit in Subset.__init__(). I am supposed to set self.total to the total number of results. And I am supposed to set the results to just this page of results. I wrote this code like this originally, dreading the moment when I had to actually implement the SQL bits. I mean, I had been disappointed by every ORM system I had seen up until this point. Either they had no concept of LIMIT and OFFSET, or they made it all but impossible to use without using a subselect. I was prepared to be disappointed in a big way.

As I read the documentation in SQLAlchemy, I discovered that this is trivial. Read what it looks like below:

        self.total = self.query.count()
        self.results = tuple(self.query 
                .limit(self.num) 
                .offset(self.first).all())

Yes, that’s it. It really is that easy. You add a “.count()” to the query to get the count. You add a limit() and offset() clause in similar manner, and then grab the results with all().

This worked, flawlessly, and I can’t imagine anything better. I don’t know what else to say.

(The code sample above is Copyright © 2007.)

The Future of GUI April 14, 2007

Posted by PythonGuy in Web Technologies.
4 comments

I spend a great deal of time with web applications. Writing applications with HTTP and HTML really, really sucks. Javascript magnifies that suckiness. AJAX isn’t too bad, but it is even more limiting.

I have spent a great deal of time playing with Qt, PyQt, OpenGL, and PyOpenGL. I believe that in the not-so-distant future, web apps will become quaint antiques. Just like we say today, “email is for old people,” we will one day be saying, “HTML is for old people.”

Well, what will replace it then?

I believe we will be writing apps once more.

.NET is the future, not necessarily because it is a language or a platform, but because it is a way to build great apps really easily. They have placed an emphasis on making development easy, as well as making distribution easy. These are two areas that I don’t believe Microsoft has really tried hard at in the past. (It is doubtful if they did a really good job with .NET this time around, either.)

I remain unconvinced that .NET, or any other GUI platform, has anything on top of Python, Qt, PyQt, OpenGL, and PyOpenGL when it comes to writing applications. With these tools, the world is my canvas.

Python is the easiest language to program in. I don’t even have to think about the language–I just write down what I am thinking and it works.

Qt is a sweet GUI library. The more I learn about it the more excited I get. More importantly, the more I use it the more I like it. It has a lot of pleasant surprises, and I have yet to find an unpleasant surprise (except when I have built PyQt wrongly.)

OpenGL is, in my opinion, the leader for 3D graphics. Sure, it probably doesn’t compete with Direct3D in some regards, but the world of 3D graphics is not all games and framerates. 3D is really about data visualization, and OpenGL is certainly sufficient for almost everything I can imagine.

I’m not going to say that Python and Qt and OpenGL are the best there ever will be. I am saying that whatever is going to replace these two technologies has got to be pretty dang good.

No, it has to be freakin’ awesome.

Universal ID January 19, 2006

Posted by PythonGuy in Web Technologies.
add a comment

I have had an idea off and on for a universal ID system. This is the thumbnail sketch.

IDs come in the form of simple email addresses. They may or may not be an email address.  It is basically @.
Each DNS domain publishes an ID server address. This is the server used to verify that a user is who they say they are.

When you login to a service at domain X, you provide your ID, which says it is part of domain Y. The server at X needs to authenticate you. So it sends a challenge request to the ID server at Y. You need to go to the ID server at Y, login, and then authenticate the domain X’s challenge as being issued by you. When you return to X, it will recognize you.

The ID server can require various methods of authentication. For instance, a simple web form with a username-password can be used. A plugin to your Firefox web browser could be used also. Or a service-level “If you can decode this, you have the right password” type challenge could be issued. Or maybe something even more secure. The point is that it is up to the ID server to decide exactly how to authenticate the user.

The communication between the server at domain X and the ID server goes something like this:

  1. X says: My session id 123 is trying to authenticate user pythonguy@example.com.
  2. X asks: Is session 123 authenticated?
  3. ID replies: No / Not yet / I forgot about 123 / Yes

The user-X interaction looks like this:

  1. User to X: Hi! I’m pythonguy@example.com
  2. (X to ID server: Session 123 is trying to authenticate pythonguy@example.com)
  3. X to User: Please authorize session 123@X.com
  4. User to ID: I would like to authorize 123@X.com
  5. ID to User: You’re authorized. Go back to X.
  6. User to X: Hi! I’m pythonguy@example.com and I authorized session 123.
  7. (X to ID: Is session 123 authorized? ID to X: Yes)
  8. X to User: Welcome, pythonguy@example.com!

People may be tempted to use this only for website interaction, but I think we can go far beyond that.

Apache or Python? January 19, 2006

Posted by PythonGuy in Web Technologies.
add a comment

I saw a post on whether to use Apache or Python for a web service. I want to share my experience on the subject.

In the beginning, everyone wrote their own webserver from scratch in C or similar technology. Then, a lot of the better, open servers merged into the Apache (= a patchy) project. Apache was really, really good for its time. But that time was an era of CGI. Apache handled CGI probably better than anyone else, and it was really good with standard files as well.

Fast forward a few years, and mod_perl hits the scene. Now you can whip together an industrial-strength web server using perl. What was better is your perl CGI scripts could have a tremendous performance improvement out-of-the-box, with only minor tweaks. This was way cool, and rocketed the performance characteristic of the standard web server.

But as  time went on, people started wising up. Why run Apache when writing your own webserver was easy enough? And you could just hide that webserver behind Apache to get Apache’s robust front-end with your app’s backend.

Today, I use Apache for web-facing stuff. Everything is either served up statically or by proxy through Apache. But behind that, I use individualized web servers.

Apache is difficult to work with, even after all the years I’ve logged with it. I’d prefer to avoid it where possible.

Maybe one day I’ll feel comfortable puting something but Apache facing the WWW. I’ll let you know when I do.