jump to navigation

Bjoern September 26, 2016

Posted by PythonGuy in Uncategorized.
add a comment

A recent blog post where the speeds of various WSGI servers was compared piqued my interest. Among the most surprising results a new WSGI server named Bjoern. It is written in C and compatible with Python 2.7. Taking advantage of the famous libev, it provides unparalleled performance. It is difficult to imagine how you can make python servers run any faster.

If you don’t know about libevent and libev, you really should be making comments on performance. Nowadays, the best way to get performance out of your hardware with coroutines and microthreads. Multithreading is now a dinosaur of a foregone era, making the GIL completely irrelevant.

Thinking about the problem I am trying to solve, I think a WSGI server is exactly what I need. I can write my own framework for my servers rather easily. So I’m investigating Bjoern and others at the moment.

Writing PyPy Compatible Python September 24, 2016

Posted by PythonGuy in Uncategorized.
add a comment

Increasing, I see projects boasting support for PyPy. It’s time for a refresher on what PyPy is and why you should be writing PyPy-compliant code.

PyPy, as opposed to PyPI (the Python Package Index), is a project aiming to compile Python with Python. This sounds absurd and make a practical joke of some sort, but it’s important. When you consider the massive success that Google’s V8 Engine for Javascript has been, you wonder why Python can’t do the same thing, and then you realize that if you could just compile Python to native machine code (or any other kind of code) with Python code, then you would be well on your way to achieving V8’s performance, and maybe beating it because the compiler is written in Python, not C, and is thus easier to understand and iterate on.

PyPy has been around for a long time and has been very visible. It has struggled to achieve the level of performance of Python itself (called CPython since it is the Python engine written in C) but lately, it is increasingly showing that not only can it meet CPython, it can beat it.

Now, Python, the language, has a problem. It was written to make the job of the programmer super-easy, but in doing so, has made it incredibly difficult to turn it into machine code. The Python VM makes it all possible, but we don’t want to make a faster VM, we want to take Python code and turn it into the lowest-level machine code, highly optimized for the CPU it is running on. In order to make that happen, you have to modify the definition of Python slightly, or rather, embrace some differences to the CPython standard implementation.

This page documents the changes you need to make.It used to be that you couldn’t do things like assign different types to a variable, but those days seem to be long gone. Now, the only major difference is that things are not garbage collected like they are in CPython. So you need to explicitly close files and generators when you are done using them. Thankfully, the “with” statement makes this trivial. It is a good pattern you should always be using.

There are some other low-level details you probably won’t run into listed here. If you go through the list, you can see how far PyPy has come.

If you want to take advantage of PyPy’s speed, you’ll need to write your code a certain way. This page lists some of the ways you can code your program to make it run a lot faster in PyPy. Basically, it’s all about being aware of what’s happening at the silicon level and working with that. Notably, the first kind of optimization you should do is choosing the right algorithm. You might make your O(N**2) algorithm run as fast as you like, it will still lose to even poorly optimized O(N log N) algorithms when you have large datasets.

PyPy is reported to give about a 7x performance boost. It is production ready, today.

Two other projects to keep your eye on:

  • Nuitka, which complies Python to C++.
  • Pyston, which uses LLVM.

Finally, a word on the GIL. People talk about the GIL as if it’s a bad thing. It’s not. It’s shifting the cost of multi-threading from every object access to the entire process. If you were to naively remove the GIL and add lock-checks on every access, Python would run a bajillion times slower in single-threaded mode. Perhaps someone will figure out a way to get the best of both worlds, but I highly doubt it.

If the GIL is your bottleneck, use multiple processes, invest in building a SOA architecture, and remind yourself that eventually, you’ll have to start running your jobs on more than one server. In other words, with other languages that support multi-threading well, your evolution is single-threaded -> multi-threaded -> multi-process. With Python, we just cut out the middle man, skip multi-threading, and start investing in multi-process development sooner rather than later. In the end, we get to ignore a whole class of errors that are notoriously difficult to detect, diagnose, and repair.

Guido van Rossum has basically said, “Remove the GIL over my dead body (or by proving me wrong.)” Folks, if you can’t show Guido that you can remove the GIL and make Python better, you have no business saying that the GIL is the problem.


Picking the Best Python Web Framework September 24, 2016

Posted by PythonGuy in Uncategorized.

I’m at the point in my job where I get to pick an entirely new web framework for Python. There are so many out there, it’s really hard to choose.

The first choice I need to look at is whether I need a “full” web framework, or a “minimal” web framework. But first, what do I mean by “web framework”?

A Web Framework is a library that provides the following features:

  • A way of mapping URLs to methods
  • A way of maintaining state across web requests (IE, database connections.)
  • A way of rendering HTML templates.
  • Several other goodies that you typically use in a web server.

A full web framework provides all of the above and lots more. This would be frameworks like Pyramid, Django, and Turbogears.

A minimal web framework provides a lot less. These are things like CherryPy and Flask.

By comparison, things like gevent, twisted, and tornado are not web frameworks. They are simply web servers. You’ll have to build the framework bits yourself.

Since I’m not building a user-facing website but a backend REST server, I don’t need a full web framework. This means Django is out of the question, and Pyramid and Turbogears are less desirable because they are so big.

The next question to consider is what version of Python do I intend to use, and whether I want to support things like Cython and PyPy. Since I am interested in performance, I will likely want to experiment with PyPy, anything that doesn’t run on PyPy is out of the question. I also want to support Python 2 AND 3. My team is transitioning to Python 3 so I don’t want to hold them back with my choice.

I then consider whether I need to interface with a database. If so, then I always choose SQLAlchemy. For those who are not familiar with SQLAlchemy, you have no idea what you are missing. Once you experience SQLAlchemy, you will never, ever want to interface with a database in any other way ever again. SQLAlchemy provides features that are all but impossible in other languages, and it does it seamlessly and effortlessly.

Thankfully, SQLAlchemy is a very well-maintained and mature product, so it supports Python 2 and 3 and PyPy.

Now that I’ve narrowed down the field quite a bit, I need to consider the last requirement. Since I’ll be competing with Node.js and other languages that provide coroutines, I want to be able to use gevent. Gevent is one of those hidden gems in Python that no one seems to know about. They say, “Python doesn’t support coroutines” but with gevent, it really does and it is awesome. Gevent makes Python competitive with many languages. PyPy seals the deal and makes Python the best language ever.

And now, let’s look at my options.

  • CherryPy, which has been around a long time and been a favorite of mine. I like the logo and the name, but it is engineered very well and supports all of the features I need. CherryPy also supports SSL natively.
  • Pylons is old, stable, and incredibly powerful. I have spent a lot of time in Pylons and I loved every minute of it.
  • Pyramid is new and I’ve tried to use it a few times but I’ve always chosen Pylons instead. Maybe I should give it another shot.
  • web2py is not Python 3 compatible.
  • Wheezy.web’s claim to fame was being the fastest back in 2012. It hasn’t been updated since 2015.
  • Bottle seems intriguing and simple. I wonder whether it supports PyPy though.
  • Flask is very popular and deserves inspection. It doesn’t seem to support Python 3 well, though. Nor does it seem to support PyPy.
  • Hug is also intriguing.
  • Falcon is what Hug is built on. So I need to take a look.

In terms of a web server, I’m going to use something off the shelf. Here are my options:

  • Nginx. I have a long history with Nginx and I really don’t like it.
  • Apache. People don’t like Apache for some reason. It is not as fast as Nginx but, in my book, much easier to configure and use. Also, they don’t hide useful features behind a paywall. Apache also has mod_wsgi.
  • Gunicorn is almost synonymous with Python web development. I’ll have to consider it.
  • Spawning seems interesting. It is worthy of more investigation.
  • Pylon’s Waitress also appears intriguing. It requires more investigation.

I am going to continue to investigate and I’ll try to keep my blog updated with my latest findings.

I should add: The reason why Python has so many web frameworks is because Python is awesome. It’s not hard for people to try out new ideas and get them production ready, and so there are always going to be tons of options out there, and they are going to be quite different from each other. This is overwhelming to some, but I prefer choice and I love experimenting with new things.