jump to navigation

Why I Don’t Use REST October 16, 2015

Posted by PythonGuy in HTTP, Networking, REST, Ruby, TCP/IP.
1 comment so far

It seems REST is pretty entrenched nowadays. Somewhere along the line, someone decided that all web developers must use REST for their API because there’s really no better way to do it.

I disagree.

I’ll begin with my criticism of REST. I end with my proposal for a replacement, a proposal that is not new nor unique.

My first criticism is that REST is emphatically NOT a standard. There is no paper or publication that I can find that proposes a REST standard, let alone one that has been agreed upon by many vendors. As far as I can tell, the best definition of REST is “the way Ruby on Rails does it”. The only reason that definition wins is because Ruby on Rails is more popular than the numerous other platforms that provides something they call REST. This criticism is often ignored, but I believe wrongly so. Without a standard, how can one learn let alone comply with it? Someone somewhere should’ve documented by now what REST means, and the fact that no one has means that probably no one can. Or, in my mind, that once they set their keyboard to documenting it, they realize how flawed and horrible it really is.

The second criticism is that it doesn’t acknowledge or even comply with a critical component of the internet known as network layers. I can’t blame new programmers and developers for their ignorance, but I can blame experienced people for not knowing the basics of networking. It seems in developing the internet, its architects made it too seamless and easy to use such that people who have no business expanding it are fully capable of doing so.

Let me explain this to you in a nutshell. If you are a developer, you should read OSI Model at Wikipedia to get the bigger picture.

You may have heard of something called HTTP and TCP/IP. These aren’t just fancy acronyms, they are critical components that make the web possible. See, when two computers attempt to communicate with one another, there is a lot that needs to happen. At the hardware level, pulses of light or electrical potential are sent down cables, or radio waves are emitted. Those pulses are received, oftentimes distorted, by another bit of hardware. What is important is that both computers not only agree on what kind of wire or cable or radio waves are used, but what those pulses mean and how they are to be interpreted. These pulses contain all the information you care about — a photo of your grandmother, a message from your boss, a request to load a customer record from a server. However, at this level, the Physical Layer, the architects who design the protocols and the engineers who hook it all up don’t care. All they care is that bit of data are sent back and forth according to those protocols. You may have heard of some of these protocols but only because you’ve had to connect the wires or buy the hardware.

Above this layer sits the Data Link Layer. Now the data sent by the Physical Layer has meaning in terms of which computer is speaking and which is supposed to listen. This layer consists of acronyms you’ve never heard of unless you work at an ISP.

Above that layer sits the “IP” of “TCP/IP”. It is the Network Layer. This layer handles routing, the conveying of a message from one computer to another through a chain of other computers. You may have heard of IPv4 and IPv6. These are two protocols that can be described as envelopes with a “to” address and a bunch of data inside. It is important to note that, like the other layers, it doesn’t care what is in the packet or how the packets are flying from one machine to another. In fact, in the journey of a packet from your computer to the server, it is likely that several different kinds of Physical and Data Link Layers are used, and IP doesn’t care. This layer is where we get IP addresses and ports from. Everything below this doesn’t even know an IP address exists.

The “TCP” in “TCP/IP” refers to the Transport Layer. This is how data that cannot fit into a single packet is conveyed between machines. Above TCP are a few other layers, some you may have heard of.

On the top of the hill of the OSI model is the Application Layer. Here sits HTTP, the granddaddy of them all. This is where you can finally tell another computer to send you an HTML document. But this is not the end. Above HTTP sits your web browser or your web server, and ultimately, your application. When the user types in your website’s URL, they are interacting at a level where they don’t even know HTTP exists.

This is a long detour, but I did it for a purpose. See, at each of these layers, there is an abstraction going on that allows all the complexities of the lower layers to be bottled up. This frees the architects, engineers and programmers working at higher levels to write the simplest code possible that is still robust and efficient. If you had to worry about what kind of internet connection the client has at the HTTP level, for instance, sending different data if they were on WiFi or if they had an ethernet cable, or if they were communicating over a cable modem or DSL, why, you job would be impossible.

When I see people using HTTP as if it were part of the app that they were writing, I want to scream. What happens when HTTP is replaced with something better that doesn’t have HTTP codes in it? I mean, we could be using IP packet headers to send our error codes, but we don’t, because we know that IP can change (and indeed, it is changing, from IPv4 to IPv6!) Or rather, I believe REST developers don’t use packet headers because the authors of the software that implement IP have packaged everything up so nicely that they don’t even know it exists.

Now, sitting atop HTTP, REST developers see the nuts and bolts and say, “Gee, I’d like to use some of that for my own purposes.” No, this is bad behavior. Just because you have knobs and levers doesn’t mean you should pull them. You should use the simplest subset of features you can get away with, and leave the knobs and levers to be pulled by those who truly understand what they are for and how they work.

The final criticism of REST is that it is object-centric. To programmers in languages like Java, this seems natural, even elegant. But to everyone else, it is horrifying and difficult. See, it has been proven that the fundamental element of programming is the function. Meaning, if you have functions alone, you have everything you need to write any type of program. Objects, on the other hand, are not sufficient. Only when you wrap functions into objects (which we call methods) do they become powerful enough to write any type of program. You can create objects with functions, but you don’t have useful objects without them.

I understand why REST is object-centric. HTTP was originally written to be a document store, and documents are a kind of object. We have since repurposed HTTP to be far more than a document store, and browsers to be far more than document fetchers and readers, but that is what it was built to be. As we’ve repurposed HTTP, we’ve left behind certain features that a document store might need. But a document store is not enough to build a full-fledged application. It is only part of an application.

Had we done things right, we might have created a new protocol for building apps on the internet. Indeed, you could say that internet architects are hard at work developing that protocol right now. Unfortunately, they are hamstrung because they have to build it on top of HTTP, because there are so many people who built their application not on top of HTTP, but deeply integrated with it. I can explain why this happened, but that is another story. There was a time when people were inventing new Application Layers to suit their needs, a time when SMTP and HTTP and more were brand new. If we want a new way to communicate over TCP/IP, we should be inventing new protocols, not torturing existing protocols.

Here is an important example of why having an object-centric API doesn’t work. Consider how you might fetch a customer record. You would access the /customers/234 for the 234 customer, right? Well, what should that return?

  • The customer might want to see all of their information, even the bits which he wouldn’t want to share with anyone else, including the company.
  • A Customer Service Representative (CSR) would like to see the history of all communications with that customer, as well as all actions taken on that customer account. Maybe there is an additional REST URL like /customers/234/history? How would the CSR filter out which history he’d like to see? /customers/234/history?from_date=20150101&type=csr_interactions maybe.
  • The person who ships the package to the customer is only interested in their name and address. They might want to know if there are any special flags, such as holds on shipping to the customer, or maybe flags that indicate there should be an email or text message sent when the package is shipped.

As you can see, the idea of “customer” is different for different people in different roles. Some would like some data, others would like other data. How do you specify which data you need? You have to add additional parameters to the REST protocol, and things start to get really messy. What happens when you want to move all the queries for full customer information to one class of server, and the queries for customer interactions to a different cluster? Do you setup a fleet of servers just to reverse proxy the requests based on the URL, or do you setup different services entirely?

REST does not give an answer to this, and indeed, trying to make things “RESTful” only makes the complicated situations far more complicated than they would have otherwise been.

Now, my proposals.

We live in a world where HTTP is the de-facto communication standard. People run their apps in browsers, and the browsers only know how to speak HTTP. Web Sockets exist (which are basic TCP/IP connections) but for the same reasons that people were forced into HTTP, Web Sockets are unlikely to succeed. It seems that we have to plug all of our application into a single port (443) and it seems we have to stick to a single protocol because too many people do not understand the full power of the internet. (Maybe games programmers can save us. They’re the last hope to building an internet on HTTP.) So my first proposal is to stop using HTTP except as a document store. We should make “web server” synonymous with some other protocol, and leave “HTTP server” to be a special kind of server that only stores, retrieves, and updates documents.

However, even though we cannot decouple our apps from HTTP yet, we CAN start writing apps that are HTTP-unaware. Here’s what such an app would look like.

  • The app code would be retrieved from some static source. This would contain all the instructions on how to start the app, including any libraries needed to bootstrap the app. It needs to be static so that it can be cached. As much of the app as possible must be static for this reason.
  • App data would be retrieved over an API layer.

That’s pretty much it.

Now, HTTP fits the bill for retrieving the static app data to a T. It is the perfect solution for this, provided that the data is truly static. Given the demand for things like CDN, this is rapidly coming true. If you can’t store your app in a CDN, it’s not static enough and you need to make it that way. But keep in mind, that if someone came up with a better way to store and retrieve static documents, you should be able to quickly and easily port to that system because you only had static files in your HTTP servers! And don’t think it isn’t coming!

The benefits of having a static file be the base of your app is that it can be cached, distributed, shared, etc… at no cost to you. We’ve seen things like Bittorrent excel at things like this. If your app can’t be distributed through Bittorrent, it’s no good to you or anyone else. When you wake up one morning and half the world wants your app, if it’s not static and distributed, you’re going to have a very bad day. This need is going to drive us away from HTTP one day. We’re already publishing our apps by uploading them to the Google and Apple store. HTTP is going to disappear just like so many protocols before it disappeared!

The API, on the other hand, should exhibit the following features:

  • Function, not object, based.
  • Functions take any kind of parameters in any configuration. (Python is a good model for this. Many languages are similar.)
  • Functions may return a result (a blob of any kind of data) or raise an exception (which is also any kind of data.)
  • A session which would contain the global or dynamic context in which the function should operate.

That’s pretty much all the API has to do. Anything on top of this is unnecessary complexity.

Now, some of the API calls return static data. Those should be called documents and stored in HTTP, and retrieved not through the API but through HTTP. So don’t put static documents in your API, keep it separate and have your API point the client to them.

WebSockets can be the foundation for this API. We can fall back to communicating over HTTP, but we must be very careful not to tie our APIs to HTTP. Meaning, we should be able to switch the API to use WebSockets, raw TCP/IP, or HTTP, with the flick of a switch.

So, build your application on HTTP today, but build it in such a way that it doesn’t have to run on HTTP. That way, when the future comes, you’ll be ready. And you’ll also be building something that others can build upon reliably.

In conclusion, I wanted to show you why REST is not a good solution. I wanted to teach you what we can replace REST with. I don’t give any specific guidance on what the future should look like, just broad pointers on where it can and probably should be.

I do appreciate comments, positive or negative. Keep in mind that ad hominem, red herring and other logical fallacies should not be employed. If you don’t know what those are and why it is bad to use them, you should go study up on logic and logical fallacies before commenting on anything ever again.

Advertisements

2011 Predictions for Python January 13, 2011

Posted by PythonGuy in Pylons, Python, Ruby, SQLAlchemy.
add a comment

I guess you’re supposed to make predictions in January. Here’s mine.

At the end of 2011, Python 2 will still be more popular than Python 3, but some important projects will migrate to Python 3 anyway.

Sometime in 2011, a lot of Ruby coders are going to wake up and start playing with something else. Not that Ruby’s bad, it’s just not the best. And it has some serious issues.

Sometime in 2011. Guido van Rossum will say at least one really wise thing that will completely redefine how we look at some aspect of day-to-day programming. Maybe two.

Sometime in 2011, PyPy will get a version of Python that runs faster than CPython.

Sometime in 2011, serious discussion will be had about whether we should all program in the PyPy dialect—just in case it does take off.

Sometime in 2011, Pyramid will be released and probably ignored for a long time. Who knows? Maybe it will be better than Pylons. Maybe not. It’s not really important because there won’t be a huge reason to shift to it. Regardless, the team developing Pyramid is going to learn a lot about the problems they were trying to solve, and propose something even shinier and better to solve all the problems they tried to solve in the past and a whole lot of new problems they invented because they are actually from the distant future.

Sometime in 2011, SQLAlchemy will become so intelligent it may actually be considered as a running mate in the 2012 presidential elections.

Sometime in 2011, someone’s going to give example code for their research paper in Python. And someone else is going to read that paper and understand it even though they don’t know Python. And then they are going to go through the rest of the month thinking that Python is really this cribbage people use for writing research papers. When someone finally points out it’s a real language and the example code was actually production in a working system, he’s going to do a double-take and spit his coffee all over. It’ll be a mess.

Python is Not a Stepping Stone to Lisp June 13, 2010

Posted by PythonGuy in Java, Lisp, perl, Python, Ruby.
add a comment

I’ve dabbled a bit in Lisp-land. I left frustrated and annoyed. Not with the language, per se, but moreso with the community and its support, or rather, the lack thereof. I’ve also taken up full-time residence for a number of years in C and C++ land. I’ve tinkered in quite a few language and today, I’m forced to write in Java. (Thanks, Java. My project is now a full two months late because your memory management sucks and I cannot do proper caching.)

Brian Carper is an ex-pat of the nation of Ruby now strangely finding residence in the nation of Clojure. He talks about why Ruby is a natural step towards Clojure, and unwittingly exposes Ruby’s fatal flaws, flaws which I find simply abhorrent. One day we’ll be reading about Brian Carper’s adventures in some other language, wherein he discovers, after all, that previous language X had it all wrong to begin with.

For reasons I cannot imagine, he hasn’t tried Python out, at least enough to find it satisfying all the weaknesses of all the languages he has tried before. Yes, super() is not, and bad Python sucks as bad as bad anything. Python isn’t a trivial language to master, although with no foreknowledge you can get pretty deep into Python without realizing it.

Anyway, perhaps this bit of arguing will help him see the error of his ways. I like to classify languages into two categories: great languages and terrible languages. There really is no middle ground.

Great languages are languages designed to solve a problem and that subsequently solve that problem. C went about trying to provide some kind of structure to assembler without getting too far away from assembler, and succeeded brilliantly. Lisp set out to prove that a language built on pure mathematics can solve the world’s problems and do it quickly and it succeeded wildly. Perl set out to show that “scripting” in a higher level language can actually make some problems really easy to solve.

Terrible languages set out to solve a problem and fall short. These are languages like Ruby and Java and pretty much everything out there except a few languages.

When you finally realize what makes a great language a terrible language, you have reached a certain level of understanding. It’s like waking up one morning and seeing, for the first time, that Lincoln probably had body odor!

So all languages are terrible. Even Python. Python sucks, a lot! I mean, I had to fight with the dang thing for hours because I happened to name a script “mymodule.py” in my bin path and it wasn’t picking up the right package path! We have a name for these things in the Python community (warts), and we show them off like trophies, waving them proudly and emphatically in front of programmers new and old.

It’s really odd seeing a language community proudly and boldly declare what their language is terrible at. It’s even more odd to see them do it with a smile.

What makes one language better than others is that it sucks less. When you compare the long list of Python’s failures with the long list of every other language out there’s failures, you’ll quickly see that Python isn’t too bad. In fact, it’s kind of nice.

We don’t have to weigh our benefits against out costs. We know that every language has some huge benefits.

What we do assert, however, is that our costs are much less than other language’s costs, and so you’ll end up with Python because we suck less.

In the end, all languages only provide one benefit: Helping you get your program correct.

On Ruby February 17, 2010

Posted by PythonGuy in Python, Ruby.
add a comment

Some Pythonistas are glorifying Ruby. It seems Northwest Python Day 2010 has opened the eyes of some to what Ruby really is. (Psst—it’s Lispy.)

Alright folks, it’s time to come clean with Python Guy.

Haven’t you already studied Lisp and Scheme? Don’t you already know about Haskell and Smalltalk?

If you don’t, you’re not really a Python programmer.

I look at Ruby with disgust and pity. I pity that so many people get trapped in its siren song. I am disgusted that it tries to implement so many nice features in the wrong way.

Python and Ruby are not similar, anymore than C and Java and Perl are.

Python is all about doing things right. If your code is good, then everything else is good. Programming languages shouldn’t make your job any harder.

Ruby is all about doing things—at all. If you can somehow get things to work, then perhaps you did it the right way, but who can tell? It works, doesn’t it?

TIMTOWTDI is a bane to all good programmers. Yes, there are many ways to do things, but only one of them can be considered “correct”. Even in math, although you can solve the same problem in a number of different ways, one of the ways is superior to the others. When programming, your goal is to get as close to that superior solution as possible with as little work as possible. Your goal is to minimize effort on the part of everyone involved, including anthropomorphic processors, memory devices, and network connections.

The very fact that things like Rake and RSpec and monkey patching exists means someone is doing something horribly wrong.

Concerning Rake, why on earth is it so hard to build your package that you need an entirely different programming language to describe how to put the thing together? Data structures should be enough. (Cf. Lisp, Scheme.)

Concerning RSpec, if you’re not programming at the descriptive level, with occasional bouts of literalism, you’re doing it wrong. (Cf. Lisp, Scheme, Prolog, Haskell, and pretty much every other programming language not based on ALGOL.) Again, your program should be a data structure describing the things the program is supposed to do. The hard bits of figuring out how to do those things should be left to smarter (or simply less busy) programmers than yourself.

Concerning monkey patching, your API is insufficient if people have to monkey patch to get stuff done. Why on earth is that seen as a good thing—programming to an API that the provider doesn’t explicitly expose? Rather, build a common system that exposes the common bits that people need to get at.

I am a Python Guy because Python is really, really good for programming. Yes, by definition it has to be different from Ruby and Perl and C and Java and all those other crummy languages. On the same token, it has to be different from Lisp and Scheme and Haskell and Smalltalk, which provide superior features but a difficult to use environment. Python strikes the right balance between making things easy for the programmer and doing things the right way.

Folks, about ten years from now, a whole lot of Ruby programmers will be missing. They will have left Ruby for Python, or whatever new language that will be created to replace Python’s niche. Python has that odd habit of bringing in refugees from about every other programming language while driving few, if any, out.

Those who know the most about programming as an art and science are naturally drawn to Python. When old Lispers go to Python to end their career, that’s saying something really important. When MIT switches its curriculum from Scheme to Python, you’ve got to see there’s a reason for that.

Anyway, enough ranting.

By the way, the reason why Python programmers weren’t programming during the talks? Because Python isn’t the type of language that consumes massive amounts of programming time just to get things done. Ruby, on the other hand, requires significant investments of time to simple get the easiest things working.