RabbitChat - Chat Server in Python using Tornado, Sockjs & RabbitMQ - also talking about Tcp sockets, Websockets & python GIL

ANIRBAN ROY DAS
11 min readMay 20, 2016

--

Chat is eating the world! If you are not convinced, then I am going to coin some names and try to lure you into believing me anyway.

Have you heard of Whatsapp, Snapchat, Messenger, Wechat? They are giants that are ruling the world presently. They are like the kings of startups and every other guy who wants to start their own startup is probably dream weaving these giant’s success stories.

Facebook M (Personal Assistant) Image Source

Still no? Have you heard of this sudden phenomena that started with this one company called Magic, the one stop messaging app which created a wave of many similar companies with the idea of a personal assistant app? Yes, this idea has taken the world. Being in India, we know for certain that there are close to 50 companies trying to do this. Chat is their Asgard. Their holy place. Without it they cannot do anything. Their whole business is based on the fundamentals of chat. Even the mighty Facebook has announced its candidacy in this personal assistant app runway.

See, I just made chat and instant messaging cool for you. I made it important for you. Now you definitely need to read on and later go ahead and try to start up a new company based on it.

Note

I generally write my blogs with a certain wit and try to write in as simple words as possible and avoid the tech jargons, just to be on the lighter side. Though I intend to do so for all my blog posts, but some of the topics would deal with real discussion about the project that I will be blogging about. In those blogs, I will try to be more technical about stuffs and talk more from a technical perspective with code examples and diagrams.

Image Source

Introducing RabbitChat

Today’s post is one such post where I would like to talk about this project that I recently did. Its a (guess what?) Chat Server. Its written in python language (Cpython). It uses Tornado to handle the networking and web requests. It uses websockets to do instant messaging using the sockjs websocket library. It also uses RabbitMQ as a message broker to queue and deliver messages over AMQP protocol. Pika is an amqp python client library that is used to talk to RabbitMQ from python.

That’s a gist of the tech specs. I will talk about each of them in detail one by one. For the impatient ones, go ahead and play with it already. Github Link.

Let’s do the engineering now

Here is a diagram of what I drew on my whiteboard while trying to engineer the system. This diagram may be hazy and disgusting and my apologies for not coming up with anything better. But I always like the concept of keeping it raw. Raw means the exact brainstorming that went through an idea or a project. Also I was lazy to draw diagrams in some word processor. So please bear with me. I will try to come up with more professional diagrams in my future blog posts.

So what we see is a high level diagram of the entrire system. At the top they are the web browsers or the web clients, B1, B2, B3, B4, etc. These boxes or clients are outside of the dotted black coloured circle.

The dotted circle is part of the backend, the server side. Now in the backend, there are two major groups. The one on the top is the Tornado Server. The web requests actually hit Tornado itself. From there, few requests like api requests, auth requests, etc. are directly responded back from Tornado itself. And the concerned instant messaging requests are first sent to the lower box and the lower box responds with some data back to the upper box, i.e the Tornado server and then tornado sends back the appropriate response back to the clients.

Now, lets talk a little more about the lower box. The lower box is our RabbitMQ message broker. Now, we will talk in detail about each part and try to understand the entire workflow and also see some code examples.

Torndao

This is not the tornado we keep hearing about in the news. Its a python web and networking library developed by the folks at Friendfeed which now doesn’t exist, but this projects continues to grow as an open source project on github.

Tornado is like a unicorn in the python world. Tornado is python community’s answer to the NodeJs community and they stand proud in representing tornado as it works similar to node with a similar single threaded event loop model which is why it makes both of them highly concurrent in their applications and the reason for the buzz. Only difference is Tornado is a python library, that means it runs on the python vm, eg: the cpython vm. But NodeJs itself is a run time engine which runs the Javascript language. Though both of them aren’t actually comparable on eye to eye basis but nodejs is unique because nobody can explain nodejs to you directly. But most of them actually compare nodejs with a language counterpart like python, ruby, java (which actually is funny since javascript is the language and not nodejs but hey! you know what we computer science engineers are talking about). Python language is a very humanly language and easy to understand. You can become a python pro in a very less time as compared to other languages like Java (sigh…!).

But python has its drawbacks. Since it tries to make everything so easy to work with in the front, but it does it at the expense of being pretty slow in computational work in the backend. It’s performance comparison with other statically typed languages like C/C++, Java and its children (Scala, Clojure), Go, is slower by many many times. Since, python is dynamically typed (making work easy in the front end for developers). But, it has to deal with many type conversions, bad memory allocations, layers of codes before it actually makes the real system calls or C calls (for Cpython). It also has the deadly GIL (Global Interpreter Lock) which is although present in a few other languages like ruby but its widely and loudly talked about in python community than anywhere else.

To give you some perspective on GIL exists or not in other languages, please follow these links and amaze and educate yourself.

GIL

GIL is what makes python come down to its knees. GIL basically holds a global lock at the interpreter (the python process) level. It doesn’t allow more than one thread to run simultaneously. Lets see what this actually mean.

It means when python is running and processing something, it is using cpu cycles and the python process is executing some code in a thread. Now GIL will hold a lock on this thread and won’t allow any other thread to run while this thread is running. Now suppose the thread, or the process creates another thread to run part of the code in the new thread. But unfortunately if the first thread is still running then the new thread will not be able to run at all, even if the system you are running on has multiple cpu cores. This is just the GIL. The evil GIL. But if the thread is doing some I/O then GIL is released and other threads can run. Also, if the thread is making any C level function call, the GIL is released even in that case. Its pretty weird, right? But its what it is. The reason why and how this is done is beyond the scope of this post. May be sometime in the future.

Now if python is so slow and it also has the evil GIL, how come people still code in python and its one of the most loved and favoured languages to work on , and how are there so many startups who prefer working on python instead of Java or C++, etc?

Folks, the thing is, most of the startups or the projects that we see now a days are web based and all of them require some sort of networking. And networking means more of I/O and less of CPU bound work. And as I mentioned, python GIL is released while doing I/O. Thus waiting or blocking for I/O in Java or C/C++ is as good as waiting in python, then why not just wait in python and use its humanly and easy to code syntax.

Here are few links which will really get you from null to master on GIL and python’s threads and processes:

Image Source

Why Torndao? Why not Django or Flask?

Guys, please don’t do this. If I say one of them is better than the other, there will be task forces from each community behind me. But let me give you some perspective. And I will try to explain my slight bias toward one with much humility.

At the end of the day, every server has two things, one is a number of raw TCP Sockets listening on some Ports at an IP address. Two, a protocol to communicate with the client. HTTP server is a server which uses Http protocol to communicate with the client and it listens on port 80 of a particular socket.

In general when we start an http server, we are actually creating TCP socket listening on port 80 at a particular IP address. The http server.listen() or similar command is just a sugar for the entire complex socket creation, socket binding and listening on the socket process.

Django and Flask are web frameworks that help you do those above work in a much abstract way. it gives you http server and also a library or bunch of modules to use to accept, listen, process, respond to http requests. Simple. Well, Tornado almost does the same thing.

The difference is Tornado is a single threaded and evented server. Wait, wait, don’t worry. I am here to explain this.

Django and Flask when starts, it starts a process and when you send request to Django or Flask, it starts processing the request. But its not single threaded, if the process is running and blocks on something, the further requests may be delayed until the process currently running is returned from CPU (GIL is released). Uwsgi, Gunicorn, all these are python WSGI servers which help run Django or Flask apps in a production environment by forking multiple process and each individual http request is served by each of these processes.

But in tornado, every request is served by a single process, it works in an evented model, where each request if waiting on I/O, waits on some kind of polling mechanism like epoll (Linux) or kqueue (BSD and Mac OS X) if they are available, or else falling back on select(). This allows larger concurrency.

To know more, here are few links, they are real good read.

RabbitMQ, AMQP and Pika

AMQP is a protocol used to do instant messaging. Its lightweight and fast and its tailor made for message communication. RabbitMQ is a message broker, message queue used to queue and process messages. RabbitMQ talks using the AMQP protocol. So any client which wants to communicate with RabbitMQ should be talking using the AMQP protocol. Pika is a client library used to talk with RabbitMQ using the AMQP protocol.

Websocket and SockJs

Websocket is a tcp protocol which is similar to HTTP, but it keeps a connection active unlike HTTP protocol which is a simple request/response protocol and closes the connection as soon as a response is received from the server.

Websockets are the best tool for instant messaging where the connection needs to remain active during the chat session. Sockjs is a library which helps to talk via websocket protocol (or falls back to other ajax, long polling, comet based communication method if websocket is not present in the browser). The communication is between a browser client and a server (having an implementation for websockets), i.e ready to accept websocket requests like it accepts http requests).

The Workflow

If you look at the figure again which shows the high level overview, it starts with the web browsers (the clients), it sends the first websocket request using the sockjs js library on the client side, tornado accepts it and open a websocket connection between the browsers(client) and itself(server).

Tornado itself starts an independent amqp connection with rabbitmq using the pika library for each active websocket connection it creates. This creates a whole message flow passage starting from the cleint to tornado to rabbitmq and back to client.

Note that RabbitMQ is not exposed to the clients directly but via tornado. This is important from security point of view.

Now let’s do a dry run of people chatting using this workflow. So when a user wants to chat in a public chatroom , first the websocket connection is established, then tornado establishes an amqp connection with rabbitmq. In this process, tornado creates pika channel and uses the channel to talk to an exchange and attaches a queue with this exchange and connection with a particular topic. AMQP topic determines which messages passes through an exchange and which are discarded.

What we are using here is Publish/Subscribe mechanism, where a client subscribes to a channel which is determined by a topic and when some other client publishes to a channel with match a topic pattern, then the exchange passes the message to the subscribed clients via the queues that are actually attached to the exchanges.

This way multiple users/clients can subscribe to a public channel and every time a client message to a the public channel, everybody subscribed to the channel receives the message.

Similarly, if two persons/clients want to talk privately, then each of them would subscribe to each other’s channel, and hence when anybody sends a message with a topic pattern matching the receiver’s attached queue’s topic, the receiver receives the message.

There are mechanisms like acknowledgements which can help you determine things like message has been read already or not.

Then there are way, tricks using the websocket connection to show the typing indication. All of these are implemented in this project.

Go ahead and check the source code from Github, or install the app from pypi using pip install rabbitChat. Let me know if you have any doubts or suggestions.

Documentation for the project is hosted here.

I have not included any code explaining in detail. Planning to do so in a future post. For now go ahead and get familiar with the project and read about amqp and how rabbitmq works. Pretty interesting stuff.

--

--

ANIRBAN ROY DAS

I believe in knights and 50 other things. An observer, listener, storyteller, make believer and writes colourful texts on a dark background for a living.