Centrifugo v2 – The Future Of The Real-Time Server And The Library For The Go

Centrifugo v2 Main Logo

Centrifugo v2 – The Future Of The Real-Time Server And The Library For The Go

Perhaps some of the readers have already heard of Centrifugo before. In this article, we will discuss the development of a second version of the server and a new real-time library for the Go language, which lies at its base.

Last summer we have joined Avito’s team, where we now help to develop the backend of Avito’s messenger. A new project, directly related to the rapid delivery of messages to users, and new colleagues inspired us to continue working on the open-source project Centrifugo.

In a nutshell, it’s a server that takes on the task of keeping constant connections from users of your application. As a transport, WebSocket or SockJS is used.If you can not install WebSocket-connection, work through Eventsource, XHR-streaming, long-polling and other HTTP-based transports. Customers subscribe to channels in which the backend through the Centrifuge API publishes new messages as they arise – after which messages are delivered to the subscribed users. In other words, it’s a PUB/SUB server.

Centrifugo v2 Photo 1

Currently, the server is used in a sufficient number of projects. Among them, for example, some Hotmail.com projects, Centrifugo uses a beautiful dashboard at the Badoo, and in the spot.im service 350 thousand users are simultaneously connected to the Centrifuge.

We began work on the second version in December of last year and continue to this day. Let’s see what comes out of this. We writing this article not only to somehow popularize the project but also to get a little more constructive feedback before the release of Centrifugo v2 – now there is room for maneuver and back incompatible changes.

Permanent library for Go

In the Go community, from time to time the question arises – are there alternatives to socket.io on Go? Sometimes we noticed how the developers in response to this advice to look toward Centrifugo.

However, Centrifugo is an independent server, not a library – the comparison is not fair. Also, several times we were asked whether it is possible to re-use the Centrifuge code in order to write real-time applications in the Go language. And the answer was: theoretically possible, but at your own peril – the backward compatibility of the API of internal packages could not. It is clear that there is no reason to risk anybody and to forge too an option so-so. Plus, we would not say that the API generally packages were generally prepared for such use.

  • Therefore, one of the ambitious tasks that we would like to solve in the process of working on the second version of the server is to try to isolate the core of the server into a separate library on Go. we believe that this makes sense, highlighting how many features the Centrifuge has to be adapted to production. There are many features out of the box, designed to help with the construction of scalable applications in real time, taking a separate solution with the developer. we wrote about these features earlier and will also mention some of them below.

We will try to justify one more plus of existence of such library. Most Centrifugo users are developers who write a backend in languages /frameworks with weak concurrency support (for example, Django/Flask/Laravel/…): to work with a large number of permanent connections, if possible, in a non-obvious or inefficient way. Accordingly, help with the development of the server written on Go, not all users (banal because of ignorance of the language) can not. Therefore, even a very small community of Go-developers that can be used in the development of the Centrifugo server that uses it.

  • As a result, the Centrifuge library was obtained. It’s still a WIP, but absolutely everything is stated in the description on Github features are implemented and working. Since the library provides a fairly rich API, before the programs are backward compatible, we would like to hear about several successful examples of using in real projects on. There are not any. As well as unsuccessful.

We understand that by naming the library in much the same way as a server, we will always deal with confusion. But we think this is the right choice since customers (such as centrifuge-js, centrifuge-go) work with both the Centrifuge library and the Centrifuge server. Plus the name is already firmly entrenched in the minds of users, and we do not want to lose these associations. And yet for a little more clarity, we will clarify again:

  • Centrifuge – library for the language Go,
  • The centrifuge is a ready-made solution, a separate service, which in version 2 will be built on the Centrifuge library.

Centrifugo because of its design (a stand-alone service that does not know anything about your backend anything) assumes that the flow of messages in real-time traffic will go from the server to the client. What is meant? If, for example, the user writes a message to the chat, then this message must first be sent to the application’s backend (for example, AJAX in the browser), on the side of the backend it is to be flushed, stored in the database if necessary, and then sent to the Centrifuge API. The library removes this restriction, allowing you to organize the bidirectional exchange of asynchronous messages between the server and the client, as well as RPC calls.

Centrifugo v2 Photo 2

Let’s look at a simple example: we implement a small server on Go using the Centrifuge library. The server will receive messages from browser clients via WebSocket, the client will have a text field where you can enter a message, press Enter and the message will be sent to all users subscribed to the channel. That is the simplest version of the chat. It seemed to me that it would be most convenient to place it in the form of a gist.

You can start as usual:


git clone https://gist.github.com/2f1a38ae2dcb21e2c5937328253c29bf.git
cd 2f1a38ae2dcb21e2c5937328253c29bf
go get -u github.com/centrifugal/centrifuge
go run main.go

And then go to http://localhost: 8000, open several browser tabs.

As you can see, the entry point to the application’s business logic occurs when the On (). Connect () function is attached to the callback function:


node.On().Connect(func(ctx context.Context, client *centrifuge.Client, e centrifuge.ConnectEvent) centrifuge.ConnectReply {

client.On().Disconnect(func(e centrifuge.DisconnectEvent) centrifuge.DisconnectReply {
log.Printf("client disconnected")
return centrifuge.DisconnectReply{}
})

log.Printf("client connected via %s", client.Transport().Name())
return centrifuge.ConnectReply{}
})

The approach based on callback-functions seemed to me the most convenient for interaction with the library. Plus similar, only weakly typed, the approach is applied in implementation socket-io of the server on Go. If suddenly you have thoughts about how the API could be done more idiomatically – I’ll be happy to hear it.

This is a very simple example that does not demonstrate all the features of the library. Someone may note that for such purposes it is easier to take a library to work with WebSocket. For example, Gorilla WebSocket. This is actually so. However, even in this case, you will have to copy a decent piece of server code from the example in the Gorilla Websocket repository.

What if:

  • You need to scale the application to multiple machines,
  • Or you need not one common channel, but several – and users can dynamically subscribe and unsubscribe from them as you navigate through your application,
  • Or you need to work when the WebSocket connection fails to establish (there is no support in the client’s browser, there is a browser extension, some proxy on the path between the client and the server cuts the connection)
  • Or you need to restore messages that the client missed during short disconnections of the Internet connection without loading the main database,
  • Or need to control the authorization of the user in the channel,
  • Or you need to disconnect the persistent connection from users who have been deactivated in the application,
  • Or you need information about who is currently in the channel or the events that someone subscribed/unsubscribed from the channel,
  • Or need metrics and monitoring?

The Centrifuge library can help you with this – in fact, it inherited all the main features that were previously available in Centrifugo. More examples showing the above-stated items can be found on Github.

The strong legacy of Centrifugo can be a drawback, as the library took over all the mechanics of the server, which is quite original and, perhaps, to someone it may seem unobvious or overloaded with unnecessary features. We tried to organize the code in such a way that unused features did not affect the overall performance.

There are some optimizations in the library that allow you to use resources more efficiently. It is the integration of several messages into one WebSocket frame to save on the Write system calls or, for example, using Gogoprotobuf to serialize Protobuf messages and others. Speaking of Protobuf.

Binary Protobuf Protocol

We really wanted Centrifugo to be able to work with binary data (and not only us), so in the new version, we wanted to add a binary protocol besides the existing one based on JSON. Now the entire protocol is described as a Protobuf scheme. This made it possible to make it more structured, to rethink some unobvious decisions in the protocol of the first version.

We think it is not necessary to tell for a long time what advantages at Protobuf over JSON have – compactness, the speed of serialization, the rigidity of the scheme. There is also a disadvantage in the form of unreadability, but now users have the opportunity to decide what is more important in this or that situation.

In general, the traffic generated by the Centrifugo protocol when using Protobuf instead of JSON should be reduced by ~ 2 times (without application data). In the same ~ 2 times, the CPU consumption in our synthetic load tests decreased in comparison with JSON. These figures do not really say much, in practice everything will depend on the load profile of the particular application.

For the sake of interest, We started on a machine with Debian 9.4 and 32 Intel® Xeon® Platinum 8168 CPU @ 2.70GHz CPU benchmark, which allowed to compare the bandwidth of client-server interaction in case of using JSON-protocol and Protobuf-protocol. There were 1000 subscribers per channel. In this channel 4 messages were published and delivered to all subscribers. The size of each message was 128 bytes.

Results for JSON:


$ go run main.go -s ws://localhost:8000/connection/websocket -n 1000 -ns 1000 -np 4 channel
Starting benchmark [msgs=1000, msgsize=128, pubs=4, subs=1000]
Centrifuge Pub/Sub stats: 265,900 msgs/sec ~ 32.46 MB/sec
Pub stats: 278 msgs/sec ~ 34.85 KB/sec
[1] 73 msgs/sec ~ 9.22 KB/sec (250 msgs)
[2] 71 msgs/sec ~ 9.00 KB/sec (250 msgs)
[3] 71 msgs/sec ~ 8.90 KB/sec (250 msgs)
[4] 69 msgs/sec ~ 8.71 KB/sec (250 msgs)
min 69 | avg 71 | max 73 | stddev 1 msgs
Sub stats: 265,635 msgs/sec ~ 32.43 MB/sec
[1] 273 msgs/sec ~ 34.16 KB/sec (1000 msgs)
...
[1000] 277 msgs/sec ~ 34.67 KB/sec (1000 msgs)
min 265 | avg 275 | max 278 | stddev 2 msgs

The results for the Protobuf case:


$ go run main.go -s ws://localhost:8000/connection/websocket?format=protobuf -n 100000 -ns 1000 -np 4 channel
Starting benchmark [msgs=100000, msgsize=128, pubs=4, subs=1000]

Centrifuge Pub/Sub stats: 681,212 msgs/sec ~ 83.16 MB/sec
Pub stats: 685 msgs/sec ~ 85.69 KB/sec
[1] 172 msgs/sec ~ 21.57 KB/sec (25000 msgs)
[2] 171 msgs/sec ~ 21.47 KB/sec (25000 msgs)
[3] 171 msgs/sec ~ 21.42 KB/sec (25000 msgs)
[4] 171 msgs/sec ~ 21.42 KB/sec (25000 msgs)
min 171 | avg 171 | max 172 | stddev 0 msgs
Sub stats: 680,531 msgs/sec ~ 83.07 MB/sec
[1] 681 msgs/sec ~ 85.14 KB/sec (100000 msgs)
...
[1000] 681 msgs/sec ~ 85.13 KB/sec (100000 msgs)
min 680 | avg 680 | max 685 | stddev 1 msgs

You can see that the throughput of such an installation is more than 2 times greater in the case of Protobuf. The client script can be found right here – this is the Nats benchmark-script adapted for the reality of Centrifuge.

It’s also worth noting that the JSON serialization performance on the server can be “pumped” using the same approach as buffer pool and code generation – currently, JSON is serialized with a package from the standard Go library built on reflect. For example, in Centrifugo, the first version of JSON is serialized manually using a library that provides a buffer pool. Something similar can be done in the future in the second version.

It is worth emphasizing that protobuf can also be used when communicating with the server from the browser. Javascript client uses for this library protobuf.js. Since the protobufjs library is quite heavy, and the number of users of the binary format will be small, with the help of the web pack and its tree shaking algorithm, we generate two client versions – one with JSON protocol support only, and the other with support for both JSON and protobuf. For other environments where the size of the resources does not play such a critical role, customers can not worry about this division.

JSON Web Token (JWT)

One of the problems in using such a standalone server as Centrifugo is that it knows nothing about your users and the method of their authentication, about what mechanism of sessions your backend uses. And to authenticate connections somehow it is necessary.

  • For this, in the first version of the Centrifuge, when connecting, an SHA-256 HMAC signature was used, based on a secret key, known only to the backend and Centrifuge. This guaranteed that the User ID transmitted by the client really belonged to him.

Perhaps the proper transfer of connection parameters and the generation of a token were one of the main difficulties in integrating Centrifugo into the project.

When the Centrifuge appeared, the JWT standard was not yet so popular. Now, several years later, there are libraries for JWT generation for most popular languages. The main idea of JWT is exactly what the Centrifuge needs: authentication of the transmitted data. In the second version of the HMAC, the hand-generated signature gave way to the use of JWT. This eliminated the need to support helper functions to properly generate a token in libraries for different languages.

For example, in Python, a token for connecting to Centrifugo can be generated as follows:


import jwt
import time

token = jwt.encode({"user": "42", "exp": int(time.time()) + 10*60}, "secret").decode()

print(token)

It is important to note that if you use the Centrifuge library, you can authenticate the user native to the Go language in a way – inside the middleware. Examples are in the repository.

GRPC

During development, we tried GRPC bidirectional streaming as a transport for communication between the client and the server (in addition to WebSocket and HTTP-based SockJS). What can we say? It worked. However, we did not find a single scenario where bi-directional streaming of GRPC would be better than WebSocket. We mainly looked at the metrics of the server: the generated traffic through the network interface, the CPU consumption by the server when there is a large number of incoming connections, and the memory consumption for the connection.

GRPC lost WebSocket on all counts:

  • GRPC generates 20% more traffic in similar scenarios,
  • GRPC consumes 2-3 times more CPUs (depending on the configuration of connections – all are subscribed to different channels or all are subscribed to one channel),
  • GRPC consumes 4 times more RAM for the connection. For example, for 10k connections, the WebSocket server has 500Mb of memory and GRPC is 2Gb.

The results were enough … expected. In general, in GRPC as a client transport, we did not see much sense – and deleted the code with a clear conscience before, perhaps, better times.

However, GRPC is good at what it was primarily created for – to generate code that allows you to make RPC calls between services in accordance with a predetermined scheme. Therefore, in addition to the HTTP API in the Centrifuge, there will now be support for the GRPC-based API, for example, to publish new messages to the channel and other available server API methods.

Difficulties with customers

Changes made in the second version, we removed the mandatory support for libraries for the server API – it was easier to integrate on the server side, however, the client protocol in the project is different and has a sufficient number of features. This makes the implementation of clients quite difficult. For the second version, we now have a client for Javascript, which works in browsers, should work with NodeJS and React-Native. There is a client on Go and build on its basis and on the basis of the project gomobile bindings for iOS and Android.

For complete happiness, there are not enough native libraries for iOS and Android. For the first version of Centrifugo, they were fiddled by guys from the open-source community. We want to believe that something will happen like this now.

Recently, we tried our luck by sending an application for a MOSS grant from Mozilla, intending to invest in the development of clients, but was refused. The reason is not enough active community on Github. Unfortunately, this is true, but, as you can see, We are taking some steps to improve the situation.

Conclusion

We did not mention all the features that will appear in Centrifugo v2 – a bit more information is in the issue on Github. The release of the server has not yet taken place, but it will soon happen. There are still unfinished moments, including the need to complete the documentation. The prototype of the documentation can be viewed from the link. If you are a Centrifugo user, now is the right time to affect the second version of the server. A time when it is not so terrible to break something, to later do better. For those interested: the development is concentrated in the c2 branch.

It’s hard for me to judge how much the Centrifuge library, which underlies Centrifugo v2, will be in demand. At the moment We pleased that We were able to bring it to its current state. The most important indicator for me now is the answer to the question “Would we ourself use this library for a personal project?”. Our answer is yes. At work? Yes. Therefore, we believe that other developers will appreciate.