Micro services are fun and easy to set up, and an important part in a micro service architecture.
One consideration in choosing which language to use for a given micro service/REST API is the speed of the servers that are readily available for that language.
Given this, I have decided to test out a variety of minimal web servers, that would serve as a solid base for a REST API / micro service.
All tests are performed on an (obviously weak) machine with the following stats:
To start with testing, we will serve a simple "Hello, World" response from each server and run the following benchmark test against them:
ab -c10 -n1000 'http://127.0.0.1:<port>/'
This is Apache Bench, with 10 concurrent users performing 1000 total requests (so, in the output, you will see batches of 100 requests performed).
<port> is going to be the port the server is running on.
It should go without saying, but for all these benchmarking tests, please disable any special debugging options that may have dramatic performance hits in each given language (such as xdebug in PHP).
To test PHP, we will begin by benchmarking a standard "Hello, World" script served from Apache 2 set up with the default configuration, followed by a "Hello, World" served from a persistent daemon (REACT PHP).
After that, we'll compare how much difference a persist daemon makes when using a framework vs not.
Add the following script into a file named 'hw.php' in a directory accessible via Apache:
<?php echo 'Hello, World';
Then (assuming your Apache is already set up) test with CURL to verify the response, test with ab to generate the bench:
curl http://127.0.0.1/hw.php # Should produce 'Hello, World' ab -c10 -n1000 'http://127.0.0.1/hw.php'
We end up with the following (truncated) output (obviously this will differ on your machine):
Concurrency Level: 10 Time taken for tests: 0.617 seconds Complete requests: 1000 Failed requests: 0 Total transferred: 154000 bytes HTML transferred: 12000 bytes Requests per second: 1620.24 [#/sec] (mean) Time per request: 6.172 [ms] (mean) Time per request: 0.617 [ms] (mean, across all concurrent requests) Transfer rate: 243.67 [Kbytes/sec] received
Not bad, but it could probably be faster if we weren't having Apache handle loading the PHP interpreter on each request.
So, lets try this out. For our purposes, the single page of setup here will work to get us started:
When you set it up/start the server, prepare to benchmark with the following:
ab -c10 -n1000 'http://127.0.0.1:1337/'
Concurrency Level: 10 Time taken for tests: 0.423 seconds Complete requests: 1000 Failed requests: 0 Total transferred: 122000 bytes HTML transferred: 22000 bytes Requests per second: 2361.98 [#/sec] (mean) Time per request: 4.234 [ms] (mean) Time per request: 0.423 [ms] (mean, across all concurrent requests) Transfer rate: 281.41 [Kbytes/sec] received
Well, that isn't terrible (over 700 more requests per second, a 45% performance boost), however Apache could probably be tuned to perform better in this use case, but it does illustrate how removing the start up time can have a significant impact on performance.
Now, lets see if running in a daemon format makes more of a difference under a different use case.
Symfony is a great framework, I love it, but in the case of our micro service setup, it may not be a good choice, as it is a pretty heavy framework, and we're testing on a pretty weak machine.
Regardless, lets see how it peforms (if we wanted to use it to gain the code structure/cleanliness/libraries it provides):
Set up a copy (refer to http://symfony.com/doc/current/setup.html), then lets update the DefaultController.php file in the codebase as such:
public function indexAction(Request $request) { return new Response('Hello, World'); }
and don't forget to disable the debug features in appdev.php (or they will tear up your performance - and really make sure xdebug isn't turned on!), and then bench similar to our other projects with:
ab -c10 -n1000 'http://local.symfony/app_dev.php'
and the results:
Concurrency Level: 10 Time taken for tests: 174.178 seconds Complete requests: 1000 Failed requests: 0 Total transferred: 380000 bytes HTML transferred: 12000 bytes Requests per second: 5.74 [#/sec] (mean) Time per request: 1741.775 [ms] (mean) Time per request: 174.178 [ms] (mean, across all concurrent requests) Transfer rate: 2.13 [Kbytes/sec] received
Pretty terrible (400x slower than plain PHP, quite a heavy tax to pay for what a framework offers).
Could it be better if we didn't have to have all the time spent ramping up those framework files? Lets give it a shot with PHP PPM (React PHP for Frameworks).
Follow the set up instructions on https://github.com/php-pm/php-pm for setup with Symfony, and onward to the benchmark:
Concurrency Level: 10 Time taken for tests: 7.371 seconds Complete requests: 1000 Failed requests: 0 Total transferred: 244000 bytes HTML transferred: 22000 bytes Requests per second: 135.67 [#/sec] (mean) Time per request: 73.710 [ms] (mean) Time per request: 7.371 [ms] (mean, across all concurrent requests) Transfer rate: 32.33 [Kbytes/sec] received
Hmm, not bad! About 25x faster than plain old Symfony over Apache, unfortunately it is still 20x slower than a plain PHP script (but, that is to be expected, as we're loading an immense number of libraries/classes/scripts).
So, Symfony had some issues - how does a very minimal framework in a language touted for multi-threading handle?
Lets test out Express with node/npm (https://www.npmjs.com/package/express) with the Hello, World sample listed on that page:
Concurrency Level: 10 Time taken for tests: 1.019 seconds Complete requests: 1000 Failed requests: 0 Total transferred: 205000 bytes HTML transferred: 11000 bytes Requests per second: 981.60 [#/sec] (mean) Time per request: 10.187 [ms] (mean) Time per request: 1.019 [ms] (mean, across all concurrent requests) Transfer rate: 196.51 [Kbytes/sec] received
Not too shabby, but are we failing to break 1k/RPS due to the framework, or is that just as fast as node is going to go for us?
Lets try without a framework, create the following node.js program:
var app = require('http').createServer(handler); app.listen(3000); function handler(req, res) { res.writeHead(200); res.end('Hello, World'); }
Concurrency Level: 10 Time taken for tests: 0.660 seconds Complete requests: 1000 Failed requests: 0 Total transferred: 87000 bytes HTML transferred: 12000 bytes Requests per second: 1515.78 [#/sec] (mean) Time per request: 6.597 [ms] (mean) Time per request: 0.660 [ms] (mean, across all concurrent requests) Transfer rate: 128.78 [Kbytes/sec] received
Not bad! Almost twice as fast as not using Express.
So, you're interested in programming if you're reading this, so you've likely heard or come to understand that C++ is pretty fast (I mean, they use it for lots of important things requiring high performance/manual memory management and fine tuning etc.).
Unfortunately, C++ doesn't have nearly the same footprints in the web/REST club that PHP does, so options for a C++ based web server are a bit more limited.
Fortunately, there is a project here:
https://github.com/eidheim/Simple-Web-Server
That can quickly get us up and running for the purpose of this test.
Go ahead and clone the repository, but before building it, make the following edit to the httpexamples.cpp file to add a plain old 'Hello, World' route:
//GET-example for the path /hw //Responds with Hello, World server.resource["^/hw"]["GET"]=[](shared_ptr<HttpServer::Response> response, shared_ptr<HttpServer::Request>) { *response << "HTTP/1.1 200 OK\r\nContent-Length: " << "12" << "\r\n\r\n" << "Hello, World"; };
Cool, they have a routing mechanism/URL pattern matcher similar to Symfony, and they make use of C++11 lambdas to do it!
Well, lets see how this one performs, build it with:
cmake . make
and run it with:
./http_examples
and bench it with:
ab -c10 -n1000 'http://127.0.0.1:8080/hw'
which gives us the results:
Concurrency Level: 10 Time taken for tests: 0.395 seconds Complete requests: 1000 Failed requests: 0 Total transferred: 51000 bytes HTML transferred: 12000 bytes Requests per second: 2530.52 [#/sec] (mean) Time per request: 3.952 [ms] (mean) Time per request: 0.395 [ms] (mean, across all concurrent requests) Transfer rate: 126.03 [Kbytes/sec] received
Not bad, not bad at all (if you just got done with the Symfony tests, this is like going in a jet after walking with weights strapped to your back).
However, it's just a little faster than the RPS we had running PHP as a daemon under React PHP (2500 vs 2350).
Although, to make use of this will mean we're going to be using C++, which some people may find more difficult than writing PHP, and a little too low level (but hey, this Simple-Web-Server project includes Boost already, which has many high level language functions/classes built right in).
As always, I've got to bring in one of my personal favorite languages (Common Lisp) and see how it performs in the context of these other languages.
Pop open a Common Lisp REPL (I assume you have a working copy of SBCL and Quicklisp set up, otherwise go do that first) and type:
(ql:quickload :woo) (woo:run (lambda (env) (declare (ignore env)) '(200 (:content-type "text/plain") ("Hello, World"))))
Voila, a working web server to serve "Hello, World" requiring nothing more than 10 seconds of time in the Common Lisp REPL.
Lets see how it performs with the following:
ab -c10 -n1000 'http://127.0.0.1:5000'
and we see the output:
Concurrency Level: 10 Time taken for tests: 0.332 seconds Complete requests: 1000 Failed requests: 0 Total transferred: 114000 bytes HTML transferred: 12000 bytes Requests per second: 3015.20 [#/sec] (mean) Time per request: 3.317 [ms] (mean) Time per request: 0.332 [ms] (mean, across all concurrent requests) Transfer rate: 335.68 [Kbytes/sec] received
Wow! Finally broke the 3000/RPS mark on this old rig.
To his credit, the author of this package is amazing, and I suggest you check out his Github for it (scroll down a little to see how it compares to other language servers):
https://github.com/fukamachi/woo
On a good machine, such as the one he benches on, he hits 40,000/RPS (with the next highest being a server written in Go reaching 30,000).
Is Common Lisp faster than C++? Not necessarily in every context, but if you look at general language benchmarks, it actually does surpass C++ and Java in some (and get beat in others), leaving only pure minimal C as a language that can consistently come in at number one, so, lets go find and test out a C based web server that can serve minimal responses.
Lets try some Common Lisp with an actual framework (Caveman2 in this case, using the bare skeleton produced by the following):
(ql:quickload :caveman2) (caveman2:make-project (pathname "~/src/lisp/cm2-bench")) (cm2-bench:start :server :woo :port 5000)
Concurrency Level: 10 Time taken for tests: 1.166 seconds Complete requests: 1000 Failed requests: 0 Total transferred: 307000 bytes HTML transferred: 12000 bytes Requests per second: 857.86 [#/sec] (mean) Time per request: 11.657 [ms] (mean) Time per request: 1.166 [ms] (mean, across all concurrent requests) Transfer rate: 257.19 [Kbytes/sec] received
Hmm, alright, so it seems inline with the speed of node + express - definitely not as nice as the plain woo bench of a simple request, but still decent.
Ok, this is a little trickier to find, as if you thought people were reluctant to write REST based services in C++, C is even more rare.
Ultimately, I didn't see a good one posted out there, but to give you an idea, a very amateurish one I put together using plain old forking and not threading was able to get the following Hello, World bench:
Concurrency Level: 10 Time taken for tests: 0.618 seconds Complete requests: 1000 Failed requests: 0 Total transferred: 71000 bytes HTML transferred: 12000 bytes Requests per second: 1617.66 [#/sec] (mean) Time per request: 6.182 [ms] (mean) Time per request: 0.618 [ms] (mean, across all concurrent requests) Transfer rate: 112.16 [Kbytes/sec] received
If and when I get time, I'll attempt to perform a similar test on a C based one with actual threading + scheduling of threads, so it can hopefully see some good results (although I do not expect it to pass the benchmarks performed by Woo).
If you anticipate your micro service being a bottleneck in your micro service architecture (a central point many other components depend on) try to avoid cumbersome frameworks in any language, as all the extra routing/features/etc. they provide will have a dramatic impact on your RPS throughput.
The big take away is this: Common Lisp wins once again!