Session stickiness is a load balancing technique that uses some sort of information to attach recurring visits (http calls) from one visitor to the same machine.
This is a very common approach, and the common scenario is an apache web server as a load balancer in front of a set of apache tomcat instances. The load balancer usually works as the ssl endpoint and does all the gzip and http header magic.
This approach does have a set of issues:
- power users don’t get their traffic load balanced, if you manage to get a few of them stuck onto one instance, you’ll kill the instance and anyone else stuck on it, while other instances would have enough free resources;
- you cannot restart an instance, as users will get their sessions killed;
- you cannot add and remove instances on the fly, as you will be killing sessions again.
One way of dealing with the killing sessions problem is to start synchronizing sessions across your tomcat instances. This works fine if you don’t have too many instances, too many sessions, too large sessions. Either you won’t scale, or you will start creating groups of synchronized tomcat instances and start having a few of those groups. This is something I would rather not have to maintain.
One common solution is to introduce an external session storage, let’s say, the database. While the database works, it might become the bottleneck rather sooner than later. Another solution is to use an external cache, let’s say, memcached. There are many cache solutions out there, and most of them will probable better fit your specific case, but in our case, memcached worked just fine. To be honest, we are using the elasticache brother of memcached provided my the amazon aws infrastructure, for more details go visit http://aws.amazon.com/elasticache/.
The setup of a memcached node is quite well documented, I won’t go into any details here. Regarding our web application, we are using spring security.
So all we had to do was to:
- provide our own session storage implementation that stores the security context into memcached rather then storing it into the http session;
- forbid the storage of ANY information into the http session;
- forbid the access to file system or any other machine bound resources (the next call will get you to another machine, so simply don’t)
We ended up using memcached for cluster synchronization of jobs, session authentication storage and, for a short period of time, the PDFs we generate asynchronously in the application. This changes to our web application were crucial to make the move into to the amazon cloud infrastructure. The fact that the web application is based on a GWT + Spring stack did help, but it was not enough.