In most businesses, email and collaboration services are essential to the performance of the company. Delivering a communication platform that scales well with the growth of the company, and provides the services anytime, anywhere, even in the event of failures is hard to achieve
...
In most businesses, email and collaboration services are essential to the performance of the company. Delivering a communication platform that scales well with the growth of the company, and provides the services anytime, anywhere, even in the event of failures is hard to achieve at low costs. Literature has proven that simplified email storage is scalable. Interactive collaboration services on the MAPI protocol, however, are limited in their scalability due to the data complexity. With this thesis, I analysed the groupware use case and data structure for possible solutions to this problem. Based on the service requirements and storage layers available, a proposed key-value data structure is presented. While out dated, the literature presented benchmark results for this database category with small 1 KB values. In this thesis, I benchmarked MySQL Cluster, Cassandra, Riak, Voldemort, and HBase using 10 KB values while focussing on the I/O subsystem throughput and failure tolerance of these databases, simulating email characteristics. The proposed solution, utilizes Riak with ZooKeeper to provide a single point of entry, scalable, and fault-tolerant communication service. I developed a prototype service and load simulator to demonstrate its scalability and failure tolerance through an extensive load simulation of 32 thousand users. The results show how failures are dealt with, and how the cluster expands, all without disrupting the user interaction on the service.