This article probably has the potential to interest less than .001% of the population, but if you are someone who thinks about integrating things like OpenLDAP, Wackamole and Zimbra then this blog entry is for you!
Over the last 15 years our company has acquired over 10 different email domains managing email for over 10,000 users. We took on the task of building a single high availability Zimbra cluster and then consolidating several different mail server systems onto this platform.
We started with a network focused on High Availability by utilizing a pair of Layer 3 Cisco routers to create a virtual gateway IP address that the systems would point to via HSRP using a “Net A/B” design with OSPF internally. This enabled redundancy all the way to our backbone, and these routers also provide the ethernet ports for the servers.
To build a scalable mail platform we designed a system that is composed of many roles so that we could adjust the resources needed for each role rather than having to have a bunch of identically powered servers for every role wasting resources, or the more common scenario where you pile multiple functions on a single server and than have a hard time breaking the system apart when one of the functions outgrows the system.
We divided the roles into: Proxy, Mailbox, Database, MTA Inbound MX, and MTA Outbound Relay. Each system is either set up in an active/active partnership, or active/standby. Proxies handle all IMAP, POP3, and HTTP/HTTPS (Webmail). The Mailbox server stores the emails, attachments, calendars, etc. The Database (OpenLDAP) stores user account info (passwords, preferences, email address). Both MTA systems handle antivirus/antispam (Amavisd-new with Spamassassin/ClamAV), and Postfix. We integrated HeartBeat and Wackamole to provide our systems with active/standby and active/active roles. HeartBeat works great for managing Zimbra and it’s shared resources for the mailbox servers (Virtual IP, Shared Storage, custom scripts). We chose Wackamole to provide us with a simple fault tolerance network with the Virtual IPs, and combined multiple DNS records to get basic load balancing (more on this in Part 2 of this blog). Every server needed multiple network cards for each “Net A/B” they connect to so we utilized the Linux bonding driver to logically combine the interfaces into active load balancing.
In terms of Systems hardware, we used a combination of IBM BladeCenters and commodity rack mount servers. The IBM Blades are managed by the Linux Xen Hypervisor, and an EMC Clarion for ISCSI connected storage. For Mailbox storage, we used fiber channel HP MSA with RAID10, and a second unit that used external SAS with RAID10. Both storage systems provided multi-path connectivity to the storage. Xen virtualization gave us hardware consolidation and a means to migrate live Zimbra instances across our Blade Network. The mail system now consists of about 20 servers, each engineered to handle a specific role. In part 2 of this blog, I’ll discuss the good, the bad and the ugly of migrating 10,000 mailboxes from multiple heterogeneous mail systems.