ePOST Serverless Email System

A PEER-TO-PEER Platform for Reliable, Secure Communication

Technical Overview

At a high level, ePOST operates on top of a distributed hash table (DHT). Messages sent in ePOST are first split into their MIME components. Each component is encrypted and stored separately in the DHT. Emails are delivered by sending a small notification message to the message recipient, containing the location and decryption keys of the message components.

Folders are maintained in ePOST using single writer logs, similar to the Ivy peer-to-peer file systems. Changes to a folder are recorded as log entries. The folder has one small piece of mutable data, the log head, which points to the most recent log entry. Each entry then points to the previous.

In the sections below, the components of ePOST are discussed in more detail.

Routing Infrastructure

ePOST is based on the FreePastry peer-to-peer overlay, which provides a highly scalable and fault tolerant routing infrastructure. In ePOST, all nodes are objects are assigned 160-bit identitifers (Ids) from a circular id space consisting of the 160-bit integers, wrapped at 2^160-1. Pastry provides a single route primitive, which takes the form
route(Key, Message)
and routes the given message efficiently to the live node with node id numerically closest to the given key.

DHT

To store data, ePOST uses a modified version of the PAST distributed hash table (DHT). PAST stores objects using a technique called consistent hashing, mapping each object to the current live node with the numerically closest Id to the object's Id. Thus, each node in the network is responsible for storing all objects with ids closest their own id.

To ensure that node churn (or nodes joining and leaving the overlay) does not cause data loss, PAST replicates each object on multiple nodes. In the case of ePOST, PAST is configured to store each object in the 3 closest nodes to the object's identifier. This choice provides a good balance between replication overhead and object availability. Protection against correlated node failures is provided by a system called Glacier, described below.

Since many emails are deleted, ePOST must be able to eventually garbage collect objects stored in the DHT, or the garbage will quickly overwhelm the live data. To do so, ePOST uses a lease-based version of PAST to store its data. This version supports automatic lightweight garbage collection without the overhead of attaching signatures or backpointer lists for each object. Each object in the DHT has an attached expiration time, which is the time after which replica holders are allowed to collect the object. Your ePOST proxy periodically extends the lifetime of all the objects that are still referenced from users' folders. Therefore, deleted objects eventually expire and are collected, since they are not refereshed.

POST and ePOST

POST is the layer on top of PAST and Pastry which provides the security and user-level functions. Specifically, POST provides three primitives to applications Using these three primitives, advanced and complex applications can easily be built. ePOST was built as a proving application for POST, using just these three primitives to provide peer-to-peer email services. For more techincal information about POST, please see the POST website.

Glacier

In distributed systems, correlated failures among nodes are common, whether they are due to shared network connections, shared power grids, geographic locality or a common security vulnerability. In a DHT-based system such as ePOST, a correlated failure due to a virus has the potential to permanently erase a large portion of the storage in the network.

To prevent such a widespread correlated failure from deleting DHT data, ePOST employs the Glacier data durability system, which can provide 99.9999% data survival even under a 60% correlated failure. Glacier works by Erasure encoding objects, and storing a large number of fragments throughout the network. Glacier maintains guarantees for each object by ensuring a fraction of the fragments exist at all times, regardless of node churn.