This repository was archived by the owner on Jun 7, 2023. It is now read-only.

Description
Quick analysis of the sample 1M emails have shown that:
- 97% of mails are less than 128K
- 95% of mails are less than 64K
This figure would vary depending on use case (e.g. free webmail vs corporate email), but in most cases we would expect >90% emails to be less than 128K.
Taking into account that these messages will be compressed with 2:1 ratio, it should be convenient to store 32K-64K blobs in Cassandra.
Internal implementation of Blobs can be designed after CassandraFS to provide good scalability.
For the beginning we can use single block (row) and single sub-block (column) implementation where max block (and sub-block) size is 128K. It should possible to add multi-block support in future.
Threshold between internal (C*) and external (Cloud) blob storage should be configurable depending on blob size.