May 2016 time series storage roundup

1 minute read


A recent slew of blogs and articles have been shedding new insight on time series storage. I thought I’d list some of the zeitgeist.

Facebook came up with Gorilla: A fast, scalable, in-memory time series database. It’s reviewed by Adrian Colyer here.

The paper includes:

In the future, we hope that Gorilla enables more advanced data mining techniques on our monitoring time series data, such as those described in the literature for clustering and anomaly detection [10, 11, 16].

All papers are very cool free-form techniques either directly applicable or very interesting when taken down to the context of time series. Adrian Colyer reviews the first and the second.

Secondly, UC Berkeley came up with BTrDB: Optimizing Storage System Design for Timeseries Processing. Again, Adrian Colyer reviews this result.

More recently, Samsung came up with A Fast Lightweight Time-Series Store for IoT Data : a data store designed to leverage the characteristics of time-series data in an IoT application context (think smartcities).

Finally, Chronix presents itself as an Apache Solr-inspired new kid on the block for the processing of time-series data. It touts a blog post and two talks at the upcoming ApacheCon.

These recent options add to the already existing:

The shape of event processing is changing, and it’s nice to see the correpsonding tools are changing as well. On the processing side, these newcomers can be linked to the already well-known ways of dealing with this sort of data, among which Flink’s complex event processing, or Spark’s frequent pattern mining. There is also Cloudera’s spark-ts library.

See anything i’ve missed ? Shoot me a comment below !