A recent slew of blogs and articles have been shedding new insight on time series storage. I thought I’d list some of the zeitgeist.
Facebook came up with Gorilla: A fast, scalable, in-memory time series database. It’s reviewed by Adrian Colyer here.
The paper includes:
In the future, we hope that Gorilla enables more advanced data mining techniques on our monitoring time series data, such as those described in the literature for clustering and anomaly detection [10, 11, 16].
Secondly, UC Berkeley came up with BTrDB: Optimizing Storage System Design for Timeseries Processing. Again, Adrian Colyer reviews this result.
More recently, Samsung came up with A Fast Lightweight Time-Series Store for IoT Data : a data store designed to leverage the characteristics of time-series data in an IoT application context (think smartcities).
These recent options add to the already existing:
The shape of event processing is changing, and it’s nice to see the correpsonding tools are changing as well. On the processing side, these newcomers can be linked to the already well-known ways of dealing with this sort of data, among which Flink’s complex event processing, or Spark’s frequent pattern mining. There is also Cloudera’s spark-ts library.
See anything i’ve missed ? Shoot me a comment below !