In this meetup you will learn how to build your own twitter backed by Cassandra and how to save memory by using probabilistic data structures.
After the two talks at ETH HG E 41 we will go over to the Polybar to socialize..
Cassandra, by Sabine
Big data has brought about a lot of new database types such as Hadoop, MongoDB and Cassandra. This talk will focus on Cassandra, the databank, that Facebook invented. As you might guess from this, Cassandra specializes on interactive scalable applications.
Datamodelling in Cassandra comes with many surprises when you compare it to the traditional RDBMs.
In this talk Cassandra will be illustrated by walking you through Twissandra, which is a Twitter Clone build with Cassandra and meant to be used as an example. Twitter has the advantage, that its use case is simple and widely known. So we can focus on the data-modeling with Cassandra rather on explaining the use case.
Bloom filters and other probabilistic data structures, by Tim Head
When processing a stream of strings, how do you check if you have seen a string before if your stream is much larger than your memory? Use a bloom filter! I will talk about bloom filters and other probabilistic data structures. We will use genomics as an example application.