In this meetup we will integrate Storm and Apache Cassandra to build a distributed web crawling system.
http://storm-project.net/
Storm allows you to build long running, distributed, services that scale and offer processing guarantee. While it holds some simple state in process, Storm usually relies on a third party datastore to store its results. Storm is used successfully by many companies (often using Cassandra) and was recently accepted into the Apache incubator.
I will cover the basics of Storm in the context of a simple web crawling system that relies of Cassandra to store its metadata and the web content.
About the Presenter:
Jake Luciani is a Apache Cassandra Committer and PMC member. You can follow him at http://twitter.com/tjake
Claim the event and start manage its content.
I am the organizer