BDAM: Distributed Rules Engine & Exactly-once processing with Apache Kafka!

Sep 20, 2017 · Palo Alto, United States of America

Shoutout to Cask ( for kindly sponsoring and hosting this meetup!

Cask will also be giving away an Amazon Dot! Enter the raffle on the day of the event for a chance to win.


6:00 - 6:30 - Socialize over food and beverages  

6:30 - 8:00 - Talks


Talk #1: Introducing a horizontally scalable, inference-based business Rules Engine for Big Data processing, by Nitin Motgi from Cask  

Talk #2: Building Stream Processing Applications with Apache Kafka's Exactly-Once processing guarantees, by Matthias Sax from Confluent

Unfortunately, we had a last minute cancellation for tonight's planned talk titled: "Advanced Data Engineering Patterns with Apache Airflow". It will be rescheduled for a future meetup.


Talk #1: Introducing a horizontally scalable, inference-based business Rules Engine for Big Data processing, by Nitin Motgi from Cask  

Business Rules are statements that describe business policies or procedures to process data. Rules engines or inference engines execute business rules in a runtime production environment, and have become commonplace for many IT applications. Except in the world of big data, where there has been a gap for a horizontally scalable, lightweight inference-based business rules engine for big data processing.

In this session, you will learn about Cask’s new business rule engine built on top of CDAP, which is a sophisticated if-then-else statement interpreter that runs natively on big data systems such as Spark, Hadoop, Amazon EMR, Azure HDInsight and GCE. It provides an alternative computational model for transforming your data while empowering business users to specify and manage the transformations and policy enforcements.

In his talk, Nitin Motgi, Cask co-founder and CTO, will demonstrate this new, distributed rule engine and explain how business users in big data environments can make decisions on their data, enforce policies, and be an integral part of the data ingestion and ETL process. He will also show how business users can write, manage, deploy, execute and monitor business data transformation and policy enforcements.

Talk #2: Building Stream Processing Applications with Apache Kafka's Exactly-Once processing guarantees, by Matthias Sax from Confluent

Kafka 0.11 added a new feature called "exactly-once guarantees". In this talk, we will explain what "exactly-once" means in the context of Kafka and data stream processing and how it effects application development. The talk will go into some details about exactly-once namely the new idempotent producer and transactions and how both can be exploited to simplify application code: for example, you don't need to have complex deduplication code in your input path, as you can rely on Kafka to deduplicate messages when data is produces by an upstream application. Transactions can be used to write multiple messages into different topics and/or partitions and commit all writes in an atomic manner (or abort all writes so none will be read by a downstream consumer in read-committed mode). Thus, transactions allow for applications with strong consistency guarantees, like in the financial sector (e.g., either send a withdrawal and deposit message to transfer money or none of them). Finally, we talk about Kafka's Streams API that makes exactly-once stream processing as simple as it can get.


• Nitin Motgi is Co-Founder and CTO of Cask, where he is responsible for developing the company’s long-term technology, driving company engineering initiatives and collaboration. Prior to Cask, Nitin was at Yahoo! working on a large-scale content optimization system externally known as C.O.R.E.
Prior to Yahoo!, Nitin led the development of a large-scale fabrication analysis system at Altera, and he previously held senior engineering roles at FedEx. Nitin holds a Master’s degree in computer science from University of Central Florida (UCF). 

• Matthias Sax is a Software Engineer at Confluent working mainly on Kafka's Streams API (aka Kafka Streams) and was involved in the exactly-once development efforts. Before Confluent he was a PhD student at Humboldt-University of Berlin, Germany, focusing on distributed stream processing systems. Matthias is also a committer at Apache Flink and Apache Storm.


Cask HQ is a few minutes walk from the California Avenue Caltrain Station. 

Also, Cask HQ has its own parking lot, but it will certainly not accommodate all guests. Please use parking lots available nearby:

Event organizers
  • Big Data Application Meetup

    This is a group for everyone interested in building applications using Apache Hadoop and other open-source, big data technologies. Come and learn how to apply big data technologies to solve real world problems! Meetup topics are focused on use cases, building end-to-end solutions, and making different technologies work together. The topics include technical presentations from open-source projects, open-source vendors and open-source users building big data applications. Topics include: • Describing the tec

    Recent Events

Are you organizing BDAM: Distributed Rules Engine & Exactly-once processing with Apache Kafka!?

Claim the event and start manage its content.

I am the organizer

based on 0 reviews