Training Session: A Beginner’s Guide to Big Data Analytics with Hadoop

Dec 4, 2012 · Cambridge, United States of America

You must sign up HERE: Responding via is not sufficient.
Excited about big data, and want to get hands-on playing with data sets and popular data tools? It’s time to Code Big or Go Home!

Date / time: Session 1: 6:30pm - 8pm Session 2: 8:30pm - 10pm

Location: hack/reduce (275 3rd St, Cambridge, MA 02142)

Cost: free (limited to 50 attendees)

Sign up now:

Background: Basic programming skills (e.g. with Java or Python). Knowledge on Big Data or Hadoop is NOT required.

Laptop: Bring a Linux or Mac laptop with the software tools mentioned in the Prerequisites section here preinstalled. Users of a Windows laptop can install a Linux virtual machine (e.g. see notes here).

Mindset: Excited to learn new tools and gain relevant skill sets

In this session, we will teach you how to program and operate Hadoop, the poster child technology enabler of Big Data. Why should you care? Take a look at this exploding chart of Hadoop job trends. Even CEOs are starting to care about Hadoop.  Oh by the way, it’s open source and free to use.
By attending this session, you will be able to:

Gain hands-on big data experiences with the experts
Learn a cutting-edge tool that may help you tackle open problems at your current job, or open up new career opportunities

Network with fellow hack / reduce technologists and find ways to work together on big data problems

For tickets:
Session Agenda
In this 90-minute session, we will cover the following ground:

Write and execute Java-based Map Reduce (MR) jobs to analyze the data at hand
Program MR jobs with other languages (e.g. python, ruby) via Hadoop Streaming
Basic usage of HDFS, the Hadoop file system to deploy and run MR jobs
Introduction to monitoring and performance tuning on Hadoop
Declarative data processing on Hadoop via HIVE

By the end of this session, you will be able to program and run Hadoop jobs on your own computer as well as on the cloud.
Mingsheng Hong
Chief Data Scientist at Hadapt
hack / reduce Contributor
Mingsheng Hong is Chief Data Scientist at Hadapt, driving the product roadmap and incubating analytic use cases. Prior to this role, Mingsheng was Field CTO at Vertica, an HP company, and was instrumental in its product development and positioning.
Mingsheng obtained his Computer Science Ph.D. degree at Cornell University, where he built Cayuga, the world's first expressive and scalable CEP engine. Mingsheng also co-founded the Microsoft CEDR event processing project, which became the Microsoft StreamInsight technology shipped with SQL Server 2008 and 2012.
Mingsheng is a frequent speaker on Big Data, and has given talks, lectures and demos at Hadoop World, TDWI conferences, the Cube and Harvard Business School.
Greg Lu
Software Engineer at Hopper
Founding Member of hack / reduce Hackathons
Greg Lu is a software engineer at Hopper, a travel search engine company. He is well versed in Java, Hadoop/Mapreduce (distributed filesystem and computation), Cassandra and HBase (distributed databases), Heritrix (web crawling), and Solr/Lucene (search and indexing).
Greg is also the technical organizer of HackReduce (, where he implemented an automated cluster management system in EC2 for creating and expanding the multiple Hadoop clusters [masked] instances), as well as mentoring the participants during the one day hackathon events.
Greg began his software career as a web developer for over 6 years, starting with PHP and then Ruby on Rails for the later 5. He has also worked with many other languages and technologies throughout my own explorations and studies.

Event organizers

Are you organizing Training Session: A Beginner’s Guide to Big Data Analytics with Hadoop?

Claim the event and start manage its content.

I am the organizer

based on 0 reviews