- This event has passed.
Apache Spark Workshop
April 22 @ 8:00 am - 12:30 pm
Half day Apache Spark from the Scratch. We will look at:
0. Getting Spark ready on your computer.
1. Apache Spark Computation Model RDDs (Resilient Distributed Datasets).
– Lazy nature of RDDs.
– RDD transformations vs actions.
– How to create RDD from different sources (let’s play with some datasets).
– RDD API. How to bend it to our needs.
2. Spark SQL
– Interoperability with RDDs.
– SQL on top of RDDs.
– Accessing to Tabular Data Sources.
– Data Frames optimization on top of RDDs.
– Distributed SQL Engine case of study.
3. Spark Streaming. – Streaming API.
– Streaming Sources.
– Extending the API. – Twitter case of study
– Interoperability with SQL and RDDs
Your are expected to follow along with code examples, so we will be helping you to install Spark locally as the first part of the sessions. More details coming soon.