Workshop on “Data Science with [a] Spark”
The Swiss National Supercomputing Centre (CSCS) and IDIDS Institute (USI) are delighted to announce their upcoming workshop Data Science with [a] Spark to be held from Tuesday, September 13 to Thursday, September 15, 2016 at CSCS in Lugano, Switzerland.
Data Science with [a] Spark is a two-and-a-half day workshop that addresses high-level parallelization for data analytics workloads using the Apache Spark framework. Learning objectives are
- to understand the value of parallelization
- to understand the value of a high-level framework like Apache Spark
- to understand the MapReduce paradigm, which is central to Spark
- to get hands-on experience in applying the MapReduce paradigm for various applications, ranging from statistical analysis to machine learning
Additionally, participants will learn how to prototype with Spark and how to exploit large HPC machines like the Piz Daint CSCS flagship system.
Participants should have a background in sciences such as Computer Science, Mathematics, Physics, Statistics, Data Science, etc. with at least a master’s degree. A basic understanding of programming languages is expected. Experience in UNIX/LINUX OS will be beneficial.
Theory and exercises will be mainly in Python, but possibly also in Scala and R; however, no deep understanding of these languages is required. In order to work on exercises, participants are requested to bring their own notebooks.
The last half day is reserved for a hackathon, where participants get the chance to work either on their individual problems or to join forces in groups. Participants can start to parallelize existing code and directly apply the learned concepts.
Furthermore, there will be the possibility to discuss parallelization strategies of individual problems with a broader audience. Interested participants should express their interest on the field “notes” of the registration form with a concise description of their application/code.
Keynote Speakers
Antonietta Mira (IDIDS / USI)
Juergen Schmidhuber (IDSIA)
Instructors
Izabela Moise (ETH Zurich)
Rito Dutta (IDIDS / USI)
Maxime Martinasso (ETH Zurich / CSCS)
Marcel Schöngens (ETH Zurich / CSCS)
Logistic details
The course starts at 09:30 on Tuesday, September 13, 2016, and ends at noon on Thursday, September 15, 2016.
The registration fee is CHF 160, which includes lunches on September 13 and 14, and coffee breaks throughout the two-and-a-half day event.
Deadline for registration: Friday, September 2, 2016
Kindly note that no parking space is available at the Swiss National Supercomputing Centre. The closest bus stop to the centre is Lugano, Stadio. From Lugano railway station, you should take bus number 4.
You are encouraged to travel by public transportation or to use the Park & Ride Resega parking lot, within five minutes walk from CSCS.