Loading...

Distributed Machine Learning

Machine Learning with Big Data

Distributed Machine Learning
Paid Course

Distributed Machine Learning

Learn how to apply statistical learning techniques to big data in Python by building, interpreting, visualising and evaluating distributed machine learning models optimised for massive data volumes.

Course Details

This course provides a hands-on and in-depth exploration of the industry-standard Apache Spark unified analytics engine, and specifically its MLlib distributed machine learning library with which to build, visualise and evaluate distributed machine learning models applied to real-world business problems and use-cases that require learning from massive data volumes ranging from gigabytes (GB) to petabytes (PB) in size. This course follows on from our Statistical Learning course, and enables senior data scientists to apply the mathematical techniques introduced in that course to real-world use-cases, from which they can make predictions and derive actionable insights from big data. As such, this course details how to build and evaluate linear models for regression and classification, tree-based models and clustering models. This course also details applied techniques for model selection and fine-tuning applied to big data volumes.

Requirements

Outcomes

  • The ability to apply statistical learning techniques in Apache Spark.
  • The ability to build, interpret, visualise and evaluate supervised and unsupervised distributed machine learning models applied to real-world business problems and use-cases that require learning from massive amounts of structured and unstructured data, ranging from gigabytes (GB) to petabytes (PB).
  • The ability to select and fine-tune distributed models applied to big data business problems.
  • Advanced knowledge of the industry-standard Apache Spark MLlib machine learning library.

Contact UsLog In