Author avatar

Pavneet Singh

Cloud Certifications: AWS Certified Machine Learning - Specialty

Pavneet Singh

  • Jun 30, 2020
  • 15 Min read
  • 754 Views
  • Jun 30, 2020
  • 15 Min read
  • 754 Views
Cloud
Cloud Application Development
Machine Learning and AI
AWS Certified Machine Learning - Specialty MLS-C01

Introduction

This guide covers the essential details of the AWS Certified Machine Learning - Specialty certification and machine learning terminologies, and provides recommended resources for best practices.

Terms to Know

Artificial intelligence (AI) is an extensive branch of computer science primarily focused on developing systems that can think intelligently, like humans. There are various techniques used to develop an AI system that are categorized into different subsets such as machine learning (ML) and deep learning (DL).

Machine learning is the process of empowering computer systems to automatically learn and progressively improve their performance over time. Often the machine is trained by using a sample of structured data, like vegetables with details such as weight, color, shape, and type (label) properties. Then the trained model applies the learning on unlabeled data to identify the type of vegetables.

Deep learning is the process of making machines that can think and process data like the human brain, i.e. by identifying patterns and classification techniques such as identifying the type of animal in an image.

Types of Data

Data is a critical aspect of machine learning. Data can be categorized into two types:

  • Labeled data contains details about the object and the desired result details as well. The result (label or tag) column’s details are often obtained from a human, and that’s why labeled datasets are more expensive to obtain.

  • Unlabeled data does not contain any result column, so it cannot be used to train machines. There are various machine learning techniques that can be applied to identify a similar type of data and create groups, for example, groups of profitable or loss-making stocks from unlabeled data.

Machine Learning Techniques

There are many types of ML techniques that can be applied to different types of data and problems.

  • Supervised learning is the most common method. It uses a labeled sample dataset to train a model and is often used in image recognition, recommendations systems, data analysis, etc. It uses regression techniques for numerical and classification modeling techniques for categorical data. Some of the popular algorithms for supervised learning are linear regression, logistic regression, decision trees, and random forests.
  • Unsupervised learning is used for unlabeled data to create groups for similar types of data by identifying common patterns or structures. It is often used when there is no labeled dataset available, for example, to identify groups of the same types of articles, product recommendations groups, etc. It uses clustering techniques such as k-means, mean-shift, DBSCAN, expectation-maximization, agglomerative hierarchical clustering, and association techniques to find relationships between data. Other techniques are anomaly detection to find problematic data and latent variable models for data dimension reduction.
  • Reinforcement learning requires input from the environment (via human or machine) to identify positive and negative scores for every step. A simple example could be a game of chess that requires input from a human to provide positive or negative scores based on different variants of chess and strategies. It is often used in gaming, navigation, and robotics. The most common algorithms are Q-learning, SARSA, DQN, and DDPG.

Prerequisites and Recommended Skills

The AWS Certified Machine Learning - Specialty certification is intended for data scientists or professional machine learning developers. This certification focuses on deep aspects of data manipulation and optimal machine learning solution development using AWS services and tools.

There are no prerequisites to take the exam, though this certification requires basic STEM knowledge as well as problem-solving and analytical skills. The skills recommended to be successful in the machine learning field are:

  • Math: This includes a deep knowledge of high school-level mathematical skills such as linear algebra, statistics, probability, multivariate calculus, matrices, algorithms, etc.
  • Data Structures: Data is often processed and manipulated via the help of predefined data structures such as matrices, arrays, lists, maps, graphs, etc.
  • Data Management: Data is a vital component to developing machine learning solutions. Data should be transformed using data modeling, data cleaning (imputation, outliers, binning, and normalization), and data transformation (CSV, streams, JSON) techniques to achieve the optimal results.
  • Algorithms: A machine learning model can be optimized using various mathematical optimization techniques such as convex optimization, coordinate or gradient descent, quadratic programming, etc.
  • Field-Specific Knowledge: This is not a hard requirement, but often machine learning solutions require knowledge of environment-specific skills, for example, knowledge of the pharmaceutical industry to develop machine learning-based drug discovery solutions.
  • Performance: The evaluation of machine learning solutions is mandatory. There are various methods available to evaluate the performance of machine learning solutions such as precision and recall, F1 score, confusion matrix, etc.

Tech Stack for Certification

AWS offers a wide variety of tools and services for machine learning solution development, including:

  • SageMaker: A fully featured service to instantly build, train, and develop machine learning models. It offers inbuilt Sagemaker Studio IDE, Notebook (like Jupyter), Ground Truth service to build datasets, automatic model for hyperparameter tuning, inbuilt algorithms, debugger, and performance monitoring services.
  • Storage: Data storage services such as Data Lake to store structured or unstructured data, S3 for file storage, DynamoDB for No-SQL database, Relational Database Service (RDS) for SQL database, Elastic Block Store (EBS) for block-level storage (like hard drives), and Redshift for the data warehouse.
  • Data Processing: Data can be processed using Amazon Glue to extract, transform, and load data from various resources, Kinesis (Firehose, Streams, Analysis) for real-time data (streams) processing, Athena for data analysis, QuickSight for visualizations and analysis, and Elastic MapReduce (EMR) for big data frameworks such as Apache Spark, Presto, Hive, etc.
  • Machine learning services: AWS includes various machine learning services for different domains such as Comprehend for language processing (text), Rekognition for face detection, Translate for language translation, Lex to create digital assistants, Polly for text-to-speech, and Transcribe for speech-to-text conversion.
  • Other Services: It is recommended to have the knowledge of the overall AWS and machine learning paradigm. The knowledge of services like Identity and Access Management, Elastic Compute Cloud, VPC for private networks, Pipeline for data transfer, step function, Batch, etc. in order to set up a custom or hybrid setup.

Apart from AWS machine learning services, the AWS Certified Machine Learning - Specialty certification also requires the knowledge of other domains such as programming languages, databases, etc.

Certification Process Details

Once you study the training and practice material thoroughly, the final step is to schedule the test. The crucial attributes for the test are:

  • Format: The exam is comprised of multiple-choice questions, and answers can have multiple correct choices. Marks are only given if only the correct choices are selected.

  • Scores: The criteria for passing scores is set by using statistical analysis (scaled scoring models) and is subject to change. Points are not given for incorrect answers.

  • Method: The exam can be taken online (proctored exam) or given at a physical test center provided by PSI or Pearson VUE. The benefit of opting for a physical test center is the opportunity to meet other developers and make new connections.

For an online proctored exam, applicants must be able to speak English to communicate with a proctor, who will monitor the testing environment. Online proctoring exams are not available for candidates in mainland China, Japan, Slovenia, or South Korea. More details are available here. Find the additional information about system requirements and policies here.

Due to COVID-19, test delivery providers have released strict guidelines for safety measures. Follow PSI guidelines here and Pearson VUE guidelines here for testing center availability and safety measures.

  • Time: The duration of the exam is 170 minutes, though it could vary in the future depending on the content.

  • Charges: There is a one-time fee of US$300 for an AWS Certified Machine Learning - Specialty exam, and the practice exam fee is US$40.

  • Beta Program: Amazon has a beta program for certification with changes to the exam’s outline or new certifications. Early access is available to a limited number of candidates (on a first come, first served basis) who can take the beta exam as well as the stable exam once it’s out of beta). This allows applicants to take the exam twice without any additional fee. The beta program also provides the benefit of 50% off of the standard exam pricing.

  • Additional Details: The exam can be rescheduled up to 24 hours before the scheduled exam time; otherwise there will be no refund and the next exam can be scheduled only after 24 hours. In case of unsuccessful attempts, the next exam can be scheduled after 14 days with the same fee, though you can use vouchers to retake the exam.

The AWS Certified Machine Learning - Specialty certification is valid for three years. The certificates will be available within five working days after a positive exam result.

Details about content and pricing vary, so make sure to verify it here.

Job Market

Machine learning is one of the most demanding skills in the job market, and the AWS Certified Machine Learning - Specialty certification qualifies you for various positions such as Data Scientist and Machine Learning Engineer.

  • According to payscale.com and LinkedIn, the average pay for a Machine Learning Engineer is between US$111,297 and US$132,000.

  • There are more than 9000+ jobs posted on LinkedIn and 3000+ jobs on indeed.com as of June 2020.

Pluralsight Resources

Pluralsight offers great resources on AWS Certified Machine Learning Specialty. The curated learning paths are:

Pluralsight’s Role and Skill IQs are also great resources to measure the level of skills and with the help of analysis; it recommends learning opportunities to fill the gaps to reach the next level:

Conclusion

  • The primary focus of the AWS Certified Machine Learning - Specialty certification is on the overall architecture of machine learning solutions and the optimal approach for specific problems.
  • Machine learning solutions work great with highly structured and clean data. Kaggle is a great resource to explore and use different datasets for practice.
  • Closely study the whitepapers that are quite important.
  • Make specific and structured notes for the last-minute review.

Hopefully, this guide explained the necessary details to get started with the AWS Certified Machine Learning - Specialty certification. Good luck with your certification.

6