#### Paths

# Building Statistical and Mathematical Models with R

This skill will explore advanced mathematical and statistical models and their implementation in the R language. These modeling approaches are relevant to machine learning and in... Read more

## What You Will Learn

- Implementation of Numerical Methods with R
- Understanding Linear Algebra implementation with R
- Applying Mathematical mass balance Models( Integration, steady state, least square)
- Applying Differential equations problems
- Build Linear inverse models with R

### Pre-requisites

- Basic Mathematics
- Basic Statistics
- R Programming Fundamentals
- R Functions

### Beginner

Understand the Conceptual framework of Statistical and Mathematical Models.

#### Understanding Statistical Models and Mathematical Models

2h 36m

##### Description

Data science and data modeling are fast emerging as crucial capabilities that every enterprise and every technologist and it us important to choose the type of model most appropriate to your use-case. In this course, Understanding Statistical Models and Mathematical Models, you will gain the ability to differentiate between mathematical models and statistical models and pick the right type of model for your scenario.

First, you will learn the important characteristics of mathematical and statistical models and their applications. Next, you will discover how classic mathematical models find wide applicability in solving differential equations and modeling deterministic systems.

Then, you will also learn how statistical models are great for modeling systems with randomness, using business-based use-cases from risk management, and the use of Monte Carlo simulations. Finally, you will round out your knowledge performing hypothesis testing using T-tests and Z-tests on real-world data.

When you’re finished with this course, you will have the skills and knowledge to use powerful techniques from both mathematical and statistical modeling, including solving simple ordinary differential equations, the use of simulated annealing and classic hill climbing, as well as hypothesis testing and statistical tests such as the T-test.

##### Table of contents

- Course Overview
- Understanding Statistical and Mathematical Models
- Case Studies on Statistical and Mathematical Models
- Applying Mathematical Models in R
- Applying Statistical Models in R

#### Solving Problems with Numerical Methods

3h 40m

##### Description

The growth in computing power means that problems that were hard to solve earlier can now be tackled using numerical techniques. These are algorithms that seek to find numerical approximations to mathematical problems rather than use symbolic manipulation i.e. fit a formula. Symbolic manipulation is often very hard and may not always be tractable. Numerical analysis, on the other hand, allows us to give approximate answers to hard problems such as weather prediction, computing the trajectory of a spacecraft, setting prices for goods in real-time and in many other use cases. In this course, Solving Problems with Numerical Methods we will explore a wide variety of numerical techniques for different kinds of problems and learn how we can apply these techniques using the R programming language. First, you will learn how numerical methods are different from analytical methods and why it is important to be able to solve problems using numerical procedures. You will understand and work with direct and iterative numerical techniques to solve a system of linear equations and perform interpolation and extrapolation using a variety of different methods. Next, you will discover how graphs can be represented and the applications of graph algorithms in the real world. You will then move on to local search techniques to solve the N-queens problem. You will study variants of classic local search such as stochastic local search algorithms, simulated annealing and threshold accepting algorithms. These techniques allow locally bad moves to avoid getting stuck in local optima. Finally, you will explore how to formulate a linear programming problem by setting up your objective, constraints and decision variables and them implement a solution using R utilities. You will round off this course by understanding and implementing differentiation and integration using R programming. When you’re finished with this course, you will have the skills and knowledge to apply a variety of numerical procedures to solve mathematical problems using the R programming language.

##### Table of contents

- Course Overview
- Understanding Numerical Methods
- Applying Numerical Methods to Solve Problems
- Working with Graphs Using Numerical Techniques
- Implementing Local Search and Optimizations
- Implementing Integration and Differentiation

### Intermediate

Implement Dimension Analysis, Differential Equations, Linear Algebra and Mathematical MASS models using R.

#### Applying the Mathematical MASS Model with R

2h 20m

##### Description

Before machine learning and Python made statistics a subject of MASS popular appeal, an entire generation of applied statisticians learned their craft from the famous textbook named “Modern Applied Statistics with S” by Venables and Ripley. The “S” referred to in the book’s title is the precursor of the R statistical software, which is so popular and effective for statistical analysis. The influence of this seminal work is so strong, that R actually contains a package named MASS, an acronym for the book’s title.

In this course, Applying the Mathematical MASS Model with R, you will gain the ability to use the datasets, predictive models, and specialized functions available in the MASS package in R.

First, you will learn how the classic t-test can be used in a variety of common scenarios around estimating means and also learn about using ANOVA, a powerful statistical technique used to measure statistical properties across different categories of data. This exploration will involve variants of the t-test such as one-sample and two-sample t-tests, as well as one-way ANOVA, which is used to compare means of a target variable across different groups, based on the value of a single categorical variable.

Next, you will discover about three powerful techniques in data analysis, namely linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and robust regression. LDA and QDA are classification techniques that both seek to re-orient the original data using new, optimized axes such that points belonging to different classes lie as far apart as possible. QDA is preferable to LDA when the x-variables that correspond to different y-variable values have differing covariances. MASS includes support for three powerful robust regression techniques, Huber, Bisquare, and Hampel; each of these is a useful way to fit a regression model even when data is heavily contaminated by outliers.

Finally, you will explore how to model complex systems using multi-state models, which represent the result of a stochastic process as a succession of states. You will understand the differences - and similarities - between transition probabilities and transition intensities, and then apply all of that knowledge to a special class of multi-state models: survival models. Such models find wide applications in medical domains such as modeling outcomes of different treatment regimens, and you will learn how to do so, and also how to model hazard rates and survival probabilities. When you’re finished with this course, you will have the skills and knowledge of several specialized statistical techniques that are featured in the MASS library in R.

##### Table of contents

- Course Overview
- Performing Qualitative and Quantitative Analysis Using MASS Datasets
- Performing Predictive Analytics Using MASS Models
- Implementing Multi-state Models in R

#### Applying Differential Equations and Inverse Models with R

2h 23m

##### Description

Differential equations are a topic rich in history - several important results date back to the 18th and 19th centuries - but their importance is not confined to the history books: Differential equations still have wide and varied applications: did you know, for instance, that the famous S-curve, which we often find using logistic regression, can also be obtained by solving a differential equation? Likewise, the Black Scholes Equation which lies at the foundation of modern quantitative finance can be solved conveniently by conversion to the heat equation.

In this course, Applying Differential Equations and Inverse Models in R, you will explore a wide variety of differential equations, as well as an unrelated technique known as inverse modeling, and learn how you can apply these techniques using the R programming language.

First, you will learn how many different physical, chemical, and financial phenomena can be modeled using Differential Equations. You will see how population growth, the spread of infectious diseases, the pricing of complex financial derivatives, and the equilibrium in a chemical reaction can all be modeled using Differential Equations.

Next, you will discover how different types of differential equations are modeled and solved numerically. You will see how a mix of algebraic and differential equations forms a system known as a DAE, or Differential Algebraic Equation; and how a time-varying relationship between the dependent and independent variables can be modeled using Delay Differential Equations.

Finally, you will explore how initial as well as boundary value differential equations. You will see how the temperature varies with time in a rod that is being heated by a heat source, has one end insulated, and has the other end exposed to the atmosphere. You might find this use-case arcane, but this is the famous diffusion equation, which is also the basis of the Black-Scholes PDE from quant finance. You will round off this course of by understanding even-determined, under-determined, and over-determined systems, and working with such systems using R programming

When you’re finished with this course, you will have the skills and knowledge to apply a variety of numerical procedures to solve differential equations using the R programming language.

##### Table of contents

- Course Overview
- Getting Started with Differential Equations
- Understanding Types of Differential Equations
- Solving Differential Equations
- Understanding and Applying Linear Inverse Models

#### Performing Dimension Analysis with R

2h 11m

##### Description

Dimensionality Reduction is a powerful and versatile unsupervised machine learning technique that can be used to improve the performance of virtually every ML model. Using dimensionality reduction, you can significantly speed up model training and validation, saving both time and money, as well as greatly reducing the risk of overfitting.

In this course, Performing Dimension Analysis with R, you will gain the ability to design and implement an exhaustive array of feature selection and dimensionality reduction techniques in R. First, you will learn the importance of dimensionality reduction and understand the pitfalls of working with data of excessively high-dimensionality, often referred to as the curse of dimensionality. Next, you will discover how to implement simple feature selection techniques to decide which subset of the existing features we might choose to use while losing as little information from the original, full dataset as possible.

You will then learn important techniques for reducing dimensionality in linear data. Such techniques, notably Principal Components Analysis and Linear Discriminant Analysis, seek to re-orient the original data using new, optimized axes. The choice of these axes is driven by numeric procedures such as Eigenvalue and Singular Value Decomposition.

You will then move to dealing with manifold data, which is non-linear and often takes the form of Swiss rolls and S-curves. Such data presents an illusion of complexity but is actually easily simplified by unrolling the manifold.

Finally, you will explore how to implement a wide variety of manifold learning techniques including multi-dimensional scaling (MDS), Isomap, and t-distributed Stochastic Neighbor Embedding (t-SNE). You will round out the course by comparing the results of these manifold unrolling techniques with artificially generated data. When you are finished with this course, you will have the skills and knowledge of Dimensionality Reduction needed to design and implement ways to mitigate the curse of dimensionality in R.

##### Table of contents

- Course Overview
- Understanding the Importance of Reducing Complexity in Data
- Performing Dimensional Analysis for Continuous Data
- Performing Dimensional Analysis for Categorical Data
- Performing Dimensional Analysis for Non-linear Data

#### Applying Linear Algebra with R

1h 15m

##### Description

Would you like to better understand the basics of linear algebra so that you can better understand the techniques used in regression and machine learning? In this course, Applying Linear Algebra with R, you will learn foundational knowledge to understand what is going on in predictive models, how to extract important information from large data sets, and the basics of linear regression in R. First, you will learn basic matrix arithmetic. Next, you will discover advanced matrix mathematics that will help build your foundation. Finally, you will explore how to put this math together into real world applications. When you are finished with this course, you will have the skills and knowledge of Linear Algebra in R needed to better implement basic machine learning techniques and springboard into more advanced topics like generalized linear models.

##### Table of contents

- Course Overview
- Working with Vectors and Matrices in R
- Understanding Operations on Matrices
- Getting Weird: Inverting, Transposing, and Row Equivalence
- Solving Linear Equations
- Understanding and Calculating Eigenvalues and Eigenvectors
- Calculating the kth Item in a Series
- Implementing Matrix Decomposition
- Using Least Squares Calculations

### Advanced

Learn how to create statistical summaries and implement Monte Carlo Method using R.

#### Building Statistical Summaries with R

3h 2m

##### Description

The tools of machine learning - algorithms, solution techniques, and even neural network architectures, are becoming commoditized. Everyone is using the same tools these days, so your edge needs to come from how well you adapt those tools to your data. Today, more than ever, it is important that you really know your data well.

In this course, Building Statistical Summaries with R, you will gain the ability to harness the full power of inferential statistics, which are truly richly supported in R.

First, you will learn how hypothesis testing, which is the foundation of inferential statistics, helps posit and test assumptions about data. Next, you will discover how the classic t-test can be used in a variety of common scenarios around estimating means. You will also learn about related tests such as the Z-test, the Pearson’s Chi-squared test, Levene’s test and Welch’s t-test for dealing with populations that have unequal variances.

Finally, you will round out your knowledge by using ANOVA, a powerful statistical technique used to measure statistical properties across different categories of data. Along the way, you will explore several variants of ANOVA, including one-way, two-way, Kruskal-Wallis, and Welch’s ANOVA.

You will build predictive models using linear regression and classification and finally, you will understand A/B testing, and implement both the frequentist and the Bayesian approaches to implement this incredibly powerful technique.

When you’re finished with this course, you will have the skills and knowledge to use powerful techniques from hypothesis testing, including t-tests, ANOVA and Bayesian A/B testing in order to measure the strength of statistical relationships within your data.

##### Table of contents

- Course Overview
- Understanding Statistical Summaries
- Solving Problems Using Statistical Inference
- Implementing Statistical Models
- Implementing Bayesian A/B Testing

#### Implementing Bootstrap Methods in R

2h 10m

##### Description

Perhaps the most common type of problem in statistics involves estimating some property of the population, and also quantifying how confident you can be in our estimates of that estimate. Indeed, the very name of the field, statistics, derives from the word statistic, which is a property of a sample; using that statistic you wish to estimate the parameter, which is the same property for the population as a whole.

Now if the property you wish to estimate is a simple one - say the mean - and if the population has nice, and known properties - say it is normally distributed - then this problem is often quite easy to solve. But what if you wish to estimate a very complex, arcane property of a population about which you know almost nothing? In this course, Implementing Bootstrap Methods in R, you will explore an almost magical technique known as the bootstrap method, which can be used in exactly such situations.

First, you will learn how the Bootstrap method works and how it basically relies on collecting one sample from the population, and then subsequently re-sampling from that sample - exactly as if that sample were the population itself - but crucially, doing so with replacement. You will learn how the Bootstrap is a non-parametric technique that almost seems like cheating, but in fact, is both theoretically sound as well as practically robust and easy to implement.

Next, you will discover how different variations of the bootstrap approach mitigate specific problems that can arise when using this technique. You will see how the conventional Bootstrap can be tweaked so that it fits into a Bayesian approach that goes one step beyond giving us just confidence intervals and actually yields likelihood estimates. You will also see how the smooth bootstrap is equivalent to the use of a Kernel Density Estimator and helps smooth out outliers from the original sample.

Finally, you will explore how regression problems can be solved using the bootstrap method. You will learn the specific advantages of the bootstrap - for instance in calculating confidence intervals around the R-squared, which is something that is quite difficult to do using conventional parametric methods. You will explore two variants of the bootstrap method in the context of regression - case resampling and residual resampling, and understand the different assumptions underlying these two approaches.

When you’re finished with this course, you will have the skills and knowledge to identify situations where the bootstrap method can be used to estimate population parameters along with appropriate confidence intervals, as well as to implement statistically sound bootstrap algorithms in R.

##### Table of contents

- Course Overview
- Getting Started with Bootstrapping in R
- Implementing Bootstrap Methods for Summary Statistics
- Implementing Bootstrap Methods for Regression Models

#### Implementing Monte Carlo Method in R

1h 42m

##### Description

Repeated sampling using the Monte Carlo method can be a much more efficient approach in solving difficult problems vs. standard mathematical or statistical practices. In this course, Implementing Monte Carlo Method in R, you’ll gain the ability to build your own Monte Carlo simulations using a variety of approaches and know which solution is most effective. First, you’ll explore the basics behind Monte Carlo and the fundamental functions in R. Next, you’ll discover some simple methods, followed by simulations on stock and commodities data for estimating return probabilities. Finally, you’ll learn how to use Monte Carlo methods on A/B tests. When you’re finished with this course, you’ll have the skills and knowledge of Monte Carlo methods needed to implement these methods yourself.

##### Table of contents

- Course Overview
- Understanding Monte Carlo Basics
- Making Predictions with Monte Carlo
- Using Monte Carlo for Value at Risk
- Utilizing MC for A/B Testing