Description
Course info
Rating
(25)
Level
Beginner
Updated
Jun 22, 2018
Duration
1h 43m
Description

As machine learning and deep learning techniques become popular, getting the dataset into the right numeric form and engineering the right features to feed into ML models becomes critical. In this course, Working with Multidimensional Data Using NumPy, you'll learn the simple and intuitive functions and classes that NumPy offers to work with data of high dimensionality. First, you will get familiar with basic operations to explore multi-dimensional data, such as creating, printing, and performing basic mathematical operations with arrays. You'll study indexing and slicing of array data and iterating over lists and see how images are basically 3D arrays and how they can be manipulated with NumPy. Next, you will move on to complex indexing functions. NumPy arrays can be indexed with conditional functions as well as arrays of indices. You'll then see how broadcasting rules work which allows NumPy to perform operations on arrays with different shapes as well as, study array operations such as np.argmax() which are very common when working with ML problems. Finally, you'll study how NumPy integrates with other libraries in the PyData stack. You will also cover specific implementations with SciPy and with Pandas. At the end of this course, you will be comfortable using the array manipulation techniques that NumPy has to offer to get your data in the right form for extracting insights.

About the author
About the author

A problem solver at heart, Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework.

More from the author
Analyzing Data with Qlik Sense
Intermediate
2h 11m
Jun 17, 2019
More courses by Janani Ravi
Section Introduction Transcripts
Section Introduction Transcripts

Course Overview
Hi. My name is Janani Ravi and welcome to this course on Working with Multidimensional Data Using NumPy. A little about myself. I have a master's degree in electrical engineering from Stanford and have worked at companies such as Microsoft, Google, and Flipkart. At Google I was one of the first engineers working on real-time collaborative editing in Google Docs and I hold four patents for its underlying technologies. I currently working on my own startup, Loonycorn, a studio for high-quality video content. In this course we learn the simple and intuitive functions and classes that NumPy offers to work with data of high dimensionality. We start off with basic operations to explore multi-dimensional data such as creating, printing, and performing basic mathematical operations on arrays. We study indexing and slicing of array data and iterating over these lists. We'll see how images are basically three-dimensional arrays and how they can be manipulated with NumPy. We then move on to complex indexing functions. NumPy arrays can be indexed with conditional functions as the less arrays of indices. We'll then see how broadcasting rules works. This allows NumPy to perform operations on arrays with different shapes. We'll study array operations such as the np. argmax, which are very commonly used when working with ML problems. Then we'll move on to studying how NumPy integrates with other libraries in the PyData stack. We cover specific implementations with SciPy as well as with Pandas. At the end of this course you will be very comfortable using the array manipulation techniques that NumPy has to offer to get your data in the right form for extracting in sites.

Exploring Multidimensional Data Using NumPy
Hi and welcome to this course on Working with Multidimensional Data Using NumPy. If you've used the Python programming language for data analysis or for machine learning algorithms, chances are you've used NumPy. NumPy is a very cool scientific computing package using all kinds of data analytics. NumPy supports a wide variety of mathematical computations such as linear algebra, Fourier transform, random number capabilities and so on. The basic building block used by NumPy is a very powerful n-dimensional array. The NumPy package in addition to the ability to represent these n-dimensional arrays also contains a whole suite of operations that can be performed very efficiently and effectively on these arrays. There is a suite of open source software available in the Python programming language for math, science, and engineering and NumPy forms the core of this suite. Many of the other packages such as Pandas, SciPy, Statsmodels, etc. , are built on top of NumPy. In addition to multidimensional array representations, NumPy offers a large collection of high-level mathematical functions which can operate on these arrays.

Complex Indexing Using NumPy
Hi, and welcome to this module on Complex Indexing Using NumPy. Indexing involves accessing specific elements within an array and NumPy gives you a wide variety of ways in which these elements can be accessed. NumPy has this really cool feature where you can access specific elements by specifying Boolean conditions. You specify the condition, it will generate a Boolean array, and this Boolean array can be used to index into another array. We tend to use NumPy when we are working with numeric data; however, NumPy arrays can also be used to store structured data. This can be thought of as a precursor to DataFrames as in Pandas. A very powerful feature that NumPy offers is the idea of broadcasting. Broadcasting allows you to work with scalars and arrays and with mismatched arrays provided they match certain broadcasting rules.

Leveraging Other Python Libraries with NumPy
Hi and welcome to this module on Leveraging Other Python Libraries with NumPy. NumPy is closely integrated with many other Python libraries used for scientific computing and data analysis such as SciPy and Pandas. We'll see an example of how we can use NumPy arrays to work with the interpolation function in the SciPy library. We'll then use NumPy functions along with the Pandas library to analyze the Titanic dataset.