Automating the Web Using PhantomJS and CasperJS

Learn to use PhantomJS and CasperJS to automate your interaction with the web to perform numerous tasks such as data scraping, network monitoring, page rendering, and browser testing in a programmatic manner.
Course info
Rating
(50)
Level
Intermediate
Updated
May 10, 2016
Duration
58m
Table of contents
Description
Course info
Rating
(50)
Level
Intermediate
Updated
May 10, 2016
Duration
58m
Description

When done manually, data scraping, monitoring, and testing are labor-intensive and time-consuming. This course, Automating the Web Using PhantomJS and CasperJS, teaches web developers various ways that PhantomJS (a "headless" scriptable web browser) and CasperJS (a utility wrapper around PhantomJS) can be utilized to automate these kinds of interactions with websites. First, you'll learn how to programmatically scrape target information from a webpage by creating a navigation script, allowing you to automatically gather the links that would be tested using your toolset. Next, you will implement a script to visit all the links that are gathered and capture screenshots from them, afterwards building a script that would monitor a page's network activity to check the state of the resources for catching potential failures. Finally, you will implement a testing suite to check the markup of a given web page against a few accessibility requirements. By the end of this course, you'll understand how to use PhantomJS and CasperJS to automate these tasks in order to save yourself time and effort.

About the author
About the author

Engin Arslan is a front end developer with a Bachelor of Science in Materials Engineering and a Postgraduate Degree in Visual Effects. Before becoming a front end developer, he worked as a visual effects artist / technical director on films and TV shows including Resident Evil, Tron, Mama, Pompeii, Vikings, and Strain. He received an Emmy nomination and won a Canadian Screen Award for his Visual Effects work on the TV show Vikings. During his time in VFX, he fell in love with Python and with programming in general. As a result he changed careers to be able to immerse himself completely in software development. Engin currently works at Myplanet, a Toronto-based digital services company, where he helps develop solutions for clients ranging from Fortune 500 companies to top technology brands. He also works at Seneca College as a part-time professor.

Section Introduction Transcripts
Section Introduction Transcripts

Course Overview
Hello everyone. My name is Engin Arslan, and welcome to my course, automating the web using PhantomJS and Casper JS. I'm a front-end developer based in Toronto, Canada. Using a headless browser, such as Phantom JS, allows for lots of different opportunities in interacting with the web. Users can query and script data. Monitor network activity, capture screenshots, and perform functional testing of web pages and web applications in a programmatic manner. This course will teach you about PhantomJS and CasperJS, reaches and navigation scripting, and testing utility for PhantomJS. We will be focusing mostly on CasperJS, as it makes using PhantomJS much easier, and less error-prone. You will learn about kind of tools that can be created with this technology, and walk through example projects, to demonstrate real life use cases. By the end of this course, you will know how to utilize a headless browser in your own web developers workflow. Before beginning the course, you need to be comfortable with JavaScript, so it is recommended to visit the introductory JavaScript titles in the Pluralsight library if you need a refresher. Familiar to it's npm, Node Package Manager, is also recommended but not mandatory. Thank you, and enjoy the course.

Project Introduction & Data Collection
In our first example we will be building an navigation and the scraping script that would programmatically manipulate and fetch the desired data on a web page using a given CSS selector. In particular we go visit the Google home page submit a search query and fetch the links from the first page of the results using CasperJS thus submit a specific example. But you can't think of the script as a tabloid to query any data from a web page using a CSS Selector. I'm using google. com as the target URL for this example since it's readily available and also it will process an interesting challenge when using a headless browser that is worth talking about. Let's get started. Starting out with CasperJS is easy we just need to instantiate a Casper object from the Casper module. PhantomJS and hence CasperJS uses the CommonJS module syntax that you might know of from NodeJS. It certainly uses some libraries that are similarly named to those of NodeJS such as fs which is a module for a file system operation. We have used Node's packaging manager to install both PhantomJS and CasperJS and set all these. It is important to note that PhantomJS is not a NodeJS space it's a Qt bucket-based library that has a Javascriptor on time. Then also note that JS is a javascriptor on time itself. Don't let similarities to trick you into believing that they are the same thing.

Page Rendering
In this third module of the course, we will be building a script that will get a list of your files from JSON files, and we'll visit every page in the list to capture screenshots in different screen sizes. By capturing screen shots in different screen sizes, we will be essentially documenting the responsive behavior of the visited pages. This not only is valuable for SS individuals, but also useful in assuring the accessibility of a page, as responsiveness is an important factor in accessibility. I will then talk a bit about visual regression testing and why you might want to utilize it in your own web development pipeline. As you know, at this point, we start off our Casper scripts by creating a Casper object. If I'm to execute the script from the command line, though, it will hang since Casper or phantom scripts require an exit methods to be called for process termination. I will also create an area that contains the viewport sizes that I would like to scale my screen to. I will be using the viewport breakpoints that the popular front-end framework bootstrap uses. And I will also create another area for the pages that we would like to visit. I will eventually refactor the script so that we can use JSON file for the confirmation of parameters such as the desired viewport sizes or the URLs that we would like our script to visit. Just to make sure everything is working, I will add console log matters to my script to print out the entered data to the screen. Perfect. Now that we have the foundation for our script ready, we can start building the actual.

Network Monitoring
In this fourth module of the course, we will be building a script that would visit every URL in a given URL list and inspect the network traffic, as PhantomJS is loading the page. Automating this kind of monitoring can be really useful for a couple of purposes. You can use the data to get information about things like how long the page is taking to load, how many resource requests are taking place, how big are the assets that are being requested, what is the distribution of the asset types that are being transferred, et cetera. Essentially, the information you are monitoring is very similar to what you will see from the developer tools of your browser, like for example, what Google Chrome will display in there, but using the power of you headless browser, you can gather this data programmatically for any given URL. Of all the things that we could be doing with the data that we can gather by monitoring the network request, I will be building a script for a very specific purpose. I want my script to detect if there are any resource errors in any given page that I'm providing. You can imagine this kind of a script can be really handy if you have lots of pages that needs to be frequently checked against these kind of resource errors. I will be using an event listener that listens to resource received events to build my script. Let's get started.

Testing & Conclusion
One of the primary benefits of using CasperJS is that it comes with its own testing framework, providing us with an easy-to-use API to test web applications. Using this testing framework, we can create browser tests that can verify if our web application is behaving as expected from a user point of view. Examples of browser tests can include making sure the user can interact with a form that you have on the page, or if submitting the form in question results in unexpected results in the user interface. In this module, to demonstrate the testing API of CasperJS, I will be writing a test suite to check a given URL against certain accessibility requirements. There are many conditions that need to be satisfied for accessibility compliance, and not all accessibility requirements can be able way to true automated means. But there are few checks that we can still employ that will ensure us at the very least that we are getting the easy things right. So I will be developing a small script that's going to check to see if the HTML element in the page has a language attribute, and if that language attribute has a value, if all the images in the page have an alt attribute, if there's a title element in the head section, and if there are any anchor elements that contain an input element. This is nowhere a complete list of accessibility requirements, but this should be sufficient to demonstrate the CasperJS testing facilities. So, let's get started.