At the risk of overusing a cliche, the world is drowning in data. When mouse clicks are made online, algorithms are hard at work predicting consumer behavior. Loyalty cards at retail stores are used to send promotions that will drive engagement. We carry phones in our pockets that are constantly connected to high speed networks that consume and produce data in the background. Each of us has essentially become a device on the "Internet of Things." Data is relatively easy to find and collect. Getting to the story behind that data is a different problem.
There's more to the story, no pun intended. The aforementioned mouse clicks are often made with the goal of analyzing the data they generate. In other words, data stories are expected to be told on the web. Data analytics and data science are influencing the jobs of everyone in web development and software engineering. But it's unreasonable to expect every web developer and designer to be an expert in data science.
The user experience must also be taken into account. It's easy enough to toss giga/tera/petabytes of data onto a server and give people access to it. But it's another to make that access worthwhile. Amazon makes their entire product catalog available via a public API for consumption in third party apps. But just the existence of that data does not give Amazon a strategic advantage, otherwise they would keep it a secret.
There are three problems here:
All three of these problems can be addressed by a free and open source software package called D3.js.
D3.js (or just D3) is a library for creating data visualizations on the web. A data visualization is a graphical representation of interesting features of a dataset. Put another way, it tells the story of data in a visual format.
Getting the right visualization to accurately and effectively tell the story of a dataset is in some ways more of an art that it is a science. This guide (or any guide) by itself will not be sufficient. What should be reassuring to you is that D3 is flexible enough to produce the right visualization without a lot of burden on the developer.
Take a look at this visualization, called a histogram, that was generated with D3.
The histogram shows frequency counts along the left side (also called the y-axis) and values along the bottom (also called x-axis). Looking at this histogram you can instantly see that the highest counts are for values between four and six. You might also recognize that the overall shape of the visualization resembles a normal distribution with a long right tail. For statisticians, that instantly implies something regardless of what the underlying data is. At a glance, you can quickly see a number of interesting points about the data that generated this visualization. That's the power of visualizations.
Take a look at at this D3 gallery. It shows over 100 different examples of visualizations (and the code to create them). The range of uses is incredible. There are the basic types such as bar charts, pie charts and histograms. There are more advanced types such as maps and box plots. There are even animated and interactive visualizations. This makes is possible for the end user to shape their experience by selecting from multiple options and seeing the results in real time.
But developers are not limited to the visualizations shown in the gallery. Take a look at this bracket layout visualization.
Basketball fans will immediately recognize this. The interactive live version can be seen at http://bl.ocks.org/jdarling/2503502. Clicking on the dots in the bracket layout will collapse or expand a bracket to make the chart more accessible by showing the level of detail the user requires. The code for the bracket layout is also shown in the above link. Notice that nowhere is there a call to a
bracketLayout() function or method provided by D3. In fact, if you look at any D3 example you won't find an API that constructs specific visualization types. There may be third party APIs that consume D3, which provide concrete implementations of specific visualizations, but D3 resists this urge. Instead, it gives you the fundamental building blocks and the foundation to assemble them to meet the needs of the data story you are telling.
D3 relies heavily on SVG, or Scalable Vector Graphics, a widely-supported W3C standard. With SVG, web apps can create very precise and intricate illustrations in the browser. You could make the claim that D3 is an abstraction of the SVG standard. D3 doesn't try to predict or force the future; it relies on existing standards.
D3 is not a nascent offering itself; the first release was a decade ago. D3 learns from the previous attempts at web visualization such as Protovis. The same team that maintains Protovis now works on D3. Many prominent companies such as Coursera, Square, New Relic, Weebly, and Mapbox depend on D3. However, D3 is equally beneficial to smaller companies and individuals working in academia, government, and industry.
If it sounds like D3 can help your next data science or data analytics project tell data stories on the web, head over to d3js.org for more examples. D3 is widely discussed online, so Stack Overflow and your favorite search engine are your friends when looking for answer to more specific uses. Also, D3 is open source, with the code maintained on Github. There you can also find the extensive documentation.
Keep these points in mind when evaluating D3 for yourself or your team:
I hope this guide shows that you have little reason not to try D3.js for your next project. If you use just a fraction of its potential, the benefits will come through. And it's fun! Thanks for reading!