DataOps: Using Logentries, Tableau, PowerShell (and more) to find answers

- select the contributor at the end of the page -
As part of the Pluralsight data team, it's safe for me to say that we're all constantly on the lookout for new and more efficient ways to do things and discover the answers we need--as are most of you. So, when it came to figuring out where our Tableau users were coming from, we had to dig into what was actually needed. In what follows, we'll take a quick look at some of the ways in which DataOps is actually practiced and why we used the tools we did to augment this simple question:

Where are our Tableau users coming from?

While a simple question, and valuable for many obvious reasons, it will take some scripting and parsing of logs. Here were our first thoughts, er, brainstorm:

  • Apache logs from Tableau contain IP address of users (amongst other bits), but it it's not neatly formatted.
  • User location would require a IP address lookup, so let's find a good public API for grabbing that info.
  • If we were able to get lat/long from the IP, then that would make an excellent visualization in Tableau.

Perfect! We have three major pieces to acquire and tackle. Once those are done, we'll have an answer.

Part 1: Getting the IP address from the Tableau logs

dataops parse logs

As the above image reflects, we used PowerShell to pull down the Apache logs, which we've stored via Logentries, and its open API. As it were, the logs aren't in the format that would make parsing easy; they are, however, delimited and we've chosen that character to make our first table for the above question.

dataops database

While we didn't need every field, we did choose a few that might add some insight to the original question and let PowerShell do the heavy lifting (and parsing). Analytics, if done correctly, are more of a path and not necessarily a destination; we moved toward an answer, but remained open to other potential insights.

Part 2: Getting geographic location from the Tableau logs

Now that we've parsed the logs and stored them in a database, we can grab the geo data from the IP address. We used the following service, freegeoip.net. It's efficient and well within our daily limit for requests.

DataOps IP lookup

Once all the processing is complete, we'll add that data to a database via PowerShell. The reason we've put it in a separate table is because we only wanted a distinct list of IPs and didn't want to make duplicate calls when we've already acquired the necessary information.

DataOps database

The above steps happen daily and automatically via scheduled jobs in PowerShell. If we need real-time alerting, we'd use the built-in feature of Logentries and some Regex to parse the logs. However, this is an analytical operation, which isn't necessarily the same aim as finding out what's happening right now.

Part 3: Visualize this data in Tableau

In the end, the data always seems to find its way back into Tableau. We can repeat the process over again, but this time with usage data on the dashboard.

DataOps user location

Simple, right? Tableau --> Logentries --> PowerShell (csv/databases) --> Tableau.

And the circle closes.

-Mike, Pluralsight data team

Get our content first. In your inbox.

Contributor

Mike Roberts

comes to Pluralsight with a wealth of experience and a varied background encompassing databases, analytics, visualization and scripting. Has has worked for Fortune 500 companies as well as small business helping them understand their vast data troves. His analytical style revolves around people and context, and in industry consumed with 'defaults', he believes that data intelligence starts with giving people access to their information with collaboration in mind.