DataOps: Using Logentries, Tableau, PowerShell (and more) to find answers
- select the contributor at the end of the page -
Where are our Tableau users coming from?
While a simple question, and valuable for many obvious reasons, it will take some scripting and parsing of logs. Here were our first thoughts, er, brainstorm:
- Apache logs from Tableau contain IP address of users (amongst other bits), but it it's not neatly formatted.
- User location would require a IP address lookup, so let's find a good public API for grabbing that info.
- If we were able to get lat/long from the IP, then that would make an excellent visualization in Tableau.
Perfect! We have three major pieces to acquire and tackle. Once those are done, we'll have an answer.
Part 1: Getting the IP address from the Tableau logs
As the above image reflects, we used PowerShell to pull down the Apache logs, which we've stored via Logentries, and its open API. As it were, the logs aren't in the format that would make parsing easy; they are, however, delimited and we've chosen that character to make our first table for the above question.
While we didn't need every field, we did choose a few that might add some insight to the original question and let PowerShell do the heavy lifting (and parsing). Analytics, if done correctly, are more of a path and not necessarily a destination; we moved toward an answer, but remained open to other potential insights.
Part 2: Getting geographic location from the Tableau logs
Now that we've parsed the logs and stored them in a database, we can grab the geo data from the IP address. We used the following service, freegeoip.net. It's efficient and well within our daily limit for requests.
Once all the processing is complete, we'll add that data to a database via PowerShell. The reason we've put it in a separate table is because we only wanted a distinct list of IPs and didn't want to make duplicate calls when we've already acquired the necessary information.
The above steps happen daily and automatically via scheduled jobs in PowerShell. If we need real-time alerting, we'd use the built-in feature of Logentries and some Regex to parse the logs. However, this is an analytical operation, which isn't necessarily the same aim as finding out what's happening right now.
Part 3: Visualize this data in Tableau
In the end, the data always seems to find its way back into Tableau. We can repeat the process over again, but this time with usage data on the dashboard.
Simple, right? Tableau --> Logentries --> PowerShell (csv/databases) --> Tableau.
And the circle closes.
-Mike, Pluralsight data team