One of the best ways to understand data is through the use of descriptive statistics, figuring out the minimum and maximum values, the median value, and the quartiles. When you’re working with smaller datasets this is easy, but with larger datasets you need to parse a lot of data to get these metrics. Luckily, you can use SQL to get descriptive statistics for your data directly from the database.
I’m working on a dashboard to track COVID-19 cases per capita in Calgary, and while the government’s open data API provides daily case counts within the city it doesn’t have any history available. The easy solution to this is to download the data on a daily basis and archive it myself, but I want to automate the download and loading into the database so I don’t have to think about it. Luckily, MySQL and a bit of shell script goes a long way.
If you need a database for a project, MySQL is one of the most popular choices. It’s free, open-source and is a core part of of the popular LAMP (Linux, Apache, MySQL, PHP) web application stack. If you want to get started using MySQL for a project, here’s a guide of how to install it on a fresh installation of Ubuntu 20.04.
While R is the real workhorse of data analysis and modeling, if you want to share the results of your work with other people your probably going to send it in a spreadsheet. Luckily the xlsx package for R makes it easy to export simple spreadsheets and also has advanced functionality to create workbooks with professional design and formatting.
A key component in data acquisition or reporting is the ability to trigger your script to run at a set time each day. Whether you are attempting to download the latest stock prices or update corporate earnings reports, once you’ve created the script to do the actual work, you need to find a way for it to be run on the correct schedule.
Power BI is Microsoft’s data exploration and dashboarding tool. While it hasn’t risen to desktop prominence like Excel and Outlook have for the majority of knowledge workers, it is an incredibly capable tool which allows you to quickly visualize data from a number of data sources and explore the data using a graphical interface.
Previously we looked at how you can combine R and Markdown to create reports directly from your R scripts, and also how to send email from R using Microsoft Outlook. In this post, we’ll take these concepts a step further and look at how we can use R to embed images in email messages or even use Markdown to create entire messages.
Robotic process automation (or RPA) is transforming the way many businesses handle their repetitious, labour intensive tasks such as reporting, making basic decisions, and providing services. Using software these tasks can be automated; reducing the time to complete tasks while also improving their accuracy and consistency. If you want to get started down the RPA path without incurring licensing costs, there are free tools you can start using today.
One underappreciated feature in R is the ability to easily create beautiful reports using Markdown. Markdown files contain a combination of code and text, allowing you to write your analysis alongside your code and publish both the analysis document and code in a wide variety of formats with little effort.
One of the biggest benefits from creating an automatic reporting framework is that you no longer need to directly supervise the creation and distribution of reports. However, when things go wrong it can be difficult to understand what went wrong and why. Luckily, R’s tryCatchLog package makes it easy to trap and log errors as they occur.