Category: Data Science and Analytics
-
SQL for accounting and finance professionals
If you’re an accounting or finance professional wanting to learn SQL, I’ve got good news: most of the knowledge you have of Excel or Google Sheets can be translated into SQL. Whether you’re looking at a spreadsheet or a database table, the core fundamentals are the same: it’s all about manipulating a big grid of…
-
Calculating medians and quartiles across groups in SQL
One of the best ways to understand data is through the use of descriptive statistics, figuring out the minimum and maximum values, the median value, and the quartiles. When you’re working with smaller datasets this is easy, but with larger datasets you need to parse a lot of data to get these metrics. Luckily, you…
-
ETL in MySQL – How to automatically load data into a table every day
I’m working on a dashboard to track COVID-19 cases per capita in Calgary, and while the government’s open data API provides daily case counts within the city it doesn’t have any history available. The easy solution to this is to download the data on a daily basis and archive it myself, but I want to…
-
Installing MySQL 8 on Ubuntu 20.04
If you need a database for a project, MySQL is one of the most popular choices. It’s free, open-source and is a core part of of the popular LAMP (Linux, Apache, MySQL, PHP) web application stack. If you want to get started using MySQL for a project, here’s a guide of how to install it…
-
Create Professional Reports in Excel using R and XLSX
While R is the real workhorse of data analysis and modeling, if you want to share the results of your work with other people your probably going to send it in a spreadsheet. Luckily the xlsx package for R makes it easy to export simple spreadsheets and also has advanced functionality to create workbooks with…
-
CRISP-DM in depth: modeling
Once the data preparation phase is complete, its time to move on to the fun part of the CRISP-DM framework: modeling. Here you’ll chose which modeling technique to use, create some tests to assess the accuracy of your model, build the model, and then assess the model using the tests you created.
-
Scheduling R scripts to run automatically in Windows
A key component in data acquisition or reporting is the ability to trigger your script to run at a set time each day. Whether you are attempting to download the latest stock prices or update corporate earnings reports, once you’ve created the script to do the actual work, you need to find a way for…
-
CRISP-DM in depth: data preparation
After developing business understanding and data understanding, the next big objective in the CRISP-DM methodology is to prepare the data for modelling and analysis. This involves selecting, cleaning and transforming the data which will be used for the project. While this isn’t flashy work, it typically accounts for 60% to 80% of the effort for…
-
Save time and improve accuracy by automating the repetitive (boring) stuff
Corporate reporting is a prime candidate for automation if you can clearly explain the process to produce it, and the process remains consistent over time. Automating your reports has many potential benefits, it can save time, reduce errors, and alleviate the boredom caused by performing repetitive tasks.
-
Exploring public data with Power BI – abandoned mines in Ontario
Power BI is Microsoft’s data exploration and dashboarding tool. While it hasn’t risen to desktop prominence like Excel and Outlook have for the majority of knowledge workers, it is an incredibly capable tool which allows you to quickly visualize data from a number of data sources and explore the data using a graphical interface.