Calculating medians and quartiles across groups in SQL

One of the best ways to understand data is through the use of descriptive statistics, figuring out the minimum and maximum values, the median value, and the quartiles.  When you’re working with smaller datasets this is easy, but with larger datasets you need to parse a lot of data to get these metrics.  Luckily, you can use SQL to get descriptive statistics for your data directly from the database.

Defining big data

Buzz words have the unfortunate tendency to be often used but seldom clearly defined. Today we are going to tackle the popular phrase “big data” and strip it down to a clear definition. Overall the term is fairly self explanatory, it refers to large data sets, but there are 5 defining characteristics specific to big data which differentiate it from the data-sets of yesterday. These 5 characteristics are known as the 5 V’s of big data.

Book review: Weapons of Math Destruction by Cathy O’Neil

As big data transforms our businesses, governments and society, it also presents us with new moral and ethical dilemmas that we need to consider. As is typical with new technology, we often tend to implement first, and consider the ethical issues later. Cathy O’Neil’s book Weapons of Math Destruction is an introduction to the ethical issues raised by the widespread use of data to drive decisions in our lives.