The Cross-Industry Standard Process for Data Mining

While analysis tools and algorithms have evolved at a rapid pace, the overall business process for analytics has remained remarkably stable. One seminal work on the analytic process is IBM’s Cross-Industry Standard Process for Data Mining (CRISP-DM). At over 20 years old, it remains a relevant and useful tool for describing the overall data science workflow.

Supporting skills for data science: relational databases & SQL

A successful data scientist needs to draw on skills from many disciplines, and one of the core skill sets is knowledge of relational databases and querying using structured query language (SQL). Relational databases are the most common way to store structured data, so a firm understanding of databases is key to obtaining performing simple analysis and reporting quickly.