Learn how to use SQL to understand the characteristics of data sets destined for data science and machine learning. The course begins with an introduction to exploratory data analysis and how it differs from hypothesis-driven statistical analysis. Instructor Dan Sullivan explains how SQL queries and statistical calculations, and visualization tools like Excel and R, can help you verify data quality and avoid incorrect assumptions. Next, find out how to perform data-quality checks, reveal and recover missing values, and check business logic. Discover how to use box plots to understand non-normal distribution of data and use histograms to understand the frequency of data values in particular attributes. Dan also explains how to use the chi square test to understand dependencies and measure correlations between attributes. The course concludes with a collection of tips and best practices for exploratory data analysis.
Zum Download / Zur Anzeige
Weiterführende Informationen
Personen: Sullivan, Dan
Sullivan, Dan:
SQL for Exploratory Data Analysis Essential Training : LinkedIn, 2018. - 00:44:07.00