Member-only story
Uncover Hidden Patterns In Your Tabular Datasets: All You Need Is The Right Statistics.
Tabular datasets can be challenging and time-consuming to analyze. However, with the right techniques and tools, you can get the most out of minimal effort. I will demonstrate the techniques and statistics to reveal new insights and explainable results with beautiful (interactive) plots.
Tabular datasets are one of the most common forms of data and consist of a mix of variables such as binary, categorical, textual, and continuous values. A well-known tabular dataset is, for example, the Titanic dataset. The major challenge in such datasets is the way of analyzing the variables because analysis of categorical values needs different statistics and/or models than continuous values, and so on. In addition, key is also to determine multicollinearity in the dataset because variables with statistically similar behavior can affect the reliability of models. In this blog, I will demonstrate the steps from pre-processing tabular datasets towards statistical testing, PCA, and network analysis that will reveal deeper insights across the variables. In addition, I will explain the importance of multiple test corrections…