Common strategies for choosing the most relevant features in your data set

The importance of feature selection

Overview of this post

Introduction to Louvain graph community detection

Louvain algorithm for community detection

What are graph communities?

Summary of Apriori, Eclat and FP tree algorithms

What are frequent patterns?

Challenges in high dimensional spaces

  1. What are the challenges of working with high dimensional data?
  2. What is subspace clustering?
  3. How to implement a subspace clustering algorithm in python
  • it makes the visualization and thus understanding of the input difficult, it often requires applying a dimensionality reduction technique beforehand…

Also an intro to reservoir computing

  1. What are Echo State Networks?
  2. Why and when should you use an Echo State Network?
  3. Simple implementation example in python

How to visualize joint distributions

  • Use a Gaussian Kernel to estimate the PDF of 2 distributions
  • Use Matplotlib to represent the PDF with labelled contour lines around density plots
  • How to extract the contour lines
  • How to plot in 3D the above Gaussian kernel
  • How to use 2D histograms to plot the same PDF
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as st
from sklearn.datasets.samples_generator import make_blobs
n_components = 3
X, truth = make_blobs(n_samples=300, centers=n_components,
cluster_std = [2, 1.5, 1],
plt.scatter(X[:, 0], X[…

How to choose the right distribution to model your data

Finding the Maximum Likelihood Estimate of a model depending on unobserved latent variables

Hybrid scale plots and a custom violinboxplot

1. How can we create comprehensive visulizations of data distributions having far outliers?

2. How can we combine the best from boxplots, violinplots and dynamic scales following the data distribution?

An overview of spectral graph clustering and a python implementation of the eigengap heuristic

Adjacency matrix (A)

Madalina Ciortan

Computer science engineer, bioinformatician, researcher in data science

