This publication covers posts from NJIT’s IS688 course and covers machine learning, data mining, text mining, and clustering to extract useful knowledge from the web and other unstructured/semi-structured, hyper- textual, distributed information repositories.