Xiao WeiArchitecting a credit risk modeling process in Scikit-Learn and PMMLSklearn is designed for machine learning not credit risk modeling. Some common variable transformations found in credit risk models that…Apr 24, 20191Apr 24, 20191
Xiao WeiCreating dummy variables from status codes in Scikit-Learn and PMMLSometimes the raw data comes in the form of status codes. For example, the codes come in four columns and each value represents some kind…Apr 24, 2019Apr 24, 2019
Xiao Wei(woe)Binning Discrete Values and Avoiding the Dummy Variable Trap in Scikit-Learn & PMMLOne Hot Encoding of categorical variables is something that happens in almost every model. Sklearn provides two functions that can handle…Apr 19, 20193Apr 19, 20193
Xiao WeiWeight of Evidence Binning in Scikit-Learn & PMMLFrequently in credit risk modeling it makes sense to transform a continuous variable into one or more discrete variables. A binned…Apr 18, 2019Apr 18, 2019
Xiao WeiExtreme/default value specification in Scikit-Learn & PMML utilizing ContinuousDomain()Frequently a modeler will want to limit the range of values a predictor can be. For example, many real life variables, like income or time…Apr 18, 2019Apr 18, 2019
Xiao WeiUsing KS-stat as a model evaluation metric in Scikit-Learn’s GridSearchCVThe KS-test of significance is frequently used in response or credit risk modeling. Sklearn doesn’t provide this metric in it’s list of…Apr 4, 20191Apr 4, 20191
Xiao WeiSimple clustering algorithm utilizing PCA and K-means clustering in scikit-learnUnderstanding the state of a business and its customer base visuallyNov 8, 20181Nov 8, 20181
Xiao WeiPutting sci-kit learn models into production with PMMLFrom pipelining to modeling to productionOct 21, 2018Oct 21, 2018