Can Adding Partitions Improve The Performance of Your Spark Job On Skewed Data Sets?After reading a number of on-line articles on how to handle ‘data skew’ in one’s Spark cluster, I ran some experiments on my own ‘single…Apr 22, 2019Apr 22, 2019
Assessing Which of A Series of Pairwise Combinations Significantly Differ via the Marascuilo…Some time back, while tinkering with R, I coded up a version of the Marascuilo procedure and wrote up the results in a post to my old blog…Apr 16, 2019Apr 16, 2019
How To Inspect Attribute Info of Nodes in a JQuery Select ListI haven’t done front-end programming for a while, but assuming JQuery is not yet dead, it might be worth resurrecting this post from my…Apr 9, 2019Apr 9, 2019
Configuring the Xerces XML Parser With Content Model DefaultsMy previous post on JSON schema included a slight dig at XML, which perhaps wasn’t really warranted. True, XML is clunkier and more…Apr 8, 2019Apr 8, 2019
Reducing Integration Hassles With JSON Schema ContractsI recently worked on a project where the ‘contract’ between service consumers and providers consisted primarily of annotated mock-ups of…Apr 3, 2019Apr 3, 2019