I came across an article from NVIDIA talking about their TPCx-BB benchmark results on A100. As a data scientist, I was immediately intrigued because I’m a big fan of the Transaction Processing Performance Council (TPC) benchmarks, which provide reasonable and objective performance metrics. Also, the TPC has clear rules about how their benchmarks are used and how results are reported to ensure that results from different vendors can be directly compared. I’ll say more about this later, but first let’s talk about the end-to-end data analytics workflow.
I’ve drawn a rough sketch of the end-to-end data analytics workflow based on…
Benchmarking isn’t my favorite topic, but I have a passing interest in graph analytics benchmarking:
I’ll occasionally dissect benchmarks that I think are inaccurate or misleading:
And I’ll also dissect benchmarks that only tell part of the story. I was half-listening to Jensen Huang’s NVIDIA GTC 2020 Keynote from May 14, 2020 when one of his performance claims caught my attention. At about the 19:30 minute mark of Part 6, the presentation turns to large-scale graph analytics, and claims that a DGX A100 rack can compute PageRank (PR) on a 128-billion-edge web graph at 688 billion edges per second. I…
If you’ve read my last two articles, Measuring Graph Analytics Performance and Adventures in Graph Analytics Benchmarking, you know that I’ve been harping on graph analytics benchmarking a lot lately. You also know that I use the GAP Benchmark Suite from the University of California, Berkeley, because it’s easy to run, tests multiple graph algorithms and topologies, provides good coverage of the graph analytics landscape — and, most important — gives comprehensive, objective, and reproducible results. However, GAP doesn’t cover community detection in social networks.
The Louvain algorithm  for finding communities in large networks is a possible candidate to…
With all the attention graph analytics is getting lately, it’s increasingly important to measure its performance in a comprehensive, objective, and reproducible way. I covered this in a previous article, in which I recommended using an off-the-shelf benchmark like the GAP Benchmark Suite from the University of California, Berkeley. There are other graph benchmarks, of course, like LDBC Graphalytics, but they can’t beat GAP for ease of use. There’s significant overlap between GAP and Graphalytics, but the latter is an industrial-strength benchmark that requires a special software configuration.
A graph is a good way to represent a set of objects and the relations between them (Figure 1). Graph analytics is the set of techniques to extract information from connections between entities.
Senior Principal Engineer at Intel Corporation