Graph Processing: Tricks
Heuristic Code Switch, Delay Update, Speculation, Approximation and More
ATC15: GraphQ: Graph Query Processing with Abstraction Reﬁnement
(Single node out-of core) Quey can be done on partial graph. This paper partitions the graph and find answers first in subgraphs. If we are lucky to have a correct ouput, it stops. If not successful, it merges partitions (add back missing edges) and goes back to check. To enjoy better luck/performance and avoid whole-graph degradation, the attempts are guided by a smaller abstraction graph.
ASPLOS12: Green-Marl: A DSL for Easy and Efﬁcient Graph Analysis
Domain Specific Languages (DSL) has one (and maybe only one) intrinsic weakness-lack of generallity (that is what Specific means). DSLs share 2 other important advantages: 1. Programmability. You can stays within the programming model, focus on the algorithms and forget about the implementation details like boosting parallelism or handling communication 2. Performance. One would be amazed at this, because common sense informs us that another layer of abstraction often brings about overhead. It is not true: DSL is superiror to hand written lower level code, because you are free to play with the compiler and do high-level domain-specific optimizations.
architecture-independent: loop fusion, definition hoisting, . architecture-specific: selection of parallel regions. domain-specific: edge flipping.
VLDB16: Using Domain-Speciﬁc Languages For Analytic Graph Databases
ATC16: Load the Edges You Need: A Generic I/O Optimization for
Disk-based Graph Processing
This papers works on out-of-core graph systems to reduce the amount of disk IO (Challenge 2 stated in previous post). It observes that as graph algorithms converges to the end, some edges no longer contribute to updating vertex value and thus render useless. Previous work can skip an edge partition if every edge in it is not useful, but that is way too coarse-grained. It proposes dynamic edge partition.
HPDC16: Efficient Processing of Large Graphs via Input Reduction
Althought it is also a reduction technique, it targets in-memory (single node) architecture. The whole processing is divided into 2 phases. The first phase computes on a transformed subgraph in the hope tht the second phase can take advantage of these results and executes faster.
The essensial part of this paper is about how the transformation are choosen and applied. The author proposes 6 styles of legal transforms. All of them are simply combinations of operations, like adding, deleting, merging vertexes and replacing edges. By heuristic, they preserve path/connectivity strcuture of the original graph. If we want approximate results, some requirements can be relaxed. For different applications, we choose corresponding tranforms that will not violate program correctness.
The claims are convincing and this input reduction concept is promising. My only concerns are that some reduction parameters (e.g. to what extent?) need careful tuning (sound very impractical) and otherwise will not have such experiment results.