Big Bang Initialization of Variable Activity in a SAT Solver

Investigating (almost) all variables leads to a better search path.

Shuji Narazaki
Backjump technologies
4 min readOct 29, 2019

--

Last update: 2019-10-29.

Effect of Restart

Restarts, proposed by [Gomes98] in the context of research on SAT solvers, is known as a useful mechanism to solve satisfiability problems quickly, which have enormous search spaces and they were proven one of NP-complete problems. How is this important? I thought a single restart changes the target for searching in a search space drastically. Since, however, variable activities, which are not affected by restarts, are used to select next decision variable, restarts are rather a way to change only the order of decision variables. We can think it should introduce a big impact. But in fact, the set of variables assigned after a restart seems very similar to that before the restart. Here I try to give an evidence.

The figures below are drawn by test runs using a modified version of my SAT solver, splr, which is based on Glucose 4.1. During execution, the solver collects some statistics:

  • The number of variables which have been selected as a first unified point, FUP, shown in gold in figures.
  • The number of variables which have been selected as a FUP multiple times in red.
  • The number of conflicting variables, shown in blue.
  • The number of multiple conflicting variables in green.

These numbers are bound up to the number of variables used in a problem; they aren't conflict counters.

First figure below shows log-scaled ratios of these variables to all variables against the number of conflicts on T56.2.0.cnf, which has been used in recent SAT competitions and has 3,145,220 variables.

I'm sorry the figure lacks information on the number of restarts, but the solver should execute many restarts based on a heuristics known as Glucose's dynamic forcing/blocking restart. Anyway the numbers of variables that were selected as important variables increase very slowly. Even after 30000 conflicts, only exp(-1.7) = 0.18 = about 20% of variables made conflicts and became FUPs.

Here is another run on CNP-5-100.cnf, a 27585 variables and 214401 clauses problem. Though the solver seems to get good progress several times, they aren't big changes in aspect of these criteria. The ratio of conflicting variables is exp(-0.6) = about 54%, lasted for 300000 conflicts.

And 10-3-13.cnf, which is an unsolvable problem with the solver used. It has different numbers. Almost variables are selected and there are some stages. But the numbers are stagnated for a long period.

What these results mean? The limit of restart? We need a more drastic restart?

I'd like to assign all variables in early stage. What happens?

Introducing an out-of-range initialization, big bang.

A easy way to explore all variables is to keep their activities high enough during the early steps. While the range of var activity in the most famous EVISDS is unbound, one in conflict history-based branching heuristic, CHB heuristics isn't [CHB]. CHB is a sort of EMA and usually the range of activity is [0, 1].

The original CHB algorithm, cited from Liang2019

So, if a solver uses CHB or something like that, some digit number seems a good initial value. And to give them a good ordering the number of their occurrences may work well, which is already calculated in clause/var elimination preprocessor. And they are converged into normal values in CHB after some conflicts, which might be thought as the early stage, to explore a vast search space sparely and firstly.

Copying occurrence values from eliminator's heap

I’d like to call this activity configuration big bang initialization. (You may remind simulated annealing because of making a system start with a dissipating high energy.)

Evaluation of the proposed scheme

This big bang works well. The following figure shows the result by my new solver, splr-0.1.5 and previous one on the problem suit used in SAT completion 2018. Note: this run used a 1,000 sec timeout and 4 parallel execution setting due to a poor computation resource.

Bing bang solver (with various small changes for 5 months) solves 155 problems (94 satisfiable and 61 unsatisfiable), while the previous one (splr-0.1.3) does 137 problems (82 satisfiable and 55 unsatisfiable). Over 10% improvement is the biggest progress on my solver's history. We can invent many variants about the initial value setting. As far as I tried, the described value is the best. Too big values gets rather bias for a long time. And giving initial values in the reverse order deteriorates the number of solved problems. I'm exploring a better setting.

References

  • [splr-0.1.5] https://github.com/shnarazk/splr/releases/tag/splr-0.1.5
  • [CHB] Liang, J. H., Ganesh, V., Poupart, P., & Czarnecki, K. (2016). Exponential Recency Weighted Average Branching Heuristic for SAT Solvers. Thirtieth AAAI Conference on Artificial Intelligence, 3434–3440.
  • [Gomes98] Gomes, C. P., et al. Boosting combinatorial Search through Randomization, Proc. of AAAI-98, pp.431-437, 1998.

--

--

Shuji Narazaki
Backjump technologies

Studying SAT solvers and symbolic computation (type and logic). Being into 円城塔, Greg Egan, Stephen Colbert, 酒見賢一, say a Sci. Fi. person.