Quantum Annealing (/Monte Carlo)

For my final project, I decided to implement a simplified version of quantum annealing on Ising spin glasses, following [Santoro, Martoňák, Tosatti, Car. arXiv:cond-mat/0205280, 2002], which has been referenced in several more recent papers on quantum annealing.

Quantum annealing is the quantum version of simulated annealing, a simple and very widely applicable optimization heuristic. The basic idea of simulated annealing is to begin with some (random) solution, change a small part of it, keep the change if it improves the score or with some small probability if it worsens the score, and repeat. This small probability of occasionally making out solution worse actually makes our solution better in the end, by avoiding local optima.

For this project, the probability of accepting a change that increases the score by some amount d will be exp(-d/T), where T is a “temperature” value that is selected beforehand and is usually made to decrease over time. (We want to minimize the score, so a positive increase in score is actually bad.)

Quantum annealing promises to have an additional advantage over simulated annealing: it can tunnel through high but thin barriers in the state space, which simulated annealing has trouble crossing over. This can help it escape certain local optima more easily.

The Ising spin glass problem is to assign each element of an N-length vector s to +1 or -1, in order to minimize

for some arbitrary matrix J.

There are two things that [Santoro et. al. 2002] do: first, (after replacing the spin variables with Pauli-z operators) they add a transverse field term, which allows for quantum tunneling:

The ⟨i,j⟩ means that we only iterate over “adjacent” pairs of nodes (i,j); this is relevant when we restrict the nonzero elements of J to connect along the edges of a specific graph.

The coefficient for the transverse field is made to decay linearly from a positive number to 0 throughout the annealing.

Second, [Santoro et. al. 2002] run a Path-Integral Monte Carlo (PIMC) by maintaining multiple (P) solutions, together with a coupling factor that gets stronger over time, gradually forcing the solutions to become more similar to each other. By maintaining 20 solutions and testing on several Ising spin glass instances, consisting of square grids up to 80x80, they found that QA consistently outperformed simulated annealing, even when multiplying the number of iterations QA took by P to account for the fact that it maintains P solutions:

Results from [Santoro et. al., 2002] for a 80x80 Ising grid, averaged results over 45 annealings. the x-axis is the number of annealing iterations; the y-axis is the resulting score minus the optimal score (found with a separate algorithm).

I wanted to see if only using the multi-solution coupling and discarding everything else would yield similar improvements over classical annealing. The quantum-inspired annealing I ended up with was:

with S as the P x N matrix containing our P solutions. After annealing, I output the row s of S with the smallest score H(s). This annealing is completely classical, and there is no transverse field here.

I tested for N=100 nodes, first with all elements of J chosen in U(-2,2) (i.e. chosen uniformly randomly between -2 and 2), where elements at cells (i,j) and (j,i) of J are both chosen separately; then with nodes placed in a 10x10 grid and only vertically/horizontally adjacent pairs of nodes having their corresponding elements in J chosen in U(-2,2), where this time only one of the elements at (i,j) and (j,i) are chosen randomly (the other is set to 0). The distribution U(-2,2) is the same one that [Santoro et. al., 2002] used in their paper.

I fixed the number of annealing iterations at R=10⁵. For the conventional simulated annealing, the temperature at iteration t is set as T(t) = exp(-5*t/R), with starting temperature T(0)=1. Both R and T(t) were selected manually, as the final score for R=10⁵ and R=10⁶ does not seem to differ too much, and the constant 5 in the temperature function seems to work best.

For the quantum-inspired annealing, the coupling at iteration t is L(t) = — ln(tanh(1-t/R)) (borrowed from but not exactly the same as [Santoro et. al., 2002]), and the number of maintained solutions to be P=20 (same as the aforementioned paper). L(t) is

An important note is that in the original paper, the temperature was fixed to a small constant, and the coupling factor was actually dependent on the transverse field factor Γ, which would decay linearly. In our version, L directly depends on t, and temperature is fixed at T=0.01:

L(t) does not reach infinity because t goes from 0 to R-1, so t never reaches R.

For regular simulated annealing (SA), a single change (mutation) consists of flipping the sign of a single spin variable (element of s). For our quantum-inspired annealing, a single mutation flips a single variable of a single solution (so, a single element of the matrix S).

Our results, averaged over 50 annealings, are:

QA(min) is the minimum (best) score out of any of the P solutions we maintain during the quantum-inspired annealing; QA(med) is the median score; and QA(max) is the maximum (worst) score. Note that now the x axis is on a logarithmic scale.

The quantum-inspired annealing does slightly better in the all-random test case but slightly worse on the 10x10 grid.

Additionally, fixing the coupling factor for our annealing to L(t)=0 yields similar results, with a slight improvement on the all-random test case. This seems to indicate that coupling might not actually be helpful:

Here, our quantum-inspired annealing with zero coupling factor is labeled MSA for multi-simulated annealing (since we still maintain multiple solutions).

There is a big discrepancy between our results and those of [Santoro et. al., 2002], as that paper shows that QA is consistently better than SA. Furthermore, in a later paper [Denchev et. al., arXiv:1512.02206, 2016] on quantum annealing, the authors studied a “weak-strong cluster network” version of Ising spin glasses and ran SA, Quantum Monte Carlo (QMC, a classical imitation of quantum tunneling), and D-Wave QA, and found that QA was significantly faster than the other two algorithms. In the below chart, instead of plotting the score over time, the authors plotted the number of iterations needed for each problem size instance until each annealing algorithm obtained the optimal solution with 99% probability:

Clearly, more work needs to be done to improve simulated annealing. An immediate next step would be to implement quantum Monte Carlo and quantum annealing more properly. Future steps would be to focus on other combinatorial optimization problems, such as higher-order Ising spin glasses, traveling salesman, knapsack, or even matrix multiplication tensor decomposition (as some least-squares methods have been used in the past to find new exact low-rank decompositions, and there has been some recent surprising progress in this field; perhaps quantum annealing or other quantum algorithms could be useful here?).

References:

https://arxiv.org/pdf/cond-mat/0205280.pdf

https://arxiv.org/pdf/1512.02206.pdf

https://cs.uwaterloo.ca/~eschost/Exam/Smirnov.pdf

https://www.nature.com/articles/s41586-022-05172-4

https://arxiv.org/pdf/2212.01175.pdf

Source code:

import numpy as np, matplotlib.pyplot as plt, time, matplotlib as mpl

#### generating test case

_MODE=0
TITLE,J,score,dscore=(None,)*4
if _MODE==0:
N=100
J=np.random.uniform(-2,2,(N,N))
TITLE=f'$N={N},\\ '+r'J_{i,j}\sim U(-2,2)$'+'\naverage over 50 runs'

else:
_DR,_DC=[1,0,-1,0],[0,1,0,-1]
R,C=10,10
pts=[(r,c) for r in range(R) for c in range(C)]
N=len(pts)
pt2id={p:i for i,p in enumerate(pts)}
np.random.seed(1)
wedges=[]
for i in range(N):
r,c=pts[i]
neighbors=[]
if r<R-1: neighbors.append((r+1,c))
if c<C-1: neighbors.append((r,c+1))
for nr,nc in neighbors:
j=pt2id[(nr,nc)]
wedges.append((np.random.uniform(-2,2),(i,j)))
TITLE=r'$J_{i,j}\sim U(-2,2)$'+'\nfor adjacent i,j in 10x10 grid; all other $J_{i,j}=0$\naverage over 50 runs'
J=np.zeros((N,N))
for w,(i,j) in wedges:
J[i][j]=w

def score(J,s):
return -( s.reshape((1,-1))@J@s.reshape((-1,1)) )[0,0]
def dscore(J,s,i,v):
out=-score(J,s)
s0=s[i]
s[i]=v
out+=score(J,s)
s[i]=s0
return out

s=np.random.uniform(-1,1,size=N)
scr0=score(J,s)
for i in range(N):
s0=s[i]
v=np.random.uniform(-1,1)
s[i]=v
ans=score(J,s)-scr0
s[i]=s0
assert abs(ans-dscore(J,s,i,v))<1e-6
plt.figure()
plt.imshow(J)
plt.colorbar()

#### SA, QA

def SA(N,J,reps,temp,seed,verbose=True):
np.random.seed(seed)
sol=np.random.choice([1,-1],size=N)
scr=score(J,sol)
st,mark,accn=time.time(),0,0
scrs=[scr]
for _rep in range(reps):
T=temp(_rep/reps)
i=np.random.randint(N)
dscr=dscore(J,sol,i,-sol[i])
if dscr<=0 or np.random.random()<np.exp(-dscr/T):
scr+=dscr
accn+=1
sol[i]*=-1
scrs.append(scr)

if verbose:
t=time.time()-st
if t>=mark or _rep==reps-1:
print(f'{t=} {scr=} nreps={_rep+1} {accn=} {T=}')
mark+=10 if mark<100 else 100
return {'sol':sol,'scrs':scrs}

def QMC_pm1(N,J,P,reps,T0,Lfunc,seed,verbose=True):
np.random.seed(seed)
coll=np.random.choice([1,-1],size=(P,N))
scrs=np.array([score(J,s) for s in coll])
coups=np.array([np.sum(a*b) for a,b in zip(coll,coll[1:])])
st,mark,accn=time.time(),0,0
stats={k:[] for k in {'min','max','med'}}
Ls=[]
for _rep in range(reps):
L=Lfunc(_rep/reps)
qscr=np.sum(scrs)-L*np.sum(coups)
p,i=np.random.randint(P),np.random.randint(N)
s0,c0=scrs[p],( (coups[p-1] if p>0 else 0) , (coups[p] if p<P-1 else 0) )
scrs[p]+=dscore(J,coll[p],i,-coll[p][i])
coll[p][i]*=-1
if p>0: coups[p-1]=np.sum(coll[p-1]*coll[p])
if p<P-1: coups[p]=np.sum(coll[p]*coll[p+1])
nscr=np.sum(scrs)-L*np.sum(coups)
if nscr<=qscr or np.random.random()<np.exp(-(nscr-qscr)/T0):
accn+=1
else:
coll[p][i]*=-1
scrs[p]=s0
if p>0: coups[p-1]=c0[0]
if p<P-1: coups[p]=c0[1]

stats['min'].append(min(scrs))
stats['max'].append(max(scrs))
stats['med'].append(sorted(scrs)[P//2])
Ls.append(L)

if verbose:
t=time.time()-st
if t>=mark or _rep==reps-1:
print(f'{t=} {qscr=} mscr={stats["min"][-1]} nreps={_rep+1} {accn=} {L=}')
mark+=10 if mark<100 else 100
return {'sol':min(coll,key=lambda s:score(J,s)),'stats':stats,'Ls':Ls}

R=100_000
T0=1
temp=lambda a:T0*np.exp(-a*5)
Lfunc=lambda a:-np.log(np.tanh(1-a))

seeds=list(range(1,51))
sas,qds=[],[]
st=time.time()
for s in seeds:
sas.append(SA(N,J,R,temp,s,verbose=False))
qds.append(QMC_pm1(N,J,20,R,0.01,Lfunc,s,verbose=False))
print(s,time.time()-st)

#### plotting output

def mean_arr(As):
return np.array(list(As)).mean(axis=0)
mpl.rcParams['figure.dpi']=300
cmap=mpl.colormaps['tab10']
plt.figure()
plt.plot(mean_arr(sa['scrs'] for sa in sas),label='SA',c=cmap(0))
for k in ['min','med','max']:
plt.plot(mean_arr(qd['stats'][k] for qd in qds),label=f'QA({k})',c=cmap(1),
linestyle={'min':'-','med':'--','max':'dotted'}[k])
plt.title(TITLE)
plt.ylabel('score')
plt.xlabel('iters')
plt.xscale('log')
plt.legend()

--

--