Extra Credit: Parallel Tempering

Extra Credit

This entire page is extra credit! You will recieve 20 points extra credit for completing this page.

You can do it in C++ or python (but only the parallelization instructions for C++ are currently here).

Background

One of the problems with Monte Carlo methods, which we saw briefly before, is that the autocorrelation time can be long. Significant autocorrelation reduces the number of independent samples generated by the simulation, worsening the statistics of its predictions or (equivalently) extending the run-times needed for reliable results.

One place where long autocorrelation time is common in systems with multiple low energy states. We’ve seen one example of this in our Ising model simulator. Mathematically, our Markov chain ought to sample both the “all-spin-up” and “all-spin-down” states equally likely. But if, at any step, the chain is in the “all-spin-up” state, it has a very hard time getting to “all-spin-down.” In practice this sampling issue isn’t much of a problem for the Ising model because the real world exhibits it as well (see spontaneous symmetry breaking). But there are other cases where it is severely problematic.

A more pernicious issue is around a critical point. There we find critical slowing down where correlations of all length scales exist. You need to be able to go from large spin-up clusters to large spin-down clusters.

There are various ways to get around poor sampling, but parallel tempering is one of the most effective. In parallel tempering, we don’t run just one Monte Carlo simulation; we run many simulations, at different temperatures, and occasionally swap their configurations. Note the difference from running multiple simulations independently in parallel. We ocassionally allow the simulations to switch their configurations, so a high-temperature simulation might get a configuration from a low temperature one and visa versa.

If we are going to do this, we have to be a bit careful. In particular, any naive application of this is going to mess up detailed balance. Suppose, for example, we ran a simulation at infinite temperature and zero temperature and just swapped them every 100 steps. Then what would happen is that the zero temperature simulation will be in a really unlikely configuration \(1\%\) of the time.

Instead we need to to only accept the move with the right probability. We need to build a Markov chain which has the right stationary distribution.

The first thing we should ask, then, is what is the stationary distribution that we want to achieve. Essentially what we want to have happen is that if we only consider any temperature, it has the right probability distribution. Suppose we have two temperature we are working with: \(\beta_1\) and \(\beta_2\). WWe can accomplish this by having the probability distributions of configurations as

\[P(c_1,c_2) \propto \exp[-\beta_1 E(c_1)] \exp[-\beta_2 E(c_2)]\]

We are going to use all the same Monte Carlo moves as before (on each simulation) but we are going to add an additional move that lets us swap the current configuration between the two temperatures. The probability of accepting such a swap is

\[ A(\textrm{swap}) = \frac{\pi_2(c_1)\pi_1(c_2)}{\pi_2(c_2)\pi_1(c_1)} \frac{T(n \rightarrow o)}{T(o \rightarrow n)} \]

Implementing in C++

In the process we will also learn how to do parallelization using mpi.

Parallel Tempering:

  1. Run, in parallel, many different Ising models.

  2. Ocassionally swap configurations using Monte Carlo rule.

To go about this you are going to use MPI.

First, let’s check that you can compile with MPI. Check that you can type

(or your equivalent) and that succesfully compiles.

Secondly, to run things you want to do

(It seems that on ubuntu on bash you want to do mpirun -np 2 -host localhost ./a.out)

This runs two copies of your ising model simulation.

The next step is to make sure each processor knows what “number” it is.

To do this, download Communication.tar.

Go ahead and include

When you run this, you will see that each processor will write out a different number. .

Now, we just need to write some code that looks like this:

double temperature=0.0
if (myComm.MyProc()==0){
   temperature=1.0;
}
else if (myComm.MyProc()==1){
   temperature=2.0;
}

Now, you should be able to write code that doesn’t talk to each other but just runs a bunch of different temperatures at once.
The next step would be to get your code to talking to each other so that we can swap the configurations.

Let’s add

vector<double> send_energy(1);
send_energy[0]=myEnergy;
vector<double> energies(myComm.NumProcs())
myComm.AllGather(send_energy,energies);

Then every processor has a copy of all the energies.

Other useful commands include

vetor<double> betas;
myComm.BroadCast(0,betas);

A good model for how to go about this is to actually have the root process (number zero) make decisions about what to do. Then it can relay its decisions to the other processes and those processes can implement them.

We have talked about this conceptually as moving around configurations. But the configurations are actually pretty large. It is better to go ahead and move around the temperatures instead. The only thing is this makes it a bit tricky to keep track of the magnetization and other observables.

Grading

For this page, you need to show that you have run your simulation successfully using parallel tempering and produced from this single parallel tempering run a magnetization curve as well as a \(\langle m^2 \rangle\) curve as a function of temperature. To check this, you should be able to compare against your previous runs. Your \(\langle m^2 \rangle\) curve should look the same, but your magnetization curve probably looks different. Why is this?

Simulated Tempering

Extra Extra Credit (2 points)

This section is extra extra credit.

Currently we’ve described an algorithm that you run in parallel and which simultaneously generates samples at all temperatures. There is an alternative to this algorithm which does something very similar but runs in series.

Start with a normal markov chain (at temperature \(\beta\)). Then add a Monte Carlo move which changes this value of \(\beta\) according to the standard accept/reject criterion. Record your observables as a function of \(\beta\). This will give you a similar result as parallel tempering although you may have to add in an importance sampling to compensate for the unequal distribution of \(\beta\).

Simulated Annealing

Extra Extra Credit (2 points)

This section is extra extra credit.

So far we’ve been doing MCMC on the Ising model to sample from the Boltzmann distribution. One can also use the inspiration of simulated tempering to do optimization. Instead of sampling \(\beta\), one simply starts at large \(\beta\) and then slow increases it (i.e. cools down). This process is called simulated annealing and can be applied to find the ground state of the Ising model but also the ground state of many optimization problems.