Markov chain Monte Carlo:
The Markov chain Monte Carlo (MCMC) is based on a random walk through parameter space where new steps are either rejected or accepted on the basis of maximizing the likelihood of the model to fit our dataset (here ESPAT profiles). A MCMC realization includes a burn-in period where successive iterations lead to a subdomain of the parameter space close to the most likely solution. After the burn-in period, the algorithm samples randomly the parameter distributions around the best solution. While there is no definitive way to determine the convergence of an MCMC model, one of the diagnostics that can be employed that may suggest convergence are trace plots and post burn-in marginal posteriori distributions and covariance plots. Plotting this results in distinctive trace or “caterpillar” plots where the model samples a given parameter until it converges upon the solution space. The trace plots and post burn-in marginal posteriori distributions and covariance plots for each of the DMDs investigated are provided in figure s3 -4 .
The calculation of the error between the observed ESPAT, f , and that of the model, p , is calculated as:
\(L_{k}=\sum{(\mathbf{f}-\mathbf{p}_{k})}^{2}\) (s6)
where k is the kth step in the Monte Carlo chain. The calculation of p is performed based on m ’, a perturbation to an initial proposal of model parameters written in vector form as m (Eq. (1) & (16) in the main text). The new candidate vector, m ’, is then is accepted in the chain and saved as m for the next step if the criteria below is met:
\(\zeta>log(\varepsilon)\) (s7)
where \(\varepsilon\) is a number sampled from a random uniform distribution between 0 and 1 and \(\zeta\) is defined as:
\(\zeta=\ \gamma\varphi\) (s8)
where \(\gamma\) is an arbitrary pre-factor and \(\varphi\) is defined as:
\(\varphi=\ -L_{k-1}+\ L_{k}\) (s9)
In this way, if the error is lowered between successive steps, the new candidate vector is accepted as the next step in the chain. If the error is not lowered, there is still a probability that the candidate vector will be accepted in order to escape local extrema. The acceptance rate, defined as the percent of accepted candidate vectors m ’ during the duration of the Markov chain, optimizes the MCMC inversion when between values of ~20% for higher dimensional problems [Roberts et al. , 1997], which can be achieved by adjusting the pre-factor \(\gamma\) until a proper acceptance rate is achieved. A similar method was also used to perform MCMC inversions for the fitting of data from Saal et al. [2008] in order to constrain the evaporation and cooling rates [Roberts et al. , 1997].