• Note: All plots should be performed using ggplot2.

Part 1: Data wrangling

Here we will recursively treat some x-ray diffraction (XRD) data obtained during a high-pressure experiment at the ESRF (the synchrotron in Grenoble). In this experiment, we measured the XRD patterns as a function of the pressure for two samples of the same rock, nicknamed PY02. PY02 is a pyrobitumen, i.e. some coal-like amorphous and porous carbon structure. The goal was to measure the bulk modulus \(\kappa\) of this rock, defined as: \[\frac{1}{\kappa}=-\frac{1}{V}\frac{\partial V}{\partial P}.\] The bulk modulus characterizes the volume reduction of a sample as a function of the applied pressure, it’s unit being a pressure unit (usually in the GPa range).

The bulk modulus is usually obtained during a high pressure experiment thanks to the Murnaghan equation of state:

\[ \frac{V}{V_0}=\left(1+P\frac{\kappa'}{\kappa}\right)^{-1/\kappa'} \] where \(V\) is the volume of the unit cell, \(V_0\) the volume at ambient pressure, and \(\kappa'\) is the pressure derivative of \(\kappa\). In first approximation, we will consider that \(\kappa'\) is constant and that \(\kappa'\simeq4\).

  1. Define the Murnaghan(P, K, Kp) function that, given the pressure \(P\), the bulk modulus \(K\) and its derivative \(Kp\) (defaulting to \(Kp=4\)), returns the relative volume \(\frac{V}{V_0}\).

  2. Find in the Data folder all the files containing the experimental diffraction patterns. They are under the form PY02_Pxx.csv or PY02bis_Pxx.csv. Store this list of files in a vector called flist. Attention, there are some .csv files that are note wanted in this list.

  3. Read and store the two files containing the pressures of each run, PY02_pressures.dat and PY02bis_pressures.dat. Combine these two tables into a single tidy one called Pressures, containing three columns run, P.kbar and sample (sample being either PY02 or PY02bis).

  4. Define the norm01(x) function that given a vector, returns this vector normalized to [0,1]

  5. Now let’s read all the data files in flist and store the result in a tidy tibble called data with three columns, file, q and int (the data files containing the diffracted intensity as a function of the scattering vector \(q\)). Either use the tidyverse-friendly version using purrr::map(), or look into the read_csv() function and notice that it can take a vector of file names as argument, and then check the id parameter. Add the int_n column containing the normalized intensity for each file (i.e. each file’s data should be normalized to [0,1], do not normalize the whole int column directly, you need to use a grouping operation).

In case you struggled to get there, I provide the wanted data tibble as a csv file in the Data folder. Simply run:

data <- read_csv("Data/data.csv")
  1. Plot all the normalized diffractograms on top of each other (shifted vertically by 1), with lines of a different color for each file. Define nice axis labels, such as Scattering Vector q [1/Å] and Intensity [arb. units].

  2. We are interested in the position of the first peak around \(q=1.8\) Å\(^{-1}\). Starting from data, create the pospeak tibble that stores the position Q of the maximum of each diffractogram. You will do so by recursively (use the pipe %>%):

    • Make sure to filter data for scattering vectors below 2.5 Å\(^{-1}\).
    • For each file, summarize the data by creating the columns
      • Q, storing the position of the maximum in intensity
      • dQ, storing the error on the value of Q. We will estimate it as 0.5% of the Q values.
    • Using the function separate(column_name, c("column","name")), add the columns run and sample that transforms the column file to the run number and sample name while getting rid of the extension. Make sure the column run contains a number (so, remove the string “P”).
    • Finally, order the tibble by ascending run for each sample
  3. Now join this table together with the Pressures one in order to get the sample names and pressures in this table. Finally, add the new column P containing the pressure in GPa (1 GPa = 10 kbar).

In case you struggled to get there, I provide the wanted pospeak tibble as a csv file in the Data folder. Simply run:

pospeak <- read_csv("Data/pospeak.csv")

Part 2: Fitting

  1. We now need to estimate the volume variation and the error on its measurement. Normally, we would need to get the crystal lattice volume, but as mentioned earlier, the PY02 sample is an amorphous rock, and thus it has no crystalline order. In first approximation, we can consider that the volume reduction is linked to the inter-atomic distance reduction through: \[ \frac{V}{V_0}=\left(\frac{r}{r_0}\right)^3, \] where \(r\) is the most representative inter-atomic distance (i.e. the one whose Fourier transform will have the highest intensity), the scattering vector \(q\) being linked to the distance \(r\) by \(q=2\pi/r\). In our case, \(r \simeq 2\pi/1.8\simeq 3.5\) Å is the distance between two graphitic-like planes. Using this, add to the pospeak tibble the columns VV0 and dVV0 containing, respectively, the volume reduction and an estimation of its measurement error.
    • Make sure that the V0 part refers to the value at ambient pressure (i.e. \(P=0\)) of the corresponding sample.
    • Make sure you propagate the errors in a proper way.

In case you struggled to get there, I provide the wanted pospeakVV0 tibble as a csv file in the Data folder. Simply run:

pospeakVV0 <- read_csv("Data/pospeakVV0.csv")
  1. Using nls(), fit the experimental data of the volume relative variation with the Murnaghan equation of state. Make sure to use the proper weighing (\(1/\sigma^2_{V/V_0}\)). We fix the derivative value at \(Kp=4\). The bulk modulus \(K\) should be of the order of a few GPa, so work with pressures in GPa.

  2. Store the resulting values of \(K\) and it’s fitting standard error \(dK\) in two variables. Round the values to the appropriate floating number. You may use the broom::tidy() function.

  3. Finally, make the plot of the experimental points (a color per sample) and the fit. Make it look like so: