The aim of this exercise is to determine the correlations between experimental parameters and the size distributions of nickel nanoparticles obtained by dewetting. The nanoparticles were obtained by heat treatment at different temperatures under dihydrogen of a silicon wafer on which a layer of 2, 4.5 or 8 nm of nickel had previously been deposited. Treatment was carried out for 5, 10, 15, 30 or 60 minutes.
These wafers were then observed by scanning electron microscopy, and different images of the surface were taken: these images were analyzed with ImageJ, as shown on Figure 1.1. With this analysis, we obtained a set of files of type [x]nmNi-T[y]-[z]min-[u].csv
, where:
[x]
is the Ni thickness in the sample before treatment,[y]
is the treatment temperature in ˚C,[z]
the treatment time in minutes, and[u]
the image number.Area values are given in pixel2, with scales stored in a separate file Data/scales.csv
.
In this tutorial, we’ll see how to import data from a large number of files, and aggregate them into a single tidy array. This table can then be exported in csv format, or used to generate graphs.
Load the package tidyverse
. Set the global ggplot2
theme to black and white. Also, make it so that the strip.background
(background of the facets titles) is blank, and that the strip.text
is bold.
Find all [x]nmNi-T[y]-[z]min-[u].csv
files in the Data
folder and store them in flist
. You could use the function glob2rx()
to help write a regular expression using the wildcard sign *
.
The pixel <-> length correspondence for each image has been stored in the file Data/scales.csv
. Import this file into a scales
tibble.
scales
a column pix2_to_nm2
which will contain the pixel2 -> nm2 conversion value for each image.file
column to contain the file name without extension.file
column into 4 columns thickness
, temperature
, time
and img
.Let’s now import all our data files into a tibble called data
, and modify this tibble to also store the information written in the files names. We will do this in a succession of pipe operations.
data
. You can use the read_csv()
function to do so, and look into the id
parameter to store the file name in a column file
. Also, we are only interested in the Area
column.Area
column into a column data
using the nest()
function. This will make the next operations faster.file
column so that it contains the file name without extension and path.file
column into 4 columns thickness
, temperature
, time
and img
.data
column to get a single column Area
containing the area values.Now we want to convert the areas in pixel2 to areas in nm2.
data
and scales
tibbles into a tibble called alldata
.Area
column to convert it to nm2, and we’ll create a diameter
column containing the diameter of the particles in nm.pix2_to_nm2
, pixel
and size
columns that are useless.In case you didn’t manage to get there, here is the
data
tibble, you can read it withalldata <- read_csv("Data/alldata.csv")
.
geom_density()
which is basically an histogram convoluted with a Gaussian distribution of bandwidth bw
. This allows for smoother graphs. Make this plot and play with the bw
parameter.Now, store in particles_ave
the average particle diameter and its standard deviation per substrate thickness, time and temperature of reaction.
Let’s see what are the parameters influencing the particle diameters the most. For this, we’ll perform a multiple linear regression using lm()
, fitting the diam
variable as a function of a combination of thickness
, temperature
, and time
. We can also set the weights for the diam
values as the inverse of their standard deviation squared (by convention). Store the result in fit
, and print the summary of the fit.
How would you interpret the result? What are the most important parameters?