Exercise 1

  • Print the 6 first lines of the R-built-in data.frame trees

  • Print only the column names

  • What is the dimension of trees?

  • Plot the trees height and volume as a function of their girth in two different graphs. Make sure the axis labels are clear

  • In each graph, add a red dashed line corresponding to the relevant correlation that you observe (average value, linear correlation…)

  • Explain your choice and write the corresponding values (average value and standard deviation, or slope, intercept and corresponding errors). Round the values to 2 decimals.


Exercise 2

  • Print the 3 first lines of the R-built-in data.frame USArrests. This data set contains statistics about violent crime rates by US state. The numbers are given per 100 000 inhabitants, except for UrbanPop which is a percentage.

  • What is the average murder rate in the whole country?

  • What is the state with the highest assault rate?

  • Create a subset of USArrests gathering the data for states with an urban population above (including) 80%.

  • How many states does that correspond to?

  • Within these states, what is the state with the smallest rape rate?

  • Print this subset ordered by decreasing urban population.

  • Print this subset ordered by decreasing urban population and increasing murder rate.

  • Plot an histogram of the percentage of urban population with a binning of 5%. Add a vertical red line marking the average value. Make sure the x axis shows the [0,100] range.

  • Is there a correlation between the percentage of urban population and the various violent crime rates? argument your answer with plots.