Topic: graphical data analysis/ visualization


EXERCISE 1.1 Continuous variable

Open the file with house prices in London in January 2019 (HP_LONDON_JAN19.xlsx), it can be downloaded from the course website.

  1. Explore the PRICE variable using histograms.
  2. Compare the distribution of the house prices for the different property types using boxplots.


EXERCISE 1.2 Categorical variable

This exercise uses the same data file as exercise 1 (HP_LONDON_19.xlsx).
Use R to create the graphs in figure 1 en figure 2 below.

Figure 1

Figure 2


EXERCISE 1.3: categorical variable

This exercise uses the same data file as exercise 1 (HP_LONDON_19.xlsx).
Make a new variable ‘WEEKDAY’ (values: SUN, MON and so on).

  1. Create a barplot with the numbers of properties sold for each of the weekdays in January 2019.
  2. Why is the graph created in part (i) misleading?
  3. Create a graph which gives a correct distribution of the numbers of properties sold on Sunday, Monday etc.


EXERCISE 1.4 Bivariate analysis: two categorical variables

Navigate to DUO website.
Choose: Databestanden/Hoger Onderwijs/Ingeschreven/hbo.
Dowload the file 01a.Ingeschrevenen hbo 2019.xlsx.

  1. Create a table with number of male (GESLACHT == ‘man’) and female students (GESLACHT == ‘vrouw’) per master program (use variable CROHO ONDERDEEL) at Dutch Universities of Applied Sciences in 2021.
  2. Create barplots with number of male and female students per master program at Dutch Universities of Applied Sciences in 2021.


EXERCISE 1.5 Bivariate analysis: two numerical variables

The file RoomsForRentNeth.xlsx contains information about rooms for rent in three big cities in The Netherlands, Amsterdam, Rotterdam and The Hague.
The data is sampled from https://directwonen.nl/ on April 24, 2018.

  1. Generate a scatterplot with PRICE as Y-variable and AREA as X-variable.
  2. Is there a relationship between RENT and AREA for the rooms for rent in the three cities.
  3. Adjust the plot by mapping CITY on color. What stands out?