--- title: 'Session 03 : Data Visualization' author: '' output: html_document --- == Task 1 : Configure R and load data === Install relevant packages ```{r, configure-r, eval=FALSE, echo=TRUE} # Load required packages. Note that the install() commands are commented out as packages are already downloaded on lab machines # install.packages("tidyverse") library(tidyverse) # install.packages("sf") # SimpleFeatures package for working with spatial data. library(sf) # install.packages("tmap") # tmap library, which uses syntax very similar to ggplot2. library(tmap) ``` === Read in the dataset ```{r, download-data, eval=FALSE, echo=TRUE} # A pre-prepared simple features data frame that loads Local Authority outlines # (as sfc_MULTIPOLYGON) and 2011 Census data that we'll use # to model the vote for the remaining sessions. data_gb <- st_read("http://homepages.see.leeds.ac.uk/~georjb/geocomp/03_datavis/data/data_gb.geojson") ``` == Task 2. Perform some exploratory data analysis === Do some data wrangling ```{r, data-wrangle, eval=FALSE, echo=TRUE} # Calculate the LA share of Leave vote by Region. region_summary <- data_gb %>% group_by(Region) %>% summarise(share_leave=sum(Leave)/sum(Leave, Remain)) %>% arrange(desc(share_leave)) # Clear geometry data. st_geometry(region_summary) <- NULL print(region_summary) ``` === Create some summary graphics ```{r, summary-graphics, eval=FALSE, echo=TRUE} # Create bar chart of result ordered by LA. data_gb %>% ggplot(aes(x=reorder(lad15nm,-share_leave), y=margin_leave, fill=margin_leave))+ geom_bar(stat="identity", width=1)+ scale_fill_distiller(palette = 5, type="div", direction=1, guide="colourbar", limits=c(-0.3,0.3))+ scale_x_discrete(breaks=c("Lambeth","Slough","Boston")) + geom_hline(aes(yintercept=0))+ theme_classic()+ xlab("LAs by Leave (asc)")+ ylab("Margin Leave/Remain") ``` == Task 3. Perform some exploratory spatial data analysis === Explore spatial variation in the Leave:Remain vote ```{r, choropleth-leave, eval=FALSE, echo=TRUE} # Generate a choropleth displaying share of leave vote by LAD. tm_shape(data_gb) + tm_fill(col="share_leave", style="cont", size=0.2, id="geo_label", palette="Blues", title="") + tm_borders(col="#bdbdbd", lwd=0.5) + tm_layout( title="LA share of Leave vote", title.snap.to.legend=TRUE, title.size=0.8, legend.text.size=0.6, title.position = c("right", "center"), legend.position = c("right","center"), frame=FALSE, legend.outside=TRUE) ``` == Task 4. Explore bivariate associations using correlation ```{r, scatterplot, eval=FALSE, echo=TRUE} # Calculate correlation coefficient of share Leave by degree-educated. data_gb %>% summarise(cor(share_leave, degree_educated)) # Generate scatterplot of share Leave by degree-educated. data_gb %>% ggplot(aes(x=share_leave, y=degree_educated)) + geom_point(colour="#525252",pch=21, alpha=0.8) + theme_bw() ``` == Task 5. Generate graphical small multiples ```{r, small-multiples, eval=FALSE, echo=TRUE} data_gb %>% gather(c(younger_adults:eu_born), key = "expl_var", value="la_prop") %>% ggplot(aes(x=la_prop, y=share_leave))+ geom_point(colour="#525252",pch=21)+ facet_wrap(~expl_var, scales="free")+ theme_bw() ```