name: module3 class: title-slide, center, middle, hide-count, hide-logo background-image: url("https://images.unsplash.com/photo-1617164924207-40e6ee7c3ffe?ixlib=rb-1.2.1&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=1740&q=80") background-size: cover # .white.big-text[Data Visualisation] ## .white[Session - 3] .footnote[ .white[Image credits:][Kayvan Mazhar](https://unsplash.com/photos/SfxhjdST3Qs) ] --- class: center # Course Progress <img src="images/data-science-communicate.png" width="100%" style="display: block; margin: auto;" /> --- class: left, middle, hide-count, hide-logo background-image: url("images/not_normal.png") background-size: 70% background-position: right # .big-text[Data] .footnote[ [Artwork Source](https://www.allisonhorst.com/) ] --- # Variable types in R: -- - `int` stands for integers, like 4, 55, 300. -- - `dbl` stands for doubles, or real numbers like 3, 7.45, 1.565, 12. -- - `chr` stands for character vectors, or strings like names. -- - `dttm` stands for date-times (a date + a time). -- - `lgl` stands for logical, vectors that contain only TRUE or FALSE. -- - `fct` stands for factors, which R uses to represent **categorical variables** with fixed possible values like occupation: student, professional, government, business. -- - `date` stands for dates. --- class: hide-logo background-image: url("images/culmen_depth.png") background-size: 25% background-position: 95% 5% # Data of Palmer Penguins - It comes with R package `palmerpenguins` -- - Name of the data is `penguins` -- - To know more about the data `?penguins` -- - Included variables are: - species, island, bill_length_mm, bill_depth_mm, flipper_length_mm, body_mass_g, sex, year .footnote[ [Artwork Source](https://www.allisonhorst.com/) ] --- # An Overview of Data .panelset[ .panel[.panel-name[Codes] ```r *glimpse(penguins) ``` ] .panel[.panel-name[Output] ``` ## Rows: 344 ## Columns: 8 ## $ species <fct> Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Adel… ## $ island <fct> Torgersen, Torgersen, Torgersen, Torgersen, Torgerse… ## $ bill_length_mm <dbl> 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.9, 39.2, 34.1, … ## $ bill_depth_mm <dbl> 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.8, 19.6, 18.1, … ## $ flipper_length_mm <int> 181, 186, 195, NA, 193, 190, 181, 195, 193, 190, 186… ## $ body_mass_g <int> 3750, 3800, 3250, NA, 3450, 3650, 3625, 4675, 3475, … ## $ sex <fct> male, female, female, NA, female, male, female, male… ## $ year <int> 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007… ``` ] ] --- ### An Overview of Data .panelset[ .panel[.panel-name[Codes] ```r *summary(penguins) ``` ] .panel[.panel-name[Output] ``` ## species island bill_length_mm bill_depth_mm ## Adelie :152 Biscoe :168 Min. :32.10 Min. :13.10 ## Chinstrap: 68 Dream :124 1st Qu.:39.23 1st Qu.:15.60 ## Gentoo :124 Torgersen: 52 Median :44.45 Median :17.30 ## Mean :43.92 Mean :17.15 ## 3rd Qu.:48.50 3rd Qu.:18.70 ## Max. :59.60 Max. :21.50 ## NA's :2 NA's :2 ## flipper_length_mm body_mass_g sex year ## Min. :172.0 Min. :2700 female:165 Min. :2007 ## 1st Qu.:190.0 1st Qu.:3550 male :168 1st Qu.:2007 ## Median :197.0 Median :4050 NA's : 11 Median :2008 ## Mean :200.9 Mean :4202 Mean :2008 ## 3rd Qu.:213.0 3rd Qu.:4750 3rd Qu.:2009 ## Max. :231.0 Max. :6300 Max. :2009 ## NA's :2 NA's :2 ``` ] ] --- # Packages required: ```r library(palmerpenguins) # to access penguin data library(tidyverse) # to use ggplot2 pkg ``` - Packages recommended: ```r install.packages(c( "directlabels", "dplyr", "gameofthrones", "ggforce", "gghighlight", "ggnewscale", "ggplot2", "ggraph", "ggrepel", "ggtext", "ggthemes", "hexbin", "mapproj", "maps", "munsell", "ozmaps", "paletteer", "patchwork", "rmapshaper", "scico", "seriation", "sf", "stars", "tidygraph", "tidyr", "wesanderson" )) ``` --- class: left, middle, hide-count, hide-logo background-image: url("images/ggplot-logo.png") background-size: contain background-position: 100% 50% # .big-text[R<br>Package] --- # ggplot2 by [Hadley Wickham](http://hadley.nz/) <br> - "is a system for declaratively creating graphics, based on [The Grammar of Graphics](https://www.springer.com/gp/book/9780387245447)" (book by Late Leland Wilkinson) .pull-left[ <div class="figure" style="text-align: center"> <img src="https://upload.wikimedia.org/wikipedia/en/b/b5/Leland_Wilkinson.png" alt="Late Leland Wilkinson" width="40%" /> <p class="caption">Late Leland Wilkinson</p> </div> ] .pull-right[ <div class="figure" style="text-align: center"> <img src="images/hadley.jpg" alt="Hadley Wickham" width="58%" /> <p class="caption">Hadley Wickham</p> </div> ] .footnote[ [Source](https://ggplot2.tidyverse.org/) ] --- class: hide-count, hide-logo background-image: url("images/layer7.png") background-size: contain background-position: 50% 50% .footnote[ [Source](https://www.ericchowkokyew.com/data-visualization-with-ggplot2-in-r/) ] --- # Key Components for ggplot2 Plot 1. data, 1. aesthetic mapping 1. at least one layer of geom function --- .panelset[ .panel[.panel-name[Task] <img src="images/layer1.png" width="45%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Codes] ```r *ggplot(data = penguins) ``` ] .panel[.panel-name[Output] <img src="viz_files/figure-html/unnamed-chunk-4-1.png" width="504" style="display: block; margin: auto;" /> ] ] --- .panelset[ .panel[.panel-name[Task] <img src="images/layer2.png" width="45%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Codes] ```r *ggplot(data = penguins, mapping = aes(x = species)) ``` ] .panel[.panel-name[Output] <img src="viz_files/figure-html/unnamed-chunk-5-1.png" width="504" style="display: block; margin: auto;" /> ] ] --- .panelset[ .panel[.panel-name[Task] <img src="images/layer3.png" width="45%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Codes] ```r ggplot(data = penguins, mapping = aes(x = species)) + * geom_bar() ``` ] .panel[.panel-name[Output] <img src="viz_files/figure-html/unnamed-chunk-6-1.png" width="504" style="display: block; margin: auto;" /> ] ] --- .panelset[ .panel[.panel-name[Codes] ```r ggplot(penguins, aes(x = species)) + geom_bar() ``` ] .panel[.panel-name[Output] <img src="viz_files/figure-html/unnamed-chunk-7-1.png" width="504" style="display: block; margin: auto;" /> ] ] --- class: your-turn, hide-logo # 🧠 YOUR TURN
05
:
00
.panelset[ .panel[.panel-name[Output] <img src="viz_files/figure-html/unnamed-chunk-9-1.png" width="38%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Codes] ```r ggplot(data = penguins, mapping = aes(x = island)) + geom_bar() ``` ] ] --- class: center, middle # How to export plot to your computer? --- .panelset[ .panel[.panel-name[Codes] ```r ggplot(data = penguins, mapping = aes(x = species)) + geom_bar() *ggsave("peng-species.pdf") # also try jpg/jpeg/png ``` ] .panel[.panel-name[Output] <img src="viz_files/figure-html/unnamed-chunk-10-1.png" width="504" style="display: block; margin: auto;" /> ``` ## Saving 7 x 7 in image ``` ] ] --- class: center, middle # How to add color to bars? --- .panelset[ .panel[.panel-name[Codes] ```r ggplot(data = penguins, mapping = aes(x = species)) + * geom_bar(fill = "blue") ``` ] .panel[.panel-name[Output] <img src="viz_files/figure-html/unnamed-chunk-11-1.png" width="504" style="display: block; margin: auto;" /> ] ] --- .panelset[ .panel[.panel-name[Codes] ```r ggplot(data = penguins, mapping = aes(x = species)) + * geom_bar(fill = c("orange", "white", "green")) # color names should be equal to the factor levels # in case of factor species levels are three # Adele, Chinstrap & Gentoo ``` ] .panel[.panel-name[Output] <img src="viz_files/figure-html/unnamed-chunk-12-1.png" width="504" style="display: block; margin: auto;" /> ] ] --- class: center, middle # How to add color using palette? 🎨 --- ## 🎨 Color Palette - R package `RColorBrewer` & `wesanderson` <img src="viz_files/figure-html/unnamed-chunk-13-1.png" width="1224" style="display: block; margin: auto;" /> --- .panelset[ .panel[.panel-name[Codes] ```r library(RColorBrewer) ggplot(data = penguins, mapping = aes(x = species, * fill = species)) + geom_bar() + * scale_fill_brewer(palette = "Dark2") ``` ] .panel[.panel-name[Output] <img src="viz_files/figure-html/unnamed-chunk-14-1.png" width="504" style="display: block; margin: auto;" /> ] ] --- class: center, middle # How to remove legend or change its position? --- .panelset[ .panel[.panel-name[Codes] ```r ggplot(data = penguins, mapping = aes(x = species, fill = species)) + geom_bar() + scale_fill_brewer(palette = "Dark2") + theme(legend.position = "none") # top, bottom, left ``` ] .panel[.panel-name[Output] <img src="viz_files/figure-html/unnamed-chunk-15-1.png" width="504" style="display: block; margin: auto;" /> ] ] --- class: center, middle # How to plot title and axis titles? --- .panelset[ .panel[.panel-name[Codes] ```r ggplot(data = penguins, mapping = aes(x = species, fill = species)) + geom_bar() + scale_fill_brewer(palette = "Dark2") + theme(legend.position = "none") + labs( title = "Species of palmer penguins", subtitle = "This data is about penguins", x = "Species", y = "Frequency" ) ``` ] .panel[.panel-name[Output] <img src="viz_files/figure-html/unnamed-chunk-16-1.png" width="504" style="display: block; margin: auto;" /> ] ] --- class: center, middle # How to control size of text? --- .panelset[ .panel[.panel-name[Codes] ```r ggplot(data = penguins, mapping = aes(x = species, fill = species)) + geom_bar() + scale_fill_brewer(palette = "Dark2") + theme(legend.position = "none", * text = element_text(size = 20)) + labs( title = "Species of palmer penguins", subtitle = "This data is about penguins", x = "Species", y = "Frequency" ) ``` ] .panel[.panel-name[Output] <img src="viz_files/figure-html/unnamed-chunk-17-1.png" width="504" style="display: block; margin: auto;" /> ] ] --- class: center, middle # How to plot two numeric variables? --- .panelset[ .panel[.panel-name[Codes] ```r ggplot(data = penguins, mapping = aes(x = bill_length_mm, y = bill_depth_mm, color = species)) + geom_point() + scale_fill_brewer(palette = "Dark2") + theme(legend.position = "none", * text = element_text(size = 20)) + labs( title = "Relationship between bill length \n& depth of palmer penguins", subtitle = "This data is about penguins", x = "Bill length (mm)", y = "Bith depth (mm)" ) ``` ] .panel[.panel-name[Output] <img src="viz_files/figure-html/unnamed-chunk-18-1.png" width="504" style="display: block; margin: auto;" /> ] ] --- class: center, middle # How to add themes to ggplot? --- # ggplot2 themes <https://ggplot2.tidyverse.org/reference/ggtheme.html> - theme_gray() - theme_bw() - theme_linedraw() - theme_light() - theme_dark() - theme_minimal() - theme_classic() - theme_void() - theme_test() --- .panelset[ .panel[.panel-name[Codes] ```r ggplot(data = penguins, mapping = aes(x = bill_length_mm, y = bill_depth_mm, color = species)) + geom_point() + scale_fill_brewer(palette = "Dark2") + theme(legend.position = "none", text = element_text(size = 20)) + labs( title = "Relationship between bill length \n& depth of palmer penguins", subtitle = "This data is about penguins", x = "Bill length (mm)", y = "Bith depth (mm)" ) + * theme_bw() ``` ] .panel[.panel-name[Output] <img src="viz_files/figure-html/unnamed-chunk-19-1.png" width="504" style="display: block; margin: auto;" /> ] ] --- .panelset[ .panel[.panel-name[Codes] ```r ggplot(data = penguins, mapping = aes(x = bill_length_mm, y = bill_depth_mm, color = species)) + geom_point() + scale_fill_brewer(palette = "Dark2") + theme(legend.position = "none", * text = element_text(size = 20)) + labs( title = "Relationship between bill length \n& depth of palmer penguins", subtitle = "This data is about penguins", x = "Bill length (mm)", y = "Bith depth (mm)" ) + theme_classic() ``` ] .panel[.panel-name[Output] <img src="viz_files/figure-html/unnamed-chunk-20-1.png" width="504" style="display: block; margin: auto;" /> ] ] --- class: center, middle # How to add regression line to ggplot? --- .panelset[ .panel[.panel-name[Codes] ```r ggplot(data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g)) + geom_point() + theme(legend.position = "none", text = element_text(size = 24)) + labs( title = "Relationship between bill length \n& depth of palmer penguins", subtitle = "This data is about penguins", x = "Flipper length (mm)", y = "Body mass (gm)" ) + theme_classic() + * geom_smooth() ``` ] .panel[.panel-name[Output] <img src="viz_files/figure-html/unnamed-chunk-21-1.png" width="504" style="display: block; margin: auto;" /> ] ] --- # More resources - ggplot2 book https://ggplot2-book.org/ - CÉDRIC SCHERER https://www.cedricscherer.com/ - ggplot2 cook book http://www.cookbook-r.com/ --- class: center middle hide-count # 🙋🏽♀️🙋♂️<br>.big-text[Q&A] --- class: center, middle, inverse, hide-logo # Dynamic Wrangling<br>Using dplyr ### .orange[Next Module - 4]