background-image: url(./img/title.png) background-position: 100% 100% background-size: 50% class: left, bottom, title-slide # Wrangling and Visualizing # Data in R .left[
[.white[Amit_Levinson]](https://twitter.com/Amit_Levinson) <br>
[.white[Amit_Levinson]](https://github.com/AmitLevinson) <br>
[.white[amitlevinson.com]](https://amitlevinson.com/) <br> ] --- class: left # About **today** .pull-left[ ### We'll talk about - SPSS: How *I* learned to work with data - Some good alternatives - R as a **recommended** alternative - What is R - Some cool plotting features - Why I fell in love ❤️ ] -- .pull-right[ ### We won't talk about - Which alternative is better - Practical code - coconut, we definitely won't talk about that (I mean who likes it? 🤮) ] -- .left[.footnote[.tiny[Anything said is based on my personal views and experience working with R]]] --- # About **me** -- - Graduate student for Sociology & Anthropology @ Ben-Gurion University of the Negev -- - Research assistant for [Dr. Jeniffer Oser](https://www.jenniferoser.com/) researching online & offline political participation -- - Political activist who likes to disseminate data as a way of advocacy -- - Using `R` for about 7 months --- # About **SPSS** - Learned and used it in quantitative courses in my BA & MA -- .pull-left[ ## Pros - Has a solid infrastructure (IBM) - Many functions - Our faculty uses it - Knowledge of it is sometimes a demand in industry ] -- .pull-right[ ## Cons - **Costs money** - An **inefficient workflow** - **Difficult to tidy data** in it - Plots are nice (?), but you can **make nicer plots**. - Its graphic user interface **(GUI) is overloaded** ] --- ## SPSS... Remind me? -- <img src="./img/spss-2.png" width="80%" height="500px" class="center"/> --- class: center background-image: url("https://media.giphy.com/media/bWM2eWYfN3r20/giphy.gif") background-size: 100% 100% --- class: top, left ## Quick Alternatives -- .pull-left[ ### [Jamovi](https://www.jamovi.org/) <img src="./img/jamovi.jpg" width="150px"/> ### [JASP](https://jasp-stats.org/) <img src="https://upload.wikimedia.org/wikipedia/commons/0/0d/JASP_logo.svg" width="150px"/> ] -- .pull-right[ ### [Tableu](https://www.tableau.com/) <img src="./img/tableu.jpg" width="200px"/> <br> <br> <br> ### [Power BI](https://powerbi.microsoft.com/en-us/) <img src="./img/power-bi.png" width="200px"/> ] ??? - Show how Jamovi/Jasp look like --- ## R is... -- - A "software environment for **statistical computing and graphics**" [(r-project)](https://www.r-project.org/)<br> -- - **Free** to use -- - Open source -- - Has an **amazing community** -- .content-box-blue[ ``` ## ## ----- ## Did you mean are? ## ------ ## \ ^__^ ## \ (oo)\ ________ ## (__)\ )\ /\ ## ||------w| ## || || ``` ] --- class: title-slide, middle, center, inverse ## Some of the basics --- ### Basics - Math operations -- - You can do simple **calculations**: ```r 1+3 ``` ``` ## [1] 4 ``` -- ```r 4^3 ``` ``` ## [1] 64 ``` -- Use objects to store vectors and operate on them: ```r x <- c(1:10) ``` -- ```r mean(x) ``` ``` ## [1] 5.5 ``` --- ### Basics - Analyzing text + It's easy to manipulate and work with **text** -- + We can use **reg**ular **exp**ressions (**regex**) to work out the magic -- + For e.g, imagine you want to extract any word that doesn't have a vowel: -- > "**Why** this is some random text with some words that don't have vowels such as **myth**, **shy**, or **gym**" -- + We want to create an expression that captures everything **that isn't a vowel** and use that to filer: -- ```r words <- unlist(str_split("Why this is some random text with some words that don't have vowel such as myth, shy, or gym", boundary("word"))) *grep("^[^aeiou]+$", x= words, value = TRUE) ``` ``` ## [1] "Why" "myth" "shy" "gym" ``` --- ### Basics - Reading data .footnote[.small[[*] Data from [Project datasets](https://perso.telecom-paristech.fr/eagan/class/igr204/datasets)]] Read data from **online sources**<sup>*</sup> ```r countries <- read_delim("https://perso.telecom-paristech.fr/eagan/class/igr204/data/factbook.csv", delim = ";") ``` -- Let's have a look at our top 6 rows: -- <div style="border: 1px solid #ddd; padding: 0px; overflow-y: scroll; height:300px; "><table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;position: sticky; top:0; background-color: #FFFFFF;"> country </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> area_sq_km </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> birth_rate_births_1000_population </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> current_account_balance </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> death_rate_deaths_1000_population </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> debt_external </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> electricity_consumption_k_wh </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> electricity_production_k_wh </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> exports </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> gdp </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> gdp_per_capita </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> gdp_real_growth_rate_percent </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> hiv_aids_adult_prevalence_rate_percent </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> hiv_aids_deaths </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> hiv_aids_people_living_with_hiv_aids </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> highways_km </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> imports </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> industrial_production_growth_rate_percent </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> infant_mortality_rate_deaths_1000_live_births </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> inflation_rate_consumer_prices_percent </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> internet_hosts </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> internet_users </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> investment_gross_fixed_percent_of_gdp </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> labor_force </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> life_expectancy_at_birth_years </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> military_expenditures_dollar_figure </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> military_expenditures_percent_of_gdp_percent </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> natural_gas_consumption_cu_m </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> natural_gas_exports_cu_m </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> natural_gas_imports_cu_m </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> natural_gas_production_cu_m </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> natural_gas_proved_reserves_cu_m </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> oil_consumption_bbl_day </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> oil_exports_bbl_day </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> oil_imports_bbl_day </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> oil_production_bbl_day </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> oil_proved_reserves_bbl </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> population </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> public_debt_percent_of_gdp </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> railways_km </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> reserves_of_foreign_exchange_gold </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> telephones_main_lines_in_use </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> telephones_mobile_cellular </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> total_fertility_rate_children_born_woman </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> unemployment_rate_percent </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Afghanistan </td> <td style="text-align:right;"> 647500 </td> <td style="text-align:right;"> 47 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 20.8 </td> <td style="text-align:right;"> 8.0e+09 </td> <td style="text-align:right;"> 6.5e+08 </td> <td style="text-align:right;"> 5.4e+08 </td> <td style="text-align:right;"> 4.5e+08 </td> <td style="text-align:right;"> 2.2e+10 </td> <td style="text-align:right;"> 800 </td> <td style="text-align:right;"> 7.5 </td> <td style="text-align:right;"> 0.01 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 21000 </td> <td style="text-align:right;"> 3.8e+09 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 163.1 </td> <td style="text-align:right;"> 10.3 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 1000 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 1.2e+07 </td> <td style="text-align:right;"> 43 </td> <td style="text-align:right;"> 1.9e+08 </td> <td style="text-align:right;"> 2.6 </td> <td style="text-align:right;"> 2.2e+08 </td> <td style="text-align:right;"> 0.0e+00 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 2.2e+08 </td> <td style="text-align:right;"> 5.0e+10 </td> <td style="text-align:right;"> 3500 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0.0e+00 </td> <td style="text-align:right;"> 3.0e+07 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 33100 </td> <td style="text-align:right;"> 15000 </td> <td style="text-align:right;"> 6.8 </td> <td style="text-align:right;"> NA </td> </tr> <tr> <td style="text-align:left;"> Akrotiri </td> <td style="text-align:right;"> 123 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> </tr> <tr> <td style="text-align:left;"> Albania </td> <td style="text-align:right;"> 28748 </td> <td style="text-align:right;"> 15 </td> <td style="text-align:right;"> -5.0e+08 </td> <td style="text-align:right;"> 5.1 </td> <td style="text-align:right;"> 1.4e+09 </td> <td style="text-align:right;"> 6.8e+09 </td> <td style="text-align:right;"> 5.7e+09 </td> <td style="text-align:right;"> 5.5e+08 </td> <td style="text-align:right;"> 1.7e+10 </td> <td style="text-align:right;"> 4900 </td> <td style="text-align:right;"> 5.6 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 18000 </td> <td style="text-align:right;"> 2.1e+09 </td> <td style="text-align:right;"> 3.1 </td> <td style="text-align:right;"> 21.5 </td> <td style="text-align:right;"> 3.2 </td> <td style="text-align:right;"> 455 </td> <td style="text-align:right;"> 30000 </td> <td style="text-align:right;"> 18 </td> <td style="text-align:right;"> 1.1e+06 </td> <td style="text-align:right;"> 77 </td> <td style="text-align:right;"> 5.6e+07 </td> <td style="text-align:right;"> 1.5 </td> <td style="text-align:right;"> 3.0e+07 </td> <td style="text-align:right;"> 0.0e+00 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 3.0e+07 </td> <td style="text-align:right;"> 3.3e+09 </td> <td style="text-align:right;"> 7500 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 5500 </td> <td style="text-align:right;"> 2000 </td> <td style="text-align:right;"> 1.9e+08 </td> <td style="text-align:right;"> 3.6e+06 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 447 </td> <td style="text-align:right;"> 1.2e+09 </td> <td style="text-align:right;"> 255000 </td> <td style="text-align:right;"> 1100000 </td> <td style="text-align:right;"> 2.0 </td> <td style="text-align:right;"> 15 </td> </tr> <tr> <td style="text-align:left;"> Algeria </td> <td style="text-align:right;"> 2381740 </td> <td style="text-align:right;"> 17 </td> <td style="text-align:right;"> 1.2e+10 </td> <td style="text-align:right;"> 4.6 </td> <td style="text-align:right;"> 2.2e+10 </td> <td style="text-align:right;"> 2.4e+10 </td> <td style="text-align:right;"> 2.6e+10 </td> <td style="text-align:right;"> 3.2e+10 </td> <td style="text-align:right;"> 2.1e+11 </td> <td style="text-align:right;"> 6600 </td> <td style="text-align:right;"> 6.1 </td> <td style="text-align:right;"> 0.10 </td> <td style="text-align:right;"> 500 </td> <td style="text-align:right;"> 9100 </td> <td style="text-align:right;"> 104000 </td> <td style="text-align:right;"> 1.5e+10 </td> <td style="text-align:right;"> 6.0 </td> <td style="text-align:right;"> 31.0 </td> <td style="text-align:right;"> 3.1 </td> <td style="text-align:right;"> 897 </td> <td style="text-align:right;"> 500000 </td> <td style="text-align:right;"> 26 </td> <td style="text-align:right;"> 9.9e+06 </td> <td style="text-align:right;"> 73 </td> <td style="text-align:right;"> 2.5e+09 </td> <td style="text-align:right;"> 3.2 </td> <td style="text-align:right;"> 2.2e+10 </td> <td style="text-align:right;"> 5.8e+10 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 8.0e+10 </td> <td style="text-align:right;"> 4.7e+12 </td> <td style="text-align:right;"> 209000 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 1200000 </td> <td style="text-align:right;"> 1.2e+10 </td> <td style="text-align:right;"> 3.3e+07 </td> <td style="text-align:right;"> 37 </td> <td style="text-align:right;"> 3973 </td> <td style="text-align:right;"> 4.4e+10 </td> <td style="text-align:right;"> 2199600 </td> <td style="text-align:right;"> 1447310 </td> <td style="text-align:right;"> 1.9 </td> <td style="text-align:right;"> 25 </td> </tr> <tr> <td style="text-align:left;"> American Samoa </td> <td style="text-align:right;"> 199 </td> <td style="text-align:right;"> 23 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 3.3 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 1.2e+08 </td> <td style="text-align:right;"> 1.3e+08 </td> <td style="text-align:right;"> 3.0e+07 </td> <td style="text-align:right;"> 5.0e+08 </td> <td style="text-align:right;"> 8000 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 185 </td> <td style="text-align:right;"> 1.2e+08 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 9.3 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 1.4e+04 </td> <td style="text-align:right;"> 76 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 3800 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 5.8e+04 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 15000 </td> <td style="text-align:right;"> 2377 </td> <td style="text-align:right;"> 3.2 </td> <td style="text-align:right;"> 6 </td> </tr> <tr> <td style="text-align:left;"> Andorra </td> <td style="text-align:right;"> 468 </td> <td style="text-align:right;"> 9 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 6.1 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 5.8e+07 </td> <td style="text-align:right;"> 1.9e+09 </td> <td style="text-align:right;"> 26800 </td> <td style="text-align:right;"> 2.0 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 269 </td> <td style="text-align:right;"> 1.1e+09 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 4.0 </td> <td style="text-align:right;"> 4.3 </td> <td style="text-align:right;"> 4144 </td> <td style="text-align:right;"> 24500 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 3.3e+04 </td> <td style="text-align:right;"> 84 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 7.1e+04 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 35000 </td> <td style="text-align:right;"> 23500 </td> <td style="text-align:right;"> 1.3 </td> <td style="text-align:right;"> 0 </td> </tr> </tbody> </table></div> --- ### Basics - plot R's simple plotting features: ```r ggplot(countries, aes(x = death_rate_deaths_1000_population, y= unemployment_rate_percent))+ geom_smooth(method = "lm")+ geom_point()+ theme_minimal() ``` <img src="index_files/figure-html/unnamed-chunk-10-1.png" width="80%" style="display: block; margin: auto;" /> --- ### Basics - reports **Rmarkdown's** **reproducible** and **automated** work flow makes it easy to work with reports and documents: .pull-left[ For example this: ```markdown "The lowest GDP per capita is `min(countries$gdp_per_capita)` and the highest unemployment rate is `max(countries$unemployment_rate). The average birth rate for 1000 people is `mean(countries$birth_rate_birth_ births_1000_population, na.rm = T)`. The correlation of unemployment and GDP per capita is cor( countries$gdp_per_capita, countries$unemployment_rate, "complete.obs"). ``` ] -- .pull-right[ Will render this: "The lowest GDP per capita is **400** and the highest unemployment rate is **90**. The average birth rate for 1000 people is **22.15.** The correlation of unemployment and GDP per capita is **-0.44**. .rotate-right[.center[<img src="https://media.giphy.com/media/lpHPFVpk65qpbH2XY5/giphy.gif" width="200px"/> ]]] --- class: title-slide, center, middle, inverse # Let's look at some .bolder[cool stuff] # you can do with R --- ## Rmarkdown Efficiency -- ### We can use code output inline with our text -- ### No more <s>Copy+Paste</s> 😱 -- ### 'Print' documents in one click -- .rotate-right[.pull-right[ <img src="https://media.giphy.com/media/Q8IYWnnogTYM5T6Yo0/giphy.gif" width="275px"/> ]] ??? Show the report rendering --- ## Maps - You can make some neat and easy maps in R -- <img src="index_files/figure-html/unnamed-chunk-11-1.png" width="100%" style="display: block; margin: auto;" /> .footnote[.tiny[Data: USArrests]] --- ## Interactive maps -- - When missiles are fired towards Israel -- - And your city has open data such as bomb shelter locations -- <iframe src="lm.html" width = "100%" height="500" id="igraph" scrolling="no" seamless="seamless" frameBorder="0"></iframe> --- ## Interactive plots -- Make interactive graphs with **{plotly}** .footnote[.tiny[Data: gapminder]] -- <iframe src="plotly.html" width="100%" height="500" id="igraph" scrolling="no" seamless="seamless" frameBorder="0" </iframe> --- <br> <br> Or with **{highcharter}** .footnote[.tiny[Data: gapminder]] -- <iframe src="hc1.html" width="95%" height="500" id="igraph" scrolling="no" seamless="seamless" frameBorder="0"> </iframe> --- ## Animated plots -- - Use with caution -- ```r library(gganimate) ``` -- ```r chat_raw <- read_delim("chat.txt", delim = "-") head(chat_raw) ``` ``` ## # A tibble: 6 x 4 ## `04/08/2017, 12:50 ` ` Messages to this g~ to `end encryption. Tap f~ ## <chr> <chr> <chr> <chr> ## 1 "12/08/2016, 12:32 " " +972 52" 374 "9319 created group \"~ ## 2 "04/08/2017, 12:50 " " +972 52" 374 "9319 added you" ## 3 "04/08/2017, 12:50 " " +972 52" 374 "9319 added +972 52" ## 4 "04/08/2017, 12:52 " " +972 54" 760 "2588: וולקאם וולקאם ו~ ## 5 "שמחים מאוד שהצטרפתם לכאן~ <NA> <NA> <NA> ## 6 "המטרה של קבוצת הרשת היא ~ <NA> <NA> <NA> ``` .footnote[.tiny[Data: WhatsApp group]] -- --- ##Animated plots - How does it work? -- ```r g + transition_reveal(date) ``` -- .center[ <img src="./graphs/chat.gif" width = "425px"/> ] --- background-image: url(https://media.giphy.com/media/rVVFWyTINqG7C/giphy.gif) background-position: center background-size: 100% 100% --- class: inverse, center, middle # Let's talk some TwitteR --- ## TwitteR <img src="./img/twitter.png" width = "35px"/> ### We can use the {rtweet} package: -- - Search tweets containing a word (word, hashtag, etc) -- - Get a user's list of friends -- - Stream live tweets -- - Get timelines from a user -- - And more [here](https://rtweet.info/)... --- ## TwitteR <img src="./img/twitter.png" width = "35px"/> Let's get the past tweets for some political candidates in the past elections<sup>*</sup> -- ```r candidates_rtweet <- rtweet::get_timeline(c("netanyahu", "gantzbe", "yairlapid"), n = 3200) ``` .footnote[.tiny[[*]: Data collected on April 11, 2020.]] -- Which gives us a lot of information: <div style="border: 1px solid #ddd; padding: 0px; overflow-y: scroll; height:300px; overflow-x: scroll; width:100%; "><table> <thead> <tr> <th style="text-align:left;position: sticky; top:0; background-color: #FFFFFF;"> x </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> user_id </td> </tr> <tr> <td style="text-align:left;"> status_id </td> </tr> <tr> <td style="text-align:left;"> created_at </td> </tr> <tr> <td style="text-align:left;"> screen_name </td> </tr> <tr> <td style="text-align:left;"> text </td> </tr> <tr> <td style="text-align:left;"> source </td> </tr> <tr> <td style="text-align:left;"> display_text_width </td> </tr> <tr> <td style="text-align:left;"> reply_to_status_id </td> </tr> <tr> <td style="text-align:left;"> reply_to_user_id </td> </tr> <tr> <td style="text-align:left;"> reply_to_screen_name </td> </tr> <tr> <td style="text-align:left;"> is_quote </td> </tr> <tr> <td style="text-align:left;"> is_retweet </td> </tr> <tr> <td style="text-align:left;"> favorite_count </td> </tr> <tr> <td style="text-align:left;"> retweet_count </td> </tr> <tr> <td style="text-align:left;"> quote_count </td> </tr> <tr> <td style="text-align:left;"> reply_count </td> </tr> <tr> <td style="text-align:left;"> hashtags </td> </tr> <tr> <td style="text-align:left;"> symbols </td> </tr> <tr> <td style="text-align:left;"> urls_url </td> </tr> <tr> <td style="text-align:left;"> urls_t.co </td> </tr> <tr> <td style="text-align:left;"> urls_expanded_url </td> </tr> <tr> <td style="text-align:left;"> media_url </td> </tr> <tr> <td style="text-align:left;"> media_t.co </td> </tr> <tr> <td style="text-align:left;"> media_expanded_url </td> </tr> <tr> <td style="text-align:left;"> media_type </td> </tr> <tr> <td style="text-align:left;"> ext_media_url </td> </tr> <tr> <td style="text-align:left;"> ext_media_t.co </td> </tr> <tr> <td style="text-align:left;"> ext_media_expanded_url </td> </tr> <tr> <td style="text-align:left;"> ext_media_type </td> </tr> <tr> <td style="text-align:left;"> mentions_user_id </td> </tr> <tr> <td style="text-align:left;"> mentions_screen_name </td> </tr> <tr> <td style="text-align:left;"> lang </td> </tr> <tr> <td style="text-align:left;"> quoted_status_id </td> </tr> <tr> <td style="text-align:left;"> quoted_text </td> </tr> <tr> <td style="text-align:left;"> quoted_created_at </td> </tr> <tr> <td style="text-align:left;"> quoted_source </td> </tr> <tr> <td style="text-align:left;"> quoted_favorite_count </td> </tr> <tr> <td style="text-align:left;"> quoted_retweet_count </td> </tr> <tr> <td style="text-align:left;"> quoted_user_id </td> </tr> <tr> <td style="text-align:left;"> quoted_screen_name </td> </tr> <tr> <td style="text-align:left;"> quoted_name </td> </tr> <tr> <td style="text-align:left;"> quoted_followers_count </td> </tr> <tr> <td style="text-align:left;"> quoted_friends_count </td> </tr> <tr> <td style="text-align:left;"> quoted_statuses_count </td> </tr> <tr> <td style="text-align:left;"> quoted_location </td> </tr> <tr> <td style="text-align:left;"> quoted_description </td> </tr> <tr> <td style="text-align:left;"> quoted_verified </td> </tr> <tr> <td style="text-align:left;"> retweet_status_id </td> </tr> <tr> <td style="text-align:left;"> retweet_text </td> </tr> <tr> <td style="text-align:left;"> retweet_created_at </td> </tr> <tr> <td style="text-align:left;"> retweet_source </td> </tr> <tr> <td style="text-align:left;"> retweet_favorite_count </td> </tr> <tr> <td style="text-align:left;"> retweet_retweet_count </td> </tr> <tr> <td style="text-align:left;"> retweet_user_id </td> </tr> <tr> <td style="text-align:left;"> retweet_screen_name </td> </tr> <tr> <td style="text-align:left;"> retweet_name </td> </tr> <tr> <td style="text-align:left;"> retweet_followers_count </td> </tr> <tr> <td style="text-align:left;"> retweet_friends_count </td> </tr> <tr> <td style="text-align:left;"> retweet_statuses_count </td> </tr> <tr> <td style="text-align:left;"> retweet_location </td> </tr> <tr> <td style="text-align:left;"> retweet_description </td> </tr> <tr> <td style="text-align:left;"> retweet_verified </td> </tr> <tr> <td style="text-align:left;"> place_url </td> </tr> <tr> <td style="text-align:left;"> place_name </td> </tr> <tr> <td style="text-align:left;"> place_full_name </td> </tr> <tr> <td style="text-align:left;"> place_type </td> </tr> <tr> <td style="text-align:left;"> country </td> </tr> <tr> <td style="text-align:left;"> country_code </td> </tr> <tr> <td style="text-align:left;"> geo_coords </td> </tr> <tr> <td style="text-align:left;"> coords_coords </td> </tr> <tr> <td style="text-align:left;"> bbox_coords </td> </tr> <tr> <td style="text-align:left;"> status_url </td> </tr> <tr> <td style="text-align:left;"> name </td> </tr> <tr> <td style="text-align:left;"> location </td> </tr> <tr> <td style="text-align:left;"> description </td> </tr> <tr> <td style="text-align:left;"> url </td> </tr> <tr> <td style="text-align:left;"> protected </td> </tr> <tr> <td style="text-align:left;"> followers_count </td> </tr> <tr> <td style="text-align:left;"> friends_count </td> </tr> <tr> <td style="text-align:left;"> listed_count </td> </tr> <tr> <td style="text-align:left;"> statuses_count </td> </tr> <tr> <td style="text-align:left;"> favourites_count </td> </tr> <tr> <td style="text-align:left;"> account_created_at </td> </tr> <tr> <td style="text-align:left;"> verified </td> </tr> <tr> <td style="text-align:left;"> profile_url </td> </tr> <tr> <td style="text-align:left;"> profile_expanded_url </td> </tr> <tr> <td style="text-align:left;"> account_lang </td> </tr> <tr> <td style="text-align:left;"> profile_banner_url </td> </tr> <tr> <td style="text-align:left;"> profile_background_url </td> </tr> <tr> <td style="text-align:left;"> profile_image_url </td> </tr> </tbody> </table></div> --- ### Tweet frequency -- <img src="index_files/figure-html/unnamed-chunk-25-1.png" width="80%" style="display: block; margin: auto;" /> --- ## Most favorited tweet .left-column[ ### Benny Gantz ] -- .right-column[ <blockquote class="twitter-tweet"><p lang="iw" dir="rtl">ישראל לפני הכל.</p>— בני גנץ - Benny Gantz (@gantzbe) <a href="https://twitter.com/gantzbe/status/1243268257760460800?ref_src=twsrc%5Etfw">March 26, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> ] --- ## Most favorited tweet .left-column[ ### <s>Benny Gantz</s> ### Yair Lapid ] -- .right-column[ <blockquote class="twitter-tweet"><p lang="iw" dir="rtl">ברשת רצות תמונות של בני גנץ עם המילה ״בוגד״. יש גבול. מבקש מכל תומכינו לא להפיץ, לא להשתמש בביטוי הזה.</p>— יאיר לפיד - Yair Lapid (@yairlapid) <a href="https://twitter.com/yairlapid/status/1243897590351106048?ref_src=twsrc%5Etfw">March 28, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> ] --- ## Most favorited tweet .left-column[ ### <s>Benny Gantz</s> ### <s>Yair Lapid</s> ### Benjamin Netanyahu ] -- .right-column[ <blockquote class="twitter-tweet"><p lang="hi" dir="ltr">मेरे दोस्त <a href="https://twitter.com/narendramodi?ref_src=twsrc%5Etfw">@narendramodi</a> आपके प्रभावशाली चुनावी जीत पर हार्दिक बधाई! ये चुनावी नतीजे एक बार फिर दुनिया के सबसे बड़े लोकतंत्र में आपके नेतृत्व को साबित करते हैं। हम साथ मिलकर भारत और इज़राइल के बीच घनिष्ट मित्रता को मजबूत करना जारी रखेंगे । बहुत बढ़िया, मेरे दोस्त 🇮🇱🤝🇮🇳</p>— Benjamin Netanyahu (@netanyahu) <a href="https://twitter.com/netanyahu/status/1131472416872509441?ref_src=twsrc%5Etfw">May 23, 2019</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> ] --- ## TwitteR <img src="./img/twitter.png" width = "35px"/> -- - We can also search on Twitter for a word or phrase, let's do that for 'בחירות' (elections): -- ```r elections <- search_tweets("בחירות", n = 25000, retryonratelimit = TRUE) ``` -- <center> <img src="https://media.giphy.com/media/3oriNYAEFtQLJgkKnS/giphy.gif" width="300px"/> </center> --- ### Election hashtags - {rtweet} comes with a hashtag column containing only the hashtags 😮 -- <img src="index_files/figure-html/unnamed-chunk-30-1.png" width="100%" style="display: block; margin: auto;" /> --- ### Frequency of words? -- - We could look at word cloud, bi-grams (2 words), trigrams, etc... -- <img src="index_files/figure-html/unnamed-chunk-32-1.png" width="100%" style="display: block; margin: auto;" /> --- ## Final <s> slides </s> words -- ### Why R? -- - It's **free** -- - It's **open source** where everyone and anyone can contribute -- - It enabled me to **tackle quantitative questions** I was interested in -- - It's an **all in one program**: Prepare data, analyze, visualize, report -- - A **skill** sought after in industry -- - .content-box.blue[THE COMMUNITY] --- ## The community! -- - Israeli R community on **Facebook** -- - R community on **Twitter** -- - **Sharing** code -- - **\#Tidyteusday** - A weekly project for improving exploratory data analysis and visualizations --- .content-box-blue[ ``` ## ## ------------- ## R you in? ## -------------- ## \ ## \ ## \ ## ## .="=. ## _/.-.-.\_ _ ## ( ( o o ) ) )) ## |/ " \| // ## \'---'/ // ## jgs /`"""`\\ (( ## / /_,_\ \\ \\ ## \_\_'__/ \ )) ## /` /`~\ |// ## / / \ / ## ,--`,--'\/\ / ## '-- "--' '--' ``` ] --- class: inverse, center, middle # There is so much more... -- ### Packages -- ### Websites -- ### CV -- ### Posters -- ### interactive applications -- ### Presentations (like this one) --- class: inverse, center, middle # Thank you! --- class: center, middle ## And thanks to the many {xaringan} tutorials: -- ### [Yihui Xie](https://slides.yihui.org/xaringan/#1) -- ### [Allison Hill concise](https://apreshill.github.io/data-vis-labs-2018/slides/06-slides_xaringan.html#1) and [elaborated](https://arm.rbind.io/slides/xaringan.html#1) versions -- ### [Zhi Yang](https://zhiyang.netlify.com/tags/xaringan/) -- ### [Garth Tarr](https://garthtarr.github.io/sydney_xaringan/#1) -- ### [Garrick adenbuie Xaringanthemer](https://pkg.garrickadenbuie.com/xaringanthemer/articles/singles/themes.html) 📦