Dplyr

Heat Maps

library(dplyr) # Data manipulation & magrittr pipe library(ggplot2) # General plotting library(NMF) # aheatmap() library(gplots) # heatmap.2() library(RColorBrewer) # Brewer palettes set.seed(123) ############################ # 2D histograms # ############################ # simulate data that consiststs of paired observations in two experiments covar_mat <- matrix(c(5, 4, 4, 5), ncol = 2) # Covariance matrix data <- MASS::mvrnorm(n = 10000, mu = c(0, 0), Sigma = covar_mat) %>% #Simulate correlated data rbind(matrix(rnorm(20000, sd = 0.

Web Scraping

Web Scraping Take data formatted for display in a web browser and reformat for analysis. It helps to know… a little about HTML and XML how to manipulate strings in R a little something about regular expressions how to write a function and do some basic conditional looping Web scraping is mostly cleaning data. Strategy Every web page is different, but a basic procedure in R (for a single web page) is as follows:

Kickoff Meetup: dplyr demo

This is an R script ported from here. # Load packages library("dplyr") library("ggplot2") library("nycflights13") library("lubridate") # it's a data.frame, but also a tbl_df. # doesn't print entire thing to screen. flights ## # A tibble: 336,776 x 19 ## year month day dep_time sched_dep_time dep_delay arr_time ## <int> <int> <int> <int> <int> <dbl> <int> ## 1 2013 1 1 517 515 2 830 ## 2 2013 1 1 533 529 4 850 ## 3 2013 1 1 542 540 2 923 ## 4 2013 1 1 544 545 -1 1004 ## 5 2013 1 1 554 600 -6 812 ## 6 2013 1 1 554 558 -4 740 ## 7 2013 1 1 555 600 -5 913 ## 8 2013 1 1 557 600 -3 709 ## 9 2013 1 1 557 600 -3 838 ## 10 2013 1 1 558 600 -2 753 ## # .