How to create interactive flexdashboards in r was one of my best discoveries in 2016. Thanks a lot to the ladies who explained it to me mid-November 2016.
One needs to have a little bit of previous experience with R Markdown and RStudio. The final result is an html-file containing visualisations that can be distributed within a company via e-mail or immediately published on an internal website. Flexdashboards are very powerful and I highly recommend using them or at least start learning on how to use them. You can use flexdashboard to publish groups of related data visualisations as a dashboard.
A flexdashboard can either be static (a standard web page) or dynamic (a Shiny interactive document). A wide variety of components can be included in flexdashboard layouts. In the following I will show how to create a Storyboard layout for presenting a sequence of visualisations and related commentary.
If you have not already done so, download and install the flexdashboard package from CRAN in your RStudio as follows:
install.packages("flexdashboard")
Next you will need to open a blank r markdown file and copy the code below.
--- title: "Working time declared by staff members" output: flexdashboard::flex_dashboard --- Analysis {.storyboard} ========================================= ### Working time entries by all staff members (rows) and per day (columns) ```{r} setwd("~/NewWorkingtime/script") library(dplyr) library(htmlwidgets) library(flexdashboard) library(ggplot2) library(googleVis) #install.packages("devtools") #require(devtools) #install_github('ramnathv/rCharts') library(rCharts) library(openxlsx) hours <- read.csv("../input/hours.csv", stringsAsFactors = FALSE) staff <- read.csv("../input/staff.csv", stringsAsFactors = FALSE) funnel <- read.xlsx("~/Sankey2/Octobercsv-2.xlsx", sheet = 'Octobercsv', cols = c(13,14,15)) staff$Department <- sapply(strsplit(staff$Object.Abbreviation, "-"), "[", 2) staff$Division <- sapply(strsplit(staff$Object.Abbreviation, "-"), "[", 1) staff$Service.Office <- sapply(strsplit(staff$Object.Abbreviation, "-"), "[", 3) committees <- read.csv("../input/committees.csv", stringsAsFactors = FALSE) org_committees <- read.csv("../input/org_committees.csv", stringsAsFactors = FALSE) flexitimereport <- read.csv("../input/99.csv", stringsAsFactors = FALSE) dataCATS <- read.csv("../input/dataCATS.csv", stringsAsFactors = FALSE) staff <- staff[!duplicated(staff$Personnel.Number),] itemizeDates <- function(startDate="01-10-16", endDate="09-12-16", format="%d-%m-%y") { out <- seq(as.Date(startDate, format=format), as.Date(endDate, format=format), by="days") format(out, format) } colnames(hours) <- c("Personnel_Number","First_name", "Last_name", itemizeDates(startDate="01-10-16", endDate="09-12-16")) hours2 = inner_join(hours, staff, by = c("Personnel_Number"="Personnel.Number")) hours2[,c("Personnel.Area","Pers..subarea.text", "Description", "Division", "Department","Service.Office")] <- list(NULL) #colnames(hours2) hours2 <- hours2[c(1,2,3,74,75,76,4:73)] hours2[c(7:76)] <- lapply(hours2[c(7:76)], as.numeric) #rownames(hours2) <- hours2$Employee.name #Uncommenting this will show staff names library(d3heatmap) map1 <- d3heatmap(hours2[c(7:76)],Rowv = FALSE, Colv=FALSE, show_grid = TRUE, margin= c(3, 9), colors = c('#3288bd','#99d594','#e6f598','#fee08b','#fc8d59','#d53e4f')) map1 #saveWidget(map1, "map1.html") ``` ### Working time entries by org units with more than 3 staff members (rows) and per day (columns) ```{r} Org.Units <- subset(hours2[6], !duplicated(hours2[6])) #colnames(Org.Units) Org.Units <- arrange(Org.Units, Object.Abbreviation) Org.Units$numerosity <- table(hours2$Object.Abbreviation) Org.Units2 <- Org.Units[apply(Org.Units[, -1], MARGIN = 1, function(x) all(x > 3)), ] hours3 <- hours2 #colnames(hours3) hours3[,c("Personnel_Number","First_name","Last_name","Employee.name","Name.of.EE.subgroup")] <- list(NULL) rownames(hours3) <- NULL colnames(hours3) <- as.character(colnames(hours3)) meanorgunit <- aggregate(. ~ Object.Abbreviation,hours3, FUN=mean) meanorgunit2 <- subset(meanorgunit, Org.Units$numerosity > 3) #colnames(meanorgunit) orgunitorder <- c("ED","ED-EXO","DED","DED-DAP","DED-ILO","DED-MBN","DED-PCM", "AF-AUD","AF-INT","AF-LD","AF-LD-DH","AF-LD-LTO","AF-LD-SPO","AF-PB","AF-SI","AF-SI-OSR", "AF-SI-SCC","AF-SMO", "D","D-DS","D-DS-OME","D-DS-PME","D-DS-SCA","D-DS-SIS","D-SD","D-SD-BMS","D-SD-CPN","D-SD-QME", "D-SD-WPS", "E","E-PM","E-PM-EPA","E-PM-EPB","E-PM-EPC","E-PM-EPD","E-PM-EPE","E-PM-EPF","E-SR","E-SR-AIV", "E-SR-CNS","E-SR-ECV","E-SR-LRS","E-SR-ONC","E-SR-REA","E-SR-RGI", "A","A-FI","A-FI-ACC","A-FI-CSO","A-FI-FSS","A-FI-PCS","A-FI-SPO","A-FI-VFO", "A-ICR","A-SG","A-SG-DEV","A-SG-PFO","A-SG-QRM","A-SG-SPB","A-ST","A-ST-FSS", "A-ST-HCS","A-ST-MSS","A-ST-TAS", "I","I-BD","I-BD-BUS","I-BD-DIL","I-BD-DSA","I-DV","I-DV-DAT","I-DV-PRO","I-DV-QCT","I-DV-SOL", "I-INS","I-OP","I-OP-ADM","I-OP-APP","I-OP-CUS","I-PAQ","I-TEL", "P","P-CI","P-CI-CNC","P-CI-MQC","P-CI-PDC","P-CI-SCS","P-PE","P-PE-SIM","P-PE-SVE", "V","V-VM","V-VM-APH","V-VM-DEM","V-VM-ROS", "S","S-CO","S-CO-MHI","S-CO-MPR","S-CO-OLD","S-CS","S-CS-SME","S-PH") meanorgunit2$Object.Abbreviation <- ordered(meanorgunit2$Object.Abbreviation, orgunitorder) meanorgunit3 <- meanorgunit2[with(meanorgunit2, order(Object.Abbreviation)),] rownames(meanorgunit3) <- meanorgunit3$Object.Abbreviation map2 <- d3heatmap(meanorgunit3[c(2:71)],Rowv = FALSE, Colv=FALSE, show_grid = TRUE, margins= c(5, 10), colors = c('#3288bd','#66c2a5','#abdda4','#e6f598','#ffffbf','#fee08b','#fdae61','#f46d43','#d53e4f')) map2 ``` ### Sankey diagram of declared working hours from October to December (all employees) ```{r results='asis'} op <- options(gvis.plot.tag = "chart") s <- gvisSankey(funnel, from="Origin", to="Destination", weight="Weight", options=list(height = 400, width = 600, title='Excess hours time flow', sankey="{link:{color:{fill:'lightblue'}}}")) #print(s, 'chart') plot(s) options(op) ``` ### Working time entered by staff who have 8 planned working hours ```{r} dataCATS <- read.csv("../input/dataCATS.csv", stringsAsFactors = FALSE) dataCATS$pureworkingtime <- (dataCATS$Declared.hours - dataCATS$Absence.hours) dataCATS2 <- dataCATS[(dataCATS[,5]>0) | (dataCATS[,8]>0),] library(dplyr) library(ggplot2) library(scales) library(plyr) dataCATS8b <- filter(dataCATS, Planned.time == 8 & Absence.hours == 0) dataCATS4b <- filter(dataCATS, Planned.time == 4 & Absence.hours == 0) dataCATS6b <- filter(dataCATS, Planned.time == 6 & Absence.hours == 0) bell8b <- ggplot(dataCATS8b, aes(Declared.hours))+ geom_histogram(aes(y = (..count..)/sum(..count..)), binwidth = 0.5) + scale_x_continuous(breaks = c(4,5,6,7,8,9,10,11,12), name = "Declared working hours")+ scale_y_continuous(breaks = c(0.01,0.02,0.05,0.10,0.2,0.3,0.4), labels = percent_format(), name = "Percentage of staff")+ ggtitle("Working time entered by staff who have 8 planned working hours") bell8b ``` *** Comments ### Working time entered by staff who have 6 planned working hours ```{r} bell6b <- ggplot(dataCATS6b, aes(Declared.hours))+ geom_histogram(aes(y = (..count..)/sum(..count..)), binwidth = 0.5) + scale_x_continuous(breaks = c(4,5,6,7,8,9,10,11,12), name = "Declared working hours")+ scale_y_continuous(breaks = c(0.01,0.02,0.05,0.10,0.2,0.3,0.4), labels = percent_format(), name = "Percentage of staff")+ ggtitle("Working time entered by staff who have 6 planned working hours") bell6b ``` *** Comments ### Working time entered by staff who have 4 planned working hours ```{r} bell4b <- ggplot(dataCATS4b, aes(Declared.hours))+ geom_histogram(aes(y = (..count..)/sum(..count..)), binwidth = 0.5) + scale_x_continuous(breaks = c(4,5,6,7,8,9,10,11,12), name = "Declared working hours")+ scale_y_continuous(breaks = c(0.01,0.02,0.05,0.10,0.2,0.3,0.4), labels = percent_format(), name = "Percentage of staff")+ ggtitle("Working time entered by staff who have 4 planned working hours") bell4b ``` *** Comments ### Working time entered by Human Divisions ```{r} library(stringr) library(dplyr) library(ggplot2) # # extract numbers only committees$order <- as.numeric(str_extract(committees$Dayorder, "[0-9]+")) library(plyr) plyr11 <- join(org_committees, meanorgunit3, by = "Object.Abbreviation") ply12 <- plyr11[,-c(1:12)] ply12[c(2:71)] <- lapply(ply12[c(2:71)], as.numeric) meanorgunit5 <- aggregate(. ~ Cluster,ply12, FUN=mean) PRAC <- committees[c(338:393),] CVMP <- committees[c(170:225),] PDCO <- committees[c(282:337),] COMP <- committees[c(114:169),] CHMP <- committees[c(58:113),] SAWP <- committees[c(394:449),] OTHER <- committees[c(450:505),] tcluster1 <- t(meanorgunit5[1,]) tcluster1 <- data.frame(tcluster1[-1,]) colnames(tcluster1) <- "Cluster1" tcluster1$Day <- rownames(tcluster1) rownames(tcluster1) <- NULL tcluster1$Day <- as.Date(tcluster1$Day, "%d-%m-%y") CHMP$Day <- as.Date(CHMP$Day, "%d/%m/%Y") tcluster1b <- inner_join(tcluster1, CHMP, by = c("Day"="Day")) tcluster1c <- aggregate(Cluster1~order, data=tcluster1b, mean) tcluster2 <- t(meanorgunit5[2,]) tcluster2 <- data.frame(tcluster2[-1,]) colnames(tcluster2) <- "Cluster2" tcluster2$Day <- rownames(tcluster2) rownames(tcluster2) <- NULL tcluster2$Day <- as.Date(tcluster2$Day, "%d-%m-%y") CVMP$Day <- as.Date(CVMP$Day, "%d/%m/%Y") tcluster2b <- inner_join(tcluster2, CVMP, by = c("Day"="Day")) tcluster2c <- aggregate(Cluster2~order, data=tcluster2b, mean) tcluster3 <- t(meanorgunit5[3,]) tcluster3 <- data.frame(tcluster3[-1,]) colnames(tcluster3) <- "Cluster3" tcluster3$Day <- rownames(tcluster3) rownames(tcluster3) <- NULL tcluster3$Day <- as.Date(tcluster3$Day, "%d-%m-%y") PDCO$Day <- as.Date(PDCO$Day, "%d/%m/%Y") tcluster3b <- inner_join(tcluster3, PDCO, by = c("Day"="Day")) tcluster3c <- aggregate(Cluster3~order, data=tcluster3b, mean) tcluster4 <- t(meanorgunit5[4,]) tcluster4 <- data.frame(tcluster4[-1,]) colnames(tcluster4) <- "Cluster4" tcluster4$Day <- rownames(tcluster4) rownames(tcluster4) <- NULL tcluster4$Day <- as.Date(tcluster4$Day, "%d-%m-%y") COMP$Day <- as.Date(COMP$Day, "%d/%m/%Y") tcluster4b <- inner_join(tcluster4, COMP, by = c("Day"="Day")) tcluster4c <- aggregate(Cluster4~order, data=tcluster4b, mean) tcluster6 <- t(meanorgunit5[5,]) tcluster6 <- data.frame(tcluster6[-1,]) colnames(tcluster6) <- "Cluster6" tcluster6$Day <- rownames(tcluster6) rownames(tcluster6) <- NULL tcluster6$Day <- as.Date(tcluster6$Day, "%d-%m-%y") OTHER$Day <- as.Date(OTHER$Day, "%d/%m/%Y") tcluster6b <- inner_join(tcluster6, OTHER, by = c("Day"="Day")) tcluster6c <- aggregate(Cluster6~order, data=tcluster6b, mean) GCluster1 <- ggplot(tcluster1c, aes (x=order, y=Cluster1)) + geom_bar(fill="#E69F00", stat="identity") + labs(title="Human Divisions, CHMP") GCluster1 ``` *** CHMP related seasonality Day 1 corresponds to the first day of the CHMP. Given that from October to December the CVMP met three times, it is the average working time entered on those three days by the staff members from the Human Divisions (except Pediatric, Orphans). Only working days are considered. ### Working time entered Pediatric Medicines Office ```{r} GCluster3 <- ggplot(tcluster3c, aes (x=order, y=Cluster3)) + geom_bar(fill="green", stat="identity") + labs(title="Pediatric Medicines Office, PDCO", x = "Working days from day 1", y = "Average recorded hours") GCluster3 ``` *** PDCO related seasonality Day 1 corresponds to first day of PDCO. Given that from October to December the COMP met three times, it is the average working time entered on those three days by all staff members from the Pediatric Office. Only working days are considered. ### Working time entered Orphan Medicines Office ```{r} GCluster4 <- ggplot(tcluster4c, aes (x=order, y=Cluster4)) + geom_bar(fill="#CC6666", stat="identity") + labs(title="Orphan Medicines Office, COMP", x = "Working days from day 1", y = "Average recorded hours") GCluster4 ``` *** COMP related seasonality Day 1 corresponds to the first day of the COMP. Given that from October to December the COMP met three times, it is the average working time entered on those three days by the staff members from the Orphan Medicines Office. Only working days are considered. ### Working time entered Veterinary Division ```{r} GCluster2 <- ggplot(tcluster2c, aes (x=order, y=Cluster2)) + geom_bar(fill="red", stat="identity") + labs(title="Veterinary Division, CVMP", x = "Working days from day 1", y = "Average recorded hours") GCluster2 ``` *** CVMP related seasonality Day 1 corresponds to the first day of the CVMP. Given that from October to December the CVMP met three times, it is the average working time entered on those three days by the staff members from the Veterinary Division. Only working days are considered. ### Working time entered by all other offices ```{r} GCluster6 <- ggplot(tcluster6c, aes (x=order, y=Cluster6)) + geom_bar(fill="#009E73", stat="identity") + labs(title="All other offices", x = "Calendar days from day 1", y = "Average recorded hours") GCluster6 ``` *** Non-committee related seasonality Day 1 corresponds to first day of the month. Only working days are considered, but are averaged on the basis of the calendar day of month. Comments to first and second page ========================================= The first two pages use d3heatmaps technology. Please note: - The maps show the row/column/value under the cursor. Staff members are coded to anonymise. - Click row/column labels to highlight only one staff member (or office) per day. - Drag a rectangle over the image to zoom in. - Double-click anywhere to go back. The colours are diverging, i.e they are going from blue, green, yellow, orange to red. Blue days have little declared working time. Red days have many working hours declared. It is not possible to offer a legend due to the interactivity. Committee work ========================================= ### Participation of organisational entities to committee work ```{r} knitr::kable(org_committees) ```
How are you getting on? So far so good? Do you want the r code and the sample data file to replicate the example on your computer? Please drop me a line.
(Explanations to be continued.)