Losing the plots

When losing my plotting coding turned into a learning opportunity.

Zoë Turner
2021-11-24

Don’t Panic

I started the day thinking my task would be easy. I had been asked to update a report I’d created a couple of years ago and although I had a week to do it, much longer than such requests can often be, I really wanted to help get this data together quickly. The report had previously been used, I was told, to make a case for better CCTV equipment. That case had been won and now patients and staff had equipment had could provide evidence validate complaints and support prosecutions. Who wouldn’t want to pull out the stops for that kind of feedback!

Including code in Rmarkdown

The day didn’t go as planned. I’d got the old (html) report but I could not remember where I’d saved the coding files. If I’d only known version control when I’d first written it, it would be in the Git/VSTS.

Even without that, I wish I’d known that putting the following into the top YAML:

output:
  html_document:
    code_download: yes
    

would produce a little button in the top right hand corner of the report from which anyone, including me importantly, could get the code. Instead, whilst I searched for my code, I also started looking at reproducing the whole document.

Finding SPC charts with signals

The re-coding wasn’t going so well though. I’d found the data I needed but I couldn’t for the life of me remember how to get just the SPC charts for wards/incidents only with signals as I’d done originally:

library(qicharts2)
library(tidyverse)
library(lubridate)

# data from the qicharts2 package, vignette https://cran.r-project.org/web/packages/qicharts2/vignettes/qicharts2.html#faceting-readmission-rates-by-gender
cabg_by_month_gender <- cabg %>% 
  mutate(month = lubridate::floor_date(date, 'month')) %>% 
  group_by(month, gender) %>% 
  summarise(readmissions = sum(readmission),
            n            = n())

# create run chart as an object so that the data can be extracte with plot$data
plot <- qic(month, readmissions, n,
    data      = cabg_by_month_gender,
    facets    = ~ gender, 
    chart     = 'run',
    y.percent = TRUE,
    title     = 'Readmissions within 30 days (run chart)',
    ylab      = '',
    xlab      = 'Month')

# keep the data for the category with a signal
plot_signal <- plot$data %>% 
  filter(runs.signal == TRUE) %>% 
  select(facet1,y.sum, x) %>% 
  group_by_(~ facet1) 

# rename back to original column names so that the later chart code doesn't need to change
plot_signal <- rename(plot_signal, 
                       'gender' = 'facet1',
                       'start_of_month' = 'x',
                       'total' = 'y.sum')  

# run charts for only where there is a signal
signal_plot <- qic(start_of_month, total,
                         data     = plot_signal,
                         facets   = ~ gender,
                         chart    = 'run',
                         #y.percent = TRUE,
                         title    = 'Signal runs',
                         ylab     = '',
                         xlab     = 'Month'
) 

# print chart in rmd
signal_plot
#print list of what is shown
signal_plot$data %>% 
  select(facet1) %>% 
  group_by(facet1) %>% 
  slice(1) %>% 
  pull(facet1)
[1] Female
Levels: Female

I knew that @_johnmackintosh had written the original blog that I’d used to do this but it was on a different site to his blog plus he’s build two great packages (runcharter and spccharter) that I really should be using but my panic was rising and I couldn’t really think straight to do anything that constructive!

But then, suddenly, I just found it.

Microsoft explorer search of Rmarkdown

Strangely though, I couldn’t find it using Microsoft search, even in the very folder that I knew it was in.

As I had a complaint to make, of course I turned to Twitter:

Image of tweet saying: “I learned today that MS explorer search is not my friend. It can’t find text from Rmarkdown so I couldn’t find my files. I don’t have these problems with GitHub and yet… Microsoft owns GitHub”

and I really should have moaned sooner as I got great responses from people on how and why to resolve this!

Solutions suggested were:

grep probably works on Git Bash in Windows I’m guessing?

@ChrisBeeley and @ERDonnachie confirmed this does work

You probably need to change the indexing options in explorer to include Rmd files. Link

@TomJemmett

You can search cloned repositories locally using ‘git-grep’. If you have Linux bash on Win, you can also try ‘grep -r …’ and qualify it however you need (don’t do the entire HDD or it’ll take forever). See the docs. If you don’t have bash, you should Link

@Pouriaaa

Now I know that I will forget this and I’ll never find it again in Twitter so I’ve got it right here for that future me.

Plus if anyone is reading this blog, do follow these people as they are just the best!

Time on my hands

Now that I had my code it was a ridiculously quick thing to update it to the new data and also change my horrific naming conventions which, quite frankly, were all over the place.

With that done, I shared the report and got instant feedback:

Thank you for this! I think this report will do the trick Any chance you could do this by individual wards?

Now, the wards in question numbered just 10, but I had presented the data 4 ways. I’d created functions for the charts but I’d still typed out a title for each ward and then the function code. Doing that 40 times is quite a bit of typing.

Automatic tabs in Rmarkdown

My solution was therefore to move from functions that are run individually to running it through a loop, but I also particularly wanted Rmarkdown tabs so that the overall Rmarkdown report isn’t too long. Scrolling down an html page through 40 charts is about as much fun as typing out the titles for them.

As I do under such circumstances of “is this even possible?” I asked the question of Google and got a Stackoverflow answer. This looked really good but didn’t work for my {qicharts2} charts.

A bit more googling and lo, it appeared to be a problem with the code being base R specific.

A bit of moving code about and I eventually got:

# To work correctly in Rmarkdown there must be a header before this chunk with two ##s with ## {.tabset .tabset-fade} after it, for example: ## Cause Group Incidents by Wards {.tabset .tabset-fade}

for(i in wards){ 
  
  cat("###", i, '<br>', '\n')
  
  data <- data %>%
    filter(department == i) %>%
    group_by(start_of_month, cause_group) %>%
    summarise(count = n_distinct(incident_number)) %>%
    arrange(start_of_month)
  
  chart <- qic(start_of_month,count,
               data     = data,
               subtitle = 'All incidents',
               chart    = 'run',
               facets   = ~ cause_group,
               title    = paste("Run chart for incidents by Cause Group for", 
                                i, sep = " "),
               ylab     = 'Number of incidents',
               xlab     = 'Month',
               caption  = 'Only incident number is counted not the number of people affected by an incident')
  
  print(chart)
  
  cat('\n', '<br>', '\n\n')
  
}

This code won’t run on its own but this one (Code) in the NHS-R Community GitHub Demos and How Tos does.

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Citation

For attribution, please cite this work as

Turner (2021, Nov. 24). Blog: Losing the plots. Retrieved from https://philosopher-analyst.netlify.app/posts/2021-11-24-losing-the-plots/

BibTeX citation

@misc{turner2021losing,
  author = {Turner, Zoë},
  title = {Blog: Losing the plots},
  url = {https://philosopher-analyst.netlify.app/posts/2021-11-24-losing-the-plots/},
  year = {2021}
}