htmlwidgets - highcharter #
The Bigger Picture #
In this document we learn how to create interactive charts with Highcharter. Simply put, we are learning how to transform tidy data into visually clear graphs. In the overall context of the workflow, this falls into the category of transforming our data into data visualisation.
What is Highcharter? #
library("tidyverse")
library("highcharter")
- An htmlwidget used to make interactive graphs and charts of several types
- Bar graphs and stacked bar graphs
- Scatterplots and bubble charts
- Interactive time series
- Tree maps
- Choropleths
- And many more
- The package is specifically regarded for its effective presentation of time series
- The package is bound to the Highcharter library in JavaScript
- Highcharter heavily depends on the pipeline (
%>%
) operator to create charts
Creating Bar Charts and Stacked Bar Charts #
For our examples we will use the same data from the Australian Environmental-Economic Accounts (2016), now including data from 2008-2014. The data relates to water consumption by state.
load(file = "tidy_EnvAcc_data/consumption.rdata")
consumption
## # A tibble: 48 x 3
## State year water_consumption
## <chr> <chr> <dbl>
## 1 NSW 2008–09 4555
## 2 VIC 2008–09 2951
## 3 QLD 2008–09 3341
## 4 SA 2008–09 1179
## 5 WA 2008–09 1361
## 6 TAS 2008–09 466
## 7 NT 2008–09 160
## 8 ACT 2008–09 48
## 9 NSW 2009–10 4323
## 10 VIC 2009–10 2904
## # … with 38 more rows
- We begin by piping our data into a function,
hchart()
- This function is how we inform Highcharter that we wish to create a visualisation
- The
type
argument specifies the type of visualisation we wish to create - Here we specify “bar” for a bar chart
consumption %>%
hchart(type = "bar")
- Currently the chart is completely empty, since we have not specifies any “aesthetics”
- In other words, we haven’t told Highcharter what we wish to plot
- For the second argument of
hchart()
, we use the functionhcaes()
(highcharter aesthetics) - This sub-function itself takes several arguments
- We first consider
x
andy
for the x-axis and y-axis variables respectively
- We first consider
consumption %>%
group_by(year) %>%
mutate(consumption_total = sum(water_consumption)) %>%
hchart(type = "bar",
hcaes(x = year,
y = consumption_total))
Hover over the bar chart to see its interactivity!
Also consider the color
argument of hchart()
which can be used to set custom colours.
consumption %>%
group_by(year) %>%
mutate(consumption_total = sum(water_consumption)) %>%
hchart(type = "bar",
hcaes(x = year,
y = consumption_total),
color = "red")
Grouped Bar Charts #
- We can instead create a grouped bar chart by using the
group
argument ofhcaes()
- This determines a variable we can use to split our data
- We see that we have split the long columns into several shorter ones by category
consumption %>%
group_by(year) %>%
mutate(consumption_total = sum(water_consumption)) %>%
ungroup() %>%
hchart(type = "bar",
hcaes(x = year,
y = water_consumption,
group = State))
Stacked Bar Charts #
- We can also create stacked bar charts
- This requires piping our chart into the
hc_plotOptions()
function - This new function can take many arguments (see here)
- Here we use the
bar
argument, which takes a list of sub-arguments as its value - We’ll use the
stacking
sub-argument for our purposes
We can set it to “stack” for a regular stacked bar chart…
consumption %>%
group_by(year) %>%
mutate(consumption_total = sum(water_consumption)) %>%
ungroup() %>%
hchart(type = "bar",
hcaes(x = year,
y = water_consumption,
group = State)) %>%
hc_plotOptions(bar = list(stacking = "stack"))
…or “percent” for a percentage breakdown!
consumption %>%
group_by(year) %>%
mutate(consumption_total = sum(water_consumption)) %>%
ungroup() %>%
hchart(type = "bar",
hcaes(x = year,
y = water_consumption,
group = State)) %>%
hc_plotOptions(bar = list(stacking = "percent"))
Also note that vertical bar charts are just column charts, so to convert between the charts we simply change “bar” to “column” where relevant:
consumption %>%
group_by(year) %>%
mutate(consumption_total = sum(water_consumption)) %>%
ungroup() %>%
hchart(type = "column",
hcaes(x = year,
y = water_consumption,
group = State)) %>%
hc_plotOptions(column = list(stacking = "percent"))
Lastly, be aware that the color
argument can be vectorised for custom colours, including
HTML code colours!
consumption %>%
group_by(year) %>%
mutate(consumption_total = sum(water_consumption)) %>%
ungroup() %>%
hchart(type = "column",
hcaes(x = year,
y = water_consumption,
group = State),
color = c("gold", "blue", "pink", "orange", "green", "purple", "red", "violet")) %>%
hc_plotOptions(column = list(stacking = "percent"))
Changing Hover Info - hc_tooltip() #
- The
hc_tooltip()
function changes the information displayed when we mouse over our visualisation - We pipe our chart into this function
- Some useful arguments:
Argument | Possible values | Function |
---|---|---|
valueDecimals |
A number | Changes the number of decimal places to which our data displays |
valueSuffix |
Any string | Adds the specified string as a suffix to the data |
shared |
TRUE or FALSE |
Specifies whether the hover data is for all bars or just the one we mouse over |
Consider how each of the following arguments have modified our mouse-over display:
consumption %>%
group_by(year) %>%
mutate(consumption_total = sum(water_consumption)) %>%
ungroup() %>%
hchart(type = "bar",
hcaes(x = year,
y = water_consumption,
group = State)) %>%
hc_plotOptions(bar = list(stacking = "percent")) %>%
hc_tooltip(valueDecimals = 2,
valueSuffix = "GL",
shared = TRUE)
Scatter, Line and Bubble Charts #
For our examples we will use data from the ABARES Agricultural Census of 2015-2016. The data relates to the average climate-adjusted productivity of all cropping farms between 1977 and 2015.
load("tidy_ABARES_data/farm_data.rdata")
head(farm_data, 5)
## # A tibble: 5 x 4
## year Total.factor.productivity Climate.effect Climate.adjusted.TFP
## <chr> <dbl> <dbl> <dbl>
## 1 1978 95.9 89.7 103.
## 2 1979 113. 113. 102.
## 3 1980 112. 106. 103.
## 4 1981 84.2 92.5 101.
## 5 1982 104. 105. 101.
To create a scatter chart requires a similar method to the bar chart.
- In the
hchart()
function, we set thetype
to “scatter” - We specify the
hcaes()
argumentsx
andy
for the data we wish to plot
farm_data %>%
hchart(type = "scatter",
hcaes(x = Climate.effect,
y = Total.factor.productivity))
We may also use the color
argument of hcaes()
to colour our points by some variable
farm_data %>%
hchart(type = "scatter",
hcaes(x = Climate.effect,
y = Total.factor.productivity,
color = Climate.adjusted.TFP))
For line charts, we use the type
of “line”
farm_data %>%
hchart(type = "line",
hcaes(x = year,
y = Climate.adjusted.TFP))
- If we wish to toggle the display of the markers, we use the
marker
argument ofhcaes()
- This argument takes a list of sub-arguments
- We toggle the
enabled
sub-argument toTRUE
orFALSE
farm_data %>%
hchart(type = "line",
hcaes(x = year,
y = Climate.adjusted.TFP),
marker = list(enabled = FALSE))
For bubble charts, we use the type
of “bubble.” The argument size
of hcaes()
is used to determine which variable influences the size of the bubble.
farm_data %>%
hchart(type = "bubble",
hcaes(x = Climate.effect,
y = Total.factor.productivity,
size = Climate.adjusted.TFP))
We can also re-scale all the bubble sizes as we like:
- Pipe the chart into the
hc_plotOptions()
function - We use the
bubble
argument of this function - The argument takes a list as its value
- We set the
maxSize
sub-argument of this list to be a percentage, for example “10%”- This percentage represents the maximum size of a bubble relative to chart size
farm_data %>%
hchart(type = "bubble",
hcaes(x = Climate.effect,
y = Total.factor.productivity,
size = Climate.adjusted.TFP)) %>%
hc_plotOptions(bubble = list(maxSize = "10%"))
A Note on Scaling the Axes #
- If we want say a logarithmic axis, we use the
hc_xAxis()
orhc_yAxis()
funcitons - The
type
argument can be set to “logarithmic”
x0 = seq(1, 10, 0.1)
y0 = log(x0)
dataXY = cbind(x0,y0)
as_tibble(dataXY) %>%
hchart(type = "scatter",
hcaes(x = x0,
y = y0))
as_tibble(dataXY) %>%
hchart(type = "scatter",
hcaes(x = x0,
y = y0)) %>%
hc_xAxis(type = "logarithmic")
Interactive Time Series Charts #
In this section we introduce a new function, highchart()
:
- The function is similar to
hchart()
- However, we don’t pipe our data into this function
- Instead we pipe this function into other ‘adding’ functions which use our data
To create an interactive time series:
- We begin with the
highchart()
function, with thetype
argument of “stock” - This is our chart format
- We then pipe this into the
hc_add_series()
function - This function uses the first argument
data
, which we set to be our data - The function also requires a
type
, which can be “point” or “line” - We must also specify the
hcaes()
argument to instruct the function which variables to plot
highchart(type = "stock") %>%
hc_add_series(data = farm_data,
type = "line",
hcaes(x = year,
y = Total.factor.productivity))
We have a highly interactive time series plot, including various zoom settings, and a scroll to select particular portions of the graph for viewing.
We can additionally plot multiple time series variables on the same graph. This is done simply by piping another hc_add_series()
function into the mix.
highchart(type = "stock") %>%
hc_add_series(data = farm_data,
type = "line",
hcaes(x = year,
y = Total.factor.productivity)) %>%
hc_add_series(data = farm_data,
type = "line",
hcaes(x = year,
y = Climate.adjusted.TFP)) %>%
hc_add_series(data = farm_data,
type = "line",
hcaes(x = year,
y = Climate.effect))
There is no legend to distinguish the different curves, but we can add one with the hc_legend()
function and by setting enabled
to TRUE
.
highchart(type = "stock") %>%
hc_add_series(data = farm_data,
type = "line",
hcaes(x = year,
y = Total.factor.productivity)) %>%
hc_add_series(data = farm_data,
type = "line",
hcaes(x = year,
y = Climate.adjusted.TFP)) %>%
hc_add_series(data = farm_data,
type = "line",
hcaes(x = year,
y = Climate.effect)) %>%
hc_legend(enabled = TRUE)
However the series labels are generic and the colours are distasteful! We can fix this using the name
argument of hc_add_series()
, which names the series (and hence the legend). We can also specify custom colours. Overall our chart becomes much nicer.
highchart(type = "stock") %>%
hc_add_series(data = farm_data,
type = "line",
hcaes(x = year,
y = Total.factor.productivity),
name = "TFP",
color = "orange") %>%
hc_add_series(data = farm_data,
type = "line",
hcaes(x = year,
y = Climate.adjusted.TFP),
name = "Climate-Adjusted TFP",
color = "red") %>%
hc_add_series(data = farm_data,
type = "line",
hcaes(x = year,
y = Climate.effect),
name = "Climate Effect",
color = "lightblue") %>%
hc_legend(enabled = TRUE)
Treemaps #
Treemaps are used to visualise the comparative sizes of a single quantative variable among observation. For example, if we wish to see which Australian state consumed what amount of water from 2013-14, we might use a treemap for comparison.
- We first pipe the data into
hchart()
as usual - We set the
type
to “treemap” - We also use the special arguments of
hcaes()
,name
for the observation andsize
for the quantative variable
consumption %>%
filter(year == "2013–14") %>%
hchart(type = "treemap",
hcaes(name = State,
value = water_consumption))
We may set colours according to a variable:
- First we introduce the
colorValue
argument ofhcaes()
and set this to be the variable to colour by - We must then pipe our entire chart into the
hc_colorAxis()
function - This takes a
minColor
andmaxColor
argument
consumption %>%
filter(year == "2013–14") %>%
hchart(type = "treemap",
hcaes(name = State,
value = water_consumption,
colorValue = water_consumption)) %>%
hc_colorAxis(minColor = "lightblue",
maxColor = "darkblue")
Note: here we use various words for blue, but we may also use HTML colour codes such as “#000EFF.”
Interactive Choropleths #
- A choropleth is a map-based chart in which regions are shaded with colours to reflect some variable
- Creating choropleths with Highcharter requires us to manipulate ‘shapefiles’
- These are files which contain information about points, lines, polygons (etc) necessary to visually depict shapes, such as countries of the world
- Before we can create a choropleth, we must learn how to prepare these shape files
There are two types of shapefiles
- ESRI shapefiles - the older standard for shapefiles
- To use them we must have (at least) one of all of the below:
- A
.dbf
file - A
.shp
file - A
.shx
file - GeoJson shapefiles - a newer type
- To use them we only require one
.json
file
A good sources of global shapefiles are NaturalEarthData.com and Johan’s repository
Note that for Highcharter, our process requires that we convert our shapefiles to GeoJson. We now do an example of this.
To prepare these shapefiles we require the library “sf
.”
- We call upon
read_sf()
to read an entire directory of shapefiles and save the result - For our example we will use an Australian shapefile released by the Australian Government
library("sf")
shapefile_map <- read_sf(dsn = "shapefiles")
# Note: for file path, do not include a '/' at the end
class(shapefile_map)
## [1] "sf" "tbl_df" "tbl" "data.frame"
We have our shapes - we will mutate our shape data so that they are named by state.
shapefile_map$State <- c("NSW", "VIC", "QLD", "SA", "WA", "TAS", "NT", "ACT")
We then use the geojsonio
library to convert these files.
library("geojsonio")
geojson_file <- geojson_list(shapefile_map)
class(geojson_file)
## [1] "geo_list"
We are now set to make our chart.
- We begin with the
highchart()
function - We set the
type
to “map” - We pipe this into the
hc_add_series_map()
function - This importantly takes the
map
argument of our geojson file- This is how Highcharter knows what our shapes are
- It also takes the
df
argument of our data- This is how Highcharter knows what data to use for the map
- We use the
joinBy
argument to join the map and data- It takes a vector of string as its value
- The first string is the name of the column of the geojson file names of locations
- The second string is the name of the column of the data file names of locations
consumption14 <- consumption %>%
filter(year == "2013–14")
highchart(type = "map") %>%
hc_add_series_map(map = geojson_file,
df = consumption14,
value = "water_consumption",
joinBy = c("State", "State"))
Mouse over the chart! We observe that the label is a bit strange. We have ways around this:
- We can use the
name
argument ofhc_add_series_map()
to change the “Series 1” label
highchart(type = "map") %>%
hc_add_series_map(map = geojson_file,
df = consumption14,
value = "water_consumption",
joinBy = c("State", "State"),
name = "Water Consumption (KL)")
We can also change colours as we have seen before with hc_colorAxis()
consumption14 <- consumption %>%
filter(year == "2013–14")
highchart(type = "map") %>%
hc_add_series_map(map = geojson_file,
df = consumption14,
value = "water_consumption",
joinBy = c("State", "State"),
name = "Water Consumption (KL)") %>%
hc_colorAxis(minColor = "#C5C000", maxColor = "#434000")
Note: the following material is more advanced and harder to follow.
If we want even more customisation for hover information, we can use hc_tooltip()
- If we use the
pointFormat
argument, we can set hover text to be whatever we like, including text and variable values - Follow the formatting below:
- The entire argument is a string
- Where we wish to see the value of a variable, we use
{point.variable_name}
highchart(type = "map") %>%
hc_add_series_map(map = geojson_file,
df = consumption14,
value = "water_consumption",
joinBy = c("State", "State"),
name = "Water Consumption (KL)") %>%
hc_tooltip(pointFormat = "Welcome to {point.State}: {point.water_consumption} KL consumed!")
We can also use the headerFormat
argument to remove the heading completely by setting it to ""
highchart(type = "map") %>%
hc_add_series_map(map = geojson_file,
df = consumption14,
value = "water_consumption",
joinBy = c("State", "State"),
name = "Water Consumption (KL)") %>%
hc_tooltip(headerFormat = "",
pointFormat = "Welcome to {point.State}: {point.water_consumption} KL consumed!")
Lastly we can use the dataLabels
argument of hc_add_series_map()
, which takes a list as its value. If we use the sub-argument enabled
and set it to TRUE
, our values will display on the map itself:
highchart(type = "map") %>%
hc_add_series_map(map = geojson_file,
df = consumption14,
value = "water_consumption",
joinBy = c("State", "State"),
name = "Water Consumption (KL)",
dataLabels = list(enabled = TRUE))
Specifying Custom Colours - “Map-Values” Method #
We have already seen some usage of custom colours, but the methods already introduced can require some trial and error to get colours as we require. If we wish to use existing colour palettes, we use a “map-values” method. In short, this method mutates our original data to introduce a colour column. We then configure Highcharter to accept and use these colours.
The plyr::mapvalues()
function is used with mutate()
. We map a colour to a State in this example.
colour_consumption <- consumption14 %>%
mutate(colour = plyr::mapvalues(State,
from = c("NSW",
"VIC",
"QLD",
"SA",
"WA",
"TAS",
"NT",
"ACT"),
to = c("red",
"yellow",
"pink",
"green",
"purple",
"orange",
"blue",
"brown")))
colour_consumption
## # A tibble: 8 x 4
## State year water_consumption colour
## <chr> <chr> <dbl> <chr>
## 1 NSW 2013–14 7508 red
## 2 VIC 2013–14 3988 yellow
## 3 QLD 2013–14 4145 pink
## 4 SA 2013–14 1077 green
## 5 WA 2013–14 1317 purple
## 6 TAS 2013–14 390 orange
## 7 NT 2013–14 167 blue
## 8 ACT 2013–14 53 brown
When we plot our chart, we now simply instruct Highcharter to recognise these colours.
- We do this by using the
color
argument ofhchart()
- Importantly, it takes the value of the data’s colour column (specified using the
$
) in the unique funtion - This prevents duplicate colours in the event that we have multiple observations of the same colour
colour_consumption %>%
hchart(type = "bar",
hcaes(x = State,
y = water_consumption,
group = State),
color = unique(colour_consumption$colour)) %>%
hc_plotOptions(bar = list(stacking = "bar"))
This method will also work with existing colour palettes, for example those from the RColorBrewer
package.
library("RColorBrewer")
colour_consumption <- consumption14 %>%
mutate(colour = plyr::mapvalues(State,
from = c("NSW",
"VIC",
"QLD",
"SA",
"WA",
"TAS",
"NT",
"ACT"),
to = brewer.pal(8, "Paired")))
colour_consumption
## # A tibble: 8 x 4
## State year water_consumption colour
## <chr> <chr> <dbl> <chr>
## 1 NSW 2013–14 7508 #A6CEE3
## 2 VIC 2013–14 3988 #1F78B4
## 3 QLD 2013–14 4145 #B2DF8A
## 4 SA 2013–14 1077 #33A02C
## 5 WA 2013–14 1317 #FB9A99
## 6 TAS 2013–14 390 #E31A1C
## 7 NT 2013–14 167 #FDBF6F
## 8 ACT 2013–14 53 #FF7F00
colour_consumption %>%
hchart(type = "bar",
hcaes(x = State,
y = water_consumption,
group = State),
color = unique(colour_consumption$colour)) %>%
hc_plotOptions(bar = list(stacking = "bar"))