2024 Voter Turnout: Hotspots
1 INTRODUCTION
While most media coverage as focused on the coalition negotiations1. There is an equally interesting development from the 2024 South African National and Provincial Election results, namely voter turnout. In successive elections cycles since the 1999 election cycle, the country recorded a diminishing voter turnout. Turnout peaked in 1999 at 89.3% and the latest cycle registered the lowest voter turn-out yet at 58.6% (O’Regan 2024). There are bound to be spatial variations in voting patterns and turnout as well.
In this post, we will analyse voter turnout differences at ward-level. This presents several challenges for analysing this dataset. Firstly, voting districts and wards are mutable. In other words, they are subject to change from one election cycle to the next. As a consequence, we will can either account for changes from previous ward demarcations to their current version or impute some other value to measure the differences between the 2019 - 2024 National and Provincial Election.
Secondly, we do not include Out-of-Country votes in the analysis as those votes are not linked to a ward. Out-of-Country do not include Provincial nor Regional ballots. (The Electoral Court of South Africa 2024)’s decision as it relates to honorary consulates, high commissions and consulates role as voting stations. Effectively, the decision introduces a large set of new voting station remarkably different to previous election cycles.
Voter turnout is defined as a proportion \(voter turnout = (100/registered population)*total votes\). It is applied to both election years and the turnout difference is effectively \(voter turnout(2024) - voterturnout(2021)\).
1.1 COLLECTING DATA
In order to collect the required data, we rely on two main data sources. The Municipal Demarcation Board, they are the body responsible for drawing districts throughout the country. In turn, the Independent Electoral Commission of South Africa can determine the appropriate voting districts (voting station boundaries). The IEC is unambiguous about the independence of the voting districts from the work of the Municipal Demarcation Board. Voting districts are logistically sound regions aimed at minimising voter inconvenience and limiting voter fraud ‘About Voting Districts and Stations - Electoral Commission of South Africa’ (n.d.).
Unlike previous years, sourcing voting districts and voting station coordinates has proved markedly more difficult in 2024. Fortunately, SANEF’s election dashboard ‘Elections Dashboard » SANEF Elections Portal 2024’ (2024) has a handy data export feature. The voting station location data can be join to their respective wards, the voting station results are aggregated to ward level for the 2019 and 2024 elections.
There are a few sanity checks in the data preprocessing, such as excluding newly demarcated wards, since they don’t have a 2019 baseline. Voting Stations with turnout greater 100% are removed. This pattern is glaring particularly at voting stations that were in temporary structures such as Tents. We do not include data from Provincial and Regional ballots since we aren’t necessarily interested in voting patterns per say but rather whether voters showed up.
The final dataset contains an sf
object with the aggregate turnout results across wards in both the 2019 and the 2024 National and Provincial Elections. Our variable of interest is the turnout change from 2019 - 2024.
2 EXPLORATORY ANALYSIS
see code
TurnOut|>
ggplot()+
geom_sf(aes(fill = turnout_diff),
color = "black")+
scale_fill_viridis_c(breaks = c(-40,-20,0,20,40))+
labs(subtitle = "2019 - 2024 Turnout Difference (%)",
fill = "Turnout Difference(%)")+
theme_void()+
theme(
text = element_text(family = "IBM Plex Sans"),
plot.title = element_text(face = "bold",
hjust = 0.5),
plot.subtitle = element_text(face = "italic",hjust =0.5),
legend.position = "bottom"
)
Figure 1 illustrates the turn out differences across wards. There are at least 4454 wards, as a result, insights are lost in the noise. For example, the metropolitan areas are indistinguishable from the rest of the country, their differences are hidden by ward boundaries. It is possible to ‘zoom’ into these areas of interest.
Figure 2 illustrates differences in turnout across five metropolitan municipalities. This approach provides a more granular view of outcomes while focusing on regions with higher population densities. Some distinct patterns emerge at a ward-level and metropolitan-level. One approach to quantifying these patterns is to do hotspot analysis. Effectively, we can rely on a number of statistics to assess spatial autocorrelation. Kopczewska (2021), pp. 149-211 provides a succinct summary of the spatial autocorrelation, global and local statistics and their visualisation.
2.1 CREATING HOTSPOTS
In the code below, we complete a couple of tasks, first we create neighbours list from the polygons of ward districts using the poly2nb
. Next, the neighbours lists are converted to spatial weights (nb2listw
) and lagged (lag.listw
).
The lagged spatial weights are used as input in the estimation of a local spatial statistic (Getis-Ord G) which will help us identify clusters of high or low voter turnout. The hotspot
function helps us classify whether the patterns observed are of interest. Finally, we can visualise results.
see code
see code
TurnOut$hotspot_classification <- G_Local_Classy
TurnOut |>
ggplot()+
geom_sf(aes(fill =hotspot_classification),
color = "black")+
scale_fill_manual(
values = c("High" = "#0f204b",
"Low" = "#A71930")
)+
labs(fill = "Hotspot Classification")+
theme_void()+
theme(
text = element_text("IBM Plex Sans"),
plot.title = element_text(face = "bold",hjust = 0.5),
plot.subtitle = element_text(face = "italic",hjust = 0.5),
legend.position = "bottom"
)
G_Local_Classy
Low High
401 314
Figure 3 illustrates a map of the hotspots throughout South Africa. However, we have the same flaw observed in with Figure 1, the hotspots are sparsely distributed throughout the country. As such, it can be difficult to extract meaningful information out of the visualisation. In addition, the map as-is does not contain any additional information such as cities,built-environment, roads etc.
Accordingly, we enhance the visualisation using the rdeck
package North (2024) which offers an interaction to the mapbox visualisation capabilities. The code below is adapted from Walker (2024) .
The mapbox service requires an account and access token. It offers a generous free-tier.
2.2 INTERACTIVE VISUALISATION
see code
library(rdeck)
library(mapdeck)
library(viridisLite)
TurnOut_Subset <- TurnOut |>
filter(!is.na(hotspot_classification))
rdeck(map_style = mapbox_satellite_streets(),
initial_view_state = view_state(
center = c(24.0850297,-29.6978701),
zoom = 5))|>
add_polygon_layer(
data = TurnOut_Subset,
pickable = TRUE,
visible = TRUE,
get_polygon = geometry,
opacity = 0.6,
get_fill_color = rdeck::scale_color_category(
col = hotspot_classification,
palette = cividis(n = table(TurnOut$hotspot_classification)|>length(),
direction = -1)
)
)
3 CONCLUSION
Our primary aim was to assess turnout differences from the 2019 - 2024 National and Provincial Elections in South Africa. This involved some data preprocessing, merging and aggregation of turnout for each election cycle. After some initial visualisations, we relied on the local Getis-Ord G statistic in order to find clusters of hotspots. The final visualisation is interactive including a satellite image of South Africa for added context.
This walk-through is fairly superficial, we do not include covariates to measure differences in turnout, nor do we consider events that may have occurred in those regions. Rule (2018), Fransman and Fintel (2024) and others have considered a broader spectrum of variables that could explain voting patterns.
References
Footnotes
See Bloomberg, DailyMaverick and Others↩︎