This document illustrates the preprocessing of the dataset visualized in this article on srf.ch.
SRF Data attaches great importance to transparent and reproducible data preprocessing and -analysis. SRF Data believes in the principles of open data but also open and reproducible methods. Third parties should be empowered to build on the work of SRF Data and to generate new analyses and applications.
The preprocessing and analysis of the data was conducted in the R project for statistical computing. The RMarkdown script used to generate this document and all the resulting data can be downloaded under this link. Through executing main.Rmd
, the herein described process can be reproduced and this document can be generated. In the course of this, data from the folder ìnput
will be processed and results will be written to output
.
Attention: Please set your working directory in the first code chunk!
The code for the herein described process can also be freely downloaded from https://github.com/srfdata/2015-10-elections-list-apparentments. Criticism in the form of GitHub issues and pull requests are very welcome!
2015-10-elections-list-apparentments by SRF Data is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
The published information has been collated carefully, but no guarantee is offered of its completeness, correctness or up-to-date nature. No liability is accepted for damage or loss incurred from the use of this script or the information drawn from it. This exclusion of liability also applies to third-party content that is accessible via this offer.
All code & data from SRF Data is available under http://srfdata.github.io.
input/lv9511n.csv
- Effects of list apparentments on the cantonal level, 1995-2011. Data source: Daniel Bochsler, NCCR Democracy.input/lv9515n.csv
- Effects of list apparentments on the cantonal level, 1995-2015. Data source: Daniel Bochsler, NCCR Democracy.input/nrwresultate_sim9515.xls
- Effects of list apparentments on the national level, 1995-2015. Data source: Daniel Bochsler, NCCR Democracy.The following sections describe the results of the data preprocessing as stored in the output
folder.
output/parties.csv
Contains party classifications made by SRF Data with the help of political scientists, used throughout all projects related to elections.
Attribute | Type | Description |
---|---|---|
ID | Integer | Unique identifier |
Abbr_* | String | Abbreviation in German (D), French (F), English (E), Romansh (R), Italian (I), respectively |
Legend_* | String | Abbreviation, but with slightly more information, used for frontend purposes |
Name_* | String | Full name |
Sortorder | Integer | Used for frontend purposes solely |
OLD_ID | String | “Official” ID as given in https://github.com/srfdata/2015-06-elections-partystrengths/blob/master/analysis/input/parteienstaerke_mod_2.xlsx (sheet “Parteien”), used for combining party strengths for party groupings |
output/lv_2015.csv
Contains effects of list apparentments on the cantonal level, 2015, as derived from input/lv9515n.csv
.
Attribute | Type | Description |
---|---|---|
year | Integer | Election year |
canton | String | Official cantonal abbreviation |
party | String | Contains the party name, but only if it belongs to a group in output/parties.csv (e.g. id == 8 ) |
party_id | Integer | Party or party grouping, referencing ID in output/parties.csv |
party_strength | Double | Party strength in percent |
seats_with | String | Actual, resulting seats |
seat_difference | String | seats_with - seats_without |
seats_without | String | Seats that would have resulted without the possibility of list apparentments |
list_id | String | The cantonal list the party was on in 2015 |
output/lv_2015_national.csv
Contains effects of list apparentments on the national level, 2015, as derived from input/nrwresultate_sim9515.xls
.
Attribute | Type | Description |
---|---|---|
year | Integer | Election year |
party | String | Contains the party name, but only if it belongs to a group in output/parties.csv (e.g. id == 8 ) |
party_id | Integer | Party or party grouping, referencing ID in output/parties.csv |
party_strength | Double | Party strength in percent |
seats_with | String | Actual, resulting seats |
seats_without | String | Seats that would have resulted without the possibility of list apparentments |
seat_difference | String | seats_with - seats_without |
output/lv_historical.csv
Contains effects of list apparentments on the cantonal level, 1995 - 2011, as derived from input/lv9511n.csv
.
Attribute | Type | Description |
---|---|---|
year | Integer | Election year |
canton | String | Official cantonal abbreviation |
party | String | Contains the party name, but only if it belongs to a group in output/parties.csv (e.g. id == 8 ) |
party_id | Integer | Party or party grouping, referencing ID in output/parties.csv |
party_strength | Double | Party strength in percent |
seats_with | String | Actual, resulting seats |
seat_difference | String | seats_with - seats_without |
seats_without | String | Seats that would have resulted without the possibility of list apparentments |
output/lv_historical_national.csv
Contains effects of list apparentments on the national level, 1995 - 2011, as derived from input/nrwresultate_sim9515.xls
.
Attribute | Type | Description |
---|---|---|
year | Integer | Election year |
party | String | Contains the party name, but only if it belongs to a group in output/parties.csv (e.g. id == 8 ) |
party_id | Integer | Party or party grouping, referencing ID in output/parties.csv |
party_strength | Double | Party strength in percent |
seats_with | String | Actual, resulting seats |
seat_difference | String | seats_with - seats_without |
seats_without | String | Seats that would have resulted without the possibility of list apparentments |
# von https://mran.revolutionanalytics.com/web/packages/checkpoint/vignettes/using-checkpoint-with-knitr.html
cat("library(magrittr)
library(tidyr)
library(dplyr)
library(readxl)
library(ggplot2)",
file = "manifest.R")
package_date <- "2015-08-01"
if(!require(checkpoint)) {
if(!require(devtools)){
install.packages("devtools", repos = "http://cran.us.r-project.org")
require(devtools)
}
devtools::install_github("checkpoint", username = "RevolutionAnalytics", ref = "v0.3.2", repos = "http://cran.us.r-project.org")
require(checkpoint)
}
## Loading required package: checkpoint
##
## checkpoint: Part of the Reproducible R Toolkit from Microsoft
## https://mran.microsoft.com/documents/rro/reproducibility/
if(!dir.exists("~/.checkpoint")){
dir.create("~/.checkpoint")
}
checkpoint(snapshotDate = package_date, project = path_to_wd, verbose = T, scanForPackages = T, use.knitr = F)
## Scanning for packages used in this project
## rmarkdown files found and will not be parsed. Set use.knitr = TRUE
## - Discovered 6 packages
## All detected packages already installed
## checkpoint process complete
## ---
rm(package_date)
source("manifest.R")
##
## Attaching package: 'tidyr'
## The following object is masked from 'package:magrittr':
##
## extract
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
unlink("manifest.R")
The following comments are unfortunately in German
Daten in CSV umkonvertieren
# 1 Listenverbindungen in den Nationalratswahlen in den Jahren 1995-2011, nach Kantonen und Parteien
listenverbindungen_kantonal <- read.csv2(file = "input/lv9511n.csv", sep = ",", stringsAsFactors = F)
# Es gibt zwei Fehler in den kantonalen Daten, gleich hier beheben
listenverbindungen_kantonal[listenverbindungen_kantonal$party == "glp" & listenverbindungen_kantonal$year == 2011 & listenverbindungen_kantonal$kt == "TG",]$ap_s <- 1
listenverbindungen_kantonal[listenverbindungen_kantonal$party == "fdp" & listenverbindungen_kantonal$year == 2011 & listenverbindungen_kantonal$kt == "TG",]$ap_s <- -1
listenverbindungen_kantonal %<>%
mutate(vote = as.double(vote))
listenverbindungen_national <- read_excel(path = "input/nrwresultate_sim9515.xls")[1:158,]
# Es gibt zwei Fehler in den nationalen Daten, gleich hier beheben
listenverbindungen_national[listenverbindungen_national$year == 2011 & listenverbindungen_national$party == "GLP",]$ap_s <- 5
listenverbindungen_national[listenverbindungen_national$year == 2011 & listenverbindungen_national$party == "GLP",]$s_noap <- 7
listenverbindungen_national[listenverbindungen_national$year == 2011 & listenverbindungen_national$party == "FDP",]$ap_s <- -1
listenverbindungen_national[listenverbindungen_national$year == 2011 & listenverbindungen_national$party == "FDP",]$s_noap <- 30
# doublecheck
listenverbindungen_kantonal %>% group_by(id_kt) %>%
summarize(total_seats = sum(abs(ap_s))) %>% arrange(desc(total_seats))
## Source: local data frame [130 x 2]
##
## id_kt total_seats
## 1 BE1995 4
## 2 BE2003 4
## 3 FR1999 4
## 4 SG1995 4
## 5 ZH2003 4
## 6 ZH2007 4
## 7 AG1995 2
## 8 AG1999 2
## 9 AG2003 2
## 10 AG2007 2
## .. ... ...
# check that sum is always 100
rundungsfehler <- listenverbindungen_kantonal %>%
group_by(year, kt) %>%
summarize(total_vote = sum(vote)) %>%
ungroup() %>%
arrange(desc(total_vote))
# mostly ok
2015-Daten: Kantonal
listenverbindungen_kantonal_2015 <- read.csv2(file = "input/lv15n.csv", sep = ",", stringsAsFactors = F)
# welche sind LPS in den Bochsler-Daten?
listenverbindungen_kantonal_2015 %>%
filter(party == "lps" & vote > 0)
## [1] year kt party id_kt vote
## [6] seats others_name ap_s ap_i kname
## [11] s
## <0 rows> (or 0-length row.names)
# Wikipedia: Die noch bestehenden kantonalen Parteien sind nun zusammen mit ihren freisinnigen Schwesterparteien Teil der neuen liberalen Bundespartei.
listenverbindungen_kantonal_2015 %<>%
select(year, kt, party, ap_s, s, ap_i, vote, others_name ) %>%
rename(party_bochsler = party, party_bochsler_description = others_name) %>%
mutate(party_bochsler = as.factor(party_bochsler))
# Bochsler-Daten: Welche "uebrige" haben Sitze gewonnen?
listenverbindungen_kantonal_2015 %>%
filter(party_bochsler == "uebrige" & s > 0)
## [1] year kt
## [3] party_bochsler ap_s
## [5] s ap_i
## [7] vote party_bochsler_description
## <0 rows> (or 0-length row.names)
listenverbindungen_kantonal_2015 %>%
filter(party_bochsler == "uebrige" & ap_s != 0)
## [1] year kt
## [3] party_bochsler ap_s
## [5] s ap_i
## [7] vote party_bochsler_description
## <0 rows> (or 0-length row.names)
# es handelt sich um die csp
## Plausibilitätschecks
# Gesamtsumme Sitze
sum(listenverbindungen_kantonal_2015$s, na.rm = T)
## [1] 200
# der Sitz der CSP im Kanton OW fehlt noch
# variablen transformieren
listenverbindungen_kantonal_2015 %<>%
mutate(canton = kt, list_id = ap_i, seats_with = s, seat_difference = ap_s, seats_without = seats_with - seat_difference, party_strength = as.numeric(vote), party_abbr = party_bochsler, party_id = NA, party = party_bochsler_description)
# unnötige variablen rausschmeissen
listenverbindungen_kantonal_2015 %<>%
select(-ap_s, -s, -ap_i, -vote)
# LV brauchen wir noch für das Matching mit den übrigen
# Plausibilitätschecks
# Summation zu 100%
as.data.frame(listenverbindungen_kantonal_2015 %>%
group_by(year, canton) %>%
summarize(total_vote = sum(party_strength)) %>%
ungroup() %>%
arrange(desc(total_vote)))
## year canton total_vote
## 1 2015 GL 100
## 2 2015 SG 100
## 3 2015 AG 100
## 4 2015 NE 100
## 5 2015 FR 100
## 6 2015 VD 100
## 7 2015 BE 100
## 8 2015 GE 100
## 9 2015 SO 100
## 10 2015 OW 100
## 11 2015 UR 100
## 12 2015 ZG 100
## 13 2015 BL 100
## 14 2015 SH 100
## 15 2015 AI 100
## 16 2015 GR 100
## 17 2015 JU 100
## 18 2015 VS 100
## 19 2015 SZ 100
## 20 2015 BS 100
## 21 2015 TG 100
## 22 2015 AR 100
## 23 2015 ZH 100
## 24 2015 TI 100
## 25 2015 NW 100
## 26 2015 LU 100
# 200 Sitze
sum(listenverbindungen_kantonal_2015$seats_with)
## [1] 200
sum(listenverbindungen_kantonal_2015$seats_without)
## [1] 200
# party_id ermitteln
listenverbindungen_kantonal_2015$party_id <- 8
listenverbindungen_kantonal_2015[listenverbindungen_kantonal_2015$party_abbr == "bdp",]$party_id <- 32
# listenverbindungen_kantonal_2015[listenverbindungen_kantonal_2015$party_abbr == "ldu",]$party_id <- 8
listenverbindungen_kantonal_2015[listenverbindungen_kantonal_2015$party_abbr == "svp",]$party_id <- 4
listenverbindungen_kantonal_2015[listenverbindungen_kantonal_2015$party_abbr == "cvp",]$party_id <- 2
listenverbindungen_kantonal_2015[listenverbindungen_kantonal_2015$party_abbr == "fdp",]$party_id <- 1
# listenverbindungen_kantonal_2015[listenverbindungen_kantonal_2015$party_abbr == "fga",]$party_id <- 9
listenverbindungen_kantonal_2015[listenverbindungen_kantonal_2015$party_abbr == "gps",]$party_id <- 13
listenverbindungen_kantonal_2015[listenverbindungen_kantonal_2015$party_abbr == "sp",]$party_id <- 3
listenverbindungen_kantonal_2015[listenverbindungen_kantonal_2015$party_abbr == "cspo",]$party_id <- 8
listenverbindungen_kantonal_2015[listenverbindungen_kantonal_2015$party_abbr == "pda",]$party_id <- 9
# listenverbindungen_kantonal_2015[listenverbindungen_kantonal_2015$party_abbr == "fps",]$party_id <- 16
listenverbindungen_kantonal_2015[listenverbindungen_kantonal_2015$party_abbr == "evp",]$party_id <- 7
listenverbindungen_kantonal_2015[listenverbindungen_kantonal_2015$party_abbr == "sd",]$party_id <- 16
listenverbindungen_kantonal_2015[listenverbindungen_kantonal_2015$party_abbr == "sol",]$party_id <- 9
# listenverbindungen_kantonal_2015[listenverbindungen_kantonal_2015$party_abbr == "lps",]$party_id <- 1
listenverbindungen_kantonal_2015[listenverbindungen_kantonal_2015$party_abbr == "lega",]$party_id <- 18
listenverbindungen_kantonal_2015[listenverbindungen_kantonal_2015$party_abbr == "edu",]$party_id <- 16
listenverbindungen_kantonal_2015[listenverbindungen_kantonal_2015$party_abbr == "glp",]$party_id <- 31
listenverbindungen_kantonal_2015[listenverbindungen_kantonal_2015$party_abbr == "cspo",]$party <- "Karl Vogler (Christlichsoziale Partei Obwalden)"
listenverbindungen_kantonal_2015[listenverbindungen_kantonal_2015$party_abbr == "cspo",]$party_abbr <- "CSP"
## Warning in `[<-.factor`(`*tmp*`, iseq, value = "CSP"): invalid factor
## level, NA generated
# Abschluss
listenverbindungen_kantonal_2015 %<>%
select(year, canton, party, party_id, party_strength, seats_with, seats_without, seat_difference, list_id)
# speichern
write.csv(listenverbindungen_kantonal_2015, file = "output/lv_2015.csv", na = "", row.names = F)
#
# data <- read.csv("output/lv_2015.csv")
# fix(data)
# data[data$party_abbr == "AL" & data$canton == "ZH",]$party_strength <- data[data$party_abbr == "AL" & data$canton == "ZH",]$party_strength + data[data$party_abbr == "AL2" & data$canton == "ZH",]$party_strength
#
# data %<>% filter(party_abbr != "AL2")
#
# data %>% group_by(canton) %>% summarize(sum = sum(party_strength)) %>% as.data.frame()
#
# write.csv(data, file = "output_for_vis/lv_2015.csv", na = "", row.names = F)
Historische Daten: Kantonal
lv_historisch <- listenverbindungen_kantonal
sum(lv_historisch$s)
## [1] 1000
# Summe sollte 800 sein
lv_historisch %>%
group_by(year) %>%
summarize(total_sitze = sum(s))
## Source: local data frame [5 x 2]
##
## year total_sitze
## 1 1995 200
## 2 1999 200
## 3 2003 200
## 4 2007 200
## 5 2011 200
# im Jahr 1995 ist die Summe nicht 200
as.data.frame(lv_historisch %>%
filter(year == 1995) %>%
group_by(kt) %>%
summarize(total_sitze = sum(s)) %>%
ungroup() %>%
arrange(desc(total_sitze)))
## kt total_sitze
## 1 ZH 34
## 2 BE 27
## 3 VD 17
## 4 AG 15
## 5 SG 12
## 6 GE 11
## 7 LU 10
## 8 TI 8
## 9 BL 7
## 10 SO 7
## 11 VS 7
## 12 BS 6
## 13 FR 6
## 14 TG 6
## 15 GR 5
## 16 NE 5
## 17 SZ 3
## 18 ZG 3
## 19 AR 2
## 20 JU 2
## 21 SH 2
## 22 AI 1
## 23 GL 1
## 24 NW 1
## 25 OW 1
## 26 UR 1
# Zürich scheint einen zu viel zu haben
as.data.frame(lv_historisch %>%
filter(year == 1995 & kt == "ZH") %>%
group_by(party) %>%
summarize(total_sitze = sum(s)) %>%
ungroup() %>%
arrange(desc(total_sitze)))
## party total_sitze
## 1 sp 9
## 2 svp 9
## 3 fdp 6
## 4 cvp 2
## 5 gps 2
## 6 ldu 2
## 7 evp 1
## 8 fga 1
## 9 fps 1
## 10 sd 1
## 11 csp 0
## 12 edu 0
## 13 glp 0
## 14 lega 0
## 15 lps 0
## 16 pda 0
## 17 sol 0
## 18 uebrige 0
# die FGA hat einen zu viel
# lv_historisch[lv_historisch$year == 1995 & lv_historisch$kt == "ZH" & lv_historisch$party == "fga" & lv_historisch$vote < 1,]$s <- 0
# BL ebenfalls
as.data.frame(lv_historisch %>%
filter(year == 1995 & kt == "BL") %>%
group_by(party) %>%
summarize(total_sitze = sum(s)) %>%
ungroup() %>%
arrange(desc(total_sitze)))
## party total_sitze
## 1 sp 2
## 2 cvp 1
## 3 fdp 1
## 4 gps 1
## 5 sd 1
## 6 svp 1
## 7 csp 0
## 8 edu 0
## 9 evp 0
## 10 fga 0
## 11 fps 0
## 12 glp 0
## 13 ldu 0
## 14 lega 0
## 15 lps 0
## 16 pda 0
## 17 sol 0
## 18 uebrige 0
# die FDP hat einen zu viel
# lv_historisch[lv_historisch$year == 1995 & lv_historisch$kt == "BL" & lv_historisch$party == "fdp" & lv_historisch$vote < 2,]$s <- 0
sum(lv_historisch$s)
## [1] 1000
# nun ist es gut
lv_historisch %<>%
group_by(year, kt, party) %>%
summarize(seats_with = sum(s), seat_difference = sum(ap_s)) %>%
ungroup() %>%
mutate(seats_without = seats_with - seat_difference)
sum(lv_historisch$seats_without)
## [1] 1000
# die entfernen, die weder mit oder ohne LV keine sitze gemacht hätten
lv_historisch %<>%
filter(seats_with > 0 | seats_without > 0)
# transformieren
lv_historisch %<>%
rename(canton = kt) %>%
mutate(party_abbr = party, party = NA, party_id = NA)
unique(lv_historisch$party_abbr)
## [1] "cvp" "fdp" "fps" "gps" "ldu" "sp" "svp"
## [8] "edu" "evp" "fga" "sd" "lps" "csp" "pda"
## [15] "lega" "sol" "glp" "bdp" "mcr" "uebrige"
# party_id reinholen
lv_historisch[lv_historisch$party_abbr == "bdp",]$party_id <- 32
lv_historisch[lv_historisch$party_abbr == "ldu",]$party_id <- 8
lv_historisch[lv_historisch$party_abbr == "svp",]$party_id <- 4
lv_historisch[lv_historisch$party_abbr == "cvp",]$party_id <- 2
lv_historisch[lv_historisch$party_abbr == "fps",]$party_id <- 16
lv_historisch[lv_historisch$party_abbr == "fdp",]$party_id <- 1
lv_historisch[lv_historisch$party_abbr == "fga",]$party_id <- 9
lv_historisch[lv_historisch$party_abbr == "gps",]$party_id <- 13
lv_historisch[lv_historisch$party_abbr == "sp",]$party_id <- 3
lv_historisch[lv_historisch$party_abbr == "csp",]$party_id <- 8
lv_historisch[lv_historisch$party_abbr == "pda",]$party_id <- 9
lv_historisch[lv_historisch$party_abbr == "fps",]$party_id <- 16
lv_historisch[lv_historisch$party_abbr == "evp",]$party_id <- 7
lv_historisch[lv_historisch$party_abbr == "sd",]$party_id <- 16
lv_historisch[lv_historisch$party_abbr == "sol",]$party_id <- 9
lv_historisch[lv_historisch$party_abbr == "lps",]$party_id <- 1
lv_historisch[lv_historisch$party_abbr == "lega",]$party_id <- 18
lv_historisch[lv_historisch$party_abbr == "edu",]$party_id <- 16
lv_historisch[lv_historisch$party_abbr == "glp",]$party_id <- 31
lv_historisch[lv_historisch$party_abbr == "uebrige",]$party_id <- 99
lv_historisch[lv_historisch$party_abbr == "mcr",]$party_id <- 8
# LPS zu FDP rechnen
lv_historisch %<>%
mutate(lpsfdp = ifelse(party_abbr == "fdp" | party_abbr == "lps", "fdplps", party_abbr))
lv_historisch %<>%
group_by(lpsfdp, year, canton) %>%
summarise(party = first(party), party_abbr = first(party_abbr), party_id = first(party_id), seats_with = sum(seats_with), seat_difference = sum(seat_difference), seats_without = sum(seats_without)) %>%
ungroup() %>%
select(-lpsfdp)
lv_historisch %>%
filter(party_abbr == "lps")
## Source: local data frame [0 x 8]
##
## Variables not shown: year (int), canton (chr), party (lgl), party_abbr
## (chr), party_id (dbl), seats_with (int), seat_difference (dbl),
## seats_without (dbl)
# Plausibilitätschecks
# 200 Sitze
sum(lv_historisch$seats_with)
## [1] 1000
sum(lv_historisch$seats_without)
## [1] 1000
sum(lv_historisch$seat_difference)
## [1] 0
# party_id ermitteln
write.csv(lv_historisch, file = "output/lv_historical.csv", na = "", row.names = F)
2015-Daten: national
# Wie viele Sitze gehen an die CSP?
listenverbindungen_national %>%
filter(party == "CSP")
## Source: local data frame [6 x 9]
##
## year party s ap_s s_noap apdummy voteap v ap_np
## 1 1995 CSP 1 1 0 2 22.20000 0.3 0.1538462
## 2 1999 CSP 1 1 0 3 34.40000 0.4 0.2692308
## 3 2003 CSP 1 1 0 1 38.86719 0.4 0.1538462
## 4 2007 CSP 1 1 0 3 29.92079 0.4 0.4230769
## 5 2011 CSP 0 0 0 4 26.70000 0.3 0.3461539
## 6 2015 CSP 0 0 0 2 NA NA NA
# Problem: Karl Vogler wird 2015 nicht als CSP aufgeführt, sondern bei den Übrigen, auch in den BFS-Daten
# In den kantonalen Daten für 2015 wird er jedoch als CSP aufgeführt, jedoch ist OW sowieso ein Majorzkanton
listenverbindungen2015_national <- listenverbindungen_national %>%
filter(year == 2015) %>%
mutate(party = tolower(party), party_abbr = party, party = NA, party_strength = v, seats_with = s, seats_without = s_noap, seat_difference = ap_s, canton = "CH", party_id = NA, list_id = NA) %>%
select(-apdummy, -voteap)
# Plausibilitätschecks
# Summiert zu 100%
sum(listenverbindungen2015_national$v)
## [1] NA
# Nicht genau 100%
# Originaldaten matchen, Quelle: BFS
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "svp",]$party_strength <- 29.4
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "cvp",]$party_strength <- 11.6
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "fdp",]$party_strength <- 16.4
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "gps",]$party_strength <- 7.1
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "sp",]$party_strength <- 18.8
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "csp",]$party_strength <- 0.2
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "cspo",]$party_strength <- 0
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "pda",]$party_strength <- 0.4
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "evp",]$party_strength <- 1.9
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "sd",]$party_strength <- 0.1
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "sol",]$party_strength <-0.5
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "lega",]$party_strength <- 1.0
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "edu",]$party_strength <- 1.2
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "glp",]$party_strength <- 4.6
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "uebrige",]$party_strength <- 2.4
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "bdp",]$party_strength <- 4.1
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "mcr",]$party_strength <- 0.3
# nochmals überprüfen
sum(listenverbindungen2015_national$party_strength, na.rm = T)
## [1] 100
# 200 Sitze
sum(listenverbindungen2015_national$seats_with)
## [1] 200
sum(listenverbindungen2015_national$seats_without)
## [1] 200
# solche mit party_strength == 0 entfernen
listenverbindungen2015_national %<>% filter(party_strength > 0)
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "uebrige",]$seats_with <- 1
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "uebrige",]$seats_without <- 1
sum(listenverbindungen2015_national$party_strength)
## [1] 100
sum(listenverbindungen2015_national$seats_with)
## [1] 200
sum(listenverbindungen2015_national$seats_without)
## [1] 200
sum(listenverbindungen2015_national$seat_difference)
## [1] 0
#
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "bdp",]$party_id <- 32
# listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "ldu",]$party_id <- 8
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "svp",]$party_id <- 4
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "cvp",]$party_id <- 2
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "fdp",]$party_id <- 1
# listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "fga",]$party_id <- 9
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "gps",]$party_id <- 13
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "sp",]$party_id <- 3
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "pda",]$party_id <- 9
# listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "fps",]$party_id <- 16
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "evp",]$party_id <- 7
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "sd",]$party_id <- 16
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "sol",]$party_id <- 9
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "lega",]$party_id <- 18
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "edu",]$party_id <- 16
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "glp",]$party_id <- 31
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "csp",]$party_id <- 8
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "mcr",]$party_id <- 8
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "uebrige",]$party_id <- 99
# Volle Namen hinzufügen
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "sd",]$party <- "SD"
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "sol",]$party <- "Sol."
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "edu",]$party <- "EDU"
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "csp",]$party <- "CSP"
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "mcr",]$party <- "MCR"
listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "pda",]$party <- "PdA"
# listenverbindungen2015_national[listenverbindungen2015_national$party_abbr == "uebrige",]$party <- ""
# select variables
listenverbindungen2015_national %<>% select(year, canton, party, party_id, party_strength, seats_with, seats_without, seat_difference)
write.csv(listenverbindungen2015_national,file = "output/lv_2015_national.csv", na = "", row.names = F)
Historische Daten: national
# Plausibilitätscheck
listenverbindungen_national %>%
group_by(year) %>%
summarize(total_sitze = sum(s))
## Source: local data frame [6 x 2]
##
## year total_sitze
## 1 1995 200
## 2 1999 200
## 3 2003 200
## 4 2007 200
## 5 2011 200
## 6 2015 200
# gleiches Problem wie bei kantonalen Daten: 1995 hat zwei Sitze zu viel
lv_historisch <- listenverbindungen_national %>%
filter(year < 2015)
# die FGA hat einen zu viel
lv_historisch[lv_historisch$year == 1995 & lv_historisch$party == "FGA",]$s <- 2
lv_historisch[lv_historisch$year == 1995 & lv_historisch$party == "FGA",]$ap_s <- 1
lv_historisch[lv_historisch$year == 1995 & lv_historisch$party == "FGA",]$s_noap <- 1
# die FDP hat einen zu viel
lv_historisch[lv_historisch$year == 1995 & lv_historisch$party == "FDP",]$s <- 45
lv_historisch[lv_historisch$year == 1995 & lv_historisch$party == "FDP",]$s_noap <- 48
lv_historisch %>%
group_by(year) %>%
summarize(total_sitze = sum(s))
## Source: local data frame [5 x 2]
##
## year total_sitze
## 1 1995 200
## 2 1999 200
## 3 2003 200
## 4 2007 200
## 5 2011 200
# Plausibilitätscheck
lv_historisch %>%
group_by(year) %>%
summarize(total_sitze = sum(ap_s))
## Source: local data frame [5 x 2]
##
## year total_sitze
## 1 1995 0
## 2 1999 0
## 3 2003 0
## 4 2007 0
## 5 2011 0
# gibt überall null, gut
# nur die nehmen, die mindestens einen Sitz gemacht haben
lv_historisch %<>%
filter(s > 0)
lv_historisch %>%
group_by(year) %>%
summarize(total_sitze = sum(s))
## Source: local data frame [5 x 2]
##
## year total_sitze
## 1 1995 200
## 2 1999 200
## 3 2003 200
## 4 2007 200
## 5 2011 200
lv_historisch %>%
group_by(year) %>%
summarize(total_sitze = sum(s_noap))
## Source: local data frame [5 x 2]
##
## year total_sitze
## 1 1995 200
## 2 1999 200
## 3 2003 200
## 4 2007 200
## 5 2011 200
lv_historisch %>%
group_by(year) %>%
summarize(total_sitze = sum(ap_s))
## Source: local data frame [5 x 2]
##
## year total_sitze
## 1 1995 0
## 2 1999 0
## 3 2003 0
## 4 2007 0
## 5 2011 0
# gut
# variablen transformieren
lv_historisch %<>%
mutate(party = tolower(party), party_abbr = party, party = NA, party_strength = v, seats_with = s, seats_without = s_noap, seat_difference = ap_s, canton = "CH", party_id = NA, list_id = NA) %>%
select(-apdummy, -voteap)
lv_historisch[lv_historisch$party_abbr == "ldu",]$party_id <- 8
lv_historisch[lv_historisch$party_abbr == "svp",]$party_id <- 4
lv_historisch[lv_historisch$party_abbr == "cvp",]$party_id <- 2
lv_historisch[lv_historisch$party_abbr == "fps",]$party_id <- 16
lv_historisch[lv_historisch$party_abbr == "fdp",]$party_id <- 1
lv_historisch[lv_historisch$party_abbr == "fga",]$party_id <- 9
lv_historisch[lv_historisch$party_abbr == "gps",]$party_id <- 13
lv_historisch[lv_historisch$party_abbr == "sp",]$party_id <- 3
lv_historisch[lv_historisch$party_abbr == "csp",]$party_id <- 8
lv_historisch[lv_historisch$party_abbr == "pda",]$party_id <- 9
lv_historisch[lv_historisch$party_abbr == "fps",]$party_id <- 16
lv_historisch[lv_historisch$party_abbr == "evp",]$party_id <- 7
lv_historisch[lv_historisch$party_abbr == "sd",]$party_id <- 16
lv_historisch[lv_historisch$party_abbr == "sol",]$party_id <- 9
lv_historisch[lv_historisch$party_abbr == "lps",]$party_id <- 1
lv_historisch[lv_historisch$party_abbr == "lega",]$party_id <- 18
lv_historisch[lv_historisch$party_abbr == "edu",]$party_id <- 16
lv_historisch[lv_historisch$party_abbr == "glp",]$party_id <- 31
lv_historisch[lv_historisch$party_abbr == "uebrige",]$party_id <- 99
lv_historisch[lv_historisch$party_abbr == "bdp",]$party_id <- 32
lv_historisch[lv_historisch$party_abbr == "mcr",]$party_id <- 8
# select variables
lv_historisch %<>% select(year, canton, party, seats_with, seat_difference, seats_without, party_abbr, party_id)
# LPS zu FDP rechnen
lv_historisch %<>%
mutate(lpsfdp = ifelse(party_abbr == "fdp" | party_abbr == "lps", "fdplps", party_abbr))
lv_historisch %<>%
group_by(lpsfdp, year, canton) %>%
summarise(party = first(party), party_id = first(party_id),seats_with = sum(seats_with), seat_difference = sum(seat_difference), seats_without = sum(seats_without)) %>%
ungroup() %>%
select(-lpsfdp)
# lv_historisch %>%
# filter(party_abbr == "lps")
# Plausibilitätschecks
# 1000 Sitze
sum(lv_historisch$seats_with)
## [1] 1000
sum(lv_historisch$seats_without)
## [1] 1000
sum(lv_historisch$seat_difference)
## [1] 0
write.csv(lv_historisch,file = "output/lv_historical_national.csv", na = "", row.names = F)