This document illustrates the preprocessing of the dataset visualized in this article on srf.ch.
SRF Data attaches great importance to transparent and reproducible data preprocessing and -analysis. SRF Data believes in the principles of open data but also open and reproducible methods. Third parties should be empowered to build on the work of SRF Data and to generate new analyses and applications.
The preprocessing and analysis of the data was conducted in the R project for statistical computing. The RMarkdown script used to generate this document and all the resulting data can be downloaded under this link. Through executing main.Rmd
, the herein described process can be reproduced and this document can be generated. In the course of this, data from the folder ìnput
will be processed and results will be written to output
.
Attention: Please set your working directory in the first code chunk!
The code for the herein described process can also be freely downloaded from https://github.com/srfdata/2015-06-elections-list-apparentments. Criticism in the form of GitHub issues and pull requests are very welcome!
2015-06-elections-list-apparentments by SRF Data is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
The published information has been collated carefully, but no guarantee is offered of its completeness, correctness or up-to-date nature. No liability is accepted for damage or loss incurred from the use of this script or the information drawn from it. This exclusion of liability also applies to third-party content that is accessible via this offer.
All code & data from SRF Data is available under http://srfdata.github.io.
input/lv9511n.csv
- Effects of list apparentments on the cantonal level, 1995-2011. Data source: Daniel Bochsler, NCCR Democracy.input/nrwresultate_sim9511.xls
- Effects of list apparentments on the national level, 1995-2011. Data source: Daniel Bochsler, NCCR Democracy.input/stimmen_der_wahllisten_nach_parteien_und_kantonen_clean.xls
- Party strengths and list definitions on the cantonal level, 2011. Data source: BFS. The formatting was removed manually in order to make reading the data into R easier.input/listenverbindungen_bfs_others_unedited.csv
- intermediary file that was used because data needed to be corrected manually, not relevant.input/listenverbindungen_bfs_others_edited.csv
- intermediary file that was used because data needed to be corrected manually, not relevant. If you need to know what manual corrections were made, look at the diff between *_unedited.csv
and *_edited.csv
.The following sections describe the results of the data preprocessing as stored in the output
folder.
output/parties.csv
Contains party classifications made by SRF Data with the help of political scientists, used throughout all projects related to elections.
Attribute | Type | Description |
---|---|---|
ID | Integer | Unique identifier |
Abbr_* | String | Abbreviation in German (D), French (F), English (E), Romansh (R), Italian (I), respectively |
Legend_* | String | Abbreviation, but with slightly more information, used for frontend purposes |
Name_* | String | Full name |
Sortorder | Integer | Used for frontend purposes solely |
OLD_ID | String | “Official” ID as given in https://github.com/srfdata/2015-06-elections-partystrengths/blob/master/analysis/input/parteienstaerke_mod_2.xlsx (sheet “Parteien”), used for combining party strengths for party groupings |
output/lv_2011.csv
Contains effects of list apparentments on the cantonal level, 2011, as derived from input/lv9511n.csv
, with the help of input/stimmen_der_wahllisten_nach_parteien_und_kantonen_clean.xls
.
Attribute | Type | Description |
---|---|---|
year | Integer | Election year |
canton | String | Official cantonal abbreviation |
party | String | Contains the party name, but only if it belongs to a group in output/parties.csv (e.g. id == 8 ) |
party_id | Integer | Party or party grouping, referencing ID in output/parties.csv |
party_strength | Double | Party strength in percent |
seats_with | String | Actual, resulting seats |
seat_difference | String | seats_with - seats_without |
seats_without | String | Seats that would have resulted without the possibility of list apparentments |
list_id | String | The cantonal list the party was on in 2011 |
output/lv_2011_national.csv
Contains effects of list apparentments on the national level, 2011, as derived from input/nrwresultate_sim9511.xls
.
Attribute | Type | Description |
---|---|---|
year | Integer | Election year |
party | String | Contains the party name, but only if it belongs to a group in output/parties.csv (e.g. id == 8 ) |
party_id | Integer | Party or party grouping, referencing ID in output/parties.csv |
party_strength | Double | Party strength in percent |
seats_with | String | Actual, resulting seats |
seats_without | String | Seats that would have resulted without the possibility of list apparentments |
seat_difference | String | seats_with - seats_without |
output/lv_historical.csv
Contains effects of list apparentments on the cantonal level, 1995 - 2007, as derived from input/lv9511n.csv
.
Attribute | Type | Description |
---|---|---|
year | Integer | Election year |
canton | String | Official cantonal abbreviation |
party | String | Contains the party name, but only if it belongs to a group in output/parties.csv (e.g. id == 8 ) |
party_id | Integer | Party or party grouping, referencing ID in output/parties.csv |
party_strength | Double | Party strength in percent |
seats_with | String | Actual, resulting seats |
seat_difference | String | seats_with - seats_without |
seats_without | String | Seats that would have resulted without the possibility of list apparentments |
output/lv_historical_national.csv
Contains effects of list apparentments on the national level, 1995 - 2007, as derived from input/nrwresultate_sim9511.xls
.
Attribute | Type | Description |
---|---|---|
year | Integer | Election year |
party | String | Contains the party name, but only if it belongs to a group in output/parties.csv (e.g. id == 8 ) |
party_id | Integer | Party or party grouping, referencing ID in output/parties.csv |
party_strength | Double | Party strength in percent |
seats_with | String | Actual, resulting seats |
seat_difference | String | seats_with - seats_without |
seats_without | String | Seats that would have resulted without the possibility of list apparentments |
# von https://mran.revolutionanalytics.com/web/packages/checkpoint/vignettes/using-checkpoint-with-knitr.html
cat("library(magrittr)
library(tidyr)
library(dplyr)
library(readxl)
library(ggplot2)",
file = "manifest.R")
package_date <- "2015-08-27"
if(!require(checkpoint)) {
if(!require(devtools)){
install.packages("devtools", repos = "http://cran.us.r-project.org")
require(devtools)
}
devtools::install_github("checkpoint", username = "RevolutionAnalytics", ref = "v0.3.2", repos = "http://cran.us.r-project.org")
require(checkpoint)
}
if(!dir.exists("~/.checkpoint")){
dir.create("~/.checkpoint")
}
checkpoint(snapshotDate = package_date, project = path_to_wd, verbose = T, scanForPackages = T, use.knitr = F)
rm(package_date)
source("manifest.R")
unlink("manifest.R")
The following comments are unfortunately in German
# 1 Listenverbindungen in den Nationalratswahlen in den Jahren 1995-2011, nach Kantonen und Parteien
listenverbindungen_kantonal <- read.csv2(file = "input/lv9511n.csv", sep = ",", stringsAsFactors = F)
# Es gibt zwei Fehler in den kantonalen Daten, gleich hier beheben
listenverbindungen_kantonal[listenverbindungen_kantonal$party == "glp" & listenverbindungen_kantonal$year == 2011 & listenverbindungen_kantonal$kt == "TG",]$ap_s <- 1
listenverbindungen_kantonal[listenverbindungen_kantonal$party == "fdp" & listenverbindungen_kantonal$year == 2011 & listenverbindungen_kantonal$kt == "TG",]$ap_s <- -1
listenverbindungen_kantonal %<>%
mutate(vote = as.double(vote))
listenverbindungen_national <- read_excel(path = "input/nrwresultate_sim9511.xls")[1:140,]
# Es gibt zwei Fehler in den nationalen Daten, gleich hier beheben
listenverbindungen_national[listenverbindungen_national$year == 2011 & listenverbindungen_national$party == "GLP",]$ap_s <- 5
listenverbindungen_national[listenverbindungen_national$year == 2011 & listenverbindungen_national$party == "GLP",]$s_noap <- 7
listenverbindungen_national[listenverbindungen_national$year == 2011 & listenverbindungen_national$party == "FDP",]$ap_s <- -1
listenverbindungen_national[listenverbindungen_national$year == 2011 & listenverbindungen_national$party == "FDP",]$s_noap <- 30
# doublecheck
listenverbindungen_kantonal %>% group_by(id_kt) %>%
summarize(total_seats = sum(abs(ap_s))) %>% arrange(desc(total_seats))
## Source: local data frame [130 x 2]
##
## id_kt total_seats
## 1 BE1995 4
## 2 BE2003 4
## 3 FR1999 4
## 4 SG1995 4
## 5 ZH2003 4
## 6 ZH2007 4
## 7 AG1995 2
## 8 AG1999 2
## 9 AG2003 2
## 10 AG2007 2
## .. ... ...
# check that sum is always 100
rundungsfehler <- listenverbindungen_kantonal %>%
group_by(year, kt) %>%
summarize(total_vote = sum(vote)) %>%
ungroup() %>%
arrange(desc(total_vote))
# mostly ok
parties <- read.csv("input/parties.csv") # same here
write.csv(parties, file = "output/parties.csv", row.names = F, na = "")
# neuer Approach: Daten aus BFS-Listen-Datensatz
# Datensatz einlesen
listenverbindungen_bfs <- read_excel(path = "input/stimmen_der_wahllisten_nach_parteien_und_kantonen_clean.xls", col_types = c("numeric", "text", "numeric", "text", "text", "text", "text", "text", "numeric"))[1:384,]
## DEFINEDNAME: 20 00 00 01 0b 00 00 00 01 00 00 00 00 00 00 07 3b 00 00 00 00 00 00 00 00 ff 00
## DEFINEDNAME: 00 00 00 0d 0b 00 00 00 00 00 00 00 00 00 00 4c 49 53 54 45 4e 53 54 49 4d 4d 45 4e 3b 01 00 00 00 32 01 00 00 07 00
## DEFINEDNAME: 00 00 00 1a 0b 00 00 00 01 00 00 00 00 00 00 45 78 63 65 6c 5f 42 75 69 6c 74 49 6e 5f 50 72 69 6e 74 5f 54 69 74 6c 65 73 3b 00 00 00 00 00 00 00 00 fc 00
## DEFINEDNAME: 00 00 00 0d 0b 00 00 00 01 00 00 00 00 00 00 4c 49 53 54 45 4e 53 54 49 4d 4d 45 4e 3b 00 00 00 00 80 01 00 00 08 00
## DEFINEDNAME: 20 00 00 01 0b 00 00 00 01 00 00 00 00 00 00 07 3b 00 00 00 00 00 00 00 00 ff 00
## DEFINEDNAME: 00 00 00 0d 0b 00 00 00 00 00 00 00 00 00 00 4c 49 53 54 45 4e 53 54 49 4d 4d 45 4e 3b 01 00 00 00 32 01 00 00 07 00
## DEFINEDNAME: 00 00 00 1a 0b 00 00 00 01 00 00 00 00 00 00 45 78 63 65 6c 5f 42 75 69 6c 74 49 6e 5f 50 72 69 6e 74 5f 54 69 74 6c 65 73 3b 00 00 00 00 00 00 00 00 fc 00
## DEFINEDNAME: 00 00 00 0d 0b 00 00 00 01 00 00 00 00 00 00 4c 49 53 54 45 4e 53 54 49 4d 4d 45 4e 3b 00 00 00 00 80 01 00 00 08 00
# plausbilitätschecks
rundungsfehler <- listenverbindungen_bfs %>%
group_by(year, canton) %>%
summarize(total_vote = sum(party_strength)) %>%
ungroup() %>%
arrange(desc(total_vote))
# always ok
# party -> kleinschreiben und als faktor
listenverbindungen_bfs %<>%
mutate(party = as.factor(tolower(party)))
# welche parties gibt es?
unique(listenverbindungen_bfs$party)
## [1] svp sp fdp gps cvp glp evp übrige pda edu
## [11] bdp sd csp fdp/lp lega sol mcr
## 17 Levels: bdp csp cvp edu evp fdp fdp/lp glp gps lega mcr pda sd ... übrige
length(unique(listenverbindungen_bfs$party))
## [1] 17
# Datensatz in party == "Übrige" und restliche aufspalten
listenverbindungen_bfs_parties <- as.data.frame(split(listenverbindungen_bfs, listenverbindungen_bfs$party == "übrige")[1]$`FALSE`)
listenverbindungen_bfs_others <- as.data.frame(split(listenverbindungen_bfs, listenverbindungen_bfs$party == "übrige")[2]$`TRUE`)
# Für parties: Summation der party_strength gruppiert nach partei
listenverbindungen_bfs_parties_summed <- listenverbindungen_bfs_parties %>%
group_by(year, canton, party) %>%
summarize(lv, party_strength = sum(party_strength)) %>%
ungroup()
# Plausbilitätschecks
listenverbindungen_bfs_parties_summed %>% filter(canton == "BE")
## Source: local data frame [11 x 5]
##
## year canton party lv party_strength
## 1 2011 BE bdp NA 14.8802936
## 2 2011 BE cvp B 2.0665817
## 3 2011 BE edu NA 3.1039152
## 4 2011 BE evp B 4.1735028
## 5 2011 BE fdp A 8.6555962
## 6 2011 BE glp B 5.2913954
## 7 2011 BE gps C 9.4147215
## 8 2011 BE pda C 0.3426402
## 9 2011 BE sd NA 0.5609540
## 10 2011 BE sp C 19.3274994
## 11 2011 BE svp D 28.9880443
listenverbindungen_kantonal %>% filter(kt == "BE" & year == 2011 & party != "uebrige")
## year kt party id_kt vote seats others_name ap_s ap_i kname s
## 1 2011 BE bdp BE2011 14.9 NA 0 Bern 4
## 2 2011 BE csp BE2011 0.0 NA 0 Bern 0
## 3 2011 BE cvp BE2011 2.1 NA 0 BE11C Bern 0
## 4 2011 BE edu BE2011 3.1 NA 0 Bern 0
## 5 2011 BE evp BE2011 4.2 NA 0 BE11C Bern 1
## 6 2011 BE fdp BE2011 8.7 NA 0 Bern 2
## 7 2011 BE glp BE2011 5.3 NA 1 BE11C Bern 2
## 8 2011 BE gps BE2011 9.4 NA 0 BE11B Bern 3
## 9 2011 BE lega BE2011 0.0 NA 0 Bern 0
## 10 2011 BE lps BE2011 0.0 NA 0 Bern 0
## 11 2011 BE mcr BE2011 0.0 NA 0 Bern 0
## 12 2011 BE pda BE2011 0.3 NA 0 BE11B Bern 0
## 13 2011 BE sd BE2011 0.6 NA 0 Bern 0
## 14 2011 BE sol BE2011 0.0 NA 0 Bern 0
## 15 2011 BE sp BE2011 19.3 NA 0 BE11B Bern 6
## 16 2011 BE svp BE2011 29.0 NA -1 BE11A Bern 8
# ok
# Zusätzliche Spalten hinzufügen
listenverbindungen_bfs_parties_summed %<>% mutate(party_abbr = party, party = NA, party_id = NA, seats_with = NA, seats_without = NA, seat_difference = NA, list_id = ifelse(!is.na(lv), paste(canton, year, lv, sep = ""), ""))
# Differenzen in den Partei-Bezeichnungen
listenverbindungen_kantonal_2011 <- listenverbindungen_kantonal %>% filter(year == 2011)
setdiff(unique(listenverbindungen_bfs_parties_summed$party_abbr), unique(listenverbindungen_kantonal_2011$party))
## [1] "fdp/lp"
setdiff(unique(listenverbindungen_kantonal_2011$party), unique(listenverbindungen_bfs_parties_summed$party_abbr))
## [1] "lps" "uebrige"
# welche sind fdp/lp in den BFS-Daten?
listenverbindungen_bfs_parties_summed %>%
filter(party_abbr == "fdp/lp" & party_strength > 0)
## Source: local data frame [2 x 11]
##
## year canton party lv party_strength party_abbr party_id seats_with
## 1 2011 BS NA A 6.826345 fdp/lp NA NA
## 2 2011 VD NA A 5.727088 fdp/lp NA NA
## Variables not shown: seats_without (lgl), seat_difference (lgl), list_id
## (chr)
# zu lps umbenennen, damit die Daten gematcht werden können
listenverbindungen_bfs_parties_summed$party_abbr <- factor(listenverbindungen_bfs_parties_summed$party_abbr, levels = c(levels(listenverbindungen_bfs_parties_summed$party_abbr), "lps"))
listenverbindungen_bfs_parties_summed[listenverbindungen_bfs_parties_summed$party_abbr == "fdp/lp",]$party_abbr <- "lps"
setdiff(unique(listenverbindungen_bfs_parties_summed$party_abbr), unique(listenverbindungen_kantonal_2011$party))
## character(0)
setdiff(unique(listenverbindungen_kantonal_2011$party), unique(listenverbindungen_bfs_parties_summed$party_abbr))
## [1] "uebrige"
listenverbindungen_bfs_parties_summed %>%
filter(party_abbr == "lps" & party_strength > 0)
## Source: local data frame [2 x 11]
##
## year canton party lv party_strength party_abbr party_id seats_with
## 1 2011 BS NA A 6.826345 lps NA NA
## 2 2011 VD NA A 5.727088 lps NA NA
## Variables not shown: seats_without (lgl), seat_difference (lgl), list_id
## (chr)
# welche sind LPS in den Bochsler-Daten?
listenverbindungen_kantonal_2011 %>%
filter(party == "lps" & vote > 0)
## year kt party id_kt vote seats others_name ap_s ap_i kname s
## 1 2011 BS lps BS2011 6.8 NA 0 BS11A Basel-Stadt 0
## 2 2011 VD lps VD2011 5.7 NA 0 VD11B Waadt 1
# Wikipedia: Die noch bestehenden kantonalen Parteien sind nun zusammen mit ihren freisinnigen Schwesterparteien Teil der neuen liberalen Bundespartei.
listenverbindungen_kantonal_2011 %<>%
select(year, kt, party, ap_s, s, ap_i, vote) %>%
rename(party_bochsler = party) %>%
mutate(party_bochsler = as.factor(party_bochsler))
# Bochsler-Daten: Welche "uebrige" haben Sitze gewonnen?
listenverbindungen_kantonal_2011 %>%
filter(party_bochsler == "uebrige" & s > 0)
## year kt party_bochsler ap_s s ap_i vote
## 1 2011 OW uebrige 0 1 56.9
listenverbindungen_kantonal_2011 %>%
filter(party_bochsler == "uebrige" & ap_s != 0)
## [1] year kt party_bochsler ap_s
## [5] s ap_i vote
## <0 rows> (or 0-length row.names)
# es handelt sich um die csp
# Bochsler-Daten: matchen (Gewonnene Sitze und LV-Auswirkungen)
listenverbindungen_parties_combined <- listenverbindungen_bfs_parties_summed %>% left_join(listenverbindungen_kantonal_2011, by = c("year", "canton" = "kt", "party_abbr" = "party_bochsler"))
## Plausibilitätschecks
# Gesamtsumme Sitze
sum(listenverbindungen_parties_combined$s, na.rm = T)
## [1] 199
# 199, der Sitz der CSP im Kanton OW fehlt noch
# überprüfen, ob party_strength überall ungefähr gleich ist, sonst korrigieren
which((listenverbindungen_parties_combined$party_strength >= listenverbindungen_parties_combined$vote - 0.1 & listenverbindungen_parties_combined$party_strength <= listenverbindungen_parties_combined$vote + 0.1) == F)
## [1] 129 130 131
listenverbindungen_parties_combined[c(129,130,131),]
## Source: local data frame [3 x 15]
##
## year canton party lv party_strength party_abbr party_id seats_with
## 1 2011 SZ NA C 3.831282 gps NA NA
## 2 2011 SZ NA C 15.657458 sp NA NA
## 3 2011 SZ NA D 37.984713 svp NA NA
## Variables not shown: seats_without (lgl), seat_difference (lgl), list_id
## (chr), ap_s (dbl), s (int), ap_i (chr), vote (dbl)
# Erklärung:
# 1. GPS + Atomfrei werden im BFS Datensatz als GPS gerechnet, im Bochsler-Datensatz als Übrige
# 2. SP + Frischer Wind + Gewerkschaftsbund werden im BFS Datensatz als SP gerechnet, im Bochsler-Datensatz als Übrige
# 3. SVP + Aktive Senioren werden im BFS Datensatz als als SVP gerechnet, im Bochsler-Datensatz Aktive Senioren als übrige
# überprüfen, wo nicht in beiden Datensätzen von einer LV gesprochen wird
a <- which(!is.na(listenverbindungen_parties_combined$lv) & listenverbindungen_parties_combined$ap_i == "")
# und umgekehrt
b <- which(is.na(listenverbindungen_parties_combined$lv) & listenverbindungen_parties_combined$ap_i != "")
# Fazit: Viele LV kommen bei Bochsler nicht vor (sind auch keine zwischenparteilichen LV)
# Eine kommt bei Bochsler vor, aber im BFS-Datensatz nicht, überprüfen:
listenverbindungen_parties_combined[c(b),]
## Source: local data frame [1 x 15]
##
## year canton party lv party_strength party_abbr party_id seats_with
## 1 2011 SO NA NA 0.4855171 edu NA NA
## Variables not shown: seats_without (lgl), seat_difference (lgl), list_id
## (chr), ap_s (dbl), s (int), ap_i (chr), vote (dbl)
listenverbindungen_kantonal_2011 %>% filter(ap_i == "SO11A")
## year kt party_bochsler ap_s s ap_i vote
## 1 2011 SO edu 0 0 SO11A 0.5
## 2 2011 SO svp 0 2 SO11A 24.3
listenverbindungen_kantonal_2011 %>% filter(kt == "SO" & ap_s != 0)
## year kt party_bochsler ap_s s ap_i vote
## 1 2011 SO cvp 1 2 SO11C 17.9
## 2 2011 SO fdp -1 1 SO11B 18.4
# Laut Bochsler sind EDU und SVP im Jahr 2011 in Solothurn eine LV eingegangen, es hat aber nichts gebracht, und die LV haben nur CVP und FDP beeinflusst
# Beschluss: wir übernehmen die LV von Bochsler
# variablen transformieren
listenverbindungen_parties_combined %<>%
mutate(list_id = ap_i, seats_with = s, seat_difference = ap_s, seats_without = seats_with - seat_difference)
# unnötige variablen rausschmeissen
listenverbindungen_parties_combined %<>%
select(-ap_s, -s, -ap_i, -vote)
# LV brauchen wir noch für das Matching mit den übrigen
# Übrige: Variablen hinzufügen + manuell bearbeiten
listenverbindungen_bfs_others %<>%
select(-lnr, -lnroff, -ulv) %>%
mutate(party = name, party_abbr = NA, seats_with = 0, seats_without = 0, seat_difference = 0)
# Der Sitz der CSP muss noch eingetragen werden
listenverbindungen_bfs_others[listenverbindungen_bfs_others$canton == "OW",]$seats_with <- 1
listenverbindungen_bfs_others[listenverbindungen_bfs_others$canton == "OW",]$seats_without <- 1
listenverbindungen_bfs_others[listenverbindungen_bfs_others$canton == "OW",]$party_abbr <- "csp"
write.csv(listenverbindungen_bfs_others, file = "input/listenverbindungen_bfs_others_unedited.csv", na = "")
# EDITIEREN
# wieder einlesen
listenverbindungen_bfs_others <- read.csv2(file = "input/listenverbindungen_bfs_others_edited.csv", sep = ",", stringsAsFactors = F)
# transformieren
listenverbindungen_bfs_others %<>%
mutate(party = name) %>%
mutate(year = as.numeric(year), party_strength = as.numeric(party_strength)) %>%
mutate(list_id = NA, party_id = NA) %>%
select(year, canton, party, lv, party_strength, party_abbr, party_id, seats_with, seats_without, seat_difference, list_id)
str(listenverbindungen_bfs_others)
## 'data.frame': 49 obs. of 11 variables:
## $ year : num 2011 2011 2011 2011 2011 ...
## $ canton : chr "ZH" "ZH" "ZH" "ZH" ...
## $ party : chr "Anti PowerPoint Partei" "Tierpartei" "Piratenpartei" "Subitas" ...
## $ lv : chr "" "B" "E" "" ...
## $ party_strength : num 0.0346 0.4002 0.8614 0.0143 0.0615 ...
## $ party_abbr : chr "APPP" "TPS" "PP" "Sub." ...
## $ party_id : logi NA NA NA NA NA NA ...
## $ seats_with : int 0 0 0 0 0 0 0 0 0 0 ...
## $ seats_without : int 0 0 0 0 0 0 0 0 0 0 ...
## $ seat_difference: int 0 0 0 0 0 0 0 0 0 0 ...
## $ list_id : logi NA NA NA NA NA NA ...
str(listenverbindungen_parties_combined)
## Classes 'tbl_df', 'tbl' and 'data.frame': 188 obs. of 11 variables:
## $ year : num 2011 2011 2011 2011 2011 ...
## $ canton : chr "AG" "AG" "AG" "AG" ...
## $ party : logi NA NA NA NA NA NA ...
## $ lv : chr "B" "B" "D" "E" ...
## $ party_strength : num 6.14 10.59 1.17 3.22 11.48 ...
## $ party_abbr : chr "bdp" "cvp" "edu" "evp" ...
## $ party_id : logi NA NA NA NA NA NA ...
## $ seats_with : int 1 1 0 0 2 1 1 0 3 6 ...
## $ seats_without : num 1 1 0 0 2 1 1 0 3 6 ...
## $ seat_difference: num 0 0 0 0 0 0 0 0 0 0 ...
## $ list_id : chr "AG11B" "AG11B" "AG11A" "AG11C" ...
# anderer Datensatz auch noch einmal transformieren
listenverbindungen_parties_combined %<>% mutate(party = as.character(party))
# Datensätze wieder zusammenführen
listenverbindungen_bfs_final <- rbind(listenverbindungen_parties_combined, listenverbindungen_bfs_others)
# Plausibilitätschecks
# Summation zu 100%
as.data.frame(listenverbindungen_bfs_final %>%
group_by(year, canton) %>%
summarize(total_vote = sum(party_strength)) %>%
ungroup() %>%
arrange(desc(total_vote)))
## year canton total_vote
## 1 2011 BE 100
## 2 2011 TI 100
## 3 2011 ZG 100
## 4 2011 GL 100
## 5 2011 FR 100
## 6 2011 UR 100
## 7 2011 AR 100
## 8 2011 SH 100
## 9 2011 SO 100
## 10 2011 AI 100
## 11 2011 VD 100
## 12 2011 GR 100
## 13 2011 JU 100
## 14 2011 NE 100
## 15 2011 NW 100
## 16 2011 TG 100
## 17 2011 BL 100
## 18 2011 SG 100
## 19 2011 BS 100
## 20 2011 GE 100
## 21 2011 OW 100
## 22 2011 VS 100
## 23 2011 AG 100
## 24 2011 ZH 100
## 25 2011 SZ 100
## 26 2011 LU 100
# 200 Sitze
sum(listenverbindungen_bfs_final$seats_with)
## [1] 200
sum(listenverbindungen_bfs_final$seats_without)
## [1] 200
# party_id ermitteln
listenverbindungen_bfs_final$party_id <- 8
listenverbindungen_bfs_final[listenverbindungen_bfs_final$party_abbr == "bdp",]$party_id <- 32
# listenverbindungen_bfs_final[listenverbindungen_bfs_final$party_abbr == "ldu",]$party_id <- 8
listenverbindungen_bfs_final[listenverbindungen_bfs_final$party_abbr == "svp",]$party_id <- 4
listenverbindungen_bfs_final[listenverbindungen_bfs_final$party_abbr == "cvp",]$party_id <- 2
listenverbindungen_bfs_final[listenverbindungen_bfs_final$party_abbr == "fdp",]$party_id <- 1
# listenverbindungen_bfs_final[listenverbindungen_bfs_final$party_abbr == "fga",]$party_id <- 9
listenverbindungen_bfs_final[listenverbindungen_bfs_final$party_abbr == "gps",]$party_id <- 13
listenverbindungen_bfs_final[listenverbindungen_bfs_final$party_abbr == "sp",]$party_id <- 3
listenverbindungen_bfs_final[listenverbindungen_bfs_final$party_abbr == "CSP",]$party_id <- 8
listenverbindungen_bfs_final[listenverbindungen_bfs_final$party_abbr == "pda",]$party_id <- 9
# listenverbindungen_bfs_final[listenverbindungen_bfs_final$party_abbr == "fps",]$party_id <- 16
listenverbindungen_bfs_final[listenverbindungen_bfs_final$party_abbr == "evp",]$party_id <- 7
listenverbindungen_bfs_final[listenverbindungen_bfs_final$party_abbr == "sd",]$party_id <- 16
listenverbindungen_bfs_final[listenverbindungen_bfs_final$party_abbr == "sol",]$party_id <- 9
listenverbindungen_bfs_final[listenverbindungen_bfs_final$party_abbr == "lps",]$party_id <- 1
listenverbindungen_bfs_final[listenverbindungen_bfs_final$party_abbr == "lega",]$party_id <- 18
listenverbindungen_bfs_final[listenverbindungen_bfs_final$party_abbr == "edu",]$party_id <- 16
listenverbindungen_bfs_final[listenverbindungen_bfs_final$party_abbr == "glp",]$party_id <- 31
# Listenverbindungs-Code neu berechnen
list_ids_per_canton <- listenverbindungen_bfs_final %>%
filter(!is.na(lv) & lv != "") %>%
group_by(canton, lv) %>%
slice(1) %>%
select(canton, lv, list_id) %>%
ungroup()
computeListId <- function(row){
# party ist nur bei "übrigen" ausgefüllt
if(is.na(row$party)){
return(row$list_id)
} else {
# perform lookup
return(as.character(list_ids_per_canton[list_ids_per_canton$canton == row$canton & list_ids_per_canton$lv == row$lv,]$list_id))
}
}
list_ids <- as.character(by(listenverbindungen_bfs_final, 1:nrow(listenverbindungen_bfs_final), computeListId))
list_ids[list_ids == "character(0)"] <- NA
list_ids[list_ids == ""] <- NA
# ersetzen
listenverbindungen_bfs_final$list_id <- list_ids
# letzte, manuelle korrekturen
listenverbindungen_bfs_final[listenverbindungen_bfs_final$canton == "BE" & listenverbindungen_bfs_final$lv == "E" & !is.na(listenverbindungen_bfs_final$party),]$list_id <- "BE11D"
listenverbindungen_bfs_final[listenverbindungen_bfs_final$canton == "FR" & listenverbindungen_bfs_final$lv == "E",]$list_id <- "FR11E"
# Abschluss
listenverbindungen_bfs_final %<>%
select(year, canton, party, party_abbr, party_id, party_strength, seats_with, seats_without, seat_difference, list_id)
# LPS zu FDP rechnen
listenverbindungen_bfs_final %<>%
mutate(lpsfdp = ifelse(party_abbr == "fdp" | party_abbr == "lps", "fdplps", party_abbr))
listenverbindungen_bfs_final %<>%
group_by(lpsfdp, year, canton) %>%
summarise(party = first(party), party_abbr = first(party_abbr), party_id = first(party_id), party_strength = sum(party_strength),seats_with = sum(seats_with), seat_difference = sum(seat_difference), seats_without = sum(seats_without), list_id = first(list_id)) %>%
ungroup() %>%
select(-lpsfdp, -party_abbr)
# Plausibilitätschecks
as.data.frame(listenverbindungen_bfs_final %>%
group_by(year, canton) %>%
summarize(total_vote = sum(party_strength)) %>%
ungroup() %>%
arrange(desc(total_vote)))
## year canton total_vote
## 1 2011 BE 100
## 2 2011 TI 100
## 3 2011 ZG 100
## 4 2011 GL 100
## 5 2011 FR 100
## 6 2011 UR 100
## 7 2011 AR 100
## 8 2011 SH 100
## 9 2011 SO 100
## 10 2011 AI 100
## 11 2011 VD 100
## 12 2011 GR 100
## 13 2011 JU 100
## 14 2011 NE 100
## 15 2011 NW 100
## 16 2011 TG 100
## 17 2011 BL 100
## 18 2011 SG 100
## 19 2011 BS 100
## 20 2011 GE 100
## 21 2011 OW 100
## 22 2011 VS 100
## 23 2011 AG 100
## 24 2011 ZH 100
## 25 2011 SZ 100
## 26 2011 LU 100
# 200 Sitze
sum(listenverbindungen_bfs_final$seats_with)
## [1] 200
sum(listenverbindungen_bfs_final$seats_without)
## [1] 200
# party_id ermitteln
write.csv(listenverbindungen_bfs_final, file = "output/lv_2011.csv", na = "", row.names = F)
# Wie viele Sitze gehen an die CSP?
listenverbindungen_national %>%
filter(party == "CSP")
## Source: local data frame [5 x 8]
##
## year party s ap_s s_noap apdummy voteap v
## 1 1995 CSP 1 1 0 2 22.20000 0.3
## 2 1999 CSP 1 1 0 3 34.40000 0.4
## 3 2003 CSP 1 1 0 1 38.86719 0.4
## 4 2007 CSP 1 1 0 3 29.92079 0.4
## 5 2011 CSP 0 0 0 4 26.70000 0.3
# Problem: Karl Vogler wird 2011 nicht als CSP aufgeführt, sondern bei den Übrigen, auch in den BFS-Daten
# In den kantonalen Daten für 2011 wird er jedoch als CSP aufgeführt, jedoch ist OW sowieso ein Majorzkanton
listenverbindungen2011_national <- listenverbindungen_national %>%
filter(year == 2011) %>%
mutate(party = tolower(party), party_abbr = party, party = NA, party_strength = v, seats_with = s, seats_without = s_noap, seat_difference = ap_s, canton = "CH", party_id = NA, list_id = NA) %>%
select(-apdummy, -voteap)
# Problem: LPS hat 1 Sitz, aber national wurde die Parteistärke der FDP zugerechnet. Deshalb: LPS entfernen und der FDP einen Sitz gutschreiben
# der fdp einen zurechnen
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "fdp",]$seats_with <- listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "fdp",]$seats_with + 1
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "fdp",]$seats_without <- listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "fdp",]$seats_without + 1
# Plausibilitätschecks
# Summiert zu 100%
sum(listenverbindungen2011_national$v)
## [1] 99.9
# Nicht genau 100%
# Originaldaten matchen
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "svp",]$party_strength <- 26.5562298833
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "cvp",]$party_strength <- 12.3040121429
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "fdp",]$party_strength <- 15.1045453764
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "gps",]$party_strength <- 8.4328061825
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "sp",]$party_strength <- 18.7221778605
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "csp",]$party_strength <- 0.2557743755
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "pda",]$party_strength <- 0.5409966352
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "evp",]$party_strength <- 1.9973982108
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "sd",]$party_strength <- 0.2003102175
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "sol",]$party_strength <- 0.3384668779
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "lega",]$party_strength <- 0.8047605286
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "edu",]$party_strength <- 1.2714234639
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "glp",]$party_strength <- 5.3808907415
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "uebrige",]$party_strength <- 2.2361843879
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "bdp",]$party_strength <- 5.4154010371
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "mcr",]$party_strength <- 0.4386220785
# nochmals überprüfen
sum(listenverbindungen2011_national$party_strength)
## [1] 100
# 200 Sitze
sum(listenverbindungen2011_national$seats_with)
## [1] 201
sum(listenverbindungen2011_national$seats_without)
## [1] 201
# solche mit party_strength == 0 entfernen
listenverbindungen2011_national %<>% filter(party_strength > 0)
sum(listenverbindungen2011_national$party_strength)
## [1] 100
sum(listenverbindungen2011_national$seats_with)
## [1] 200
sum(listenverbindungen2011_national$seats_without)
## [1] 200
sum(listenverbindungen2011_national$seat_difference)
## [1] 0
#
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "bdp",]$party_id <- 32
# listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "ldu",]$party_id <- 8
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "svp",]$party_id <- 4
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "cvp",]$party_id <- 2
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "fdp",]$party_id <- 1
# listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "fga",]$party_id <- 9
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "gps",]$party_id <- 13
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "sp",]$party_id <- 3
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "csp",]$party_id <- 8
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "pda",]$party_id <- 9
# listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "fps",]$party_id <- 16
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "evp",]$party_id <- 7
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "sd",]$party_id <- 16
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "sol",]$party_id <- 9
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "lega",]$party_id <- 18
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "edu",]$party_id <- 16
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "glp",]$party_id <- 31
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "csp",]$party_id <- 8
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "mcr",]$party_id <- 8
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "uebrige",]$party_id <- 99
# Volle Namen hinzufügen
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "sd",]$party <- "SD"
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "sol",]$party <- "Sol."
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "edu",]$party <- "EDU"
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "csp",]$party <- "CSP"
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "mcr",]$party <- "MCR"
listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "pda",]$party <- "PdA"
# listenverbindungen2011_national[listenverbindungen2011_national$party_abbr == "uebrige",]$party <- ""
# select variables
listenverbindungen2011_national %<>% select(year, party, party_id, party_strength, seats_with, seats_without, seat_difference)
write.csv(listenverbindungen2011_national,file = "output/lv_2011_national.csv", na = "", row.names = F)
# Plausibilitätscheck
listenverbindungen_national %>%
group_by(year) %>%
summarize(total_sitze = sum(s))
## Source: local data frame [5 x 2]
##
## year total_sitze
## 1 1995 202
## 2 1999 200
## 3 2003 200
## 4 2007 200
## 5 2011 200
# gleiches Problem wie bei kantonalen Daten: 1995 hat zwei Sitze zu viel
lv_historisch <- listenverbindungen_national %>%
filter(year < 2011)
# die FGA hat einen zu viel
lv_historisch[lv_historisch$year == 1995 & lv_historisch$party == "FGA",]$s <- 2
lv_historisch[lv_historisch$year == 1995 & lv_historisch$party == "FGA",]$ap_s <- 1
lv_historisch[lv_historisch$year == 1995 & lv_historisch$party == "FGA",]$s_noap <- 1
# die FDP hat einen zu viel
lv_historisch[lv_historisch$year == 1995 & lv_historisch$party == "FDP",]$s <- 45
lv_historisch[lv_historisch$year == 1995 & lv_historisch$party == "FDP",]$s_noap <- 48
lv_historisch %>%
group_by(year) %>%
summarize(total_sitze = sum(s))
## Source: local data frame [4 x 2]
##
## year total_sitze
## 1 1995 200
## 2 1999 200
## 3 2003 200
## 4 2007 200
# Plausibilitätscheck
lv_historisch %>%
group_by(year) %>%
summarize(total_sitze = sum(ap_s))
## Source: local data frame [4 x 2]
##
## year total_sitze
## 1 1995 0
## 2 1999 0
## 3 2003 0
## 4 2007 0
# gibt überall null, gut
# nur die nehmen, die mindestens einen Sitz gemacht haben
lv_historisch %<>%
filter(s > 0)
lv_historisch %>%
group_by(year) %>%
summarize(total_sitze = sum(s))
## Source: local data frame [4 x 2]
##
## year total_sitze
## 1 1995 200
## 2 1999 200
## 3 2003 200
## 4 2007 200
lv_historisch %>%
group_by(year) %>%
summarize(total_sitze = sum(s_noap))
## Source: local data frame [4 x 2]
##
## year total_sitze
## 1 1995 200
## 2 1999 200
## 3 2003 200
## 4 2007 200
lv_historisch %>%
group_by(year) %>%
summarize(total_sitze = sum(ap_s))
## Source: local data frame [4 x 2]
##
## year total_sitze
## 1 1995 0
## 2 1999 0
## 3 2003 0
## 4 2007 0
# gut
# variablen transformieren
lv_historisch %<>%
mutate(party = tolower(party), party_abbr = party, party = NA, party_strength = v, seats_with = s, seats_without = s_noap, seat_difference = ap_s, canton = "CH", party_id = NA, list_id = NA) %>%
select(-apdummy, -voteap)
lv_historisch[lv_historisch$party_abbr == "ldu",]$party_id <- 8
lv_historisch[lv_historisch$party_abbr == "svp",]$party_id <- 4
lv_historisch[lv_historisch$party_abbr == "cvp",]$party_id <- 2
lv_historisch[lv_historisch$party_abbr == "fps",]$party_id <- 16
lv_historisch[lv_historisch$party_abbr == "fdp",]$party_id <- 1
lv_historisch[lv_historisch$party_abbr == "fga",]$party_id <- 9
lv_historisch[lv_historisch$party_abbr == "gps",]$party_id <- 13
lv_historisch[lv_historisch$party_abbr == "sp",]$party_id <- 3
lv_historisch[lv_historisch$party_abbr == "csp",]$party_id <- 8
lv_historisch[lv_historisch$party_abbr == "pda",]$party_id <- 9
lv_historisch[lv_historisch$party_abbr == "fps",]$party_id <- 16
lv_historisch[lv_historisch$party_abbr == "evp",]$party_id <- 7
lv_historisch[lv_historisch$party_abbr == "sd",]$party_id <- 16
lv_historisch[lv_historisch$party_abbr == "sol",]$party_id <- 9
lv_historisch[lv_historisch$party_abbr == "lps",]$party_id <- 1
lv_historisch[lv_historisch$party_abbr == "lega",]$party_id <- 18
lv_historisch[lv_historisch$party_abbr == "edu",]$party_id <- 16
lv_historisch[lv_historisch$party_abbr == "glp",]$party_id <- 31
# select variables
lv_historisch %<>% select(year, canton, party, seats_with, seat_difference, seats_without, party_abbr, party_id)
# LPS zu FDP rechnen
lv_historisch %<>%
mutate(lpsfdp = ifelse(party_abbr == "fdp" | party_abbr == "lps", "fdplps", party_abbr))
lv_historisch %<>%
group_by(lpsfdp, year, canton) %>%
summarise(party = first(party), party_abbr = first(party_abbr), party_id = first(party_id),seats_with = sum(seats_with), seat_difference = sum(seat_difference), seats_without = sum(seats_without)) %>%
ungroup() %>%
select(-lpsfdp, -canton, -party_abbr)
# Plausibilitätschecks
# 800 Sitze
sum(lv_historisch$seats_with)
## [1] 800
sum(lv_historisch$seats_without)
## [1] 800
sum(lv_historisch$seat_difference)
## [1] 0
write.csv(lv_historisch,file = "output/lv_historical_national.csv", na = "", row.names = F)
lv_historisch <- listenverbindungen_kantonal %>%
filter(year < 2011)
sum(lv_historisch$s)
## [1] 802
# Summe sollte 800 sein
lv_historisch %>%
group_by(year) %>%
summarize(total_sitze = sum(s))
## Source: local data frame [4 x 2]
##
## year total_sitze
## 1 1995 202
## 2 1999 200
## 3 2003 200
## 4 2007 200
# im Jahr 1995 ist die Summe nicht 200
as.data.frame(lv_historisch %>%
filter(year == 1995) %>%
group_by(kt) %>%
summarize(total_sitze = sum(s)) %>%
ungroup() %>%
arrange(desc(total_sitze)))
## kt total_sitze
## 1 ZH 35
## 2 BE 27
## 3 VD 17
## 4 AG 15
## 5 SG 12
## 6 GE 11
## 7 LU 10
## 8 BL 8
## 9 TI 8
## 10 SO 7
## 11 VS 7
## 12 BS 6
## 13 FR 6
## 14 TG 6
## 15 GR 5
## 16 NE 5
## 17 SZ 3
## 18 ZG 3
## 19 AR 2
## 20 JU 2
## 21 SH 2
## 22 AI 1
## 23 GL 1
## 24 NW 1
## 25 OW 1
## 26 UR 1
# Zürich scheint einen zu viel zu haben
as.data.frame(lv_historisch %>%
filter(year == 1995 & kt == "ZH") %>%
group_by(party) %>%
summarize(total_sitze = sum(s)) %>%
ungroup() %>%
arrange(desc(total_sitze)))
## party total_sitze
## 1 sp 9
## 2 svp 9
## 3 fdp 6
## 4 cvp 2
## 5 fga 2
## 6 gps 2
## 7 ldu 2
## 8 evp 1
## 9 fps 1
## 10 sd 1
## 11 csp 0
## 12 edu 0
## 13 glp 0
## 14 lega 0
## 15 lps 0
## 16 pda 0
## 17 sol 0
## 18 uebrige 0
# die FGA hat einen zu viel
lv_historisch[lv_historisch$year == 1995 & lv_historisch$kt == "ZH" & lv_historisch$party == "fga" & lv_historisch$vote < 1,]$s <- 0
# BL ebenfalls
as.data.frame(lv_historisch %>%
filter(year == 1995 & kt == "BL") %>%
group_by(party) %>%
summarize(total_sitze = sum(s)) %>%
ungroup() %>%
arrange(desc(total_sitze)))
## party total_sitze
## 1 fdp 2
## 2 sp 2
## 3 cvp 1
## 4 gps 1
## 5 sd 1
## 6 svp 1
## 7 csp 0
## 8 edu 0
## 9 evp 0
## 10 fga 0
## 11 fps 0
## 12 glp 0
## 13 ldu 0
## 14 lega 0
## 15 lps 0
## 16 pda 0
## 17 sol 0
## 18 uebrige 0
# die FDP hat einen zu viel
lv_historisch[lv_historisch$year == 1995 & lv_historisch$kt == "BL" & lv_historisch$party == "fdp" & lv_historisch$vote < 2,]$s <- 0
sum(lv_historisch$s)
## [1] 800
# nun ist es gut
lv_historisch %<>%
group_by(year, kt, party) %>%
summarize(seats_with = sum(s), seat_difference = sum(ap_s)) %>%
ungroup() %>%
mutate(seats_without = seats_with - seat_difference)
sum(lv_historisch$seats_without)
## [1] 800
# die entfernen, die weder mit oder ohne LV keine sitze gemacht hätten
lv_historisch %<>%
filter(seats_with > 0 | seats_without > 0)
# transformieren
lv_historisch %<>%
rename(canton = kt) %>%
mutate(party_abbr = party, party = NA, party_id = NA)
unique(lv_historisch$party_abbr)
## [1] "cvp" "fdp" "fps" "gps" "ldu" "sp" "svp" "edu" "evp" "fga"
## [11] "sd" "lps" "csp" "pda" "lega" "sol" "glp"
# party_id reinholen
# lv_historisch[lv_historisch$party_abbr == "bdp",]$party_id <- 32
lv_historisch[lv_historisch$party_abbr == "ldu",]$party_id <- 8
lv_historisch[lv_historisch$party_abbr == "svp",]$party_id <- 4
lv_historisch[lv_historisch$party_abbr == "cvp",]$party_id <- 2
lv_historisch[lv_historisch$party_abbr == "fps",]$party_id <- 16
lv_historisch[lv_historisch$party_abbr == "fdp",]$party_id <- 1
lv_historisch[lv_historisch$party_abbr == "fga",]$party_id <- 9
lv_historisch[lv_historisch$party_abbr == "gps",]$party_id <- 13
lv_historisch[lv_historisch$party_abbr == "sp",]$party_id <- 3
lv_historisch[lv_historisch$party_abbr == "csp",]$party_id <- 8
lv_historisch[lv_historisch$party_abbr == "pda",]$party_id <- 9
lv_historisch[lv_historisch$party_abbr == "fps",]$party_id <- 16
lv_historisch[lv_historisch$party_abbr == "evp",]$party_id <- 7
lv_historisch[lv_historisch$party_abbr == "sd",]$party_id <- 16
lv_historisch[lv_historisch$party_abbr == "sol",]$party_id <- 9
lv_historisch[lv_historisch$party_abbr == "lps",]$party_id <- 1
lv_historisch[lv_historisch$party_abbr == "lega",]$party_id <- 18
lv_historisch[lv_historisch$party_abbr == "edu",]$party_id <- 16
lv_historisch[lv_historisch$party_abbr == "glp",]$party_id <- 31
# LPS zu FDP rechnen
lv_historisch %<>%
mutate(lpsfdp = ifelse(party_abbr == "fdp" | party_abbr == "lps", "fdplps", party_abbr))
lv_historisch %<>%
group_by(lpsfdp, year, canton) %>%
summarise(party = first(party), party_abbr = first(party_abbr), party_id = first(party_id), seats_with = sum(seats_with), seat_difference = sum(seat_difference), seats_without = sum(seats_without)) %>%
ungroup() %>%
select(-lpsfdp, -party_abbr)
# Plausibilitätschecks
# 200 Sitze
sum(lv_historisch$seats_with)
## [1] 800
sum(lv_historisch$seats_without)
## [1] 800
sum(lv_historisch$seat_difference)
## [1] 0
# party_id ermitteln
write.csv(lv_historisch, file = "output/lv_historical.csv", na = "", row.names = F)