This document illustrates the preprocessing of the dataset visualized in this article on srf.ch.
SRF Data attaches great importance to transparent and reproducible data preprocessing and -analysis. SRF Data believes in the principles of open data but also open and reproducible methods. Third parties should be empowered to build on the work of SRF Data and to generate new analyses and applications.
The preprocessing and analysis of the data was conducted in the R project for statistical computing. The RMarkdown script used to generate this document and all the resulting data can be downloaded under this link. Through executing
main.Rmd, the herein described process can be reproduced and this document can be generated. In the course of this, data from the folder
ìnput will be processed and results will be written to
The code for the herein described process can also be freely downloaded from https://github.com/srfdata/2015-09-elections-cantonal-budgets. Criticism in the form of GitHub issues and pull requests is very welcome!
2015-09-elections-cantonal-budgets by SRF Data is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
The published information has been collated carefully, but no guarantee is offered of its completeness, correctness or up-to-date nature. No liability is accepted for damage or loss incurred from the use of this script or the information drawn from it. This exclusion of liability also applies to third-party content that is accessible via this offer.
The data shown here is the result of an email questionnaire conducted by SRF Data and Radio Télévision Suisse (RTS) in July and August 2015 with over 200 cantonal party sections. See the questions asked here. In the case of a misunderstanding, clarification was asked for, when no answer was received, follow-up questions were asked.
The data shown is based exclusively on the cantonal party’s own declaration and has not been verified by other sources. A statement on the accuracy of this data can therefore not be made, neither on the total budget nor on the sources of finance. A question mark for the financial sources can have the following meanings: 1. the cantonal party cannot give any declaration, 2. the cantonal party does not want to make any declaration, 3. the figures could not be calculated. In the canton of Jura, the budget cannot always be separated between the federal and cantonal elections, which both take place on October 18.
input/data.csv- The original survey response data, already double-checked, preprocessed and cleaned by SRF Data. Is copied over 1:1 to the
input/parties.csv- Contains party classifications made by SRF Data with the help of political scientists, used throughout all projects related to elections. Is copied over 1:1 to the
The following sections describe the results of the data preprocessing as stored in the
|party_name||String||Contains the party name, but only if it belongs to a group in
|canton||String||Official cantonal abbreviation|
|transparency_level||Integer||Level of transparency (
|budget_total_lower||Integer||Lower boundary of total budget as declared by the cantonal section|
|budget_total_upper||Integer||Upper boundary of total budget as declared by the cantonal section (if no range is given
|budget_share_private_donors||String||Share of budget coming from private donors (see survey questions above)|
|budget_share_corporate_donors||String||Share of budget coming from corporate donors (see survey questions above)|
|budget_share_candidates_elected||String||Share of budget coming from candidates OR already elected representatives (e.g. in the form of fees, see survey questions above)|
|budget_share_members||String||Share of budget coming from member fees (see survey questions above)|
|budget_share_others||String||Share of budget coming from other sources (see survey questions above)|
|budget_share_others_description||String||Description of other sources by the cantonal section|
|comment_by_party||String||Additional comments made by the cantonal section|
Basically the same content as
data.csv but in JSON format. Used directly by the frontend application.
Contains party classifications made by SRF Data with the help of political scientists, used throughout all projects related to elections.
|message.code||String||Used for frontend purposes solely|
|id||Integer||Unique identifier, referenced from
|abbr_*||String||Abbreviation, but with slightly more information, used for frontend purposes|
|legend_*||String||Abbreviation, but with slightly more information, used for frontend purposes|
|sortorder||Integer||Used for frontend purposes solely|
# read in and save in output folder destination_data <- read.csv("input/data.csv") parties <- read.csv("input/parties.csv") # save write.csv(destination_data, "output/data.csv", row.names = F, quote = T, na = "") write.csv(parties, "output/parties.csv", row.names = F, quote = T, na = "")
# save also in assets folder of application for direct usage rList <- list(updated = Sys.time(), title = "data", data = destination_data) jsonObj <- jsonlite::toJSON(rList, pretty = T, auto_unbox = T, na = "null") writeLines(jsonObj, "../frontend/src/assets/gsheets/data.json")
# count == 2 p.P. perparty <- destination_data %>% group_by(party_id) %>% summarize(count_transparent = sum(transparency_level == 2), count_all = sum(transparency_level >= 0), count_percentage = count_transparent/count_all) %>% left_join(parties, by = c("party_id" = "id")) %>% select(abbr_en, count_transparent, count_all, count_percentage) %>% arrange(desc(count_transparent), desc(count_percentage)) perparty
## Source: local data frame [12 x 4] ## ## abbr_en count_transparent count_all count_percentage ## 1 SP 23 25 0.9200000 ## 2 GPS 17 20 0.8500000 ## 3 GLP 14 17 0.8235294 ## 4 BDP 13 16 0.8125000 ## 5 CVP 13 24 0.5416667 ## 6 SVP 13 26 0.5000000 ## 7 EVP 12 12 1.0000000 ## 8 FDP 9 25 0.3600000 ## 9 Small left-wing parties 8 8 1.0000000 ## 10 Others 6 7 0.8571429 ## 11 Small right-wing parties 3 12 0.2500000 ## 12 Lega 1 1 1.0000000
# perparty %>% # write.csv("output/perparty.csv")
# count == 2 p.P. percanton <- destination_data %>% group_by(canton) %>% summarize(count_transparent = sum(transparency_level == 2), count_all = sum(transparency_level >= 0), count_percentage = count_transparent/count_all) %>% select(canton, count_transparent, count_all, count_percentage) %>% arrange(desc(count_transparent), desc(count_percentage)) percanton %>% as.data.frame()
## canton count_transparent count_all count_percentage ## 1 GE 11 12 0.9166667 ## 2 VD 10 10 1.0000000 ## 3 BE 9 10 0.9000000 ## 4 BS 8 10 0.8000000 ## 5 FR 8 10 0.8000000 ## 6 NE 7 8 0.8750000 ## 7 VS 7 10 0.7000000 ## 8 ZH 7 10 0.7000000 ## 9 JU 6 7 0.8571429 ## 10 LU 6 8 0.7500000 ## 11 TI 6 8 0.7500000 ## 12 BL 6 9 0.6666667 ## 13 SO 6 9 0.6666667 ## 14 TG 6 9 0.6666667 ## 15 SZ 5 7 0.7142857 ## 16 SG 5 9 0.5555556 ## 17 GR 4 6 0.6666667 ## 18 AG 4 9 0.4444444 ## 19 AI 2 2 1.0000000 ## 20 OW 2 4 0.5000000 ## 21 UR 2 5 0.4000000 ## 22 ZG 2 6 0.3333333 ## 23 AR 1 3 0.3333333 ## 24 NW 1 3 0.3333333 ## 25 GL 1 4 0.2500000 ## 26 SH 0 5 0.0000000
# percanton %>% # write.csv("output/percanton.csv")
overall_budget <- destination_data %>% filter() %>% summarize(sum_lower = sum(budget_total_lower, na.rm = T), sum_upper = sum(budget_total_upper, na.rm = T)) transparency_rate <- destination_data %>% filter(transparency_level == 2) %>% summarise(count = n()) / nrow(destination_data) response_rate <- destination_data %>% filter(transparency_level >= 1) %>% summarise(count = n()) / nrow(destination_data) estimation <- overall_budget / transparency_rate[1,1]
Based on our data at hand we can say that parties spend between 13490410 and 13616410 Swiss francs. Based on our data at hand and a transparency rate of 68 %, we can estimate that cantonal parties spend between 19724614 and 19908841 Swiss francs. Note: This is a very conservative estimate, since the parties spending the most (SVP, FDP) are usually the ones that are not transparent. And it is also an uncertain estimate, because it does not consider the inter-cantonal budget- and transparency distribution.