This directory contains the replication data and scripts for Toni Rodon, Tom Paskhalis. "Parliamentary Power in Early Representative Assemblies: Evidence from XVII century England" published in the Journal of Historical Political Economy.
analysis_and_write_up.R - Complete replication script that combines all analysis steps and generates all figures and tables.
The script requires the following data files in the ./data/ directory (within the replication folder):
These files must be present in ./data/:
-
Parliamentary Journals:
commons_df.rds- Journal of the House of Commonslords_df.rds- Journal of the House of Lords
-
Parliamentary Sessions & Acts:
parliament_sessions.csv- Parliamentary session dates and monarchsparliament_royalcontrol.csv- Royal control indicatorsparliament_acts_smith.csv- Parliamentary acts from Smith (1999)parliament_acts_hoppit.csv- Parliamentary acts from Hoppit (2017)parliament_acts_interregnum_bho.csv- Interregnum acts
-
Kings' Speeches:
kings_text.rds- Kings' speeches text
-
Members of Parliament:
mps_data.xlsx- MPs data by parliamentary session (for surnames analysis)
-
Constituencies:
constituencies_infos.csv.bz2- Constituency informationconstituencies_list.csv.bz2- List of constituencies
-
Economic Data:
governmental_income_chandaman.csv- Governmental income datanational_income_obrien.csv- National income data
The script also requires these files from ./code/ directory:
./code/helper_functions.R- LDA fitting and analysis functions./code/plotting_functions.R- Plotting utilities./code/latin_stopwords.R- Latin stopwords for text preprocessing
Note: All input data and code files are in subdirectories within the replication/ folder. The script is self-contained and does not require access to parent directories.
The script performs the following steps:
-
Data Loading - Loads all necessary datasets (Commons, Lords, Acts, Kings' speeches, MPs data, etc.)
-
Descriptive Analysis - Calculates summary statistics for Commons and Lords data
-
LDA Model Fitting - Fits topic models with 10 and 30 topics for both Commons and Lords journals
-
Validation & Entropy - Calculates entropy measures (parliamentary power) and combines with acts data
-
Kings' Speeches Analysis - Calculates cosine similarity between Kings' speeches and parliamentary journals
-
Surnames Analysis - Analyzes surname diversity among MPs as a measure of political competition
-
Constituency Analysis - Counts mentions of constituencies in parliamentary texts
-
Regression Analysis - Fits regression models examining the relationship between policy diversity and acts
-
Figure Generation - Generates all main manuscript figures
-
Table Generation - Generates all main manuscript tables
IMPORTANT: This project uses renv for reproducible package management. The script automatically activates the renv environment from the parent directory (../renv/) which contains all required package versions used in the original analysis.
Recommended approach:
- The script will automatically load the renv environment from the parent directory
- All package dependencies are managed through renv, ensuring reproducibility
- No need to manually install packages - renv will restore the exact versions used
If you need to restore packages manually or set up the environment for the first time:
# From the main project directory (parent of jhpe_parliament_england_xvii/)
renv::restore()The following R packages are required and managed by renv:
# Packages managed through renv (auto-installed):
# "readr", "readxl", "dplyr", "tidyr", "stringr", "purrr",
# "lubridate", "quanteda", "quanteda.textstats", "topicmodels",
# "ggplot2", "gridExtra", "modelsummary", "fixest", "stopwords"Manual Installation (Alternative):
If you prefer to work outside of renv or encounter issues with the renv environment, you can manually install all required packages:
install.packages(c(
"readr", "readxl", "dplyr", "tidyr", "stringr", "purrr",
"lubridate", "quanteda", "quanteda.textstats", "topicmodels",
"ggplot2", "gridExtra", "modelsummary", "fixest", "stopwords"
))Note: Manual installation may result in different package versions than those used in the original analysis, which could lead to minor variations in results. For exact replication, use renv.
From R or RStudio:
# Navigate to replication directory
setwd("jhpe_parliament_england_xvii")
# Run the script (automatically activates renv from parent directory)
source("analysis_and_write_up.R")From the command line:
cd /path/to/jhpe_parliament_england_xvii
Rscript analysis_and_write_up.RWhat happens when you run the script:
- Automatically activates the renv environment from the parent directory (
renv::load("..")) - Reads input data from
./data/ - Sources helper functions from
./code/ - Writes all outputs to
./figures/,./results/,./tables/, and./log/
The script generates the following outputs (all within the replication/ directory):
figure1.pdf/png- Overview of Commons and Lords data over timefigure2.pdf/png- Yearly parliamentary acts by housefigure3a.pdf/png- Scatter plot of power vs actsfigure3b.pdf/png- Regression coefficient plotfigure4.pdf/png- Cosine similarity between Kings' speeches and journalsfigure5.pdf/png- Surname diversity and parliamentary power
- LDA models (in
./results/lda/subdirectory):lda/commons_10/- Commons 10-topic models by year (entropies.rds, topics.rds, yearly .rds files)lda/commons_30/- Commons 30-topic models by yearlda/lords_10/- Lords 10-topic models by yearlda/lords_30/- Lords 30-topic models by year
- Regression models (in
./results/ols/subdirectory):ols/acts_10_md1.rdsthroughols/acts_10_md6.rds- 10-topic regression modelsols/acts_30_md1.rdsthroughols/acts_30_md6.rds- 30-topic regression models
- Aggregate results:
acts.csv- Combined parliamentary acts dataxvii.csv- 17th century data with entropy measures (parliamentary power)commons_kings_similarities.csv- Cosine similarity between Kings' speeches and Commons journalssurnames_df.csv- Surname diversity index merged with power measures
table1.tex- Summary of 17th century parliamentstable2.tex- Top words by topic for key years (1625, 1642, 1662, 1689)table3.tex- Regression models of parliamentary power
In addition to input files, intermediate datasets generated during analysis:
meta_commons.csv,meta_lords.csv- Descriptive statistics per yearconstituencies_mentions_xvii.csv- Frequency of constituency mentionsfull.csv- Complete merged dataset for regression analysis
analysis_YYYYMMDD_HHMMSS.log- Timestamped log of the analysis run with progress and results
The full script may take several hours to complete due to:
- LDA model fitting (most time-consuming step)
- Large corpus processing
For testing purposes, you can modify the LDA control parameters to reduce iterations:
control = list(alpha = 5, verbose = 25L, seed = 1234,
burnin = 100, thin = 10, iter = 200) # Reduced iterationsThe replication directory structure is:
jhpe_parliament_england_xvii/
├── analysis_and_write_up.R ← Main replication script
├── README.md
├── code/ ← Helper functions
│ ├── helper_functions.R
│ ├── plotting_functions.R
│ └── latin_stopwords.R
├── data/ ← Input data & intermediate files
│ ├── commons_df.rds
│ ├── lords_df.rds
│ ├── parliament_sessions.csv
│ ├── parliament_acts_*.csv
│ ├── kings_text.rds
│ ├── mps_data.xlsx
│ ├── governmental_income_chandaman.csv
│ ├── national_income_obrien.csv
│ ├── constituencies_infos.csv.bz2
│ ├── constituencies_list.csv.bz2
│ ├── meta_commons.csv ← Generated
│ ├── meta_lords.csv ← Generated
│ ├── constituencies_mentions_xvii.csv ← Generated
│ └── full.csv ← Generated
├── figures/ ← Generated figures (PDF/PNG)
│ ├── figure1.pdf/png
│ ├── figure2.pdf/png
│ ├── figure3a.pdf/png
│ ├── figure3b.pdf/png
│ ├── figure4.pdf/png
│ └── figure5.pdf/png
├── results/ ← Generated results
│ ├── lda/ ← LDA topic models
│ │ ├── commons_10/
│ │ ├── commons_30/
│ │ ├── lords_10/
│ │ └── lords_30/
│ ├── ols/ ← Regression models
│ │ ├── acts_10_md1.rds through acts_10_md6.rds
│ │ └── acts_30_md1.rds through acts_30_md6.rds
│ ├── acts.csv
│ ├── xvii.csv
│ ├── commons_kings_similarities.csv
│ └── surnames_df.csv
├── tables/ ← Generated LaTeX tables
│ ├── table1.tex
│ ├── table2.tex
│ └── table3.tex
└── log/ ← Analysis logs
└── analysis_YYYYMMDD_HHMMSS.log
- Package Management: The project uses
renvfor reproducible package management. The script automatically loads the environment from the parent directory (renv::load("..")) - All paths are relative to the
replication/directory - The script is self-contained and does not require access to parent directories (except for renv activation)
- All input data files are in
./data/ - Helper functions are in
./code/ - All output directories (
figures/,results/,results/lda/,results/ols/,tables/,log/) are created automatically if they don't exist - Progress and results are logged to both console and timestamped log file in
./log/ - The original data files in
./data/are NOT modified (except for generated intermediate files) - Log files are timestamped (e.g.,
analysis_20260302_143025.log) to preserve multiple runs
If you encounter any issues running the script, check:
- All required packages are installed
- The working directory is correctly set
- All data files are in their expected locations
- You have write permissions for output directories