- NOTE: I used the pyenv, and
pyenv-virtualenv to set up my
pythonenvironment. - For
RI set it up using renv- I used
Rversion 4.2.2
- I used
To run this first set up the project as follows:
git clone https://github.com/peterjbachman/causal_project.git
cd causal_project
pyenv virtualenv 3.11.1 causal_project
pyenv local causal_project
pip install -r requirements.txtSet up R for this project by then running the following in the same terminal:
RYou will need to create a python script in this directory named secrets_cl.py
formatted as follows:
api_key = {"Authorization": "Token <insert CAP API token here>"}01_pull_data.py- Pulls the opinion text using the CAP API
02_clean_text.py- Cleans up the text and cleans the opinion author section
03_tidy_text.R- Cleans up the data further by trimming variables in the dataset and add a few variables
04_split_cases.R- Splits the data into a training and test dataset to learn about the topics in this data.
05_train_stm.R- Train Structural Topic Model on the training data, perform diagnostics and explore topics
06_pull_validate.R- Randomly sample 50 paragraphs, realize some stuff, only sample 49 documents and then hand code and validate the topics they were assigned to.
07_final_results.R- Run Structural Topic Model on testing data, Create Figure 1 on the poster
08_map.R- Create Figure 2 on the poster