Overview
Scrapy is a Python web scraping framework, but it also offers a lot of encapsulated async data processing functionality independent of actual web scraping. Since I have now done the same data processing with a Scrapy pipeline and without, and since we want to standardize the role of Scrapy in our data work, I want to reflect on each implementation option and the strengths and tradeoffs of each.
Proposal
Planning to spend an afternoon typing up a document of notes.
Overview
Scrapy is a Python web scraping framework, but it also offers a lot of encapsulated async data processing functionality independent of actual web scraping. Since I have now done the same data processing with a Scrapy pipeline and without, and since we want to standardize the role of Scrapy in our data work, I want to reflect on each implementation option and the strengths and tradeoffs of each.
Proposal
Planning to spend an afternoon typing up a document of notes.