scholar.py is a Python module that implements a querier and parser for Google Scholar's output. Its classes can be used independently, but it can also be invoked as a command-line tool.
The script used to live at http://icir.org/christian/scholar.html, and I've moved it here so I can more easily manage the various patches and suggestions I'm receiving for scholar.py. Thanks guys, for all your interest! If you'd like to get in touch, email me at christian@icir.org or ping me on Twitter.
Cheers,
Christian
- Extracts publication title, most relevant web link, PDF link, number of citations, number of online versions, link to Google Scholar's article cluster for the work, Google Scholar's cluster of all works referencing the publication, and excerpt of content.
- Supports the full range of advanced query options provided by Google Scholar, such as title-only search, publication date timeframes, and inclusion/exclusion of patents and citations.
- Supports retrieval of citation details in standard external formats as provided by Google Scholar, including BibTeX and EndNote.
- Command-line tool outputs a CSV file containing results from pages 1-99
Hey everyone, this original code has been changed to parse through pages 1-99 instead of just page 1. Output is now a CSV file that will be compiled with all results and stored in the outputs directory of biblio_reader.
Best,
Michael
Try scholar.py --help for all available options. Note, the command line arguments changed considerably in version 2.0! A few examples:
Retrieve one article written by Einstein on quantum theory:
$ scholar.py -c 1 --author "albert einstein" --phrase "quantum theory"
Title On the quantum theory of radiation
URL http://icole.mut-es.ac.ir/downloads/Sci_Sec/W1/Einstein%201917.pdf
Year 1917
Citations 184
Versions 3
Cluster ID 17749203648027613321
PDF link http://icole.mut-es.ac.ir/downloads/Sci_Sec/W1/Einstein%201917.pdf
Citations list http://scholar.google.com/scholar?cites=17749203648027613321&as_sdt=2005&sciodt=0,5&hl=en
Versions list http://scholar.google.com/scholar?cluster=17749203648027613321&hl=en&as_sdt=0,5
Excerpt The formal similarity between the chromatic distribution curve for thermal radiation [...]
scholar.py is using the standard BSD license.