Skip to content

Latest commit

 

History

History
49 lines (35 loc) · 2.49 KB

File metadata and controls

49 lines (35 loc) · 2.49 KB

scholar.py

scholar.py is a Python module that implements a querier and parser for Google Scholar's output. Its classes can be used independently, but it can also be invoked as a command-line tool.

The script used to live at http://icir.org/christian/scholar.html, and I've moved it here so I can more easily manage the various patches and suggestions I'm receiving for scholar.py. Thanks guys, for all your interest! If you'd like to get in touch, email me at christian@icir.org or ping me on Twitter.

Cheers,
Christian

Features

  • Extracts publication title, most relevant web link, PDF link, number of citations, number of online versions, link to Google Scholar's article cluster for the work, Google Scholar's cluster of all works referencing the publication, and excerpt of content.
  • Supports the full range of advanced query options provided by Google Scholar, such as title-only search, publication date timeframes, and inclusion/exclusion of patents and citations.
  • Supports retrieval of citation details in standard external formats as provided by Google Scholar, including BibTeX and EndNote.
  • Command-line tool outputs a CSV file containing results from pages 1-99

Note

Hey everyone, this original code has been changed to parse through pages 1-99 instead of just page 1. Output is now a CSV file that will be compiled with all results and stored in the outputs directory of biblio_reader.

Best,
Michael

Examples

Try scholar.py --help for all available options. Note, the command line arguments changed considerably in version 2.0! A few examples:

Retrieve one article written by Einstein on quantum theory:

$ scholar.py -c 1 --author "albert einstein" --phrase "quantum theory"
         Title On the quantum theory of radiation
           URL http://icole.mut-es.ac.ir/downloads/Sci_Sec/W1/Einstein%201917.pdf
          Year 1917
     Citations 184
      Versions 3
    Cluster ID 17749203648027613321
      PDF link http://icole.mut-es.ac.ir/downloads/Sci_Sec/W1/Einstein%201917.pdf
Citations list http://scholar.google.com/scholar?cites=17749203648027613321&as_sdt=2005&sciodt=0,5&hl=en
 Versions list http://scholar.google.com/scholar?cluster=17749203648027613321&hl=en&as_sdt=0,5
       Excerpt The formal similarity between the chromatic distribution curve for thermal radiation [...]

License

scholar.py is using the standard BSD license.