[LTER-im] PASTA dataset download tallying tools

Ken Ramsey kramsey at jornada-vmail.nmsu.edu
Tue Feb 14 13:02:07 PST 2017


Hi John,

Thanks!

Ken


>>> John Porter <jhp7e at eservices.virginia.edu> 2017-02-14 01:58 PM >>>
During the VTC yesterday, several folks expressed interest in code to
tally dataset and metadata downloads of data in PASTA.  PASTA keeps
excellent logs, but it is up to us to do the desired aggregations.

https://github.com/lter/VCR 

has several Python programs that may be of help.

PastaUseCountBasic.py (attached) writes to standard output a CSV file
containing Scope, Identifier, Revision, Title, Entity, DownloadCount,
StartDate, EndDate for each entity downloaded during a specified time
period. 

Some notes:

The program produces output to STDOUT based on command line options.  A
typical command line might be:
 python ./PastaUseCountBasic.py --fromdate 2017-01-01 --todate
2017-02-14  knb-lter-jrn >jrn_2016.csv

The program is NOT particularly fast, due to the large number of web
service calls required and latency associated with PASTA processing.
Shorter time periods are processed faster than longer ones due to the
smaller number of log entries needed to be retrieved.

The program uses a number of modules (listed at the top in the import
statements) that need to be installed prior to running.

It requires that you give it an authorized login to access the needed
records and prompts you for them, or you can set up an "authorization
file" that eliminates the need to manually login. Contact me for details....


-- 
John H. Porter
Dept. of Environmental Sciences
University of Virginia
291 McCormick Road
PO Box 400123
Charlottesville, VA 22904-4123
ORCID: http://orcid.org/0000-0003-3118-5784 




More information about the im mailing list