[LTER-im] PASTA+ access audit database downsizing...

Mark Servilla mark.servilla at gmail.com
Tue Oct 9 09:08:03 PDT 2018


Dear IM,


Due to the voluminous size of the PASTA+ access audit database (42 million
records) and its poor response time when queried, we will perform the
following actions on Wednesday 10 October 2018 during our weekly
maintenance window:


1. We will backup the entire database to an offline location for
preservation.

2. We will write out the entire access audit table to an accessible text
file (for asynchronous queries).

3. We will delete all records from the active access audit table prior to 2
May 2018.


Why 2 May 2018 you may ask? In our first five-plus years of operation, we
did not effectively distinguish search engine robots and crawlers from
regular users when recording access events into the access audit table.
This changed with the advent of a new detection algorithm. As of 2 May
2018, we have high confidence that search engine robots and crawlers are
being identified from regular users. This now means that the data presented
through the PASTA+ access audit database from 2 May 2018 and onward
contains useful and actionable information. We apologize that data prior to
2 May 2018 does not meet a useful level of quality and should be considered
suspect when analyzing access and download events.


If you would like a January 2013 to May 2 2018 data slice from the access
audit table for your site’s use (based on the scope value), please let us
know at any point in the future.


Finally, let us know immediately if you believe this action will adversely
affect your operations. We will do our best to serve our community and keep
PASTA+ running smoothly.


Sincerely,

Mark


---
Mark Servilla
mark.servilla at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lternet.edu/pipermail/im/attachments/20181009/849e8141/attachment.html>


More information about the im mailing list