<div dir="ltr">Hi Don,<div><br></div><div>With regard to your request for "EDI to devise a way to filter bot requests from download reports", we have recently modified the capture of request information to include robot identification in the audit report output. For example,</div><div><br></div><div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">robot: Mozilla/5.0 (compatible; Googlebot/2.1; +<a href="http://www.google.com/bot.html">http://www.google.com/bot.html</a>)</blockquote><div><br></div><div>The phrase "robot:" will preface any identified request that we believe to originate from a robot and will be recorded as such in the audit log under the "user" column (in lieu of "public"). Duane has implemented this identification by using the robot suspect list provided by the Counter project (<a href="https://www.projectcounter.org/" target="_blank" rel="noopener" tabindex="-1" style="margin:0px;padding:0px;border:0px;font-variant-numeric:inherit;font-stretch:inherit;line-height:inherit;font-family:open-sans,emojifontface,helvetica,arial,sans-serif;font-size:12px;vertical-align:baseline;outline:none;color:rgb(9,87,164)">https://www.projectcounter.org/</a>), which is updated on a regular basis (about every 2-6 months). This feature was just released this past Wednesday evening (12 April) during our weekly updates. We are still fine tuning the list to avoid false positives, but it is now functioning in the production PASTA environment. I realize that this process does not technically filter out robot requests, but we hope it will suffice to identify them in the audit logs.</div><div><br></div><div>Sincerely,</div><div>Mark</div><div class="gmail_extra"><div><div class="gmail_signature"><br>---<br>Mark Servilla<br><a href="mailto:mark.servilla@gmail.com" target="_blank">mark.servilla@gmail.com</a></div></div>
<br><div class="gmail_quote">On Fri, Apr 14, 2017 at 2:43 PM, Henshaw, Donald <span dir="ltr"><<a href="mailto:don.henshaw@oregonstate.edu" target="_blank">don.henshaw@oregonstate.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div lang="EN-US">
<div class="gmail-m_8822082104308898435WordSection1">
<p class="MsoNormal"><span style="font-family:calibri,sans-serif;font-size:11pt"> </span><br></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif">We have been tracking data downloads for each data set from our webpage for over 15 years and have included this information in LTER and USFS PNW annual reports and NSF proposals.
While we have heard little feedback from NSF regarding their perspective on this, we feel this is valuable information. To better account for downloads of Andrews data we hope to also track downloads from PASTA and DataONE. We are concerned that the download
counts from PASTA include web robot counts along with the genuine downloads.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif"><u></u> <u></u></span></p>
<p class="gmail-m_8822082104308898435MsoListParagraph" style="margin-left:51pt">
<u></u><span style="font-size:11pt;font-family:symbol"><span>·<span style="font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:normal;font-stretch:normal;font-size:7pt;line-height:normal;font-family:"times new roman"">
</span></span></span><u></u><span style="font-size:11pt;font-family:calibri,sans-serif">We encourage EDI to devise a way to filter bot requests from download reports<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-family:calibri,sans-serif;font-size:11pt"> </span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif;color:rgb(31,73,125)"><u></u> <u></u></span></p>
</div>
</div>
<br>______________________________<wbr>_________________<br>
Long Term Ecological Research Network<br>
im mailing list<br>
<a href="mailto:im@lternet.edu">im@lternet.edu</a><br>
<br>
<br></blockquote></div><br></div></div></div>