[LTER-im] Water Cooler today (4/10/2017) at noon pacific, 1pm mountain, 2pm central, 3pm eastern

Yang Xia yangx at ksu.edu
Mon Apr 10 07:28:57 PDT 2017


Hello All,
Just a quick reminder that we will have our Water Cooler today at noon pacific, 1pm mountain, 2pm central, 3pm eastern.  Our main topic will be the Quality Checker for PASTA.  

Connection info: https://ucsb.zoom.us/j/322175707
Or iPhone one-tap (US Toll):  +16465588656,322175707# or +14086380968,322175707#
Or Telephone: Dial: +1 646 558 8656 (US Toll) or +1 408 638 0968 (US Toll)
 Meeting ID: 322 175 707

The following message was Margaret sent out on April 3 for your brainstorm. Hope to see you all. Thanks!
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hi all -
As you know, an LTER IMC watercooler is scheduled for next Monday, April 
10 (3pm EDT). One of the topics will be advances planned for the ECC - 
the system performing dataset checking for PASTA. This msg is a short 
description of what we would like to cover.

In case you have forgotten, back in 2012 an IMC working group finalized 
72 checks, and ~25 of these were running when PASTA went into production 
in 2013. In the intervening time, additional checks were implemented 
depending on resources available, and other checks proposed.  Fast 
forward to today: EDI is up and running, and we have resources budgeted 
to work on this more systematically.

The new checks to be implemented are related to specific feature 
requests. These involve data integrity, and they are important to review 
with you because failure will generate an 'error' and block upload of 
the dataset.

1. checksum (2 checks, details on request):
These will confirm entity integrity during upload. The checksum can be 
used later by PASTA to minimize entity duplication.

2. DOIs:
PASTA now adds package DOIs to L1 EML. This means that L0 EML should not 
contain a DOI (e.g., a DOI may have been inadvertently left behind if an 
EML doc was recycled).  This check will prevent confusion due to 
possible conflicting ids.

During the watercooler, we will outline specifics about the checks. As 
with other PASTA improvements, checks will be developed and can be 
tested on portal-d (i.e., you can pre-evaluate your trial EML), and 
portal-s is reserved as the staging platform for production. For a 
summary of the checker, its behavior, and results from the first few 
years with LTER datasets, see this paper in the recent Ecoinformatics 
special issue, DOI: 10.1016/j.ecoinf.2016.08.001

Best,
EDI team




More information about the im mailing list