[LTER-im] Monday's Watercooler, ECC checks
Margaret O'Brien
margaret.obrien at ucsb.edu
Mon Apr 3 16:50:37 PDT 2017
Hi all -
As you know, an LTER IMC watercooler is scheduled for next Monday, April
10 (3pm EDT). One of the topics will be advances planned for the ECC -
the system performing dataset checking for PASTA. This msg is a short
description of what we would like to cover.
In case you have forgotten, back in 2012 an IMC working group finalized
72 checks, and ~25 of these were running when PASTA went into production
in 2013. In the intervening time, additional checks were implemented
depending on resources available, and other checks proposed. Fast
forward to today: EDI is up and running, and we have resources budgeted
to work on this more systematically.
The new checks to be implemented are related to specific feature
requests. These involve data integrity, and they are important to review
with you because failure will generate an 'error' and block upload of
the dataset.
1. checksum (2 checks, details on request):
These will confirm entity integrity during upload. The checksum can be
used later by PASTA to minimize entity duplication.
2. DOIs:
PASTA now adds package DOIs to L1 EML. This means that L0 EML should not
contain a DOI (e.g., a DOI may have been inadvertently left behind if an
EML doc was recycled). This check will prevent confusion due to
possible conflicting ids.
During the watercooler, we will outline specifics about the checks. As
with other PASTA improvements, checks will be developed and can be
tested on portal-d (i.e., you can pre-evaluate your trial EML), and
portal-s is reserved as the staging platform for production. For a
summary of the checker, its behavior, and results from the first few
years with LTER datasets, see this paper in the recent Ecoinformatics
special issue, DOI: 10.1016/j.ecoinf.2016.08.001
Best,
EDI team
--
-----------
Margaret O'Brien
Information Management
Marine Science Institute, UCSB
Santa Barbara, CA 93106
805-893-2071 (voice)
http://environmentaldatainitiative.org
http://sbc.marinebon.org
http://sbc.lternet.edu
More information about the im
mailing list