[LTER-im] Fwd: EDI Date time and funding elements to go into production May 16

Margaret O'Brien margaret.obrien at ucsb.edu
Tue May 1 16:23:18 PDT 2018

Margaret O'Brien
ORCID: 0000-0002-1693-8322
Information Management
Marine Science Institute, UCSB
Santa Barbara, CA 93106
805-893-2071 (voice)

---------- Forwarded message ----------
From: EDI <info at environmentaldatainitiative.org>
Date: Tue, May 1, 2018 at 4:18 PM
Subject: EDI Date time and funding elements to go into production May 16
To: margaret.obrien at ucsb.edu

View this email in your browser

Hi all -
You will recall that when time permits, EDI works on EML data package
checks for the EML Congruence Checker (ECC), and our protocol is to
accumulate checks for semi-annual release in November and May. Recently, we
finalized three checks: two related to date-times (in dataTable
attributes), and one for an optional element in the project tree. Below is
a description of the checks we plan to release May 16. Until then, the
checks can be tested in the staging environment, https://portal-s.

For more information, contact EDI. Feel free to visit our git repository,
below, to find meeting notes and more details of these checks (under

ECC Working Group, https://github.com/EDIorg/ECC
Margaret O'Brien, Duane Costa, Sven Bohm, Stevan Earl, Jason Downing,

*Date-time checks*
Date-times are complex. As labels for points in time, they have features of
other types of measurements, and when correctly parsed, can be used in
computations (e.g., to compute duration). Users have requested the ability
to query, filter or plot data values by date and time. But before data
values can be effectively used, they must be parsed and interpreted. Many
programming languages have libraries for date-time parsing, and may also
layer on their own interpretations. The simplest solution would have been
to accept the dateTime interpretation of a single processing language, but
this was not consistent with the language-agnostic spirit of EML.

There are two parts to EML date-time checking, and consequently, two
checks, which work together.

1. examine dateTimeFormatString: EML uses a formatString in metadata to
specify how date time values will appear in data. For this check, the
working group created a list of preferred dateTime formats
which generally reflect ISO-accepted date times. Code reads this list and
creates regular expressions, to which EML dateTime formatStrings can be
compared, that is, a match between the format string and every individual
data value in that column.

2. dateFormatMatches: Only datetimes with preferred formatStrings can be
checked for congruence.

Both dateTime checks return a "warn" if their conditions are not met. With
this setting, non-passing data packages are still accepted, but the
submitter is aware of the potential problems. Datasets that meet ECC
standards will be the most easily reused for synthesis.

*Funding element check*
Increasingly, funders (e.g., NSF) are asking for datasets to be searchable
by a funding code. In EML 2.1.x the funding element in the project tree
holds that information, and is unstructured. A planned addition to EML 2.2
will have additional fields for structuring award information. Currently,
the fundingPresence check simply looks for the element's presence, although
this is likely to be revisited after the EML 2.2 release.

*Copyright © 2018 Environmental Data Initiative, All rights reserved.*
You were added to the EDI contact list because you expressed interest in
hearing from us.

*Our mailing address is:*
Environmental Data Initiative
Center for Limnology - University of Wisconsin
680 North Park Street

Add us to your address book

Want to change how you receive these emails?
You can update your preferences
or unsubscribe from this list

[image: Email Marketing Powered by MailChimp]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lternet.edu/pipermail/im/attachments/20180501/2b8c5809/attachment-0001.html>

More information about the im mailing list