[LTER-im-rep] Poll Question concerning LTER Metacat Content and whether it should be archived and hidden during DataONE search and discovery
Linda A Powell
powell at fiu.edu
Fri Nov 13 12:39:45 MST 2015
I’m glad we’re polling the community about this issue and hopefully we’ll be able to ‘hide’ all the metacat contributed data packages in DataONE. Thank you for the acknowledgement regarding the mistaken timing of the FCE additions. As you know (you and I discussed my making the versioning changes beforehand), it took quite a bit of work to make all those changes to the FCE files and our Oracle database.
Back in 2000 when the FCE began, the LTER network was just listing data in a DTOC that John Porter had created and there was no EML. The FCE was looking for direction from the Network on what was required in terms of data (other than make it available within two years of collection) when we first started and the Network was at the stage were there were no data ‘Best Practices’ in place. Our example goes to show that sometimes decisions you make early on can become a real problem later as technology progress!
Florida Coastal Everglades LTER Program
OE 148, Florida International University
Miami, Florida 33199
Phone (Tallahassee, FL): 850-745-0381
Phone(Miami,FL): 305-856-0039 or 305-348-6054
From: Margaret O'Brien <margaret.obrien at ucsb.edu>
Sent: Friday, November 13, 2015 2:19 PM
To: Linda A Powell; im-rep at lists.lternet.edu; Evelyn Gaiser
Subject: Re: [LTER-im-rep] Poll Question concerning LTER Metacat Content and whether it should be archived and hidden during DataONE search and discovery
Hi Linda and Evelyn -
I was mistaken about the timing of the FCE additions, and the effect
that had on their appearance in DataONE. They were the capped-off
datasets in Metacat that reappeared; SBC has a couple of these too, but
not as many as FCE. They are like zombies.
When it is implemented, the archiving of all LTER metacat submissions
will "hide" all the metacat-contributed data packages in DataONE, not
just the zombies. This is entirely appropriate, since we want our PASTA
contributed datasets to be what the public sees. Thanks again for
pushing that ball ahead.
The goal of my outburst was to correct some misinformation in Inigo's
response about metacat and pasta revisioning. It's unfair to represent
those products in that way, and I stand by my statements.
Santa Barbara Coastal LTER
Marine Science Institute, UCSB
Santa Barbara, CA 93106
On 11/12/15 10:16 AM, Linda A Powell wrote:
> Hi Margaret,
> With respect, I’d like to mention that the FCE LTER Program implemented a ‘versioning’ protocol at its inception in 2000….long before the adoption of EML and LTER Best Practices for handling data. The FCE saw the importance of never overwriting data and allowing users to always be able to download the exact dataset they may have downloaded years earlier. We were always commended by our NSF review teams for the fact that we followed this protocol.
> One of the reasons we decided to ‘yield’ to the LTER Network practices in 2011 was the fact that a script couldn’t or wouldn’t be written so that only the most current version of the FCE data was displayed on the Metacat Portal (we had such a feature on the FCE Data Resource Page) and the fact that the files weren’t even ‘ordered’ when displayed on that portal. As you can imagine, users had to read a listing of versioned files to find the most recent and in some cases, it appeared in the middle of the list!
> I sat down with Mark Servilla and Duane Costa at the Santa Barbara IM meeting in 2011 to discuss us possibly making changes and doing away with our ‘versioned’ data and at NO time was there a discussion about possible issues with DataONE displaying our ‘deleted’ or ‘archived’ data. In fact, I don’t think we had started migrating the LTER data into DataONE at that point in time. Those FCE data that were deleted from Metacat were NOT incorrect in their scientific value but had to be 'deleted' because of the package ID constraints within Metacat (can't reuse IDs).
> We (FCE) realize that removing datasets is not a common occurrence but in this instance what choice did we have in order to move forward? We should have the right to delete or hide datasets from the search and discovery in Metacat and/or DataOne as we (FCE) are the owners of those data.
> Linda Powell
> Information Manager
> Florida Coastal Everglades LTER Program
> OE 148, Florida International University
> University Park
> Miami, Florida 33199
> Phone (Tallahassee, FL): 850-745-0381
> Phone(Miami,FL): 305-856-0039 or 305-348-6054
> Website: http://fcelter.fiu.edu
> From: im-rep <im-rep-bounces at lists.lternet.edu> on behalf of Margaret O'Brien <margaret.obrien at ucsb.edu>
> Sent: Thursday, November 12, 2015 12:44 PM
> To: im-rep at lists.lternet.edu; Evelyn Gaiser
> Subject: Re: [LTER-im-rep] Poll Question concerning LTER Metacat Content and whether it should be archived and hidden during DataONE search and discovery
> Hi Inigo -
> Some of your comments indicate a basic lack of understanding of past and
> current network systems. I'd like to correct a few of your points here.
> Please see comments inline.
> Thanks -
> im-rep at lists.lternet.edu
> On 11/12/15 8:50 AM, Inigo San Gil wrote:
>> Deleted (metacat) metadata should have not been exposed to dataOne in
>> the first place. I am surprised, even if they call those data
>> 'archived'. There are old versions, and then there are deprecated,
>> deleted data. Any serious IMS can make that distinction. It is good to
>> see metacat is gone, for that, and many other reasons. (it is not
>> really gone, though.)
> Metacat was perfectly capable of distinguishing between revisions. FCE
> had a ad hoc definition of "data set revision" which was inconsistent
> with what the rest of the network was doing. Unfortunately, their
> practice didn't get aligned with the rest of the network till after the
> first DataONE submission. That's why FCE datasets appear the way they do.
>> Also, data that is erased at origin (@ FCE, due to whatever reason),
>> should be deleted at the public repositories -not sure how-.
> I disagree. data that are in repositories should have undergone
> sufficient review to be appropriate for the public before it was
> submitted. There exists a manual process for removal (for rare cases).
>> Due to shortcomings with the repository systems we have been using,
>> this deletion has been a headache. Always.
> Removing datasets is not a common occurrence, so should not be a trivial
>> At best, we were forced by default to retire our numerical
>> identifiers, this has been quite irritating. At worse, unwanted, bad
>> data remains public through these repos.
> Identifiers, if they were carefully assigned originally and in a robust
> system, were not retired. See any SBC dataset in PASTA, and you will see
> that the same identifier was used for previous revisions in Metacat. I
> know this is true for at least half the sites as well (GCE, MCR, HFR,
> VCR, AND, NTL among others).
>> There is a silver lining to all this, old versions are just that, old
>> versions. Anybody that is able to get the hands on an old version of
>> data ought to realize that this is not the most current version.
> This is true, and has always been how our systems work (metacat and pasta).
>> I would ask dataOne to erase that content if you are concerned.
> It will be archived (not erased), so that if someone has a link, it will
> still work
>> Cheers, Inigo
>> On 11/12/2015 8:58 AM, Linda A Powell wrote:
>>> Dear Information Managers,
>>> As some of you may know, the FCE LTER program discontinued its
>>> practice of file versioning where each updated data file would be
>>> given a different file name (.v1, .v2, etc.) and a new EML package
>>> ID. We initially had 525 data files that got combined into new
>>> data files so our FCE data count decreased to 125 data files. I
>>> started the EML packaging ID numbers for the newly combined data
>>> files at knb-lter-fce.1050 (well beyond the last package ID
>>> knb-lter-fce.525) and I personally deleted all the old versioned data
>>> from the LTER Metacat. I then added the 125 new files back into the
>>> Metacat and PASTA.
>>> Unfortunately, those files were never really deleted from Metacat,
>>> only archived, and when the Metacat files were harvested into
>>> DataONE, *ALL* my files, including those I thought were ‘deleted’ and
>>> the existing, were uploaded. Now the FCE has a big mess! The old
>>> 'deleted' files are listed but none of the files exist any longer so
>>> the links to the data don’t work. I’m sure the DataONE users are
>>> frustrated! There may be Metacat files that other IMs have thought
>>> were deleted that are also showing up in DataONE.
>>> */My question to the LTER IMs is whether the content that existed in
>>> the LTER Metacat should be archived and made hidden during search and
>>> discovery from the DataONE infrastructure (i.e. ONEMercury, CN API,
>>> etc.)? /* I've asked Mark Servilla to help with this issue and we
>>> thought It would be simplest and scale economically if he could
>>> perform this operation at one time and for all LTER site content as
>>> opposed to performing this operation for each site independently. Of
>>> course we want input from the IMC before we move forward.
>>> *I’ve created a Doodle Poll (https://urldefense.proofpoint.com/v2/url?u=http-3A__doodle.com_poll_uw87khqqyhhsiry5&d=AwIGaQ&c=1QsCMERiq7JOmEnKpsSyjg&r=rMjlJunaBgd7uJf-sYlFRA&m=RVv8KlYGFNDj9xLDYUDq5stTIDQI2iy0uF-L1UhVBnQ&s=aiyi-APYh7qQ_iTwA0ej-Yjcuy4vQAZ3gyq2n63drl0&e= )
>>> and would appreciate input from EACH of the LTER site IMs as to
>>> whether the content that existed in the LTER Metacat should be
>>> archived and made hidden during search and discovery from the DataONE
>>> infrastructure? Please select ‘Yes’ or ‘No’. *
>>> Thank you in advance for your participation!
>>> Kindest Regards,
>>> Linda Powell
>>> Information Manager
>>> Florida Coastal Everglades LTER Program
>>> OE 148, Florida International University
>>> University Park
>>> Miami, Florida 33199
>>> Phone (Tallahassee, FL): 850-745-0381
>>> Phone(Miami,FL): 305-856-0039 or 305-348-6054
>>> Website: http://fcelter.fiu.edu
>>> Long Term Ecological Research Network
>>> im-rep mailing list
>>> im-rep at lternet.edu
> Margaret O'Brien
> Information Management
> Santa Barbara Coastal LTER
> Marine Science Institute, UCSB
> Santa Barbara, CA 93106
> 805-893-2071 (voice)
>> Long Term Ecological Research Network
>> im-rep mailing list
>> im-rep at lternet.edu
> Long Term Ecological Research Network
> im-rep mailing list
> im-rep at lternet.edu
More information about the im-rep