[LTER-im-rep] Non tabular data sources in DEIMS

Inigo San Gil isangil at lternet.edu
Fri Sep 11 14:04:56 MDT 2015


Sometimes, we manage data contained in things other than spreadsheets. 
  For example, (in genomics) we may have FASTA files. Perhaps we have 
(in metagenomics) massive SFFs files.  Bio-assays projects.

Sometimes, it is much simpler than that: Our data is a photo or a 
gallery of photos, or a massive collection of photos. A map, or a 
bi-dimensional array (n-dimensional too). A graph. A shapefile. A zipped 
dataset of a number of GIS dataset assets.  A (hopefully not) rather 
unstructured word document. A punch card (ask Luquillo).  An input 
parameter file for a modelling software. Output files of such modelling 
software.  A matlab file. A JSON encoded collection, a mongoDB instance 
capturing streaming data pairs. I can go on, but you get it.

All these resources are bona-fide examples of data, yet, in DEIMS is not 
all that clear how to handle those, as DEIMS forms smell like EML.  We 
are going to clarify a path forward now:
A simple way to address those data in DEIMS would be to just upload the 
file as is. For that, you may have to relax a bit the types of files 
that you can upload or attach (by default "only" these types can get in: 
txt xls xlsx csv fhx rtf xml doc docx dbf zip.)

Since the "variables" and other metadata will not apply in most of those 
cases, just ignore those CSV-centric metadata fields.  In all 
practicality, a good description of the data, the file name and actual 
file will do. Concerned about EML? Dont be. The 
latestcommit<https://github.com/lter/deims/commit/c642103f5973ae7810094978ff44dd57f3701309>of 
the EML data-source template will ensure that a valid EML is produced 
(using the otherEntity branch).  If you have an existing DEIMS, this is 
an easy update -- all you need to do is replace the template called 
eml--node--data-source.tpl.php you find 
inGithub<https://raw.githubusercontent.com/lter/deims/7.x-1.x/modules/custom/eml/templates/eml--node--data-source.tpl.php>in 
lieu of the current template (in 
profiles/deims/modules/custom/eml/templates).

Of course, this is not the end of what DEIMS can do with data that is 
not a csv.  What you do with ESRI Gis is hopefully, fire some ARC 
software to play with it. If you use SFFs out of a titanium (454) 
machine, hopefully you are using specialized software and workflows to 
do basics (de-noise) or advanced analysis (usingMG-RAST 
<http://metagenomics.anl.gov/>, for example). If we are dealing with 
Matlab, please use matlab. Nothing shocking.  However, sometimes we can 
offer exploratory tools for some specialized, non-tabular data.  Let's 
work with the researchers, find what is a good data-exploratory 
functionality that may be suitable to offer through DEIMS.  For example, 
fluorescence data (two-d contour plots, linked to their several data 
precursors, the indexes derived from the raw matrices - how do we 
display this as to make the experience of the visitor a satisfactory 
one? Archives of historical images. Videos. In those cases, we may not 
be able to make anything with EML, but we certainly can make those data 
useful to the public too.

Inigo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lternet.edu/pipermail/im-rep/attachments/20150911/3680f61c/attachment.html>


More information about the im-rep mailing list