[LTER-im-rep] distributed model -- but what distribution?

Thu Jul 2 22:18:25 MDT 2015

Hi John,

Thanks for your kind effort.  In the three points that you touch upon, I 
see something in 1) and 3).

Perhaps I see things different.  Is centralization something we need to 
stay away from?  My impression is that we can be more effective by 
coordinating and harmonizing better our efforts, energies and abilities. 
I.e, efforts towards common (central?) network goals, in a distributed 
environment, the interesting paradigm is that we work towards 
homogenization while respecting the intrinsic distributed character of 
the group. Just fostering an even loosely connected federation seems 
silly, against the tighter network spirit.

I am sorry you feel the process is somewhat tedious. It is definitely 
not an easy process. One way you may find some excitement in the boredom 
is keeping an eye on what could happen if the final outcome is good.  
The way I see it is that we need to flush out more the model to actually 
present something that sounds effective, distributed but network 
oriented.  You may disagree, but as it stands, most of what I see is 
more of the same.  And this may be OK, but over the comfort of the 
situation we did spot some obvious, correctible deficiencies 
(repetition, redundancies). If not us, somebody else will call them for 
us, and then, a good chance to re-think the network data center will be 
lost.  Bottom line,  here is the opportunity to make a better network 
providing a great data center - sorry if the process (of actually 
finding the right proposal) feels tedious to you, but we are not all as 
clear as you may see this yet.

The proposed model is rather similar to the stuff that is in place, we 
may find the same or similar impediments that the current model 
encountered. At times is a matter of labels, but substance seems the 
same, for good (and yes, there is a lot of good) and bad.  You may say, 
what's the problem with that? we are good, right? Yes, but we can be 
better, and there is much room for improvement. Here are two things A) 
we can use more common approaches to information management, thus 
leveraging our common knowledge.  We do that, right? Yes, but this 
time.. how we do this for real (some commitments, without so many sticks 
prodding each other) B) We need to examine what did not work (example: 
the IM compensation for network time) and avoid the same unexpected 
outcomes in the proposed plans.

In your point 1) we discuss about prioritization, and what gets done and 
who directs the LTER Data Center.  You tell us that this new model 
reflects the distributed characteristic because prioritization and 
scheduling would be distributed in the sense that the folks serving in 
the Gov. Committee are indeed distributed and rotating. I can eat that, 
but, do tell me how is that more efficient to the same distributed 
aspects of the process we have in place (NISAC, EB, SC, Bob, NSF)? The 
key aspects of prioritization at high level may not be as important as 
the involvement at the very detailed level. So, I see this a bit of a 
distraction -- Perhaps I would like to see articulated how is the larger 
community going to be vested in the project.  Perhaps we can agree that 
we can have a more enthusiastic and supporting steering, and the real 
question regarding 1) is whether the Gov. Committee is not going to be 
"it", is the drivers of the work that pay attention how real innovation 
is done who may feel the excitement (if history is an indication).

Here are some details that I draw from experience of what I have seen in 
our own network.

And here is my larger concern.  I hear about figures about the budget, 
between 700k to 1mil per year (opinions abound). That is not much IMO.  
The actual team that will make LTER shine in the IM aspects cannot be 
"two programmers and the occasional IM with a Project Manager that may 
or may not be a PI (opinions about the qualifications of such PM).  The 
real issue is that the whole team should include a Project Manager(s), 
System Administrators, Content Strategists, Database Administrators, 
Designers (Web and otherwise), Programers of several flavors, Tech 
writers, Data Curators, Data Custodians, Outreach and liaisons folk - a 
lobbyist such as Brian Wee's role would be fine too. I am leaving some 
crucial roles forgive me. Obviously 700k will not do it.  Some 
misinformed folks may believe that "this can be done with a couple of 
graduate students", but for those of us that can grasp the 
possibilities, we know that one very good programmer and a good leader 
will not produce it all, and will leave many things undone.  Being 
understaffed and under equipped may also affects morale, which is a 
compounding problem.

But I should get real.  A team of 15? No money.  Well, perhaps not, but 
first, we need to make clear that 2 programmers will do the level of 
work that we have seen (at best).  There is a way we have been 
operating: we formed teams of 15 before amongst ourselves (mixed with 
LNO sometimes).  Ofter, the lion's share of that team was carried by one 
to three persons, with occasional quality involvement from others, and a 
few, well, spectators (everybody plays a role!) who struggle to find a 
footing for whichever circumstance.  But cool project got off the 
ground, some even without sticks (do EML or else), and little support.  
Other times, real involvement happened with all members. I think 15 is 
not too much to ask, as we have done it, but we can and should improve 
vastly to take the data-center to a really brilliant position that we 
all feel ownership.  I think the IMC can work like that, but without us 
making the commitment, change, or adaptation, it will not happen.   We 
are in front of a very pervasive problem, we may not even detect it as 
we may convince ourselves that we got the right plan for it.  Inertia 
creeps in, we will default to what we have been doing for the last 10, 
20 or 30 years, which while not bad, we can and should do much better. 
At stake is the rare opportunity of taking stock of what we did wrong, 
and build upon that knowledge to make LTER shine through what always 
made feel like LTER may actually be network, the IMs.

Here is what I would like to see - and will contribute to that end - for 
starters flush out how the IMC is really going to pad the projected 
budget shortfall.  THere is the first item that may convince me we can 
do this.  Also, I would communicate and explain why we really need all 
the roles in a data center (and not all in one person). The last is the 
fixing the problem with the financial model.  If we were discouraged to 
work for the network, even when $ was around, identify the problems, and 
find the model that may work best.  And well, if this turns out to be 
"supplements", well, supplements it is.

Cheers,
Inigo

On 7/2/2015 4:31 PM, John Porter wrote:
> Inigo,
>
> A few quick comments. Many of these issues have been discussed at length
> (some to the point of tedium) during the discussions of the groups
> meeting (virtually) each Friday. I'll run through a few of them and try
> to characterize why, even though there are centralized elements, the
> model truly is distributed.
>
> 1) Who should make decisions about what projects money will be spent on?
> Current model: Bob Waide is the ultimate arbiter regarding how LNO funds
> are spent. He gets input from the EB, SC, NISAC and IMC, but ultimately
> he's the one who needs to make sure the budgets balance.
>
> Proposed model: The current governance plans call for a Governance
> Committee composed of LTER IM's and PIs. Although it is one committee,
> it has distributed membership and will be taking input from the entire
> IM and PI community. I'm hard pressed to think of a more distributed way
> select priorities for the network.  I suppose we could go with a "Town
> Meeting" approach where all the LTER IMs are required to consider and
> vote on each issue that arises, but it is not clear that every IM wants
> to spend a very large block of time each month doing this. The
> expectation is that governance committee members will need to devote
> significant amounts of time to this effort, and therefore, will rotate
> frequently.
>
> There is a Project Manager whose job it is to support and implement the
> decisions of the Governance Committee. The thinking is that we need
> someone who can devote full time and concentration to tracking the
> progress of individual initiatives, research budget alternatives to be
> presented to the governance board, manage fiscal details and prepare
> materials for NSF reports. Although this role COULD be performed by the
> Governance Committee if they dropped all other activities, it seems to
> make sense to have someone who can wholly concentrate on making sure
> that initiatives move forward. However, the Project Manager does not
> have decision making authority with respect to the major decisions
> needed to support development of LTER systems.
>
> 2) Where should the computational hardware required be housed?
> Current model: The LNO provides its own servers, at its central location.
>
> Proposed model: There is still wide discussion regarding the use of
> commercial cloud services vs. contracting with a particular university
> or company for providing storage, computation and network resources
> needed to support LTER Network databases.  However, it is not currently
> contemplated that the hardware will be physically associated with the
> governance committee or the Project Manager. Although it might be
> possible to implement a widely distributed model (e.g., each LTER site
> runs one or more servers supporting one or more network databases),
> there doesn't seem to be much of a motivation for doing so. I very
> seldom am in the same room or building as servers we use - so moving
> them further away or to the cloud is no problem.
>
> 3) Who should do the actual development of LTER Network Systems?
> Current Model: IM's at the LNO do most of the work. As you noted and
> lauded, occasionally, LTER site IMs are contracted to work on specific
> projects, but this proved difficult because it involved paying double
> overhead (once to UNM, again to the IM's home institution).
>
> Proposed Model: As a group, we would like to see much broader
> participation of LTER site IM's in implementing network-wide solutions
> and have been wrestling with possible solutions (independent contracts,
> interagency personnel agreements) that would allow funding to flow
> without the duplicate overhead problem. We don't want to depend on  work
> diverted from site efforts. A principle has been that people working on
> network projects should be funded to do so, independent of funding from
> their LTER site.  If we can get around the overhead issue, it would
> allow personnel from small groups of sites to receive funding to develop
> tools to meet network priorities. However, there are some tools and
> systems (e.g., PASTA, network database server administration) that may
> require dedicated staff that would not be associated with LTER sites.
> Additionally, there might be some projects that might demand in-depth
> expertise in a particular software stack (e.g., Palantir and DEIMS) that
> does not currently exist at any LTER site. It is not anticipated that
> there would be a large, permanent staff.
>
> I could go on about how the fiscal administration is likely to be
> physically disassociated from all the other parts, and from any other
> entity that has its own priorities independent of LTER, but I think you
> get the point. Decisions: Made by a committee with members drawn from
> the LTER Network; Computer Systems: Cloud or contracted for separate
> from the institution administering the grant; Work: Distributed among
> LTER IM's if appropriate, or use of dedicated or outside staff, if not.
>
> I think you'd have to agree that that sounds more like a distributed
> system than a centralized one! I'll admit that we haven't really
> seriously discussed the ultimate in distributed systems: giving each
> individual LTER site identical amounts of money to work on running and
> improving needed network-wide databases and systems. However, it doesn't
> take too much thinking to realize that, given limited funding, we need
> some way to jointly identify and prioritize activities - and that a
> fully distributed model provides no clear way to do that.
>
> Hope this helps!
>
>   -John Porter
>
> On 7/2/2015 5:34 PM, Inigo San Gil wrote:
>> Dear IMs,
>>
>> Is there a bit of a lull on the DDMS, or perhaps, Im in the dark - I
>> didnt hear about VTCs, or anything.  Well, I keep mulling these ideas,
>> met with Bob Waide (get feedback), talk to my PI, and will talk some
>> more. It is an interesting opportunity for the LTER to be mum, guys.
>>
>> We are working towards what is being coined "LTER Distributed IM model".
>>
>> As I read through the folders (some documents dissapear..), and read the
>> paperwork thus far, I wonder whether a better name would be "LTER Data
>> Center", and not really an "LTER Distributed IM model".  The reason is I
>> fail to identify the "distributed" part of the equation in the "model".
>>
>> Could you please identify for me what parts are "distributed"? I fail to
>> see those clearly.  I see governance schemes, with elaborate diagrams. I
>> see a financial model  and I see a service bucket tasks (wrote most of
>> these tasks). We had budgeted a few items for the eventuality of hosting
>> solutions in the Cloud ( I would stay away of becoming a Guinea pig -
>> re: NSF Dear letter).
>>
>> But in my mind, the LTER IM Distributed model always has played like IMs
>> working on a coordinate fashion to leverage our collective strengths,
>> hence mitigating our individual gaps.
>>
>> I do not see that vision explicitly in the work thus far.  For me, the
>> "LTER Distributed IM model" may entail a profound change in LTER and
>> specifically, a change in the way we operate.  See, this far, we are
>> site-centric, and network later (if at all).  Sure, we have EML medal,
>> but it sounds more like the Euro for Europe in terms of integration/
>> distributed (bad analogy, unless you are a Greek, then you may know what
>> I mean).
>>
>> Since I do not know, I would like you to tell each other what do you
>> think working in a "LTER Distributed IM model" means.
>>
>> Yes, I am aware sites (PIs) "want to have their IM there 150% and on the
>> cheap".  True, we identified that we need a person at each site to solve
>> the day-to-day issues related with the handling of each site digital's
>> assets (and some hardware, I must add).
>>
>> It sounds to me, that to have a concerted effort, you would have to have
>> those IMs really involved with the persons who are truly network
>> dedicated (the Data Center staff) and I am unsure how is this going to
>> work, given the experience thus far.  One example - after years of
>> asking for compensation for "network-dedicated" time, we got it, and
>> while it lasted (not too long), the resource or instrument was seriously
>> under-used.  (How much, I do not really know).  Point is "money" did not
>> kept us from being immersed in a network project.  Then what is it?
>> Perhaps the weak link was motivation and purpose (those are the main
>> drivers to get engaged in any activity, along with the mastery of the
>> activity). The reason I ask, is cause the key on a "LTER Distributed IM
>> model" seems to me the collaborative aspect of the human assets, that
>> is, _you_ and _me_ working on a common project.
>>
>> Or perhaps, the "LTER Distributed IM model" means something else for
>> you, I just would like to read what that means for you.  I may be wrong,
>> but there is quite a bit of room for status quo in the ideas that bounce
>> off the documents and watercoolers.
>>
>> Thanks for your comments,
>> Inigo
>>
>>
>>
>>
>> -- 
>>
>> Inigo San Gil
>> +1 505 277 2625
>> http://scholar.google.com/citations?user=foIppL4AAAAJ&hl=en
>>
>>
>>
>> _______________________________________________
>> Long Term Ecological Research Network
>> im-rep mailing list
>> im-rep at lternet.edu
>>