Workshop Feedback; responses & annotations:

Aquatic Macroinvertebrates workshop

28-29 August 2008

Present: Phil Suter (PS), Jeff Webb (JW), David Yeates, Garry Jolley-Rogers (GJR), Jeremy Price (JP), Margaret Cawsey (MC)

nb * indicates actions

~ for further thought & documentation

Responses and Further Annotations

General issues with the flowcharts

  • The "black box" terminolgy is confusing to the participants as it is open to misinterpretation. The term should be changed to something else.
  • The flowcharts are getting too complicated and the iteration or non-linearity is making them very difficult to interpret. In a sense, although the process can be interative, in part or in whole, the steps are still linear. Bernard Pfeil's phylogenetic process flowchart demonstrates this, and is far easier to understand and is thus more descriptive of the different components of each sub-process etc. The original flowcharts are important to illustrate the iteration, but the internal components need to be described by more linear flowcharts.

Components of the process

Curation (specimens and data)

  • ANIC will take the mayfly specimens, depending on the quality of curation.
  • The AMI project suffers from the fact that they have nowhere to curate anything or store anything. Researchers are reduced to storing things at home etc.
  • This leads to the recurring issue - the long term problem of data (=specimens, papers etc.) getting lost when researchers retire, leave, drop off the perch. This occurs right through the university system and government.
  • Collection and collection maintenance is a major problem in Australia
  • There is no long-term data repository and databases in Australian Universities; basically, Universities are not suitable as corporate collections and databases.
  • Poor curation also leads to data loss; e.g. where location is not recorded specimens are useless.
  • Where research is done and published, it is crucial that specimens do NOT vanish. For ecological collections, if specimens are lost this is unfortunate but not tragic. It IS tragic for published specimens.
  • Given the university-institutional relationship, a partnership with a corporate time-frame is the only way to go, otherwise published specimens will not be universally available to the taxonomic community. Getting this into the universal informatics mindset is crucial.
  • The information cannot necessarily be parsed from publications because descriptions are not standard across taxa (or within taxa?) and there are things missing.
  • We need to institute the technology of web service from corporate databases i.e. a trusted server on whatever corporate network with the database and which can serve the data to outside aggregators. This means that institutions involved with science need to grasp that they need to run scientific databases.


Collecting event

  1. One locality (= site), one time, one GPS coordinate, 1 photo, 1 substrate, flowing yes/no, flow speed fast/slow, 1 netful from which desired taxa picked out and put into a jar of ethanol (i.e. only keep taxa of interest.
  2. Take the jar back to the lab.
  3. Sort the jar into taxa - which can take days.
    1. Sort first into morphological groups
    2. each group gets an accession number
    3. DNA samples taken from each group to see whether they've got single or mixes of species.

  1. One locality, one malaise trap, left out for a week with ethanol.
  2. Take the jar back to the lab.
  3. Sort the taxa of interest and leave the rest of the bycatch in the jar, which is stored. At this time, data are lost because the permit and other constraints pertaining to the capture are not linked to the jar in any way.
  4. Unsure how the jar is documented. David Y. knows where it is. I think the information is in a database. I also think that this information is not readily available to others so the valuable resource i.e. the bycatch might be, to all intents and purposes, lost.
  5. Conclusion: Collection events should be documented on the web so people are generally aware of the availability of valuable bycatches which might contain taxa in which they are interested.
  6. However, David Y. suggests that experts in different taxa manage to collect more of their speciality taxa in traps than others (due to placement for example), so the bycatches are likely to be less valuable for other taxa than the sample was for flies.
  7. David Y. would have to be convinced of the value of using PDAs to collect field data.
Application of informatics to assist
  1. Phil S. and Jeff W. would be amenable to the use of PDAs to collect the site data (point 1) in the field and synch this directly with the database on return to the lab. David Y. would have to be convinced of the value of using PDAs to collect field data.

Plant Taxonomists Workshop


Present: Judy West (JW), Joe Miller (JM), Brendan Lepschi (BL), Bernard Pfeil (BF), Richard Watts (RW)
Garry Jolley-Rogers (GJR), Jeremy Price (JP), Margaret Cawsey (MC)

nb * indicates actions

~ for further thought & documentation


  • JW: Can see why we need to start with an institution but need to deal with organisms and disciplines. Doesn't like institutions in the table.
  • BP: Particularly as institutions change. Taxa are the way to go.
    gjr 6/8/08 > * let's remove the reference to institutions as much as is practical - after all to a great extent the institutions are organised along organismal lines. Where institutions differences are important they can be noted in the body.


  • JW: likes the notion of a "collection event".
    gjr 6/8/08>the term "collection event" was taken from the ANBG herbarium lexicon - driven, I guess, by the need to avoid linguistic acrobatics (collect, collection, collecting etc.).
  • JM: his project went around in the red circle for 8 years and now are going around in the blue circle.
    gjr 6/8/08> * I think this is best done by looking at Projects. The view of the meeting was that some elements of the processes inside planning and implementing projects need to be defined.
  • 3rd party not clearly understood.
    gjr 6/8/08> * I shall clarify "3rd Party" on diagrams. Understand the confusion, it was not meant to be material on loan rather deposited from outside sources (e.g. duplicates or deposits from Universities) which then go on to be part of the collection and thus possibly inform future research/curation.
  • See BP's amendment to the flowchart; there needs to be a large circle around Collection events and 3rd party and add a loans process.Need to define types (?).
    gjr 6/8/08> ~ A lot of the general discussion was about the process of planning a project and the assembly of material. I think this is where the borrowing of material fits in.
  • BL: "Loans" is an essential bottleneck; people always need materials; BP: often cannot anticipate the need to borrow material until quite a way into the process.
    gjr 6/8/08> ~ There was discussion about the Assessment and Assembly of matterial available (at hand or in other collections) PRIOR or as alternatives to making a collection. General consensus seemed to be that Assembly and Assessment needed to be treated as important and distinct. Also, that it might merit separate consideration.
    gjr 6/8/08 > lending discussed below. Part of curation? (see below)
    gjr 6/8/08> ~ We also talked about how Researchers might use technology to remotely assess the adequacy of material in other collections. In the simplest instance good images might allow the investigator to assess the state of material at hand and the need to go collecting. (In a perfect world,) Annotations or remote measurements might go as far as substitute sometimes for sending the specimen
  • Need to publish where you've collated material from.
    gjr 6/8/08 > ~ informatics tools to help track materials assessed? tie in with collections DB?
  • Imaging could speed up this "process" (Informatics input).
    gjr 6/8/08> ~ images could be used to i) document field sites and ii) to record material during the process of assessment. A means to organize and access the images post hoc would be essential. Challenges include storage, indexing, metadata, and curation of images. Tied to GPS systems could be of great use.
  • BP: raised the use of PDAs to collect field data (Informatics input).
    gjr 6/8/08 >
    ~ use of PDA's bring opportunities for standardized and easily transferable information. There was also some discussion about difficulties of PDA's in the field e.g. expense +the importance of design. General consensus seemed to be that there would be occasions where they would be of great assistance.


  • JW: doesn't like the term "reception"; doesn't apply to plants? "Pre-processing"? "Arrival"? Terminology needs to be considered.
    gjr: 6/8/08> * shall seek a clearer term for reception. Or rework this section as there was some discussion about the sequence of what happens when specimens enter the collection. particulars about process, identification, etc
  • The processing circle needs to be a big black box. This box needs to be analysed in its own right, and may need to become a stage of the Taxonomic Process with its own flowchart.
    gjr: 6/8/08> ~ Believe it will include much that is taxon and institution specific but probably can be put into some broad categories. I've sourced texts from specific institutions and will use these to start along with further consultations. Suggestions?
  • Data-basing needs to be a process, with arrows leading to/form it and Assessment, Processing, Curation etc. and leading to/from it and the Refining specimen documentation black box (see also BP's amended flowchart.
    gjr: 6/8/08> * Databasing. The box " Refining Specimen Documentation" was meant to include everything to do with documenting including databasing. Perhaps it needs to be clarified and further specified.... although documenting, annotating det slips etc can be placed under the same ubrella as "databasing" .. it probably does not reflect how people think about it so I'll change it.
    gjr: 6/8/08> ~ The accession of digital media was discussed i.e. whether digital media might be curated in its own right (not tied to a specimen).
  • Curation is a stage of the Taxonomic Process in its own right, as opposed to merely a part of the Accession stage, and needs its own flowchart. This will more usefully encompass "Further curation" as well.
    gjr: 6/8/08> * Curatorial. Shall get to work on a flowchat & open a page for curation.Will use same strategy as for processing.
  • Borrowing Material for Assessment, Curation, and Research (came up in the meeting).
    gjr: 6/8/08> * I need to seek clarification. Does material borrowed from an external collection need to be accessioned? Certainly, I can imagine that the interchange with other collections is important in research and the refinement of identity and documentation so curation (through its linkages to research). Are there any inter-institution responsibilities to track material?
  • Lending Material.
    gjr: 6/8/08> I believe lending material is a part of the curatorial process.(?) *** I shall seek confirmation and if so shall encorporate it.
    BL reported lending and the management of lending and the return of material is a bottleneck. A lot of material can be lost e.g. instances when Universities have thrown away Herbarium specimens at the retirement of academics. Some clear and easy to pick opportunities!?
  • The sub-samples/tracking samples will need expansion to take account of entomological collection processes (e.g. tree-fogging) where there are lots and sub-lots as opposed to merely samples and sub-samples. The specimens from tree-fogging could feed into many different projects, so tracking samples, lots, sublots will have to be treated differently. (BP: the individual specimen is more easily associated with sub-samples of itself, whereas all of the other specimens from the same tree may not be easily associated/tracked together.) This area may well have to become a full stage in the Taxonomic Process with its own flowchart.
    gjr: 6/8/08> Lims dealt with in detail below.
    ~ Even so, there are other issues here:-
    i) relationships between specimens (LOTS, sublots, single specimens, samples from specimens) and
    ii) need to seek advice as to what relationships need to be recorded and how best to do so.
    iii) the vexed issue of labelling (some solutions RFID, bar codes, Aztec codes, Data Matrices or similiar viz )
    iv) curating and tracking subsamples.
    * Shall break this into a separate page on the as suggested.....


The conversation around this subject concentrated a great deal on LIMS.
  • Replace the word "Taxonomic" with "Research" in the Hypothesis box, otherwise the Research stage appears to be relevant only to Alpha taxonomy and not phylogenetic research.
    gjr: 6/8/08> * Shall do.
  • Research Hypothesis process is not predictable. Some steps are linked, other elements need not be linked.
    gjr: 6/8/08> ~ There was general agreement on this but..... it was evident from the discussion that it is possible to specifing parts of the "Research Process" e.g. the stages of examination and documentation in alphaTaxonomy, & some of the genetics mthds.
  • JW: HubRIS should not look at characters but at links to other methods.
    gjr: 6/8/08> * clarification needed for me.
  • The inductive reasoning box needs to be expanded to consider elements such as measurements, analysis.
    gjr: 6/8/08> * Shall do. It may merit separate treatment in its own flowchart.
  • JW: contract someone to look at the sample labelling process. Labelling changes as taxonomic revisions occur and the labelling could be assisted by Informatics e.g. barcodes, and other automatic methods of recording specimen/sample/sub-sample identification.
    gjr: 7/8/08> ~ see also notes above (in accession). Clever strategies may be discovered as we consult further afield in other disciplines.
    Abstracting the labels away from taxa will help with revisions.
    We also discussed the criteria that any labelling system needs to meet. These are (not in order of priority nor practicality) i) longeviety in terms of decades ii) elements which are readable by machine and people iii) portability/extensibily to link subsamples, lots and specimens & iv) affordability
  • LIMS - needs to be teased out; varies from lab to lab.
    gjr: 7/8/08> A lot of common elements in applications of LIMS for Genetics seen to date (Kyle, BP, JM) suggest common solutions might work.
    gjr: 7/8/08> Collections DB are really a form of LIMS.
    gjr: 7/8/08> * Role of HubRIS? needs to be explored. to make reccommendations as to standards/software or more?
    gjr: 7/8/08> ~ for genetics based LIMS variation between labs due to differences in emphasis & needs. Driven by differences in material, institutional makeup, research agendas
    gjr: 7/8/08> ~ many of the problems that a LIMS might address could be dealt with through other means (instituting proceedures & standard protocols) . Only when taken together, is there potential (substantial) gains.
  • RW: dynamics in the workforce means that you end up with freezers full of stuff that nobody knows about.
    gjr: 7/8/08> ~ Could be dealt with proceedurally.
  • Corporate databases should store data on primers which didn't work. Corporate protocols should be developed to ensure that these data are recorded as an accepted part of ordinary work practice. Until both databases and protocols are available, it is likely that these data will remain incompletely recorded.
    gjr: 7/8/08> ~ BP (or JM?) suggested that this could be dealt with by some changes in the way primers are ordered e.g. through a common registry with some means (& gentle insistenace) to report on primer efficacy afterwards.
  • Informatics can assist with the assemblage of metadata which should assist in keeping track of samples and sub-samples and the development of a FailBank (as opposed to SuccessBank = GenBank - JP).
    gjr: 7/8/08> ~ ease of data entry would be important. JM reported that there was a high level of compliance for his (access based) LIMS database. probably just a matter of getting it into the rhythm of the lab work.
  • It is not necessary to record details of methods, temperatures etc. Only the primer information is of use.
    gjr: 7/8/08> ~ this is true for BP but, for example, not for Kyle with the ancient DNA.
    gjr: 7/8/08> With appropiate design of the GUI, entry or non entry of these details is not a big issue.
    gjr: 7/8/08> Details of variables used methods may be useful management tool for investigators with many assistants or students e.g to identify problems with proceedures, equipment or practice.
  • BP: If FailBank was linked to primer ordering accounts, e.g. via the oligoform, this would be a useful way of standardising data capture as well as managing the ordering process. You get the primer information from the same LIMS information and this would hook the DNA to the specimen database.
    gjr: 7/8/08> ~ simple in principle. linkages to existing systems needs investigation.
    gjr: 7/8/08> ~ Technical issues to explore. Reading and entering success or failure from arrays on plates is hard as they are hard.. tubes are hard to label too. however the plates form a matrix and this may be sufficient in careful hands.
  • JW: a link to ANSHIR and APNI is good, but not to be tightly tied as part of the Oracle database.
    gjr: 7/8/08> ~ Could be done by recording accession no. or sample no in LIMS. No need to modify ANSHIR or APNI just to reference them. Ease of ID would be important... segue back to the need for good labels or some other robust means of sample id.
  • Data capture needs to be standardised and made as easy as possible so it doesn't drive the scientists crazy. If it does drive them crazy, it won't get done.
    gjr: 6/8/08> ~ biomed labs have this sorted. However, they generally have more staff and less variable material. Still some ideas can be adopted from them especially around the "industrial" challenges of sample tracking and the common elements of lab genetics. Standardization would have the advantage of making lab books more transparent to other practitioners and potential economies of scale which maybe important as (and if) throughput becomes an issue. Some care would be needed to avoid placing a wet blanket on innovation.... a via media. The collections managers might have a role in this too - where scientists are working on material from collections some standardization, tracking of derived material, and annotation is part of the deal and this could be standardized.

Web-based tools to facilitate scientific collaboration

  • It is clear that there is some confusion about what the Wiki will be useful for, accompanied by some ignorance as to the benefits of other web-based tools which facilitate communication and collaboration, each in different ways.
  • At the end of this workshop, the participants indicated an interest in finding out about the variety and usage of such tools.
  • * HubRIS will investigate the logistics of putting together (a) workshop(s) aimed at educating TRIN scientists in more sophisticated use of internet capabilities to facilitate their science and collaboration with colleagues.

