extended-data-model's Introduction
extended-data-model's People
extended-data-model's Issues
ABIS adoption technical analysis
Search by dataset name
Add datasetName to the available search fields
Context:
It seems dataset name was previously available on the Advanced Search for Biocache Hub but was not searching across the field of the same name. Eventually, it seems that the field was removed from advanced search instead of fixing the field it uses to perform the search.
Q: Was there any technical issue to allow search by dataset name?
Commission test environment
Download Event Data
This has 2 components
- Webservice - see #30
- UI support
Source exemplar datasets
- OBIS Australia
- BushBlitz
- Birdlife Australia UI, API
Seedbank
Seedbank it is a good source of datasets that attach measurements. This dataset also is a good example showing that we can't make the assumption that occurrence is always the last element at the bottom of the hierarchy.
Query event data from Galah
Download event data using Galah
Datasets title functionality unintuitive
Design extended data model
Use a Vocabulary Service
Current Data been mapped to Event DwCA, such as Reef ... dataset, includes sampling method with typical values of 1 or 2 that lack any meaning unless the context information about the method is added.
We need the ability to define resolvable URIs in the dataset metadata with context information about the sampling method (any other field or term?).
We need to make use of a vocabulary service as the target for the proposed URIs above and actively maintain those vocabularies.
Prospect services are https://vocabs.ardc.edu.au/ or the vocabulary service managed by GBIF (tbc).
Download service implementation for event data
Exploration required which should include:
- Investigate use of Spark QL to support download service
- Investigate connector between Spark and Elastic (Elastic SQL) for reading Elastic search from Spark to produce exports. See this
4 potential types of download we could support, each with different complexities in implementation.
a) Single dataset download
These would be full exports of the event datasets with our interpretation (taxonomy etc).
These could be pre-generated using pipelines (similar to DwCA export pipeline) and copied to S3 or FS.
These would satisfy the EcoCommons people.
Complexity: LOW
b) Multiple dataset download
Similar to the above, but the ability to package multiple complete datasets (a zip of zips).
Complexity:MEDIUM
c) Query based cross dataset download
This would be the sort of download we are familiar with for occurrence data, but i question whether it is a good idea for event data, where the datasets are all quite different.
If AVRO based, then events need (globally) unique eventIDs which is something we dont have at the moment.
Complexity: HIGH
d) Sites by species download
Elastic search based, using facets
Complexity: MEDIUM
Integrate authentication
Collectory integration
- Event count
- View events
Biocache integration
- View event details link
Create data producers guide
Investigate models that make use of event core data
What models can be enabled by having event type data?
What data sets #24 we need to test them?
Generate DOIs for downloads
- Add metadata to DOI similar to metadata available for Occurrence Download DOIs
Question: Are we going to implement a fallback mechanism similar to Biocache when DOI service is unavailable or otherwise how are we going to manage errors?
Prepare dev environment
Generate Event Datasets from Biocollect
Add feature on Biocollect to create event core DwCAs.
Add event fields to facets
Interpret EstablishmentMeans as a String, not a JSON
Originally reported by @nielsklazenga
https://biocache-ws-databox.ala.org.au/ws/occurrences/3afe9be0-f4fd-41d7-b6df-2a5117e08757
Valued of processed->occurrence->establishmentMeans is an json: 'establishmentMeans: "{"concept": "vagrant", "lineage": ["vagrant"]}"'
Our current EstablishmentMean model has two fields: concept and lineage, but EstablishmentMean in the latest DWC core is defined as a controlled value string
refer: https://dwc.tdwg.org/em/
It has the same issue on our Biocache prod
Todo:
Working on
https://github.com/gbif/pipelines/blob/dev/livingatlas/pipelines/src/main/java/au/org/ala/pipelines/transforms/ALABasicTransform.java#L112
TERN onthology to DwC Event Mapping
ABIS adoption data analysis
Provide data for EcoCommons
Accept Data Collections in ABIS format
ABIS to DwC mapping
Standup UI components on a AWS VM
Required, in order
- Copy GBIF react-components to ALA repo
- Add an ala-graphql-api package
- Add ES backed Event resource
- Add Event and EventSearch react-components
- Add event-ui package using the Event and EventSearch components
- Deploy on AWS
Optional
- Add graphql SOLR backed Occurrence resource and react-components to search and view
- Add graphql downloads-service resource and react-components to submit requests
- Add graphql Collections resource and react-components to search and view
- Add graphql BIE resource and react-components to search and view
- Add graphql Images resource and react-components to view
- Add graphql Lists resource and react-components to view
Filter by Event fields
Ingest Event Data
Search by Event fields
Ingestion exploration
Document basic navigation
Create a user support article to explain the basic navigation starting from a dataset
Search by all event fields - Event search tab
-
Add additional search option at the top of event search tab.
Search will perform a partial match across all event Id, parent event id, field number and dataset name.
The rational is that data across different datasets will use different fields for the same information, for example one dataset uses event id while the other uses field number. -
For consistency with other elements of the user interface, rename dataset names to dataset / survey names.
Note: labels on UI must use i18n conventions.
See below for details:
BIE integration
- View list of events for this taxon
- View map of events for this taxon
Search by Event Id
Dataset metadata view too narrow
Pre-ingestion framework produces Event Core DwCAs
Pagination broken
Pagination broken on site tab (Devonport Tasmania insect dataset)
Steps to reproduce
Go to Datasets scroll down to "Catches of numerous insect species in Rothamsted 160W light trap at Devonport, Tasmania, 1992-2019" dataset and click Add to filter.
Go to Sites tab
Click on next page ">" button
Expected
Next page will be displayed
Actual
No site data available message is displayed despite having 3 pages of site information.
Design Initial Model
General UI feedback
-
Filters at the top look similar to tabs.
Tabs design should be more tradditional to make it clear what they are. -
There should be some text indicating that no filters are applied when displaying all records.
"Currently viewing all datasets apply some filters to narrow down the results." -
A basic inline tutorial should be great but the second option is to produce documentation to explain the basic navigation starting from a dataset.
Index event data for ALA search
Add event and sites as additional elements to the ALA index.
Currently BIE (ALA search index capability) indexes the concepts below:
Source Biocollect data sets
The initial analysis in #23 already includes some sample datasets with event information.
This activity will expand on the initial work to source exemplar datasets from Biocollect.
Mapping service support
We need to support mapping for events.
The intention is to reuse https://github.com/gbif/maps spring boot based component for this.
This module uses HBase and Elastic as data sources to render tiles.
Some work is required to allow it to run without HBase being present.
Create Event Page
Include navigation among records and locations (sites)
Feedback OBIS AU
Feedback can be accessed in the PDF below
https://drive.google.com/file/d/1WmrH32NAusw8JSHBVa2_M0qSVk_2bxYy/view?usp=sharing
Facets and record view changes for event data
This issue takes an opinionated approach to move some fields around in the Biocache Hub to make more intuitive for users to find event information.
Facets
- Add Event section to facets just below Occurrence
For the filters configuration:
-
Create new event section.
-
Move Month, Year, Date Precision, Year (by decade) and Event ID from Occurrence Section to new Event section
-
Move Dataset name from Attribution to new Event section
-
Rename " Dataset name" to "Dataset /Survey name"
-
Add Parent Event ID and Field number fields to facets
-
Proposed order of fields:
Left | Right |
---|---|
Dataset /Survey name | Month |
Parent Event ID | Year |
Field number | Date Precision |
Event ID | Year (by decade) |
Record view page
- Move Field number and Dataset / Survey name from Dataset section to Event section
- Add Event ID and Parent Event ID
- The proposed order of fields in the Event section is:
- Dataset / Survey name
- Parent Event ID
- Field number
- Event ID
- Existing list of event fields
Source OBIS AU datasets
https://biocache.ala.org.au/occurrences/search?q=data_provider_uid%3Adp5183#tab_mapView
- Add eventType
- Generate DwCA
Prepare presentation/questionnaire for reference group
Create Site/Location Page
Include navigation among records and events
Map Dataset from DwC to ABIS
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.