Giter Site home page Giter Site logo

Comments (11)

damianooldoni avatar damianooldoni commented on August 17, 2024

Website updated. And I added a first draft to tackle this issue in https://trias-project.github.io/indicators/occurrence_indicators_appearing_taxa.html
@timadriaens: could you please check it? Everybody else is welcome of course.

Something to discuss is also how to present results as up to now, I don't find ranking method to merge the tables with appearing taxa and reappearing taxa.

from indicators.

damianooldoni avatar damianooldoni commented on August 17, 2024

Based on @timadriaens check: Junco hyemalis is NOT present in 48 cells in Belgium.
To do for @damianooldoni: check in code whether function still works on number of cells. maybe is there is a bug and does it work with number of observations?

from indicators.

damianooldoni avatar damianooldoni commented on August 17, 2024

about Junco hyemalis, @timadriaens: I checked datacube, and indeed this taxon is present only in 2018, with 298 observations distributed in 48 cells (=48km2).
However, I think everything is perfectly ok! The reason is the following: the observations are published with a big coordinate uncertainty: from GBIF and from datacube can I read both the same information: coordinate uncertainty in metres: 3536. So, the observations are spread within an area of 39.2 km2 around centroid. As the observations come not all from the same cell, the area can be bigger than 39 km2. So I can conclude that the 48 cells (= 48 km2) are well-grounded.
In general, if both the uncertainty and the number of occurrences are so high, the likelihood they cover a bigger area increases.

from indicators.

timadriaens avatar timadriaens commented on August 17, 2024

Well, clearly, that is plain wrong of course as in reality the species, as it is an extreme rarity, was seen by almost 300 twitchers in the same km square. Anyway, it was most probably a wild bird in this case so would not even classify as an alien (there was another one found dead in Antwerp harbour area).

image

Where is this coordinate uncertainty introduced? In the mapping of wnm.be data to gbif? Is this coordinate uncertainty the same for many records?

from indicators.

damianooldoni avatar damianooldoni commented on August 17, 2024

288 out of 298 obs come from the Waarnemingen.be dataset Waarnemingen.be - Bird occurrences in Flanders and the Brussels Capital Region, Belgium. As you can find by clicking on the link, the publisher states:

Generalized and/or withheld information: location information is generalized to 5 x 5 km² Universal Transverse Mercator (UTM) grid cells.

Please, ask to Waarnemingen.be to publish data of this taxon without masking it at 5x5km level if you want better results. I cannot do much more about it. As it is an alien, they could say yes to do it.

from indicators.

qgroom avatar qgroom commented on August 17, 2024

from indicators.

damianooldoni avatar damianooldoni commented on August 17, 2024

About downsides of our approach:
Using occurrence points without taking into account uncertainty is even worse, theoretically (any measurement has an uncertainty!), methodologically and practically. GPS readings are never perfect, and depending on signal strength, the uncertainty varies a lot. Any user of iNaturalist can check it. And I like it is like that as the point is not in the very real location, as the GPS signal strength varies. Without uncertainty we would introduce an error as well.
.
About flagging low quality data:
as asked by @amyjsdavis, who needed high quality data with low coordinate uncertainty, I added a column the minimum coordinate uncertainty to the data cube so we can filter based on coordinate uncertainty. But for emerging indicators I think we should use everything. The "problem" noticed by @timadriaens is restricted to one species against thousands.. So ,yes, downsides are very relatively small. Improving data at the source would solve this problem easily.

About flagging data from same organism: this issue is something I don't know how to solve at the moment. I don't think I will have time to investigate it. But good to know it exists.

from indicators.

qgroom avatar qgroom commented on August 17, 2024

from indicators.

timadriaens avatar timadriaens commented on August 17, 2024

All of your thinkings are good points of course, and indeed any approach has limitations. @damianooldoni @qgroom as you know, thanks to the newly negociated INBO contract with Natuurpunt alien species occurrences are effectively published on gbif with detailed coordinates so they are not blurred anymore (see this data paper about it on which I should actually have been an author but was the handling editor).

Yet the actual problem here, as you pointed out, is that the species occurs on the alien bird checklist (and rightly so < escapes) ànd in the dataset of (native) bird observations of waarnemingen.be (link). So we have "contradicting information" in the occurrences and the checklist in a way.

In fact, we have decided in the metadata to discard such species from the checklist: "Species with populations of mixed origin (escaped and wild) in Belgium (e.g. barnacle goose Branta leucopsis, greylag goose Anser anser) were not considered true alien species and were excluded from the list. Also excluded are native species that are used for restocking (e.g. grey partridge Perdix perdix) or birds that might have popped up occasionally as escapes from collections but that also occur as rare migrants or occur as native breeders. "

I think the junco is an example. Shall I raise the issue in the alien bird checklist repo?

from indicators.

damianooldoni avatar damianooldoni commented on August 17, 2024

Thanks @timadriaens for your comments. Probably is better to shift this discussion on alien bird checklist, indeed.

from indicators.

damianooldoni avatar damianooldoni commented on August 17, 2024

Pipeline to detect appearing and reappearing taxa at present year is done:
https://trias-project.github.io/indicators/occurrence_indicators_appearing_taxa.html

List of taxa produced by this pipeline is saved in data/output/appearing_taxa.tsv and data/output/reappearing_taxa.tsv. Also taxa without occurrences are saved: data/output/alien_taxa_without_occs.tsv.

Issue closed.

from indicators.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.