The tutorials_marine_sdm from oceanhackweek

Marine Species Distribution Model (SDM) Tutorial

Overview

This tutorial was developed during OceanHackWeek2023 to provide a simple workflow to developing a marine Species Distribution Model (SDM) using R programming. To see the OHW23 project at the end of OHW23, go to the ohw23_proj branch or see the ohw23_proj release.

Background

Species Distribution Modelling (SDM) also known as niche/environmental/ecological modelling uses an algorithm to predict the distribution of a species across space and time using environmental data. An understanding of the relationship between the species of interest and the physical environment they occupy will inform the selection of relevant environmental factors that will be included in the model.

Biotic information is also needed by SDMs and at the very least locations of individuals are needed. Abundance or densities can also be used as inputs, but are not compulsory. It is worth noting that absences, that is, the locations where individuals of a species are NOT present is just as important because it provides information about the environmental conditions where individuals are not usually sighted. Often absences are not recorded in biological data, but we can use background points (also known as pseudo-absences), which provide information about the full range of environmental conditions available for the species interest in our study area.

For a review of the performance of different SDM algorithms, see the following publications:

Valavi, Guillera-Arroita, Lahoz-Monfort, Elith (2021). Predictive performance of presence-only species distribution models: a benchmark study with reproducible code. DOI: 10.1002/ecm.1486
Elith et al (2006). Novel methods improve prediction of species’ distributions from occurrence data. DOI: 10.1111/j.2006.0906-7590.04596.x

For a discussion on the impact of background data on SDMs see: Phillips et al (2009). Sample selection bias and presence-only distribution models: implications for background and pseudo-absence data. DOI: 10.1890/07-2153.1. For a background sample generation refer to work by Valavi.

Datasets used in the tutorial

Biological Data

Our area of interest is the Indian Ocean, where four species of sea turtles have been reported to occupy this area:

Loggerhead, Caretta caretta
Green, Chelonia mydas
Olive Ridley, Lepidochelys olivacea
Hawksbill, Eretmochelys imbricata

For this tutorial, we will focus on predicting the areas occupied by loggerhead sea turtles. To do this, we will use presence-only data from 2000 until present, which have been sourced from the Ocean Biodiversity Information System (OBIS) via the robis package.

Environmental Data

This tutorial focuses on regions in the northern Indian Sea, specifically the western Arabian Sea, Persian Gulf, Gulf of Oman, Gulf of Aden and Red Sea. Environmental predictor variables were sourced via the sdmpredictors R package. The package give access to the https://bio-oracle.org/ and http://marspec.org/ high-resolution layers of various marine variables. Note these variables are location specific but not time specific: they are average values over time periods.

Workflow/Roadmap

This tutorial is based on the notes by Ben Tupper (Bigelow Lab, Maine), and highlights modeling presence-only data via maxnet R package.

Tutorial roadmap

Presence Data -- obtain Loggerhead sea turtle (C. caretta) presence data from OBIS via robis
Background Points -- shows two methods to create random background points within our area of interest
Environmental Data -- obtain environmental predictors of interest using SDMpredictors
Model -- run species distribution model and predict using maxnet
Data Visualizations

References

Bosch S, Fernandez S (2022). sdmpredictors: Species Distribution Modelling Predictor Datasets. R package version 0.2.14, http://lifewatch.github.io/sdmpredictors/.
OBIS (2023) Ocean Biodiversity Information System. Intergovernmental Oceanographic Commission of UNESCO. www.obis.org. Accessed: 2023-08-08.
Steven J. Phillips, Miroslav Dudík, Robert E. Schapire. [Internet] Maxent software for modeling species niches and distributions (Version 3.4.1). Available from url: http://biodiversityinformatics.amnh.org/open_source/maxent/. Accessed on 2023-08-10.

Tutorial developers

Eli Holmes: Research Fisheries Biologist, Northwest Fisheries Science Center, NOAA Fisheries.
Catherine Courtier:
Mackenzie Fiss: Fifth-year PhD student at Northeastern University studying carbon cycling and microbial interactions in salt marshes.
Denisse Fierro Arcos: PhD candidate at the Institute for Marine and Antarctic Studies (IMAS) and Data Officer at the Integrated Marine Observing System (IMOS)
Paulo Freire: PhD candidate at the University of North Carolina at Charlotte (UNCC) studying marine microbial ecology.
Jade Hong: Recently finished ungraduate studies majoring Biology and Marine Science at Boston University.
Tylar Murray: USF IMaRS Software Engineer - code whisperer, data viz enthusiast, scientific generalist, compulsive overengineerer, & UX PhD
Caitlin O'Brien: Research Scientist, Columbia Basin Research, School of Aquatic Fishery and Sciences, University of Washington
Mary Solokas: John A. Knauss Marine Policy Fellow, National Oceanic and Atmospheric Administration
Laura Tsang: Recent Master's graduate from Northeastern University
Ben Tupper: Senior Research Associate at Bigelow Laboratory for Ocean Science

Who is this tutorial intended for?

Some experience programming in R is needed to make the most of this tutorial. To run this tutorial make sure you clone this repository into your local machine by creating a new project that uses version control (git).

The tutorial content was developed in a R version 4.2.2 for Linux.

Additional resources

If you need additional support with R programming, you can check the following resources:

Ben Best's labs on SDMs
R for Data Science - 2nd edition by Wickham, Çetinkaya-Rundel and Grolemund.
Data analysis and visualisation in R for ecologists
For information on how to use git and GitHub with R, Happy Git and GitHub for the useR by Jenny Bryan is a great resource.

oceanhackweek / tutorials_marine_sdm Goto Github PK

tutorials_marine_sdm's Introduction

Marine Species Distribution Model (SDM) Tutorial

Overview

Background

Datasets used in the tutorial

Biological Data

Environmental Data

Workflow/Roadmap

References

Tutorial developers

Who is this tutorial intended for?

Additional resources

tutorials_marine_sdm's People

Contributors

Stargazers

Watchers

Forkers

tutorials_marine_sdm's Issues

Recommend Projects

Recommend Topics

Recommend Org