Home - Brand

Data on species occurance

The species occurrence data is a set of coordinates of where the species were reported to be present. This will act as the training dataset for the models.

Presence data

Obtained from the Global Biodiversity Information Facility (GBIF)
Used R package Coordinate Cleaner to extract the data using the taxon key for S. frugiperda
Pulled 11 120 occurrence points

Absence data

Made use of pseudoabsences
Used R package sp to create these points
Chose to generate 20 000 data points at a resolution of 30s as this will later be cleaned to only contain points on land masses

Presence cleaning

Filtered to contain only points with both latitude and longitude present
Coordinate Cleaner was then used to test for centroids, equal, zeros, gbif, institutions, seas, and duplicates
The final presence dataset had 5379 records

Absence cleaning

The specific CC_sea package from Coordinate Cleaner was then used to test for seas. The others were omitted because the pseudoabsences were randomly created.
The final absence dataset was reduced to 3032 records

Data on current and future climate

I chose the GFDL-ESM4 model from the CMPI6 scenario as this covers Africa, Asia, and (whatever it was in that paper) better than the other scenarios available currently in the CHELSA-BIOCLIM+ database. This model was created by the National Oceanic and Atmospheric Administration, Geophysical Fluid Dynamics Laboratory, Princeton, NJ 08540, USA and is at a native resolution of 288x180. Point values were extracted at a resolution of 30s.

All climate data used in this project were obtained from the Climatologies at High resolution for the Earth’s Land Surface Areas (CHELSA) database.

Current data

Comprehensive information on historical climate conditions across various regions. It includes datasets derived from observations and reanalysis data, offering insights into parameters such as temperature, precipitation, and other climatic variables over past time periods.

Researchers and practitioners utilize this data to analyze past climate trends, understand regional climate variability, and assess the impacts of climate change on ecosystems, agriculture, water resources, and human societies

Future data

Consists of projections and simulations of climate conditions under different greenhouse gas emission scenarios and climate models. These projections offer insights into potential future climate scenarios, including changes in temperature, precipitation patterns, and extreme weather events.

Future data allows researchers and policymakers to anticipate and plan for potential climate impacts, develop adaptation strategies, and evaluate the effectiveness of mitigation measures to address climate change challenges

Variables

The CHELSA Climate Database provides a wide range of climatic variables that are essential for understanding and analyzing climate patterns and trends.

Due to the large volume of variables available on CHELSA, only the following were used in the final model training, as explained further in the Modelling section

Variable	Name	Explanation
bio1	Mean annual air temperature (°C)	Mean annual daily mean air temperatures averaged over 1 year
bio6	Mean daily minimum air temperature of the coldest month (°C)	The lowest temperature of any monthly daily mean maximum temperature
bio9	Mean daily mean air temperatures of the driest quarter (°C)	The driest quarter of the year is determined (to the nearest month)
bio11	Mean daily mean air temperatures of the coldest quarter (°C)	The coldest quarter of the year is determined (to the nearest month)
bio12	Annual precipitation amount (kg m^-2year^-1)	Accumulated precipitation amount over 1 year
gdd10	Growing degree days heat sum above 10°C (°C)	Heat sum of all days above the 10°C temperature accumulated over 1 year.
ngd10	Number of growing degree days (number of days)	Number of days at which mean daily air temperature > 10°C

Why these variables?

This was determined by analysing the feature importances given after training a model on the presence and absence datapoints.

For more info on this, check out the Modelling section

All about the data.