Corresponding author: Carol X Garzon-Lopez (
Academic editor:
To assess bias related to sampling effort, we use one of the most widely used datasets for studying biodiversity at large spatial extents: the GBIF dataset. GBIF data comprise a large range of species occurrence observations collected using a variety of sampling approaches. The data span from well-established plot censuses to direct observations collected during field trips. Consequently, some of the data points are at the centre of sampled grids (each point comprises the species located at a specific-size quadrant) or correspond to a single observation of at least one individual of the same species. These differences also depend on the methodologies used to observe/record occurrences per taxon. Plots and transects are common practices in vegetation censuses, while transects, point counts, and live traps are preferred in the case of animals. Moreover, the variation in factors—such as per country biodiversity monitoring schemes, funding schemes, focal ecosystems, and accessibility to remote areas—add another source of variation, especially at multinational scales (
We aimed to quantify and map the uncertainty derived from variations in observations due to differences in sampling efforts. Cartograms were used to illustrate uncertainty, in which the shape of objects (countries) correlates with the level of uncertainty. Cartograms build on the standard treatment of diffusion, in which the current density is given by:
where
Cartograms facilitate the visualization of spatial uncertainty in the results by changing the size of the polygons based on the density of information contained (number of observations, variation, etc.).
The generated maps show differences in species observations per country across all taxa, including some of the main taxonomic groups.
The cartograms were developed using free and open source software (
Cartograms are intuitive: the shape and area of the countries derives from the difference between the actual size of the country and the size of the sampling (e.g., the number of observations). Hence, smaller areas which are oversampled will look bigger in the cartograms, with a high oversampling value, while bigger oversampled areas will have a high value but a lower relative size. The method thereby directly accounts for the area effect, i.e. the size of each country, on the final sampling effort. For instance, the Netherlands and Sweden are both oversampled, but the latter occupies a bigger surface area. Hence in the final cartogram (e.g. Fig.
In the proposed method, uncertainty is shown at the country level and corresponds with the deformation of the original country area. In other words, countries bigger than their original size require strategies to reduce the effect of oversampling on the products derived from the GBIF data, while countries smaller than their original sizes require more sampling effort. Future developments will include the visualization of species distribution model predictions combined with the maps of uncertainty presented here.
Cartogram of species occurrences. Extracted from GBIF data (
Plants
Fungi
Animals
All taxa