This website demonstrates the results of building birdcall distribution maps with Bayesian modeling methods. I completed this project for the IYSE 6420: Bayesian Statistics course as part of my Fall 2022 semester in Georgia Tech’s OMSCS program. See the project report and source on GitHub for more details.

We use the geographic metadata from the BirdCLEF 2022 competition dataset to build a map to show the location of birdcall recordings. We fit the data to a Poisson Generalized Linear Model (GLM) to estimate covariate or random effects.

Plots

We split each region into a grid (or regular lattice) and summarized birdcall recording observations into each grid cell. We define the grid in degrees of latitude or longitude. These discrete cells help fit a Bayesian model to the data and allow us to incorporate external geographical information derived from Google Earth Engine. The cells are small enough to be computationally tractable but large enough to capture the spatial variation in the data. See the Earth Engine Plots page for more information about the data we use from Google Earth Engine.

The posterior predictive is the estimated point prediction for the number of observations in each grid cell derived from the posterior distribution of the model parameters.

Options

Black-crowned Night-Heron (bcnher), americas, 5° resolution, 303 cells

Trace Summary

One key component of Bayesian analysis is the ability to quantify uncertainty in the model predictions. We do this by sampling from the posterior distribution of the model parameters. These samples form a trace. We can summarize these traces into credible intervals, which describes how often a sample falls within a particular range.

We take advantage of this uncertainty to classify whether a particular parameter in the model is significant. We say that a parameter in the model is significant if it’s credible interval does not include zero (i.e. an analogy to rejecting the null hypothesis in a frequentist analysis).

options

Black-crowned Night-Heron (bcnher), americas, 5° resolution, 303 cells

misc parameters

This table contains posterior estimates for hyper-parameters for the CAR distribution such as

\alpha

and

\tau

, as well as the intercept and slope parameters

\beta

for the linear regression.

index	mean	sd	hdi_2.5%	hdi_97.5%
intercept	-12.859	2.875	-18.526	-7.842
betas[LST_Day_1km_p5]	8.218	4.002	0.464	16.194
betas[land_cover_08_woody_savannas]	-2.178	1.072	-4.322	-0.129
betas[land_cover_09_savannas]	5.372	1.815	2.058	9.124
betas[land_cover_10_grasslands]	-1.95	0.95	-3.912	-0.182
alpha	0.811	0.143	0.531	0.999
tau_phi	0.052	0.032	0.01	0.114

Note that the land cover classification features often change from species to species. We can infer types of habits or environmental preferences from the significant features in the model.

spatial random effect

\phi

This measures random spatial variation across grid cells. The prior

\phi

is drawn from the CAR distribution i.e.

\phi_i \sim CAR(\mu_i, \tau_i, \alpha, W)

index	mean	sd	hdi_2.5%	hdi_97.5%
phi[129.0]	2.549	1.321	0.091	5.275

Observe how the random effects explain more of the variance in the simpler intercept_car model than in the more complex intercept_car_spatial model.

poisson parameter

\mu

The prior

\mu

is the rate parameter, which controls the expected number of observations in each grid cell. Note there are some notational inconsistencies here; this is the same as our

\theta

parameter in the model definitions, and is properly known as

\lambda

in the Poisson distribution.

index	mean	sd	hdi_2.5%	hdi_97.5%
mu[0]	4.582	2.09	1.044	8.659
mu[2]	0.766	0.764	0.003	2.28
mu[14]	1.363	1.108	0.01	3.55
mu[27]	1.359	1.102	0.016	3.535
mu[42]	2.549	1.548	0.171	5.564
mu[53]	6.146	2.459	1.936	11.053
mu[64]	2.687	1.599	0.256	5.807
mu[125]	1.806	1.286	0.075	4.334
mu[129]	4.168	2.005	0.984	8.228
mu[130]	0.651	0.708	0.001	2.077

25 rows - page 1 of 3

Data

Data for this project can be found in the gs://iyse6420-birdcall-distribution bucket. Here are the direct links to source data:

You can load this data directly into a Python session using pandas and pyarrow:

IYSE 6420 Birdcall Distributions

Plots

Options

Black-crowned Night-Heron (bcnher), americas, 5° resolution, 303 cells

linear scale

log scale

Trace Summary

options

Black-crowned Night-Heron (bcnher), americas, 5° resolution, 303 cells

misc parameters

spatial random effect $\phi$

poisson parameter $\mu$

Data

IYSE 6420 Birdcall Distributions

Plots

Options

Black-crowned Night-Heron (bcnher), americas, 5° resolution, 303 cells

linear scale

log scale

Trace Summary

options

Black-crowned Night-Heron (bcnher), americas, 5° resolution, 303 cells

misc parameters

spatial random effect ϕ\phiϕ

poisson parameter μ\muμ

Data

spatial random effect $\phi$

poisson parameter $\mu$