The Cornell Lab of Ornithology and Oregon State University received a National Science Foundation award to fund BirdCast. The goal is to synthesize eBird data, night flight calls captured by acoustic monitoring stations and clouds of migrating birds detected at night by WSR-88D weather radar stations. The proposed work will allow, for the first time, real-time predictions of bird migrations: when they migrate, where they migrate, and how far they will be flying. Accurate models of migration have broad application for basic research by allowing researchers to understand behavioral aspects of migration, how migration timing and pathways respond to variation in climatic conditions, and whether linkages exist between annual variation in migration timing and subsequent inter-annual changes in population size.
Collaborative Research: CDI-Type II: BirdCast: Novel Machine Learning Methods for Understanding Continent-Scale Bird Migration
PI: Steve Kelling, Cornell University, Lab of Ornithology
PI: Tom Dietterich, Oregon State University, Kelley Engineering Center
PI: Wesley Hochachka, Cornell University, Lab of Ornithology
PI: Andrew Farnsworth, Cornell University, Lab of Ornithology
Theme: From Data to Knowledge
A unique interdisciplinary team of computer scientists, statisticians, and ornithologists will develop novel computer science methods and apply them to the challenge of understanding the annual migration of birds across North America, which is one of the most complex and dynamic natural phenomena on the planet. While direct observation of migrating birds is limited to a handful of large birds capable of wearing tracking devices, other sources of data provide partial information about migration, which when combined will provide insight into migration at a scale previously unimaginable. These sources include a continental-scale network of volunteer bird watchers, night flight calls captured by acoustic monitoring stations, and clouds of migrating birds detected at night by WSR-88D weather radar stations. We propose to develop two innovative machine learning techniques—Collective Graphical Models (CGMs) and Semi-Parametric Latent Process Models (SLPMs)—that when combined will reconstruct and predict the behavior of the ~400 species of migrating birds across North America. The resulting model will be able to identify the complex conditions governing the dynamics of migration behavior including the choice of migratory pathways, the factors that influence when birds migrate, and the speed and duration of each night’s movements. In addition, we will improve our machine learning methods for identifying bird species from their flight calls. We will develop infrastructure for collecting and making interoperable bird observations, flight call, and radar data as well as covariate data from multiple sources including satellite imagery, weather, and human population data. By the end of the grant period, we will provide daily forecasts of bird migration (a daily BirdCast) and provide interactive tools for visualizing and understanding the fitted models. We will also provide general-purpose open-source implementations of CGMs and SLPMs.
Intellectual Merit: CGMs and SLPMs will greatly extend the scope of phenomena that can be captured with graphical models. A CGM is able to recover a model of the behavior of individuals using only collective observations. For BirdCast, it will construct a model of individual bird dynamics from observations of the counts of birds at a given location or flying over a given acoustic or radar station. We will develop algorithms for fitting CGMs to data and for predicting bird migration in real time. An SLPM is an extension of latent process models, such as the CGM for bird migration, in which the dynamics of a process is represented by latent variables that are observed only indirectly through noisy sensors. In an SLPM, the conditional probability distribution of each variable is modeled using flexible, non-parametric methods from machine learning, such as boosted regression trees. Boosted regression trees have been shown to make more accurate predictions and require less modeling effort and data pre-processing than traditional parametric methods. However, introducing such flexible methods into latent variable models raises difficult challenges for model fitting and validation. We will develop novel information regularization and latent model cross-validation methods to prevent over-fitting and enforce latent variable semantics. Latent process models arise in domains as varied as economics, psychology, and engineering. Hence, the resulting algorithms will have broad application throughout science and engineering.
Broader Impacts: The proposed work will allow, for the first time, real-time predictions of bird migrations: when they migrate, where they migrate, and how far they will be flying. Knowledge of migratory behavior will aid decisions for placement of wind turbines and identify nights on which lighting of tall buildings could be reduced to prevent the deaths of millions of birds. Accurate models of migration also have broad application for basic research by allowing researchers to understand behavioral aspects of migration, how migration timing and pathways respond to variation in climatic conditions, and whether linkages exist between annual variation in migration timing and subsequent inter-annual changes in population size.
Development of novel web-based data visualizations for communicating the predictions of migration will also provide BirdCast with a strong potential for outreach and education of the general public, with applications for informal education regarding computer science, ecology, and conservation. The same visualization tools will provide an appealing avenue for school classes and the general public to “see” the dynamic process of migration in action, strengthening their connection to the natural world. Further, by integrating these applications into existing education and outreach activities already managed by the Cornell Lab of Ornithology, we can introduce vast new audiences to exciting advances in computer science.