Whale species produce a wide range of vocalizations, from very low to very high frequencies, which vary by species and location, making it difficult to develop models that automatically classify multiple whale species. By analyzing whale vocalizations, researchers can estimate population sizes, track changes over time, and help develop conservation strategies, including protected area designation and mitigation measures. Effective monitoring is essential for conservation, but the complexity of whale calls, especially from elusive species, and the vast amount of underwater audio data complicate efforts to track their populations.
Current methods for animal species identification through sound are more advanced for birds than for whales, as models like Google Perch can classify thousands of bird vocalizations. However, similar multi-species classification models for whales are more challenging to develop due to the diversity in whale vocalizations and a lack of comprehensive data for certain species. Previous efforts have focused on specific species like humpback whales, with earlier models developed by Google Research in partnership with NOAA and other organizations. These models helped classify humpback calls and identified new locations of whale activity.
To address the limitations of previous models, Google researchers developed a new whale bioacoustics model capable of classifying vocalizations from eight distinct species, including the mysterious “Biotwang” sound attributed to the Bryde’s whale. This new model expands on earlier efforts by classifying multiple species and vocalization types, designed for large-scale application on long-term passive acoustic recordings.
The proposed whale bioacoustics model processes audio data by converting it into spectrogram images for each 5-second window of sound. The front-end of the model uses mel-scaled frequency axes and log amplitude compression. It then classifies these spectrograms into one of 12 classes, corresponding to eight whale species and several specific vocalization types. To ensure accurate classifications and minimize false positives, the model was trained not just on positive examples but also on negative and background noise data. The model’s performance, as measured by metrics such as the area under the receiver operating characteristic curve (AUC), showed strong discriminative abilities, particularly for species like Minke and Bryde’s whales.
Along with the classification task, the model helped researchers discover new insights about species’ movements, including differences between central and western Pacific Bryde’s whale populations. By labeling over 200,000 hours of underwater recordings, the model also uncovered the seasonal migration patterns of some species. The model is now publicly available via Kaggle for further use in whale conservation and research efforts.
In conclusion, Google’s new whale bioacoustics model is a significant advancement in the field, addressing the challenge of multi-species classification with a model that not only recognizes eight species but also provides detailed insights into their ecology. This model is a crucial tool in marine biology research, offering scalable and accurate underwater audio data classification and furthering our understanding of whale populations, especially for elusive species like Bryde’s whales.
Check out the Paper and Blog. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter..
Don’t Forget to join our 50k+ ML SubReddit
Pragati Jhunjhunwala is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Kharagpur. She is a tech enthusiast and has a keen interest in the scope of software and data science applications. She is always reading about the developments in different field of AI and ML.