Queensland scientists have examined automated audio detection methods and machine learning to better identify frog calls in the wild in the hopes that these tools can be used for the management and conservation of biodiversity throughout Australia.
In a paper published in Ecological Indicators1, researchers from James Cook University and the Queensland University of Technology taught three different computer models to detect frog calls found in lengthy recordings from eight separate frog breeding sites in northern Queensland.
The researchers discovered that their models could identify common species of frogs with greater than 80 per cent accuracy after learning by listening to thousands of minutes of recordings.
“Acoustic monitoring is becoming an important way to monitor whole communities in their natural habitats. Sound recordings are valuable for ecological studies of vocal animals and habitat disturbance, as well as for monitoring native and invasive species,” said lead author of the paper and PhD candidate at JCU’s College of Science & Engineering, Sheryn Brodie.
Tens of thousands of hours of audio
Brodie told Lab Down Under that this research was part of her broader PhD research using sound recordings to study the chorusing patterns of frogs in north Queensland. With sound recording technology allowing the continuous monitoring of frogs over long time periods, Brodie had over 30,000 hours’ worth of audio taken over 18 months at multiple sites.
“I’m attempting to describe the chorusing patterns of all the frog species present, so I need a way to automate some of the analysis,” she said.
To achieve this, Brodie and her team collaborated with the Ecoacoustics Research Group at QUT and accessed the tools they had available. In particular, the “false-colour” spectrograms developed by Dr Michael Towsey of QUT’s School of Computer Science were “most useful” for the detection of frogs, Brodie said.
“These spectrograms are based on three acoustic indices mapped to red, green and blue and combined to form a colour image. In most cases I am able to visually identify which species are active on a night from these images,” she told Lab Down Under.
“This led us to wanting to test if we could use the acoustic indices as predictors in a statistical classification model which might give us a method of automating species identification.”
What are acoustic indices?
When listening to a recording, we may hear the audio as coming across continuously in one smooth wave of sound. However, these recordings are actually a series of individual measurements of sound energy taken so closely together that they sound continuous. A typical recording may have 48,000 measurements per second, for instance.
An acoustic index is a “summary” of these measurements taken over a set length of time, ranging from a few seconds or minutes to the whole recording, Brodie said.
“In a sense it’s like averaging the signal over a chosen length with the aim of characterising the sound patterns rather than focusing on each individual noise.”
Various types of acoustic indices have been developed for different purposes. One example is the acoustic complexity index which uses the variation of sound intensity to spot bird calls.
“The idea being that a recording of a forest with birds calling will have high acoustic complexity, whereas a recording of urban traffic, although loud, will have low acoustic complexity,” Brodie said.
QUT’s Ecoacoustics Research Group has found acoustic indices are most useful when combined together to highlight individual acoustic features. This led to the development of “false-colour” spectrograms that offer a visual picture of a recording’s sound content.
Image 1: Example false-colour spectrogram of a 10-hour recording used in the study. Horizontal dotted lines delineate 1000 Hz frequency intervals (0-11025 Hz). 1 pixel represents 1 minute of audio and approximately 43 Hz frequency range. Colours derive from 3 acoustic indices mapped to the 3 colour channels red, green and blue. Biotic noises featured in this image are: 5 different frog species calling from 1900 hrs (pink, yellow, blue, indigo 500-5000 Hz; insects blue/indigo >5000 Hz and red > 3000 Hz. Picture by Sheryn Brodie. Used with permission.
A lesson in frog identification
Almost 4,000 recordings were used in the study, taken between October 2012 to April 2014 near waterbodies found in Townsville and Hervey Range. Eleven different acoustic indices were used to spot the frog calls of 12 different species.
Brodie and her team then used machine learning to train three different classification models to scan the spectograms and help improve the accuracy and predictive performance in detecting the calls of the 12 chosen species. This was done by taking the models through training data first to ‘teach’ it how to spot different frog calls, and then using the model on test data to determine the accuracy of the model.
The research found that acoustic indices could be used to assist the automated detection of frog calls from individual species, including when multiple species of frogs were calling at the same time. The results worked best for common species such as the cane toad, eastern dwarf tree frog and little red tree frog where precision and recall rates were greater than 80 per cent.
“For the less commonly heard species it is difficult to obtain enough cases for the training and testing process. So, while accuracy on the test cases was low for some species, we didn’t have many instances of these species in the test data, and accuracy might improve with more test cases,” Brodie said.
The study also showed that high prediction rates were easier to obtain when the test cases came from the same sound files as the training data.
“While our test cases were from the same recorders and same sites, they were from different nights and months. This highlights that the variability of sound in environmental recordings is not completely known or controllable.”
Accuracy from complexity
While frog calls can differ in a number of ways, such as frequency, duration and call rate, variations at a small scale did not affect the acoustic indices which averaged the audio over a minute and at a 43 Hertz frequency range in the study.
“This is why it seems using acoustic indices to detect chorusing animals is effective because the indices are reflecting the chorus activity across a whole minute and a broad frequency range rather than detecting the individual calls at a precise time scale or frequency range,” Brodie said.
Variations in call rate created an obstacle in meeting the goal of high predictive performance however, with some species of frog having slow call rates when chorusing as an individual but higher call rates when sounding out as a group. This meant that the colour in the spectrograms could differ for the same species, depending on whether it was calling alone or with other frogs of the same type.
Brodie and her team taught the models to detect both types by including calls from individuals and groups within the training data.
“That is, we chose minutes that contained only one species calling, or two or more species calling. Some training minutes contained up to seven frog species calling in a single minute. We also chose training minutes that contained other sources of noise, such birds or vehicles, as well as silence,” she said.
The training data was thus analysed to build a model of which acoustic indices would best predict the selected species.
“I cannot say we completely got around the challenges of call variability and multiple species calling. The frogs in my study form large choruses which often saturate the audio, and this will always be challenging for any detection method.”
Making a good method better
Further improvements could be made to the detection method, Brodie said, including recording future audio in an uncompressed sound file format instead of MP3, which itself created inconsistencies in the recordings.
While Brodie set a high standard for the detection method, which included all cases of identified frog calls whether they were faint or brief, she said that having tighter controls may improve the usefulness of the model.
“For example, the classifier could be trained to identify a species only if it is calling in the foreground (i.e. louder than background noise) and for a minimum duration of time. This approach would disregard any species calling only faintly or briefly in that minute, but this may not be an issue in a long-term monitoring program when there is ample opportunity for detecting all the species calling. This trade-off would depend on the objectives of the study.”
Using audio recorded from several, standardised units could also remove some of the variability, Brodie said.
This method was best used on chorusing animals such as frogs and insects that had consistent, persistent calls that dominated the soundscape because of the way that acoustic indices averaged out the sound over an entire minute. There may also be some mammals this would work for, plus it could also be used to filter out “nuisance noise” such as wind and rain, Brodie told Lab Down Under.
“I would predict this method would be useful in any situation for a target noise where there is little competing noise. However, this is also true for the more common call detection methods that attempt to detect individual calls.”
Brodie is now working on research examining whether cane toad calling affects the chorusing behaviour of native frog species using the same set of recordings.
The acoustic monitoring study was funded by the Australian Research Council, which is also backing a broader research project on cane toads led by JCU Head of Zoology and Ecology Professor Lin Schwarzkopf in partnership with the Anindilyakwa Land Council on Groote Eylandt. This study will investigate whether cane toad calls have an effect on native frogs on Groote Eylandt where cane toads have yet to spread.
Professor Schwarzkopf, of JCU’s Vertebrate Ecology Lab, is involved in the Australian Acoustic Observatory (A2O), a continental scale acoustic recording project taking continuous recordings at around 400 sites across the country.
Brodie’s research was funded by the Australian Research Council and an Australian Government Research Training Program Scholarship. The
Anindilyakwa Land Council were partners in the grant funding for this
Author’s note: If you enjoyed this article, you can follow Lab Down Under on Facebook and LinkedIn or support me on Patreon. I also have my own personal Twitter account where I’ll be sharing my latest stories and any other items of interest. Finally, you can subscribe here to get my weekly blogs in your inbox.
1 Brodie S, Allen-Ankins S, Towsey M, Roe P, Schwarzkopf L. Automated species identification of frog choruses in environmental recordings using acoustic indices. Ecological Indicators, Volume 119, December 2020, 106852.
Featured image by heinnie101 from PxHere. Used under the CC0 1.0 Universal (CC0 1.0) Public Domain Dedication licence.