ecoVAD 🍀

Voice activity detection in eco-acoustic data enables privacy protection and is a proxy for human disturbance

Benjamin Cretois, Carolyn Rosten & Sarab Sethi

About me

🌳 Researcher at Norwegian Institute for Nature Research (NINA)


💻 Currently mainly working with bioacoustics

... but also HPC, Deep learning

Background

Background

🔉 Eco-acoustic monitoring is increasingly being used to map biodiversity across large scales


😕 Little thought is given to the privacy concerns and potential scientific value of inadvertently recorded human speech


💥 We developed an end-to-end VAD pipeline and show how VAD models can be used for anonymisation & human noise quantification!

Method

Method

We developped an end-to-end pipeline (ecoVAD) and compared it with 2 state-of-the-art VAD models:


⭐ Note that in our repo we provided wrappers for using both models

Method


Training 🌳 : We collected soundscape data that we mixed with human voices & other noises.

Testing 👨 👩 🧒 : playback experiments with speech samples from a man, woman, and child at 1,5,10 and 20 meters.

Using: we collected soundscape from a recreational area to evaluate the ability of the models to detect human speech.

Results

Results

  • All model performed well on the playback data

  • Our custom model performs better

Results

Discussion

Discussion

⭐ The importance of training a model for specific purpose


⭐ But ... there is a trade of to make with generalisation (Pyannote performed better on a more "anthropogenic")
area

⭐ Speech detections as a direct measure of anthropogenic noise pollution and indirect proxy of human disturbance

Discussion


➡️ Have a look at the ecoVAD GitHub repo!

➡️ And very soon the paper in MEE!

➡️ Current work on snowscooter detections

➡️ Please reach out to us for any questions / collaboration!