Even behavior scientists who are particularly good at interpreting dog vocalizations are unable to understand all the nuances – but that could change thanks to machine learning. A team of researchers is currently developing artificial intelligence-based tools that could help us better understand what man’s best friend is trying to tell us.
This is not the first time that researchers have explored this concept. But so far they’ve all run into the same problem: lack of data. Indeed, all language processing models must be trained using real-world examples. There is no way an algorithm can determine the meaning of a sequence of sounds from scratch; It is imperative to provide him with references.
However, while it is very easy to obtain this type of resource in the case of humans, it is much more complicated for animals. “ Logistically, animal vocalizations are much more difficult to solicit and record “, explains Artem Abzaliev, lead author of the study.
A model designed for humans
To overcome this obstacle, a team from the University of Michigan relied on another rather original approach: recycle a model initially developed for human speech. « Using language processing models initially trained on human speech, we opened a window into the nuances of dog barking », explains co-author Rada Mihalcea.
Thanks to this approach, the team was able to start building their project on an already solid foundation, since these systems have become relatively sophisticated in recent years. There are already tons of models capable of distinguishing nuances of timbre, intonation or accent. Some are even able to recognize the emotions (frustration, gratitude, disgust, etc.) that come through in an audio recording. “ These models are capable of learning to encode the incredibly complex patterns of human language, and we wanted to see if we could harness these capabilities to interpret dog barks. “, explains Abzaliev.
So his team started with Wav2Vec2, a model designed for humans, and presented him with a dataset made up of recordings from 74 dogs. They came from animals of various breeds, ages and sexes, and were collected in many different contexts (play, detection of a disruptive element such as a small animal, defense reflex, social interactions, etc.). Using this data, Abzaliev was able to alter the importance of the connections that connect the network’s artificial neurons (the weights) as well as the biases that govern them.
Encouraging results
At the end of the process, the team was able to generate representations of the acoustic data collected from the dogs and interpret them. Analyzing the results, they found that the model had classified the recordings into the correct category (play, anxiety, attention seeking, pain, frustration, etc.) in 70% of cases.
A result that is still quite approximate, but much higher than what models trained exclusively on animal recordings are capable of. “ This is the first time that techniques optimized for human speech have participated in the decoding of animal communication. “, Mihalcea rejoiced.
A resource for researchers?
Beyond the raw results, these results have a very interesting implication: they prove that the sounds and patterns inherent in human language can serve as a foundation for analyzing the vocalizations of dogs, and perhaps even other species. Consequently, once mature, this system could become a very interesting tool, particularly fors ethologists.
These researchers specializing in the study of animal behavior often rely on vocalizations to study interactions within groups, behavioral particularities and even cognitive abilities of their preferred species. Therefore, a tool like this could help them identify nuances that might otherwise have been missed… or simply pre-empted. For example, imagine a team of specialists studying primates in a difficult environment, such as a jungle. Instead of spending considerable time sifting through audio recordings to classify vocalizations and assign them to a certain type of behavior, they could hand over this task to an AI model to identify interesting relationships and trends much more quickly.
The researchers do not address this theme at all in their paper, but by extrapolating, we can also imagine that one day, a generative AI system could make it possible to synthesize sounds specifically calibrated to transmit a very specific message to an animal. For now, this is still pure science fiction. But pperhaps one day, artificial intelligence will finally allow us to “chat” with our faithful companions, to understand whale songs or learn more about why orcas attack boats, for example.
The text of the study is available here.
🟣 To not miss any news on the WorldOfSoftware, follow us on Google and on our WhatsApp channel. And if you love us, .
