How I addressed Kerchunking in SVXLink with VAD (Voice Activity Detection)

Introduction

At the outset of 2024, I embarked on a fascinating journey into the world of HAM radio. Tuning into local HAM frequencies in January, I discovered a vibrant community of enthusiasts across Romania. However, my exploration also led me to encounter a prevalent issue known as “Kerchunking.” This phenomenon occurs when individuals either test their ability to access a repeater or intentionally disrupt network traffic and its participants. Intrigued and somewhat troubled by this, I decided to delve deeper and address the issue head-on. This journey would eventually lead me to become a certified HAM operator :).

The Challenge of Kerchunking

Kerchunking, while seemingly benign, can significantly hamper the quality and accessibility of HAM radio communications. Determined to find a solution, I began investigating the technical aspects of the problem. My research pointed me towards SVXLink, a C++ based software pivotal to the operation of many repeaters, which has been around for over two decades. It became clear that implementing a Voice Activity Detection (VAD) system could potentially mitigate the kerchunking issue.

Technical Journey

With limited C++ expertise, I initially sought community support by raising an issue on GitHub. Despite my efforts, the response was lukewarm, as many in the HAM community seemed resigned to living with kerchunking. Undeterred, I turned to libfvad, a C++ implementation of the WebRTC VAD engine, and embarked on a solo mission to integrate this solution into svxlink.

svxlink vad processing flow
svxlink vad processing flow

My journey was fraught with challenges, especially given my initial lack of audio processing knowledge. However, with perseverance and assistance from GPT, I navigated through numerous debugging sessions. I learned to manage the audio stream at the svx Reflector level, focusing on OPUS codec transcoding, frame accumulation, and VAD library integration.

Despite successfully integrating WebRTC VAD, it became apparent that it was not the optimal solution. It could detect sounds but struggled to differentiate between voice and non-voice audio. This led me to discover Silero VAD, a more promising alternative that I tested through a web implementation. After much effort, I managed to implement Silero VAD in the Reflector, including a buffering mechanism to improve sample accumulation and analysis.

Implementation and Configuration

Configuring the system to use Silero VAD is straightforward. The necessary components include the Silero ONNX model, the Microsoft ONNX library for integration, and adjustments to the svxreflector.conf file as shown below:

[VAD_SETTINGS]
# Enable or disable the VAD feature
IS_VAD_ENABLED=true
# Path to the Silero VAD model file
SILERO_MODEL_PATH=/home/silviu/silero-vad/files/silero_vad.onnx
# Sample rate of the audio stream in Hz for the VAD model (Do not change this value unless you know what you are doing)
SAMPLE_RATE=16000
# Number of samples in the audio stream that the VAD model processes at once (should be the same as the model's input size)
WINDOW_SIZE_SAMPLES=1536
# Threshold for the VAD model to consider a frame as speech (0.0 - 1.0) - the higher the value, the more strict the VAD model is
THRESHOLD=0.3
# Number of samples sent to the VAD model at once (should be a multiple of WINDOW_SIZE_SAMPLES) - the higher the value, the more accurate the VAD model is
PROCESSED_SAMPLE_BUFFER_SIZE=7680
# The gate sample size is the number of samples that the VAD model uses to determine if the audio stream is speech or not (should be a multiple of PROCESSED_SAMPLE_BUFFER_SIZE)
VAD_GATE_SAMPLE_SIZE=46080
# List of callsigns for which the VAD is enabled (comma-separated)
VAD_ENABLED_CALLSIGNS=client1,client2,client3
# Number of first milliseconds in buffer that are replaced with silence before sending the audio stream to the VAD model to minimize the false positives
START_SILENCE_REPLACEMENT_BUFFER_MS=90

Conclusion

The journey to address the kerchunking issue in HAM radio through technical innovation has been both challenging and rewarding. My work on integrating VAD into SVXLink represents a significant step forward in enhancing the quality of HAM radio communications. For fellow HAM reflector administrators and enthusiasts alike, I hope this solution offers a valuable tool in making our conversations more enjoyable and interference-free.

For a detailed guide on installing and configuring SVXLink with VAD support, please visit my GitHub repository.

73 de YO6SAY!

Silviu Stroe
I'm Silviu and I run Brainic, a mobile-focused software agency. I'm also a member of Nokia and Yahoo wall of fame. My interests are in low-code/no-code development and bleeding-edge technologies.

Leave a Reply

Your email address will not be published. Required fields are marked *