Do not need to download the model, the model is loaded from the memory directly. and the onnx model size is only 1.6M. Do not need pytorch, torchaudio, etc. dependencies. fsmn VAD was trained on ...