Abstract: In an age where the Internet is dominated by visual content, the generation of animated captions has become a must. It has always been an interesting study for researchers in the Department ...
Abstract: With the emergence of audio-language models, constructing large-scale paired audio-language datasets has become essential yet challenging for model development, primarily due to the ...
一些您可能无法访问的结果已被隐去。
显示无法访问的结果