Estoy tratando de pasar archivos de audio y vídeo en FFMPEG y determinar si su audio es totalmente silencioso (no necesito detectar si el flujo de audio está presente o no, sólo si es silencioso), e idealmente devolver un booleano al final de todo, o un 0/1. Soy capaz de dar salida a la información de silencio a través de:
ffmpeg -i FILE.mov -af silencedetect=noise=0.0001 -f null - 2>&1
Creo que tendría que comprobar si el último valor de silence_duration es igual a la duración.
Parece que hay un redondeo diferente de los puntos decimales dependiendo de los valores para la salida de FFMPEG (duration = 15.67, silence_duration=15.6667), así que cualquier precisión posible dadas estas circunstancias está bien.
No estoy seguro de cómo analizar la salida para hacer esto y cualquier empujón en la dirección correcta sería extremadamente útil - ¡Gracias!
Aquí hay dos posibles ejemplos que contienen silencio:
Archivo enteramente silencioso
ffmpeg version 4.4.1 Copyright (c) 2000-2021 the FFmpeg developers
built with Apple clang version 12.0.0 (clang-1200.0.32.29)
configuration: --prefix=/usr/local/Cellar/ffmpeg/4.4.1_3 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-avresample --enable-videotoolbox
libavutil 56. 70.100 / 56. 70.100
libavcodec 58.134.100 / 58.134.100
libavformat 58. 76.100 / 58. 76.100
libavdevice 58. 13.100 / 58. 13.100
libavfilter 7.110.100 / 7.110.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 9.100 / 5. 9.100
libswresample 3. 9.100 / 3. 9.100
libpostproc 55. 9.100 / 55. 9.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/Users/user/Desktop/SilenceAll.mov':
Metadata:
major_brand : qt
minor_version : 1
compatible_brands: qt
creation_time : 2021-11-23T16:08:58.000000Z
timecode : 01:00:00:09
Duration: 00:00:15.67, start: 0.000000, bitrate: 11795 kb/s
Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(tv, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 8955 kb/s, 30 fps, 30 tbr, 3k tbn, 60 tbc (default)
Metadata:
handler_name : Video Handler
vendor_id : [0][0][0][0]
encoder : H.264
Stream #0:1(und): Audio: pcm_s24le (in24 / 0x34326E69), 48000 Hz, stereo, s32 (24 bit), 2304 kb/s (default)
Metadata:
handler_name : Sound Handler
vendor_id : [0][0][0][0]
Stream #0:2(und): Data: none (tmcd / 0x64636D74), 0 kb/s (default)
Metadata:
handler_name : Timecode Handler
timecode : 01:00:00:09
Stream mapping:
Stream #0:0 -> #0:0 (h264 (native) -> wrapped_avframe (native))
Stream #0:1 -> #0:1 (pcm_s24le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
Metadata:
major_brand : qt
minor_version : 1
compatible_brands: qt
timecode : 01:00:00:09
encoder : Lavf58.76.100
Stream #0:0(und): Video: wrapped_avframe, yuv420p(tv, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 30 fps, 30 tbn (default)
Metadata:
handler_name : Video Handler
vendor_id : [0][0][0][0]
encoder : Lavc58.134.100 wrapped_avframe
Stream #0:1(und): Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s (default)
Metadata:
handler_name : Sound Handler
vendor_id : [0][0][0][0]
encoder : Lavc58.134.100 pcm_s16le
[silencedetect @ 0x7f93de904280] silence_start: 0.49 bitrate=N/A speed=11.7x
frame= 470 fps=0.0 q=-0.0 Lsize=N/A time=00:00:15.66 bitrate=N/A speed=31.7x
video:246kB audio:2938kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
[silencedetect @ 0x7f93de904280] silence_end: 15.6667 | silence_duration: 15.6667
Silencio al principio y al final del expediente (dos secciones de silencio, pero no del todo)
ffmpeg version 4.4.1 Copyright (c) 2000-2021 the FFmpeg developers
built with Apple clang version 12.0.0 (clang-1200.0.32.29)
configuration: --prefix=/usr/local/Cellar/ffmpeg/4.4.1_3 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-avresample --enable-videotoolbox
libavutil 56. 70.100 / 56. 70.100
libavcodec 58.134.100 / 58.134.100
libavformat 58. 76.100 / 58. 76.100
libavdevice 58. 13.100 / 58. 13.100
libavfilter 7.110.100 / 7.110.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 9.100 / 5. 9.100
libswresample 3. 9.100 / 3. 9.100
libpostproc 55. 9.100 / 55. 9.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/Users/user/Desktop/SilenceTopTail.mov':
Metadata:
major_brand : qt
minor_version : 1
compatible_brands: qt
creation_time : 2021-11-23T16:09:32.000000Z
timecode : 01:00:00:09
Duration: 00:00:15.67, start: 0.000000, bitrate: 11795 kb/s
Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(tv, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 8955 kb/s, 30 fps, 30 tbr, 3k tbn, 60 tbc (default)
Metadata:
handler_name : Video Handler
vendor_id : [0][0][0][0]
encoder : H.264
Stream #0:1(und): Audio: pcm_s24le (in24 / 0x34326E69), 48000 Hz, stereo, s32 (24 bit), 2304 kb/s (default)
Metadata:
handler_name : Sound Handler
vendor_id : [0][0][0][0]
Stream #0:2(und): Data: none (tmcd / 0x64636D74), 0 kb/s (default)
Metadata:
handler_name : Timecode Handler
timecode : 01:00:00:09
Stream mapping:
Stream #0:0 -> #0:0 (h264 (native) -> wrapped_avframe (native))
Stream #0:1 -> #0:1 (pcm_s24le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
Metadata:
major_brand : qt
minor_version : 1
compatible_brands: qt
timecode : 01:00:00:09
encoder : Lavf58.76.100
Stream #0:0(und): Video: wrapped_avframe, yuv420p(tv, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 30 fps, 30 tbn (default)
Metadata:
handler_name : Video Handler
vendor_id : [0][0][0][0]
encoder : Lavc58.134.100 wrapped_avframe
Stream #0:1(und): Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s (default)
Metadata:
handler_name : Sound Handler
vendor_id : [0][0][0][0]
encoder : Lavc58.134.100 pcm_s16le
[silencedetect @ 0x7fb140d0d100] silence_start: 0.49 bitrate=N/A speed=12.8x
[silencedetect @ 0x7fb140d0d100] silence_end: 5.16667 | silence_duration: 5.16667
[silencedetect @ 0x7fb140d0d100] silence_start: 10.1
frame= 470 fps=0.0 q=-0.0 Lsize=N/A time=00:00:15.66 bitrate=N/A speed=31.9x
video:246kB audio:2938kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
[silencedetect @ 0x7fb140d0d100] silence_end: 15.6667 | silence_duration: 5.56667