5 votos

Detectar el audio silencioso a través de FFMPEG (archivo completo)

Estoy tratando de pasar archivos de audio y vídeo en FFMPEG y determinar si su audio es totalmente silencioso (no necesito detectar si el flujo de audio está presente o no, sólo si es silencioso), e idealmente devolver un booleano al final de todo, o un 0/1. Soy capaz de dar salida a la información de silencio a través de:

ffmpeg -i FILE.mov -af silencedetect=noise=0.0001 -f null - 2>&1

Creo que tendría que comprobar si el último valor de silence_duration es igual a la duración.

Parece que hay un redondeo diferente de los puntos decimales dependiendo de los valores para la salida de FFMPEG (duration = 15.67, silence_duration=15.6667), así que cualquier precisión posible dadas estas circunstancias está bien.

No estoy seguro de cómo analizar la salida para hacer esto y cualquier empujón en la dirección correcta sería extremadamente útil - ¡Gracias!

Aquí hay dos posibles ejemplos que contienen silencio:

Archivo enteramente silencioso

ffmpeg version 4.4.1 Copyright (c) 2000-2021 the FFmpeg developers
  built with Apple clang version 12.0.0 (clang-1200.0.32.29)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/4.4.1_3 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-avresample --enable-videotoolbox
  libavutil      56. 70.100 / 56. 70.100
  libavcodec     58.134.100 / 58.134.100
  libavformat    58. 76.100 / 58. 76.100
  libavdevice    58. 13.100 / 58. 13.100
  libavfilter     7.110.100 /  7.110.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  9.100 /  5.  9.100
  libswresample   3.  9.100 /  3.  9.100
  libpostproc    55.  9.100 / 55.  9.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/Users/user/Desktop/SilenceAll.mov':
  Metadata:
    major_brand     : qt  
    minor_version   : 1
    compatible_brands: qt  
    creation_time   : 2021-11-23T16:08:58.000000Z
    timecode        : 01:00:00:09
  Duration: 00:00:15.67, start: 0.000000, bitrate: 11795 kb/s
  Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(tv, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 8955 kb/s, 30 fps, 30 tbr, 3k tbn, 60 tbc (default)
    Metadata:
      handler_name    : Video Handler
      vendor_id       : [0][0][0][0]
      encoder         : H.264
  Stream #0:1(und): Audio: pcm_s24le (in24 / 0x34326E69), 48000 Hz, stereo, s32 (24 bit), 2304 kb/s (default)
    Metadata:
      handler_name    : Sound Handler
      vendor_id       : [0][0][0][0]
  Stream #0:2(und): Data: none (tmcd / 0x64636D74), 0 kb/s (default)
    Metadata:
      handler_name    : Timecode Handler
      timecode        : 01:00:00:09
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> wrapped_avframe (native))
  Stream #0:1 -> #0:1 (pcm_s24le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
  Metadata:
    major_brand     : qt  
    minor_version   : 1
    compatible_brands: qt  
    timecode        : 01:00:00:09
    encoder         : Lavf58.76.100
  Stream #0:0(und): Video: wrapped_avframe, yuv420p(tv, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 30 fps, 30 tbn (default)
    Metadata:
      handler_name    : Video Handler
      vendor_id       : [0][0][0][0]
      encoder         : Lavc58.134.100 wrapped_avframe
  Stream #0:1(und): Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s (default)
    Metadata:
      handler_name    : Sound Handler
      vendor_id       : [0][0][0][0]
      encoder         : Lavc58.134.100 pcm_s16le
[silencedetect @ 0x7f93de904280] silence_start: 0.49 bitrate=N/A speed=11.7x    
frame=  470 fps=0.0 q=-0.0 Lsize=N/A time=00:00:15.66 bitrate=N/A speed=31.7x    
video:246kB audio:2938kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
[silencedetect @ 0x7f93de904280] silence_end: 15.6667 | silence_duration: 15.6667

Silencio al principio y al final del expediente (dos secciones de silencio, pero no del todo)

ffmpeg version 4.4.1 Copyright (c) 2000-2021 the FFmpeg developers
  built with Apple clang version 12.0.0 (clang-1200.0.32.29)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/4.4.1_3 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-avresample --enable-videotoolbox
  libavutil      56. 70.100 / 56. 70.100
  libavcodec     58.134.100 / 58.134.100
  libavformat    58. 76.100 / 58. 76.100
  libavdevice    58. 13.100 / 58. 13.100
  libavfilter     7.110.100 /  7.110.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  9.100 /  5.  9.100
  libswresample   3.  9.100 /  3.  9.100
  libpostproc    55.  9.100 / 55.  9.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/Users/user/Desktop/SilenceTopTail.mov':
  Metadata:
    major_brand     : qt  
    minor_version   : 1
    compatible_brands: qt  
    creation_time   : 2021-11-23T16:09:32.000000Z
    timecode        : 01:00:00:09
  Duration: 00:00:15.67, start: 0.000000, bitrate: 11795 kb/s
  Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(tv, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 8955 kb/s, 30 fps, 30 tbr, 3k tbn, 60 tbc (default)
    Metadata:
      handler_name    : Video Handler
      vendor_id       : [0][0][0][0]
      encoder         : H.264
  Stream #0:1(und): Audio: pcm_s24le (in24 / 0x34326E69), 48000 Hz, stereo, s32 (24 bit), 2304 kb/s (default)
    Metadata:
      handler_name    : Sound Handler
      vendor_id       : [0][0][0][0]
  Stream #0:2(und): Data: none (tmcd / 0x64636D74), 0 kb/s (default)
    Metadata:
      handler_name    : Timecode Handler
      timecode        : 01:00:00:09
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> wrapped_avframe (native))
  Stream #0:1 -> #0:1 (pcm_s24le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
  Metadata:
    major_brand     : qt  
    minor_version   : 1
    compatible_brands: qt  
    timecode        : 01:00:00:09
    encoder         : Lavf58.76.100
  Stream #0:0(und): Video: wrapped_avframe, yuv420p(tv, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 30 fps, 30 tbn (default)
    Metadata:
      handler_name    : Video Handler
      vendor_id       : [0][0][0][0]
      encoder         : Lavc58.134.100 wrapped_avframe
  Stream #0:1(und): Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s (default)
    Metadata:
      handler_name    : Sound Handler
      vendor_id       : [0][0][0][0]
      encoder         : Lavc58.134.100 pcm_s16le
[silencedetect @ 0x7fb140d0d100] silence_start: 0.49 bitrate=N/A speed=12.8x    
[silencedetect @ 0x7fb140d0d100] silence_end: 5.16667 | silence_duration: 5.16667
[silencedetect @ 0x7fb140d0d100] silence_start: 10.1
frame=  470 fps=0.0 q=-0.0 Lsize=N/A time=00:00:15.66 bitrate=N/A speed=31.9x    
video:246kB audio:2938kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
[silencedetect @ 0x7fb140d0d100] silence_end: 15.6667 | silence_duration: 5.56667

2voto

Puede canalizar la salida de ffmpeg a través de awk para su posterior procesamiento:

ffmpeg ... | awk '/silence_end/ && ($5 == $8) {print "silent"}'

No me molesté en comprobar silence_start porque para que todo el audio sea silencioso, silent_end tiene que coincidir con silence_duration de todos modos.

AppleAyuda.com

AppleAyuda es una comunidad de usuarios de los productos de Apple en la que puedes resolver tus problemas y dudas.
Puedes consultar las preguntas de otros usuarios, hacer tus propias preguntas o resolver las de los demás.

Powered by:

X