What You’ll Learn

  • The basic command to overlay BGM (an .mp3) on a video (an .mp4)
  • How to keep vs. replace the original audio track
  • Lowering BGM to 30% so the voiceover stays audible
  • Handling BGM that is shorter or longer than the video (apad, aloop, shortest)
  • Adding fade-in and fade-out to the BGM
  • Chaining intro / main / outro tracks into a single BGM
  • Auto-ducking the BGM under narration
  • Stream-mapping pitfalls and how to avoid them
  • Batch-applying the same BGM to many videos

Tested with: FFmpeg 8.1 Platform: Windows / macOS / Linux


Basic Command: Overlaying BGM on a Video

The most useful pattern — original audio plus BGM — uses the amix filter. You cannot use -c:a copy here because the audio must be re-encoded after mixing.

ffmpeg -i input.mp4 -i bgm.mp3 \
  -filter_complex "[0:a][1:a]amix=inputs=2:duration=shortest:dropout_transition=0[aout]" \
  -map 0:v:0 -map "[aout]" -c:v copy -c:a aac -b:a 192k output.mp4

There are three things to notice. -filter_complex mixes the two audio streams, -map 0:v:0 keeps the video, and -map "[aout]" selects the mixed audio. The video is copied (-c:v copy), so the command finishes in seconds even on long files.

Key amix options

OptionDefaultDescription
inputs2Number of inputs to mix
durationlongestlongest, shortest, or first decides output length
dropout_transition2Fade duration (seconds) when an input ends
weights1 1Per-input weight; combines with the volume filter
normalize1Auto-normalises the output to prevent clipping

amix vs amerge

These two filters are easily confused but solve different problems.

FilterPurposeOutput channels
amixSums signals that share the same channel layoutSame as inputs
amergeConcatenates channels (e.g., 2 mono → 1 stereo)Sum of input channels

For “video + BGM” you want the signals summed, so always use amix. Using amerge produces a stereo file with the original audio in the left channel and the BGM in the right — almost never what you want.


Replacing Original Audio with BGM

If you want to throw away the original audio and keep only the BGM, drop the source audio with -an-style mapping and pick the BGM stream directly.

ffmpeg -i input.mp4 -i bgm.mp3 \
  -map 0:v:0 -map 1:a:0 -c:v copy -c:a aac -b:a 192k -shortest output.mp4

-shortest cuts the output at the shorter input, so a long BGM gets trimmed automatically. Re-encode to AAC (-c:a aac) rather than copying MP3 into MP4 — many players choke on MP3-in-MP4 even though FFmpeg accepts it.


Adjusting Volume Balance

“BGM is too loud and drowns out the voice” is the single most common complaint. Insert a volume filter before amix to attenuate only the BGM.

ffmpeg -i input.mp4 -i bgm.mp3 \
  -filter_complex "[1:a]volume=0.3[bgm];[0:a][bgm]amix=inputs=2:duration=shortest:dropout_transition=0[aout]" \
  -map 0:v:0 -map "[aout]" -c:v copy -c:a aac -b:a 192k output.mp4

volume=0.3 is roughly -10 dB. Practical starting points:

Use caseOriginal audioBGM
Tutorial / explainer1.00.15–0.25
Vlog (keep ambience)0.70.30–0.40
Silent footage with BGM(muted)1.0
Music-led content0.201.0

You can also use dB syntax: volume=-12dB. Note that amix defaults to normalize=1, which can rescale your carefully tuned ratios — pass normalize=0 if you want the exact mix.

Verify with a single input

To rehearse the same chain against a single audio file, split the source with asplit and mix the copies.

ffmpeg -i input.mp3 -filter_complex "[0:a]asplit=2[a1][a2];[a1]volume=0.3[bgm];[a2]volume=1.0[voice];[bgm][voice]amix=inputs=2:duration=longest:dropout_transition=0[out]" -map "[out]" -c:a libmp3lame output.mp3

asplit=2 duplicates the input into two streams, one of which gets attenuated before being mixed back together. Replace the split with two real -i inputs in production.


When BGM Is Shorter or Longer Than the Video

A 60-second video paired with a 30-second BGM gives you four options:

SituationFilterEffect
BGM short — pad with silenceapadAppends silence to the end
BGM short — loop insteadaloopRepeats the BGM
BGM long — trim to videoamix=duration=shortest or -shortestStops at the shorter input
BGM long — clean cutatrim + afadeCuts at a chosen second with a fade

Loop the BGM until the video ends

ffmpeg -i input.mp4 -i bgm.mp3 \
  -filter_complex "[1:a]aloop=loop=-1:size=2e+09[loop];[0:a][loop]amix=inputs=2:duration=first:dropout_transition=0[aout]" \
  -map 0:v:0 -map "[aout]" -c:v copy -c:a aac -b:a 192k output.mp4

aloop=loop=-1 repeats forever; size=2e+09 allocates a generous loop buffer. duration=first clips the output to the video length.

Pad the BGM tail with silence

ffmpeg -i input.mp4 -i bgm.mp3 \
  -filter_complex "[1:a]apad[pad];[0:a][pad]amix=inputs=2:duration=first:dropout_transition=0[aout]" \
  -map 0:v:0 -map "[aout]" -c:v copy -c:a aac -b:a 192k output.mp4

apad with no arguments appends infinite silence; duration=first makes amix stop at the video length. The result: BGM plays through, then silence fills the remainder.

Test apad alone (single input)

ffmpeg -i input.mp3 -af "apad=pad_dur=5,atrim=0:25" output.mp3

This adds 5 seconds of silence to a 20-second clip and trims to exactly 25 seconds — a quick way to confirm the length-adjustment logic before plugging it into the bigger graph.


BGM Fade-In / Fade-Out

A BGM that pops in at full volume sounds amateur. Add 2-second fade-in and 3-second fade-out via afade.

ffmpeg -i input.mp4 -i bgm.mp3 \
  -filter_complex "[1:a]volume=0.3,afade=t=in:st=0:d=2,afade=t=out:st=57:d=3[bgm];[0:a][bgm]amix=inputs=2:duration=first:dropout_transition=0[aout]" \
  -map 0:v:0 -map "[aout]" -c:v copy -c:a aac -b:a 192k output.mp4

afade=t=out:st=57:d=3 means “start fading at second 57, take 3 seconds to reach silence”. For a 60-second video that lands the fade exactly at the end. Always set st slightly before the actual cut — fading after the cut does nothing.

Verify the fade chain on a single file

ffmpeg -i input.mp3 -af "volume=0.3,afade=t=in:st=0:d=2,afade=t=out:st=15:d=5" output.mp3

This applies attenuation, a 2-second fade-in, and a 5-second fade-out (starting at second 15) to a 20-second clip in one chain. See Audio Fade-In / Fade-Out for more detail.


Chaining Multiple BGM Tracks (Intro / Main / Outro)

When you want intro, main, and outro played back-to-back as one BGM, use the audio variant of the concat filter.

ffmpeg -i intro.mp3 -i main.mp3 -i outro.mp3 \
  -filter_complex "[0:a][1:a][2:a]concat=n=3:v=0:a=1[bgm]" \
  -map "[bgm]" -c:a libmp3lame combined_bgm.mp3

n=3 is the input count and v=0:a=1 means “0 video streams, 1 audio stream per input”. The resulting combined_bgm.mp3 plugs into any of the previous mix commands as bgm.mp3.

Try concat with one input

ffmpeg -i input.mp3 -filter_complex "[0:a]asplit=3[a1][a2][a3];[a1]atrim=0:5,asetpts=PTS-STARTPTS[s1];[a2]atrim=0:10,asetpts=PTS-STARTPTS[s2];[a3]atrim=0:5,asetpts=PTS-STARTPTS[s3];[s1][s2][s3]concat=n=3:v=0:a=1[out]" -map "[out]" -c:a libmp3lame output.mp3

This splits a single source into 5/10/5-second segments and concatenates them. The asetpts=PTS-STARTPTS reset on every segment is critical — without it, concat produces audible gaps because timestamps overlap.


Auto-Duck Original Audio Under BGM

To make the narration cut through, automatically dip the BGM whenever the voice is present using a sidechain compressor.

ffmpeg -i input.mp4 -i bgm.mp3 \
  -filter_complex "[0:a]aformat=fltp:44100:stereo[voice];[1:a]aformat=fltp:44100:stereo,volume=0.5[bgm];[bgm][voice]sidechaincompress=threshold=0.02:ratio=6:attack=200:release=1000[ducked];[ducked][voice]amix=inputs=2:duration=first:dropout_transition=0[aout]" \
  -map 0:v:0 -map "[aout]" -c:v copy -c:a aac -b:a 192k output.mp4

The mechanism: voice arrives → compressor squashes the BGM → voice stops → BGM rises again. For full parameter explanations and natural attack/release values, see Auto-Duck BGM Under Narration with sidechaincompress.


Stream-Mapping Pitfalls

When you use -filter_complex, FFmpeg disables automatic stream selection entirely. Forgetting -map is the number-one cause of “the output has no audio” or “the BGM never made it in”.

PatternResult
-map 0:v:0 -map "[aout]"Correct — video plus mixed audio
-map "[aout]" onlyWrong — audio-only file (no video)
-map 0 onlyWrong — original audio kept, BGM ignored
No -map at allWrong — [aout] is dropped, defaults are used
# Wrong
ffmpeg -i input.mp4 -i bgm.mp3 \
  -filter_complex "[0:a][1:a]amix=inputs=2:duration=shortest[aout]" \
  -c:v copy -c:a aac output.mp4

Above, no -map is set, so [aout] is silently dropped and the original audio passes through. Rule of thumb: -filter_complex always requires explicit -map.


Batch: Apply Same BGM to Many Videos

A Bash one-liner that applies the same BGM to every .mp4 in the current directory:

for f in *.mp4; do
  ffmpeg -i "$f" -i bgm.mp3 \
    -filter_complex "[1:a]volume=0.3,afade=t=in:st=0:d=2[bgm];[0:a][bgm]amix=inputs=2:duration=first:dropout_transition=0[aout]" \
    -map 0:v:0 -map "[aout]" -c:v copy -c:a aac -b:a 192k "out/${f%.mp4}_bgm.mp4"
done

On Windows PowerShell rewrite with Get-ChildItem *.mp4 | ForEach-Object { ffmpeg ... }. Since -c:v copy skips video re-encoding, even hundreds of files finish in minutes.


Troubleshooting

No audio in the output

Confirm you wrote -map "[aout]" after -filter_complex. Also check that -c:a copy is not present — copying bypasses the filter graph and can produce silence.

BGM clips (distorts)

Either amix over-normalised or the sum exceeded full scale. The defensive pattern:

ffmpeg -i input.mp4 -i bgm.mp3 \
  -filter_complex "[0:a]volume=0.8[v];[1:a]volume=0.3[bgm];[v][bgm]amix=inputs=2:duration=first:dropout_transition=0:normalize=0[aout],alimiter=limit=0.95[lim]" \
  -map 0:v:0 -map "[lim]" -c:v copy -c:a aac -b:a 192k output.mp4

normalize=0 disables auto-rescaling, and alimiter caps the peak at 0.95 (≈ -0.5 dBFS). For full broadcast-grade loudness see Loudness Normalization (loudnorm/LUFS).

Audio drifts out of sync

Mismatched sample rates between the two inputs cause amix to drift. Force both inputs through aformat first:

ffmpeg -i input.mp4 -i bgm.mp3 \
  -filter_complex "[0:a]aformat=fltp:44100:stereo[v];[1:a]aformat=fltp:44100:stereo,volume=0.3[bgm];[v][bgm]amix=inputs=2:duration=first:dropout_transition=0[aout]" \
  -map 0:v:0 -map "[aout]" -c:v copy -c:a aac -b:a 192k output.mp4

FAQ

BGM is too loud

Apply volume=0.20.3 to the BGM stream before amix. If the BGM still masks the voice, switch to auto-ducking with sidechaincompress.

BGM doesn’t match the video length

Short BGM: loop with aloop. Long BGM: clip with amix=duration=first or -shortest. The two can be combined.

How do I auto-duck?

Use the sidechain compressor pattern shown above; full parameter guidance lives at Auto-Duck BGM Under Narration.

Keep original audio vs replace it?

Keep with amix; replace by mapping only 1:a:0 and dropping 0:a. Vlogs benefit from keeping ambient sound; music videos usually replace it entirely.

BGM only plays in one channel

This happens when amerge is used by mistake or when a mono BGM is fed in. Force stereo with aformat=fltp:44100:stereo and stick to amix.

Where do I find royalty-free BGM?

YouTube Audio Library (free, via YouTube Studio), Pixabay Music, Free Music Archive, and Incompetech are reliable starting points. Always check the licence and any required attribution before publishing.



Tested with ffmpeg 8.1 / Windows 11 Primary source: ffmpeg.org/ffmpeg-filters.html#amix