What You’ll Learn
- The basic command to overlay BGM (an
.mp3) on a video (an.mp4) - How to keep vs. replace the original audio track
- Lowering BGM to 30% so the voiceover stays audible
- Handling BGM that is shorter or longer than the video (
apad,aloop,shortest) - Adding fade-in and fade-out to the BGM
- Chaining intro / main / outro tracks into a single BGM
- Auto-ducking the BGM under narration
- Stream-mapping pitfalls and how to avoid them
- Batch-applying the same BGM to many videos
Tested with: FFmpeg 8.1 Platform: Windows / macOS / Linux
Basic Command: Overlaying BGM on a Video
The most useful pattern — original audio plus BGM — uses the amix filter. You cannot use -c:a copy here because the audio must be re-encoded after mixing.
ffmpeg -i input.mp4 -i bgm.mp3 \
-filter_complex "[0:a][1:a]amix=inputs=2:duration=shortest:dropout_transition=0[aout]" \
-map 0:v:0 -map "[aout]" -c:v copy -c:a aac -b:a 192k output.mp4
There are three things to notice. -filter_complex mixes the two audio streams, -map 0:v:0 keeps the video, and -map "[aout]" selects the mixed audio. The video is copied (-c:v copy), so the command finishes in seconds even on long files.
Key amix options
| Option | Default | Description |
|---|---|---|
inputs | 2 | Number of inputs to mix |
duration | longest | longest, shortest, or first decides output length |
dropout_transition | 2 | Fade duration (seconds) when an input ends |
weights | 1 1 | Per-input weight; combines with the volume filter |
normalize | 1 | Auto-normalises the output to prevent clipping |
amix vs amerge
These two filters are easily confused but solve different problems.
| Filter | Purpose | Output channels |
|---|---|---|
amix | Sums signals that share the same channel layout | Same as inputs |
amerge | Concatenates channels (e.g., 2 mono → 1 stereo) | Sum of input channels |
For “video + BGM” you want the signals summed, so always use amix. Using amerge produces a stereo file with the original audio in the left channel and the BGM in the right — almost never what you want.
Replacing Original Audio with BGM
If you want to throw away the original audio and keep only the BGM, drop the source audio with -an-style mapping and pick the BGM stream directly.
ffmpeg -i input.mp4 -i bgm.mp3 \
-map 0:v:0 -map 1:a:0 -c:v copy -c:a aac -b:a 192k -shortest output.mp4
-shortest cuts the output at the shorter input, so a long BGM gets trimmed automatically. Re-encode to AAC (-c:a aac) rather than copying MP3 into MP4 — many players choke on MP3-in-MP4 even though FFmpeg accepts it.
Adjusting Volume Balance
“BGM is too loud and drowns out the voice” is the single most common complaint. Insert a volume filter before amix to attenuate only the BGM.
ffmpeg -i input.mp4 -i bgm.mp3 \
-filter_complex "[1:a]volume=0.3[bgm];[0:a][bgm]amix=inputs=2:duration=shortest:dropout_transition=0[aout]" \
-map 0:v:0 -map "[aout]" -c:v copy -c:a aac -b:a 192k output.mp4
volume=0.3 is roughly -10 dB. Practical starting points:
| Use case | Original audio | BGM |
|---|---|---|
| Tutorial / explainer | 1.0 | 0.15–0.25 |
| Vlog (keep ambience) | 0.7 | 0.30–0.40 |
| Silent footage with BGM | (muted) | 1.0 |
| Music-led content | 0.20 | 1.0 |
You can also use dB syntax: volume=-12dB. Note that amix defaults to normalize=1, which can rescale your carefully tuned ratios — pass normalize=0 if you want the exact mix.
Verify with a single input
To rehearse the same chain against a single audio file, split the source with asplit and mix the copies.
ffmpeg -i input.mp3 -filter_complex "[0:a]asplit=2[a1][a2];[a1]volume=0.3[bgm];[a2]volume=1.0[voice];[bgm][voice]amix=inputs=2:duration=longest:dropout_transition=0[out]" -map "[out]" -c:a libmp3lame output.mp3
asplit=2 duplicates the input into two streams, one of which gets attenuated before being mixed back together. Replace the split with two real -i inputs in production.
When BGM Is Shorter or Longer Than the Video
A 60-second video paired with a 30-second BGM gives you four options:
| Situation | Filter | Effect |
|---|---|---|
| BGM short — pad with silence | apad | Appends silence to the end |
| BGM short — loop instead | aloop | Repeats the BGM |
| BGM long — trim to video | amix=duration=shortest or -shortest | Stops at the shorter input |
| BGM long — clean cut | atrim + afade | Cuts at a chosen second with a fade |
Loop the BGM until the video ends
ffmpeg -i input.mp4 -i bgm.mp3 \
-filter_complex "[1:a]aloop=loop=-1:size=2e+09[loop];[0:a][loop]amix=inputs=2:duration=first:dropout_transition=0[aout]" \
-map 0:v:0 -map "[aout]" -c:v copy -c:a aac -b:a 192k output.mp4
aloop=loop=-1 repeats forever; size=2e+09 allocates a generous loop buffer. duration=first clips the output to the video length.
Pad the BGM tail with silence
ffmpeg -i input.mp4 -i bgm.mp3 \
-filter_complex "[1:a]apad[pad];[0:a][pad]amix=inputs=2:duration=first:dropout_transition=0[aout]" \
-map 0:v:0 -map "[aout]" -c:v copy -c:a aac -b:a 192k output.mp4
apad with no arguments appends infinite silence; duration=first makes amix stop at the video length. The result: BGM plays through, then silence fills the remainder.
Test apad alone (single input)
ffmpeg -i input.mp3 -af "apad=pad_dur=5,atrim=0:25" output.mp3
This adds 5 seconds of silence to a 20-second clip and trims to exactly 25 seconds — a quick way to confirm the length-adjustment logic before plugging it into the bigger graph.
BGM Fade-In / Fade-Out
A BGM that pops in at full volume sounds amateur. Add 2-second fade-in and 3-second fade-out via afade.
ffmpeg -i input.mp4 -i bgm.mp3 \
-filter_complex "[1:a]volume=0.3,afade=t=in:st=0:d=2,afade=t=out:st=57:d=3[bgm];[0:a][bgm]amix=inputs=2:duration=first:dropout_transition=0[aout]" \
-map 0:v:0 -map "[aout]" -c:v copy -c:a aac -b:a 192k output.mp4
afade=t=out:st=57:d=3 means “start fading at second 57, take 3 seconds to reach silence”. For a 60-second video that lands the fade exactly at the end. Always set st slightly before the actual cut — fading after the cut does nothing.
Verify the fade chain on a single file
ffmpeg -i input.mp3 -af "volume=0.3,afade=t=in:st=0:d=2,afade=t=out:st=15:d=5" output.mp3
This applies attenuation, a 2-second fade-in, and a 5-second fade-out (starting at second 15) to a 20-second clip in one chain. See Audio Fade-In / Fade-Out for more detail.
Chaining Multiple BGM Tracks (Intro / Main / Outro)
When you want intro, main, and outro played back-to-back as one BGM, use the audio variant of the concat filter.
ffmpeg -i intro.mp3 -i main.mp3 -i outro.mp3 \
-filter_complex "[0:a][1:a][2:a]concat=n=3:v=0:a=1[bgm]" \
-map "[bgm]" -c:a libmp3lame combined_bgm.mp3
n=3 is the input count and v=0:a=1 means “0 video streams, 1 audio stream per input”. The resulting combined_bgm.mp3 plugs into any of the previous mix commands as bgm.mp3.
Try concat with one input
ffmpeg -i input.mp3 -filter_complex "[0:a]asplit=3[a1][a2][a3];[a1]atrim=0:5,asetpts=PTS-STARTPTS[s1];[a2]atrim=0:10,asetpts=PTS-STARTPTS[s2];[a3]atrim=0:5,asetpts=PTS-STARTPTS[s3];[s1][s2][s3]concat=n=3:v=0:a=1[out]" -map "[out]" -c:a libmp3lame output.mp3
This splits a single source into 5/10/5-second segments and concatenates them. The asetpts=PTS-STARTPTS reset on every segment is critical — without it, concat produces audible gaps because timestamps overlap.
Auto-Duck Original Audio Under BGM
To make the narration cut through, automatically dip the BGM whenever the voice is present using a sidechain compressor.
ffmpeg -i input.mp4 -i bgm.mp3 \
-filter_complex "[0:a]aformat=fltp:44100:stereo[voice];[1:a]aformat=fltp:44100:stereo,volume=0.5[bgm];[bgm][voice]sidechaincompress=threshold=0.02:ratio=6:attack=200:release=1000[ducked];[ducked][voice]amix=inputs=2:duration=first:dropout_transition=0[aout]" \
-map 0:v:0 -map "[aout]" -c:v copy -c:a aac -b:a 192k output.mp4
The mechanism: voice arrives → compressor squashes the BGM → voice stops → BGM rises again. For full parameter explanations and natural attack/release values, see Auto-Duck BGM Under Narration with sidechaincompress.
Stream-Mapping Pitfalls
When you use -filter_complex, FFmpeg disables automatic stream selection entirely. Forgetting -map is the number-one cause of “the output has no audio” or “the BGM never made it in”.
| Pattern | Result |
|---|---|
-map 0:v:0 -map "[aout]" | Correct — video plus mixed audio |
-map "[aout]" only | Wrong — audio-only file (no video) |
-map 0 only | Wrong — original audio kept, BGM ignored |
No -map at all | Wrong — [aout] is dropped, defaults are used |
# Wrong
ffmpeg -i input.mp4 -i bgm.mp3 \
-filter_complex "[0:a][1:a]amix=inputs=2:duration=shortest[aout]" \
-c:v copy -c:a aac output.mp4
Above, no -map is set, so [aout] is silently dropped and the original audio passes through. Rule of thumb: -filter_complex always requires explicit -map.
Batch: Apply Same BGM to Many Videos
A Bash one-liner that applies the same BGM to every .mp4 in the current directory:
for f in *.mp4; do
ffmpeg -i "$f" -i bgm.mp3 \
-filter_complex "[1:a]volume=0.3,afade=t=in:st=0:d=2[bgm];[0:a][bgm]amix=inputs=2:duration=first:dropout_transition=0[aout]" \
-map 0:v:0 -map "[aout]" -c:v copy -c:a aac -b:a 192k "out/${f%.mp4}_bgm.mp4"
done
On Windows PowerShell rewrite with Get-ChildItem *.mp4 | ForEach-Object { ffmpeg ... }. Since -c:v copy skips video re-encoding, even hundreds of files finish in minutes.
Troubleshooting
No audio in the output
Confirm you wrote -map "[aout]" after -filter_complex. Also check that -c:a copy is not present — copying bypasses the filter graph and can produce silence.
BGM clips (distorts)
Either amix over-normalised or the sum exceeded full scale. The defensive pattern:
ffmpeg -i input.mp4 -i bgm.mp3 \
-filter_complex "[0:a]volume=0.8[v];[1:a]volume=0.3[bgm];[v][bgm]amix=inputs=2:duration=first:dropout_transition=0:normalize=0[aout],alimiter=limit=0.95[lim]" \
-map 0:v:0 -map "[lim]" -c:v copy -c:a aac -b:a 192k output.mp4
normalize=0 disables auto-rescaling, and alimiter caps the peak at 0.95 (≈ -0.5 dBFS). For full broadcast-grade loudness see Loudness Normalization (loudnorm/LUFS).
Audio drifts out of sync
Mismatched sample rates between the two inputs cause amix to drift. Force both inputs through aformat first:
ffmpeg -i input.mp4 -i bgm.mp3 \
-filter_complex "[0:a]aformat=fltp:44100:stereo[v];[1:a]aformat=fltp:44100:stereo,volume=0.3[bgm];[v][bgm]amix=inputs=2:duration=first:dropout_transition=0[aout]" \
-map 0:v:0 -map "[aout]" -c:v copy -c:a aac -b:a 192k output.mp4
FAQ
BGM is too loud
Apply volume=0.2–0.3 to the BGM stream before amix. If the BGM still masks the voice, switch to auto-ducking with sidechaincompress.
BGM doesn’t match the video length
Short BGM: loop with aloop. Long BGM: clip with amix=duration=first or -shortest. The two can be combined.
How do I auto-duck?
Use the sidechain compressor pattern shown above; full parameter guidance lives at Auto-Duck BGM Under Narration.
Keep original audio vs replace it?
Keep with amix; replace by mapping only 1:a:0 and dropping 0:a. Vlogs benefit from keeping ambient sound; music videos usually replace it entirely.
BGM only plays in one channel
This happens when amerge is used by mistake or when a mono BGM is fed in. Force stereo with aformat=fltp:44100:stereo and stick to amix.
Where do I find royalty-free BGM?
YouTube Audio Library (free, via YouTube Studio), Pixabay Music, Free Music Archive, and Incompetech are reliable starting points. Always check the licence and any required attribution before publishing.
Related Articles
- Auto-Duck BGM Under Narration with sidechaincompress
- Audio Fade-In / Fade-Out
- Loudness Normalization (loudnorm/LUFS)
- Extract Audio from Video
Tested with ffmpeg 8.1 / Windows 11 Primary source: ffmpeg.org/ffmpeg-filters.html#amix