Trim, Transcode, Concatenate: Your Guide to Media3 Editing Libraries
Media3 includes libraries for a variety of media use cases. In this blog post we’ll focus on APIs we’ve recently published for video creation, including: converting media from one format to another, applying effects, and producing compositions out of multiple input audio/video streams. We’ll start by revisiting some of the functionality we shared on the Android Developers Blog last year, then do a deep dive into building compositions.
We’re looking forward to sharing more information about composition previewing use cases later this year. If you’re interested in receiving updates as we expand media editing functionality, please follow this blog.
Media3 Editing Libraries
Media3 provides foundational video editing features, including:
- Transcoding: efficiently convert video files between different formats.
- Basic edits like trimming, cropping, scaling and rotating
- Apply various effects using the
media3-effect
module, includingBitmapOverlay
,MatrixTransformation
and RGB filters. - Concatenate multiple assets into a composition. Read on for more detail about the specific use cases we handle today.
This functionality uses various modules:
media3-transformer
is the main entry point to the Transformer API, which supports creating media files.media3-effect
provides functionality for applying effects to video frames. It works in conjunction with Transformer and ExoPlayer.media3-muxer
is used to write MP4 container files. Currently, the muxer supports writing MP4 files with H.264/AVC, H.265/HEVC and AV1 video, AAC audio, and various metadata types (including orientation hints, location, capture FPS, timestamp (creation and modification), XMP data and key/value metadata).
And in the future we plan to support previewing edits.
Getting started with Transformer
The easiest way to demonstrate how Transformer
works is to transcode a single video asset. To begin with, we need to create an EditedMediaItem
by passing a MediaItem
and defining transformations to apply to it. Next, we need to define transcoding settings and attach a listener to be notified when exporting media completes, or an error occurs. And finally, call the start
method to begin processing and export of the item.
Below is a code snippet that shows how to transcode a single video asset to H.264/AVC video with AAC audio:
val editedMediaItem = EditedMediaItem.Builder(
MediaItem.fromUri(videoUri))
.build()
val transformer = Transformer.Builder(context)
.setVideoMimeType(MimeTypes.VIDEO_H264)
.setAudioMimeType(MimeTypes.AUDIO_AAC)
.addListener(listener)
.build()
transformer.start(editedMediaItem, outputPath)
Basic editing operations
Media3 editing libraries provide a set of basic editing operations like matrix transformations, trimming, cropping and a range of visual and audio effects.
Let’s look at a trimming operation on a single video asset. To define start and end position for the operation, use setClippingConfiguration
when building MediaItem
to define the boundaries of a clip as demonstrated in the snippet below:
val clippedMediaItem = MediaItem.Builder()
.setUri(videoUri)
.setClippingConfiguration(
MediaItem.ClippingConfiguration.Builder()
.setStartPositionMs(1_000)
.setEndPositionMs(2_000)
.build())
.build()
The media3-effect
module provides a set of standard effects and functionality to build custom effects. The set of implemented effects includes:
- Color filters like brightness, contrast, saturation
- Matrix transformations like rotation, scaling and crop
- Video effects like speed adjustment, frame drop and overlays
It is possible to apply multiple effects to one EditedMediaItem
. The code snippet below shows how to create a list of video effects that we later will apply to an EditedMediaItem
:
val videoEffects = mutableListOf<Effect>()
videoEffects.add(RgbFilter.createGrayscaleFilter())
videoEffects.add(ScaleAndRotateTransformation.Builder()
.setScale(.2f, .2f)
.build())
To set video or audio effects, create an EditedMediaItem
by passing in previously created clippedMediaItem
and use setEffects
method to apply video effects:
val editedMediaItem = EditedMediaItem.Builder(clippedMediaItem)
.setEffects(Effects(/* audioProcessors= */ listOf(),
/* videoEffects= */ videoEffects))
.build()
We build Media3 libraries to be highly customizable and enable developers to provide their own implementations when needed, and custom effects are no exception. To create a custom effect, implement a GlShaderProgram
wrapping your custom GLSL shader, and a GlEffect
to act as a factory for the shader program. The utility base class BaseGlShaderProgram
handles output texture allocation for you, and should be sufficient for many use cases that don’t require asynchronous processing.
To learn more about basic editing operations, check out the “Create a basic video editor using Media3 Transformer” guide.
Sequential multi-asset
Let’s explore how we can use Media3 to build a simple sequential composition with the APIs available since the 1.2 release.
Before we get into coding, we need to cover a few definitions:
EditedMediaItemSequence
is a series ofEditedMediaItem
s that is arranged in such a way that items do not overlap in time and are stacked sequentially.Composition
is a data structure consisting of one or multipleEditedMediaItemSequence
s- Sequential multi-asset composition is a representation of multiple input assets that are arranged in a sequence with no video streams overlapping in time.
For starters, let’s create concatenation of two videos (video1 and video2):
To be able to combine two videos, we will need to follow several steps:
- Create
EditedMediaItem
s containingUri
s to video assets for video1 and video2 - Add both
EditedMediaItem
s when creatingEditedMediaItemSequence
- Pass the
EditedMediaItemSequence
to aComposition.Builder
- Build the composition
val video1 = EditedMediaItem.Builder(
MediaItem.fromUri(video1Uri))
.build()
val video2 = EditedMediaItem.Builder(
MediaItem.fromUri(video2Uri))
.build()
val videoSequence = EditedMediaItemSequence(
video1, video2)
val composition = Composition.Builder(
videoSequence)
.build()
Add background audio track
Starting with release 1.2, Transformer supports mixing together multiple concurrent audio streams, allowing you to add a background audio track to your Composition
.
Using this functionality is as simple as providing an additional EditedMediaItemSequence
instance to the Composition.Builder
and the audio will be mixed together in the exported file!
The illustration below demonstrates how the Composition
object will look when we add a second EditedMediaItemSequence that represents the audio track:
The following code demonstrates the creation of EditedMediaItemSequence
that will contain a background audio track and including it in the Composition
:
val backgroundAudio = EditedMediaItem.Builder(
MediaItem.fromUri(audioUri))
.build()
val backgroundAudioSequence = EditedMediaItemSequence(
ImmutableList.of(backgroundAudio),
/* isLooping= */true)
val composition = Composition.Builder(
videoSequence,
backgroundAudioSequence)
.build()
Set isLooping
to true
to loop the background audio track throughout the duration of the first EditedMediaItemSequence
containing two video assets.
The team is actively working on giving developers more control and flexibility for creating amazing editing apps on Android.
You can dive into the development branch for a sneak peek at what’s on the horizon or check out our demo app to learn more about our APIs.
Found a bug or have a great feature request? We encourage you to file it on our GitHub repository!