Video Encoding in the Cloud with ElasticTranscoder

I’ve had a “multimedia” section on DetroitHockey.Net since about the third day the site existed but I haven’t always done a good job keeping said multimedia in a usable format. For awhile all of the videos were in QuickTime format, then I jumped over to Windows Media.  There were whole seasons of hockey where I didn’t bother adding any new highlights to the site because first I couldn’t figure out what the best format to use would be, then I didn’t want to take the time to burn through the backlog of highlights I needed to edit, encode, upload, etc.

Dumping all of the videos off on to YouTube wouldn’t be an option for me because I try to own the entire end-to-end experience on DH.N.  I don’t want to dump people off on a third-party to get something that I am supposedly providing.

About 18 months ago I finally sat down and took on the challenge of bringing the multimedia system up to date.  I pulled up the raw capture files for every video I had, including my backlog.  I re-edited everything, re-watermarked it, and re-encoded it all in HTML5-friendly formats.  I did it all by hand because I wanted to keep an eye on things as they went along but the entire time I was thinking, “I need to automate this going forward.”

After the updated multimedia section launched I had the idea of setting up FFmpeg on my server and using it to do the encoding rolling around in my head for awhile.  The idea would be that I’d upload the edited video and have automated processes that encoded it to the right formats, added the watermark, and copied everything to the right place.  I never got around to that before Amazon Web Services put out their ElasticTranscoder service.

Put simply, ETC does what I wanted to use FFmpeg for.  Here’s a bit on how I’m now using it.  All code shown is PHP (as should be evident) using AWSSDKforPHP2.

There are three important concepts to using ETC.  The “job” defines what file is to be encoded and what preset(s) to use for the encoding.  The “preset” is the set of encoding parameters, including codec options. The “pipeline” defines where your files come from and what gets notified upon various job-related events, including completion and error.

There’s one caveat to that: watermarking.  Watermarks are defined in the job but the preset must be set up to allow for watermarks.  I Tweeted that I think it should only be on the job level or should be a fully-defined fourth concept (along with the pipeline, preset and job) that gets attached to a job, rather than splitting the definition across the preset and the job.  That said, the way I ended up implementing things it doesn’t matter.

The pipeline is the one constant for every DH.N encoding job.  I dump the edited video file into a specified S3 bucket (all input files for ETC must be in an S3 bucket) and the job puts the completed files back in that bucket.  Upon completion or error, ETC sends a message to a topic in the AWS Simple Notification Service.  I have an HTTP endpoint subscribed to that topic, where code runs to shuffle the completed files into their final locations. I should probably be using an SQS subscription instead of an HTTP one to reduce the possibility of data loss but I’m not right now.

So dump the file to S3 and kick off the job, right?  I’ve skipped setting up the preset and here’s why:

Presets in ETC can’t be edited.  Once you create one, that’s it.  So I could create a set of presets that work for what I need now and save all of their IDs for reference by my scripts but if I ever needed to change that preset I would have to create a whole new preset and then update my script with the new ID.  Not hard but it felt like an easy point to make a mistake.

Instead, since I’m programmatically firing off the job anyway via the AWS API, I fire off the command to create a new set of presets for each job first.  This limits me because you can only have 50 custom presets but I clean up after each job so it really just means I can only have a certain number of jobs active at a time.

Enough of my babbling, on to the code:

$preset = array();

// WATERMARKS AND THUMBNAILS ARE USED FOR ALL PRESETS, DEFINE THEM HERE
$watermark_args = array();
$watermark_args['Id'] = 'watermark';
$watermark_args['MaxWidth'] = '25%';
$watermark_args['MaxHeight'] = '25%';
$watermark_args['SizingPolicy'] = 'ShrinkToFit';
$watermark_args['HorizontalAlign'] = 'Right';
$watermark_args['HorizontalOffset'] = '5%';
$watermark_args['VerticalAlign'] = 'Bottom';
$watermark_args['VerticalOffset'] = '5%';
$watermark_args['Opacity'] = '35';
$watermark_args['Target'] = 'Content';

$thumbnail_args = array();
$thumbnail_args['Format'] = 'png';
$thumbnail_args['Interval'] = '60';
$thumbnail_args['MaxWidth'] = 'auto';
$thumbnail_args['MaxHeight'] = 'auto';
$thumbnail_args['SizingPolicy'] = 'Keep';
$thumbnail_args['PaddingPolicy'] = 'NoPad';

// CREATE MP4 PRESET
$args = array();
$args['Name'] = 'MP4 ' . date('n/j/Y - g:i A');
$args['Description'] = 'MP4 output for ' . $file_name;
$args['Container'] = 'mp4';
$args['Video']['Codec'] = 'H.264';
$args['Video']['CodecOptions']['MaxReferenceFrames'] = '3';
$args['Video']['CodecOptions']['Profile'] = 'baseline';
$args['Video']['CodecOptions']['Level'] = '4';
$args['Video']['KeyframesMaxDist'] = '90';
$args['Video']['FixedGOP'] = 'false';
$args['Video']['BitRate'] = ($use_hd_video) ? '2000' : '768';
$args['Video']['FrameRate'] = '29.97';
$args['Video']['MaxWidth'] = 'auto';
$args['Video']['MaxHeight'] = 'auto';
$args['Video']['SizingPolicy'] = 'Keep';
$args['Video']['PaddingPolicy'] = 'NoPad';
$args['Video']['DisplayAspectRatio'] = 'auto';

if ($use_watermark) {
  $args['Video']['Watermarks'][] = $watermark_args;
}

$args['Audio']['Codec'] = 'AAC';
$args['Audio']['SampleRate'] = '44100';
$args['Audio']['BitRate'] = '128';
$args['Audio']['Channels'] = '2';

$args['Thumbnails'] = $thumbnail_args;

$data = $etc->createPreset($args);
$preset['mp4'] = $data['Preset']['Id'];

As the comment says, the watermark and thumbnail-creation data is the same for all output file formats so I define that first.  You’ll notice that the actual watermark file isn’t defined there, that’s defined at the job level, which is one of the things I think is weird about how watermarks are handled.

Then I create my MP4 preset, defining all of the codec options and other variables.  The big thing here is that I’m using a higher bitrate for HD video than SD, so I define that on the fly.  I’m sure I could do more fine-tuning but I’ve forgotten more about video codecs than Brian Winn at MSU would like for me to admit.

With the preset definition built, I fire off the createPreset command, which spits back a ton of data including the newly-created presetID.  I save that ID for later.

I also create a WebM preset but I’ll save space and not include that here since it looks almost the same.

With the presets defined, it’s time to fire off the job.

// CREATE JOB
$args = array();
$args['PipelineId'] = $pipeline_id;
$args['Input']['Key'] = $source_file;
$args['Input']['FrameRate'] = 'auto';
$args['Input']['Resolution'] = 'auto';
$args['Input']['AspectRatio'] = 'auto';
$args['Input']['Interlaced'] = 'auto';
$args['Input']['Container'] = 'auto';
$args['OutputKeyPrefix'] = $output_folder;

foreach ($preset AS $type => $preset_id) {
  $output_args = array('Key' => (substr($source_file, 0, strpos($source_file, '.')) . '.' . $type), 'PresetId' => $preset_id, 'ThumbnailPattern' => '', 'Rotate' => 'auto');

  if ($use_watermark) {
    $output_args['Watermarks'][] = array('InputKey' => '_assets/watermark.png', 'PresetWatermarkId' => 'watermark');
  }

  $args['Outputs'][] = $output_args;
}

$data = $etc->createJob($args);

The pipeline ID is saved off in a configuration file, since that’s used for every job.  The source file and the output folder are defined in code outside this.  We loop through each of the previously-defined presets to say how that preset will be used (for example, I don’t use thumbnails on any of my jobs, even though they were defined at the preset level).  If I’m using a watermark, the watermark file is added to the job definition.  With the job defined, the createJob command is fired.

The job runs along in the background and I don’t care about the status, because I’ll know when it ends because my HTTP endpoint will be hit.  The endpoint looks like this:

$preset = array();
$path = array();

if ($data->state == 'COMPLETED') {
  if ($data->output) {
    $preset[] = $data->output->presetId;
    $path[substr($data->output->key, ((strpos(strrev($data->output->key), '.')) * -1))] = $data->output->key;
  } elseif (is_array($data->outputs)) {
    foreach ($data->outputs AS $output_data) {
      $preset[] = $output_data->presetId;
      $path[substr($output_data->key, ((strpos(strrev($output_data->key), '.')) * -1))] = $output_data->key;
    }
  }

  // SHUFFLE FILES AROUND
  $s3 = AwsS3S3Client::factory(array('key' => $config['access_key'], 'secret' => $config['secret_key'], 'region' => $config['region']));

  $s3->copyObject(array('Bucket' => $bucket, 'Key' => $file_location['mp4'], 'CopySource' => ($bucket . '/' . $output_folder . $path['mp4']), 'ACL' => 'public-read'));
  $s3->copyObject(array('Bucket' => $bucket, 'Key' => $file_location['webm'] . 'file.webm'), 'CopySource' => ($bucket . '/' . $output_folder . $path['webm']), 'ACL' => 'public-read'));

  $s3->deleteObject(array('Bucket' => $bucket, 'Key' => ($output_folder . $path['mp4'])));
  $s3->deleteObject(array('Bucket' => $bucket, 'Key' => ($output_folder . $path['webm'])));
  $s3->deleteObject(array('Bucket' => $bucket, 'Key' => $source_file));

  unset($s3);
} else {
  if ($data->output) {
    $preset[] = $data->output->presetId;
  } elseif (is_array($data->outputs)) {
    $preset[] = $output_data->presetId;
  }
}

// REMOVE JOB PRESETS
$preset = array_unique($preset);
if (count($preset)) {
  $etc =  AwsElasticTranscoderElasticTranscoderClient::factory(array('key' => $config['access_key'], 'secret' => $config['secret_key'], 'region' => $config['region']));

  foreach ($preset AS $id) {
    $etc->deletePreset(array('Id' => $id));
  }

  unset($etc);
}

We start by determining whether or not the job is complete. Because we’re only notified upon completion or error, we know that anything that isn’t completion is a failure.

On completion we loop through the data provided about the completed job to determine what presets were used and what files were created.  We move the new files to their final locations (and make them publicly readable) and remove the outputted files and the original input file.  On error we just determine what presets were used (we don’t delete any files in case they can be re-used).  In both cases, we then remove the presets that were used so that we don’t hit that 50-preset limit.

There are some other pieces that manage metadata and connections to other parts of DH.N but these are the interactions with ElasticTranscode.