👉 AWS Debug Games (Beta) - Prove your AWS expertise by solving tricky challenges.

👉 AWS Debug Games - Prove your AWS expertise.

Download YouTube videos with AWS Lambda and store them on S3

Michael Wittig – 17 May 2019

Recently, I was faced with the challenge to download videos from YouTube and store them on S3.

Download YouTube videos with AWS Lambda

Sounds easy? Remember than Lambda comes with a few limitations:

  1. 512 MB of disk space available at /tmp
  2. 3008 MB of memory
  3. 15 minutes maximum execution time

While working on a solution, I encountered multiple problems:

  1. Download the video from YouTube to /tmp and then upload it to S3: Does not work with videos larger than 512 MB.
  2. Download the video from YouTube into memory and then upload it to S3: Does not work with videos larger than ~3 GB.
  3. Download the video from Youtube and stream it to S3 while downloading: Works for all videos that can be processed within 15 minutes. I have not found a video that took longer than a few minutes to process.

Let’s look at how I finally solved the problem with a streaming approach in Node.js. I use the youtube-dl library to get easy access to YouTube videos.

First, we create a PassThrough stream in Node.js. A pass-through stream is a duplex stream where you can write on one side and read on the other side.


Looking for a new challenge?

  • DEMICON

    Cloud Operations Lead

    DEMICON • AWS Advanced Consulting Partner • Remote (Europe)
    service-delivery-management hiring devops platform

const stream = require('stream');
const passtrough = new stream.PassThrough();

Next, we need to write data to the stream. This is done by the youtube-dl library.

const youtubedl = require('youtube-dl');
const dl = youtubedl(event.videoUrl, ['--format=best[ext=mp4]'], {maxBuffer: Infinity});
dl.pipe(passtrough); // write video to the pass-through stream

And finally, we need to upload the stream to S3. We make use of the Multipart Upload feature of S3 which allows us to upload a big file in smaller chunks. This way, we only have to buffer the small junk (64 MB in this case) in memory and not the whole file.

const AWS = require('aws-sdk');
const upload = new AWS.S3.ManagedUpload({
params: {
Bucket: process.env.BUCKET_NAME,
Key: 'video.mp4',
Body: passtrough
},
partSize: 1024 * 1024 * 64 // 64 MB in bytes
});
upload.send((err) => {
if (err) {
console.log('error', err);
} else {
console.log('done');
}
});

That’s it. Now you can download YouTube videos of any size with Lambda and upload them to S3. I recommend running the code in a “big” Lambda function with 3008 MB of memory for better network performance.

You can find the full source code on GitHub including a SAM template to provision the AWS resources. Have fun!


This is a shorter article. Do you prefer longer or shorter reads? Let me know! michael@widdix.de, LinkedIn, or @hellomichibye.

Become a cloudonaut supporter

Michael Wittig

Michael Wittig ( Email Twitter LinkedIn Mastodon )

We launched the cloudonaut blog in 2015. Since then, we have published 365 articles, 67 podcast episodes, and 67 videos. It's all free and means a lot of work in our spare time. We enjoy sharing our AWS knowledge with you.

Please support us

Have you learned something new by reading, listening, or watching our content? With your help, we can spend enough time to keep publishing great content in the future. Learn more

$
Amount must be a multriply of 5. E.g, 5, 10, 15.

Thanks to Alan Leech, Alex DeBrie, Christopher Hipwell, e9e4e5f0faef, Jason Yorty, Jeff Finley, jhoadley, Johannes Konings, John Culkin, Jonathan Deamer, Juraj Martinka, Ken Snyder, Markus Ellers, Oriol Rodriguez, Ross Mohan, sam onaga, Satyendra Sharma, Simon Devlin, Todd Valentine, Victor Grenu, and all anonymous supporters for your help! We also want to thank all supporters who purchased a cloudonaut t-shirt.