High Quality Video Encoding at Scale

spdustin · on Dec 10, 2015

It's a fascinating look into how they produce multiple encoded formats from source video, but something stuck out at me... One feature in their ingestion system is meant to reject source inputs that would lead to a poor viewer experience. This doesn't seem to include interlaced NTSC video as a "poor experience" metric. I'm willing to bet that the majority of the screens on which Star Trek: Deep Space Nine is played are progressive scan, yet that show is interlaced at NTSC resolution.

In cases like this, where the source material is analog and, I would have to assume, not available in progressive scan, is there a technical reason why Netflix doesn't de-interlace the source before encoding? DS9 seems big enough of a catalog to be worth encoding in a less jarring scan rate.

Not complaining, mind you. The DVDs are interlaced too. Seems that the original recordings were NTSC only. Honestly curious if anyone has any insight.

speakeron · on Dec 10, 2015

TNG was like this (sourced from a video version and with a very nasty interlace look) on Netflix up to a couple of months ago. Now they stream the magnificent remastered blu-ray HD version which is sourced from the original film with fresh CGI since the original CGI was tied to the video.

It's not clear whether DS9 will get the same treatment.

dbcooper · on Dec 10, 2015

I prefer to watch SD rips from the TNG blu-rays. The merciless clarity of 1080p just kills the magic.

Keyframe · on Dec 10, 2015

I agree. Remastered TNG, in 1080p, looks like a theater-based teleplay.

knodi123 · on Dec 10, 2015

at least it's not HFR. I was never able to achieve immersion in the hobbit movies, because I felt like I was watching people on a stage.

danjc · on Dec 10, 2015

I had precisely that problem and this is the first time I've heard someone else mention it.

superuser2 · on Dec 10, 2015

YES! I'm insanely sensitive to this too. I have to interrupt group TV watching to play around with the TV's menus and turn off the "automatically simulate 60fps" setting that so many Smart TVs have on by default these days.

My mother can't tell the difference at all. I'm pretty sure this is just something that only a subset of the population notices enough to care about.

Bognar · on Dec 10, 2015

The simulated 60 FPS (or higher) are significantly worse than just 60 FPS source material. TVs attempt to interpolate between the frames automatically and it just screws everything up.

rossy · on Dec 11, 2015

Yes! I didn't mind the HFR Hobbit movies at all, but I can't stand the interpolated HFR that comes out of smart TVs. Once you start to notice the artifacts (often distortion or weird juddering in the background,) it becomes much less enjoyable to watch than the standard uninterpolated 24fps content. I look forward to a future of real HFR content, not fake interpolated crap.

And of course, smart TV HFR butchers hand-drawn animation.

aninhumer · on Dec 11, 2015

Personally, I find the higher framerates valuable enough that I'm willing to put up with the artifacts. They're easy to see if you're looking for them, but I think most of them are pretty ignorable, although there are some pathological cases (any time the scene pans across regular vertical bars e.g. blinds).

jjaredsimpson · on Dec 10, 2015

I really enjoyed HFR it only distracted me for the first few minutes.

The actions scenes were way clearer.

I saw the first 2 hobbits on TV and the 3rd in theaters. And I definitely preferred HFR.

Keyframe · on Dec 11, 2015

I do production and post production for a living. I didn't like HFR in Hobbit. The lighting was off, so was motion blur - cheapening the effect of it. Maybe with further tweaking it will look 'filmic' eventually. I do like it in 3D though. Also, high frame rate is AWESOME for (almost) anything live TV, and I think that's a real future of it. For example: https://www.youtube.com/watch?v=XVXQlkpaC5k

aninhumer · on Dec 11, 2015

I'm very much inclined to think this is just what people get used to. You're used to watching lower framerate, so your brain is yelling "something is different" all the time you're watching.

I got used to higher framerates (mostly via interpolation), and now 24fps stuff feels really jerky to me, almost like stop motion.

devonkim · on Dec 10, 2015

I've wondered if this could be fairly easily emulated with some playback filters locally if a viewer has a preference of experiencing material with lesser fidelity (CD v. vinyl comes to mind where it's almost more about ceremony and romanticism than technical merits 95%+ of the time to reject the hi-fi version, "remastering" insanity notwithstanding).

octobyte · on Dec 10, 2015

If you look at video game emulation, CRT shaders are in vogue for that very reason. Would be quite interesting if someone were to apply the same filters to video playback.

nitrogen · on Dec 10, 2015

Sure, just scale the output to 720x480.

baldfat · on Dec 10, 2015

CD vs Vinyl is not a fidelity issue but actually a compression that is a part of analog which is what people call "warm" sounding. A good analog recording has just as much or more fidelity then digital. An analog recording can actually reproduce a accurate sound wave while a digital can technically never be able to do that. (Former sound engineer)

nitrogen · on Dec 10, 2015

Actually digital recordings can reproduce signals up to Nyquist perfectly. Watch this surprising demo: http://xiph.org/video/vid2.shtml

devonkim · on Dec 11, 2015

In some defense of vinyl purists, there's a lot of signal intensity above audible human hearing range in sibilants and especially percussions. http://www.cco.caltech.edu/~boyk/spectra/spectra.htm

However, similar to how we perceive ultra low frequency more as a physical vibration than as sound, these harmonics above 20 kHz are probably merely annoying and subtly felt and impact one major issue of enjoyability - listening fatigue. So maybe we don't want those frequencies from a musical perspective anyway.

Regardless, there's hardly any equipment in use by even "audiophiles" on full-blown analog setups that can faithfully reproduce sounds beyond 30 kHz because of electronics design limitations in themselves rather than an analog-digital end user recording format distinction. In fact, a lot of vinyl historically had to be mastered with a low-pass filter cutting off a lot of the high frequencies because historically with sufficiently high enough energy in high frequencies, the needle would be tougher to control and fly right off the track sometimes (one explanation I read from an audio engineer - really not sure about that logic, but there's definitely low AND high pass filtering on vinyl that makes it lower fidelity in many respects than the master).

Heck, most vinyl produced since the 70s comes from digital masters in the first place. http://wiki.hydrogenaud.io/index.php?title=Myths_(Vinyl)#Myt...

Bugs the crap out of me to see people claim vinyl is superior on technical merits rather than aesthetic ones (sound preference / taste is real). You'd think they're climate change deniers with their insistence and rhetoric. But this is what I meant by purists about the "original" - higher fidelity and clarity is oftentimes not what people desire.

nitrogen · on Dec 11, 2015

Yeah, I love vinyl, not because it's good, but because it's so bad. I'll still listen to digital recordings a majority of the time.

sjwright · on Dec 10, 2015

You're simply incorrect. Within a defined boundary (e.g. the absolute limits of human hearing) digital audio can reproduce an audio signal perfectly. Most analog recording systems are incapable of this.

tertius · on Dec 10, 2015

So now I have to watch TNG all over again?!

Gelob · on Dec 10, 2015

That isn't Netflix's job. A post house delievers the raw .mov files to Netflix. The post house gets the files or tapes from the studio or whomever manages the libary for them (aka another post house). The post house has to comply to a set of standard's set forth by Netflix or it is automatically rejected. Most of the time things like no interlacing is in those requirements but also there can be exceptions if no other source material is available. iTunes works the same way. So say the post house that delievers to Netflix is only getting digital raw .movs and not the actual tapes, they may not be allowed to re-encode those or have access to the tapes to do the correct conversion.

lmm · on Dec 10, 2015

If you deinterlace an interlaced source before encoding you either discard information or store twice as much. And potentially people watching on an interlaced screen lose data - not all deinterlace/interlace pairings roundtrip cleanly. Better to store it in the original format, and then it can go through one round of deinterlacing on playback for those screens that need it, and none for those that don't.

(Also deinterlacing approaches get better over time - if a particular episode entered their catalogue 10 years ago and was deinterlaced using the state of the art approach at the time, it would look much worse than a modern deinterlace)

jfb · on Dec 10, 2015

One of the overriding constraints on these sorts of video pipeline tasks is touching the input pixels as few times as possible. Every lossy transformation you do (cropping, scaling, color correction, transcoding) potentially introduces defects; and that's assuming that your lossless transformations (repackaging container formats, for instance) are bug-free.

I'd love to see some figures from Netflix' QC team; I bet at their scale they see all kinds of insane edge case problems.

izacus · on Dec 10, 2015

Uhm, no, due to the way modern video formats work, you don't store twice as much data - on the contrary, H.264 and similar modern formats are significantly more efficient at storing progressive (including deinterlaced) video than equivalent interlaced stream.

lmm · on Dec 14, 2015

Nope. Modern (and even ancient) video codecs can store interlaced data just as efficiently as progressive data - how could it be otherwise? But when you deinterlace e.g. a 30 frames per second interlaced source, either you store the result as 60 frames per second (twice as much data), or you lossily downsample to 30 frames per second.

dbcooper · on Dec 10, 2015

Are you sure they aren't 24p that's been flagged as NTSC interlaced but is actually converted to 30p via 3:2 telecine?

castell · on Dec 10, 2015

The probably used DS9 DVDs as source. In fact DVD rental was their original core business.

jfb · on Dec 10, 2015

It's possible, but the way the rights are assigned, they more than likely went back to tape and had them recaptured.

jfb · on Dec 10, 2015

If size of catalog trumps video quality, you roll with NTSC if NTSC is the best you can get. Deinterlacing is extremely tricky, and if you can deliver interlaced video to playback devices, you're often better off punting.

camperman · on Dec 10, 2015

I'm envious of these capabilities. I wrote the backend for a streaming service that streams about forty channels to a quarter of a million mobile users in Africa. We get source video from our content providers that has given me grey hair. The problem is that I simply can't spin off the encoding and source quality checks to the cloud because of bandwidth costs here. So I do the quality checking and compression on local servers and then upload the compressed output to the servers.

I'd kill for a 100 megabit line at a decent price.

noir_lord · on Dec 10, 2015

That sounds fascinating, what is the video used for?

camperman · on Dec 10, 2015

It's literally a bunch of TV channels: sport, soaps, educational, news and music videos.

whatever_dude · on Dec 10, 2015

As someone who encodes video for work with a ton of ffmpeg-based batch files, this article makes me feel like a child.

Splines · on Dec 10, 2015

I've done a bunch of playing with ffmpeg at home, and I imagine the tech stack is probably similar at Netflix, at least for the Source-->Chunk, Chunk-->Assemble, and Assembled-->Encode steps.

The validation done during all of these is interesting. Netflix's early years are probably exactly like what you're doing - single file in, transcode, single file out and deploy.

Chunking the pieces up is clever. Getting it right must have been challenging. How do you write an oracle for something that complex?

jfb · on Dec 10, 2015

How do you write an oracle for something that complex?

You start with a much smaller problem and incrementally build up.

n0us · on Dec 10, 2015

Fascinating article. Forgive me for being ignorant about this, but how often do they need to encode video?

I would think that it's only done when new titles are added to streaming, and once the video has been encoded into all the required formats they would be done with it. Sure, there is a lot of video content out there to be encoded but it isn't unlimited. Is serving the content and providing the recommendation engine to users at scale not a greater challenge than encoding the video?

jedberg · on Dec 10, 2015

There are lots of reasons a reencode might be needed. For example, a new compression algorithm is developed, a new device is supported with a new codec, a new way of giving users a faster startup is developed, etc.

Basically any change to the way video is delivered over the internet could trigger a full or partial reencode of the entire library.

kyriakos · on Dec 10, 2015

Its another one of their challenges. Time is an important factor, encoding high quality video is time consuming, multiply that by the many bitrates and codecs it could mean that they need to delay their content availability by days.

jfb · on Dec 10, 2015

When I built a similar system (for a large consumer electronics company) we built parallel paths for high-priority content; we'd intelligently split the incoming mezzanine and distribute the 90-120s chunks to a farm of systems, while also completing the multi-pass encodes. When the latter finished, the system would swap them. Because of the business model, we never ran this in full production mode, but it was built and ready to go.

mkagenius · on Dec 10, 2015

Yes, encoding needs to be done only once.

Generally it would take 2-3x of original duration of video to encode a source into a 1080p, so I am not sure why they take full 1 day? unless they do each bitrate serially which I think is not as hard to parallelize as it is to parallalize single bit rate by chunking.

Yes, I believe serving is lot harder, but serving is almost a solved problem since people are dealing with for long time.

garblegarble · on Dec 10, 2015

>I am not sure why they take full 1 day? unless they do each bitrate serially which I think is not as hard to parallelize as it is to parallalize single bit rate by chunking.

I think the talk I know this from is https://www.youtube.com/watch?v=tQrsz3BrfwU - they chunk not only for encode but also for QC (and QC validation on the resulting transcoded asset).

If memory serves the talk also discussed the long transcode time, because their transcoder (EyeIO at the time and I have not heard differently since) is optimised for efficient packing over performance

mkaufmann · on Dec 10, 2015

For x264 that is true, HEVC which is also mentioned is much slower. For a 4k source transcoding can take more than a second per frame. For a normal movie this can quickly result in encoding times of more than a day.

Another problem is that you have to encode the movie for each codec profile times the number of different bitrates per profile. The article mentions four profiles (VC1, H.264/AVC Baseline, H.264/AVC Main and HEVC) and bitrates ranging from 100 kbps to 16 Mbps. Assuming now there are 20 different bitrates per code you already get 4*20 => 80 encoded copies per source. But of course this can be solved by parallelism.

sorenjan · on Dec 10, 2015

Are there any codecs that can output multiple versions of an input at the same time? Seems like a lot of the encoding process (like motion estimation) is the same every time, so why do it once for every output instead of reusing it?

JustSomeNobody · on Dec 10, 2015

That would be interesting to know. A lot of transcoders can make multiple passes over the source, so being able to reuse the meta data generated for subsequent passes at different output qualities might help speed up the process. I dunno, not my forte, just thinking out loud.

astrange · on Dec 10, 2015

It's not worth it, because every single decision ends up depending on your output targets anyway.

(You can't afford accurate motion estimation at low bitrates because you can't fit the accurate info in your budget anyway. Except for when you can.)

ksec · on Dec 10, 2015

Well do they not get re encode once in a while? I am pretty sure the x264 encoder now is significantly better then the one 3 - 4 years ago. Same goes to HEVC.

mkagenius · on Dec 10, 2015

I doubt that they do. The older streams if it is so old that meanwhile significantly better encoders have come to the market then probably very few people are watching those old streams.

jfb · on Dec 10, 2015

I would be shocked if they didn't roll their catalog. Maybe not the whole thing every time, but pulling their sources and re-encoding the complete suite when a new bitrate/codec combination comes on line seems like a sensible use of resources.

jonknee · on Dec 10, 2015

I'm pretty confident people stream all sorts of old content on Netflix. Why would the longtail not be a thing for Netflix?

mkagenius · on Dec 10, 2015

Interestingly, not even a single mention of ffmpeg.

doppel · on Dec 10, 2015

I was wondering the same thing. I find it highly unlikely that they are doing this processing with in-house created tools, although their limited set of input and output codecs shrinks this down from "practically impossible" to "improbable".

astrange · on Dec 10, 2015

It's possible their AVC encoder is not x264, and their VC1 encoder might be from Microsoft, in which case it would be a pretty modified ffmpeg. And some of the input formats might not be using it either.

But you can be sure it's involved somewhere, since there's no other ProRes decoder that works on Linux!

contingencies · on Dec 10, 2015

there's no other ProRes decoder that works on Linux!

Lightworks supports ProRes: http://www.lwks.com/index.php?option=com_content&view=articl...

coalescence · on Dec 10, 2015

FFmpeg supports Prores decode and encode? FFMBC for sure last time I checked

garblegarble · on Dec 10, 2015

FFmpeg does, it has different decoder implementations with their own quirks (btw, moot AFAIK because I think they use EyeIO):

  # ffmpeg -codecs | grep prores
  ffmpeg version 2.8.1 Copyright (c) 2000-2015 the FFmpeg developers
    built with Apple LLVM version 7.0.0 (clang-700.1.76)
    configuration: --prefix=/opt/local --enable-swscale --enable-avfilter --enable-avresample --enable-libmp3lame --enable-libvorbis --enable-libopus --enable-libtheora --enable-libschroedinger --enable-libopenjpeg --enable-libmodplug --enable-libvpx --enable-libspeex --enable-libass --enable-libbluray --enable-lzma --enable-gnutls --enable-fontconfig --enable-libfreetype --enable-libfribidi --disable-indev=jack --disable-outdev=xv --mandir=/opt/local/share/man --enable-shared --enable-pthreads --cc=/usr/bin/clang --enable-vda --enable-videotoolbox --arch=x86_64 --enable-yasm --enable-gpl --enable-postproc --enable-libx264 --enable-libxvid --enable-version3 --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libsmbclient --enable-nonfree --enable-libfdk-aac --enable-libfaac
    libavutil      54. 31.100 / 54. 31.100
    libavcodec     56. 60.100 / 56. 60.100
    libavformat    56. 40.101 / 56. 40.101
    libavdevice    56.  4.100 / 56.  4.100
    libavfilter     5. 40.101 /  5. 40.101
    libavresample   2.  1.  0 /  2.  1.  0
    libswscale      3.  1.101 /  3.  1.101
    libswresample   1.  2.101 /  1.  2.101
    libpostproc    53.  3.100 / 53.  3.100
   DEVIL. prores               Apple ProRes (iCodec Pro) (decoders: prores prores_lgpl ) (encoders: prores prores_aw prores_ks )

garblegarble · on Dec 10, 2015

It's my understanding they use a transcoder from EyeIO rather than FFmpeg

mkagenius · on Dec 10, 2015

I haven't yet watched your linked video fully but at 8:00 he mentions they used ffmpeg in 2013:

https://youtu.be/tQrsz3BrfwU?t=480

garblegarble · on Dec 11, 2015

Yeah, they talk in a couple of different videos about the different approaches they've used over the years

legulere · on Dec 10, 2015

I guess using ffmpeg commercially is hard because of patent issues.

MichaelGG · on Dec 10, 2015

I would be surprised if Netflix cannot pay the patent fees. The ffmpeg page mentions that commercial ffmpeg users end up paying the groups like MPEG LA. In general, I'm guessing patent owners don't like saying "no, we don't want to accept your money".

https://www.ffmpeg.org/legal.html

JustSomeNobody · on Dec 10, 2015

Interesting post. Netflix's posts usually are.

And then you scroll to the comments on the page...

contingencies · on Dec 10, 2015

I personally worked on a similar system for a major mobile video solutions provider six years ago: none of this is particularly new.

Keyframe · on Dec 10, 2015

Has Netflix started with HEVC h.265 support? I see they mention HEVC briefly.

virtuallynathan · on Dec 10, 2015

I think their 4K streaming to TVs uses HEVC.