The GIF format is terrible. The only reason it’s still used at all, is because it’s widely supported, and it’s often treated special as the only image format that can be animated. In this blog post, I go over the problems with GIF, and why I think the new format AVIF is the perfect successor to GIF in the modern world.
This is the main problem with the GIF format for me. GIF is from the 1980s, and animated GIFs are from the 90s, and its compression is just as archaic as one might expect for a format that old. GIFs are extremely inefficient when it comes to size. A video that is a couple megabytes can blow up to 100x its size when attempting to turn into a GIF with minimal losses.
There are two main innovations in modern video compression that GIF doesn’t have.
Ignoring the GIF palette (we’ll touch on that later), GIF compression is fully lossless. This sounds like a good thing, right? Why would you want to lose data? Well, here’s the thing: the human eye and brain cannot discern every single bit of an image. If you’re willing to accept a tiny bit of image degradation, you can throw out 80% of the data while barely losing any noticeable quality. This is why JPEGs (a lossy format) are so much smaller than PNGs (a lossless format), even when they look nearly identical. This principle applies to images and video. Modern video compression is very lossy, but we can hardly tell the difference, especially at higher resolutions.
Lossy compression is an umbrella term for many different types of compression, different bits of data we can throw away while keeping the image visually indistinguishable. Here are a few:
These examples are very high-level overviews that are very oversimplified, but they get the general idea across.
However, this isn’t everything. Formats like WebP do have lossy compression for animations, which does save some data, but nowhere near as much as a real video codec. That’s because video codecs have the next innovation.
Inter-frame compression extrapolates the previous idea of “Different parts of the image are often not completely random, but are related to the others” to video. Each frame is often highly related to the last, and storing just the difference or the movement between each frame is far less data than storing each frame individually. Technically, GIFs have a very primitive and very often buggy form of this, called partial frames, which I touch on later.
Your average animated GIF, WebP, or JPEG-XL, even if each frame is 1% different than the last, will store an entire copy of each frame. This, predictably, leads to poor compression and large file sizes.
The most distinctive appearance of modern GIFs is the limited palette. By default, a GIF can only have 256 colors per frame (or 255 colors + one fully transparent color). Technically, there are techniques to get a wider palette. GIFs can have different palettes per frame, and an individual frame can be split into chunks with individual palettes. Tools like gifski can encode fairly high quality gifs using these. However, these require extra encoding time, and are often lost when websites re-compress GIFs. For example, Discord image previews quantize the color palette to 5 bits per channel, and all palette info beyond one palette per frame is lost.
Properly utilizing the GIF format depends on the encoder to use the palette smartly. Lazy encoders will use a fixed default palette across all GIFs, and no dithering when downsampling, leading to abysmal color banding. Some improvements smarter encoders can use are generating palettes based on the content of the frame, using one palette per frame, or dithering to make better use of the limited palette.
In the real world, dithering has issues. When a dithered image is made smaller, especially with a limited palette, most of the dithering information is lost. This is very common on the web, where bandwidth is aggressively optimized. Also, better-looking dithering algorithms like error diffusion dithering look horrible for animations, as the dithering isn’t stable, so there’s a static-y flickering effect.
GIFs technically support transparency, however, it’s limited and buggy. You can have one color in the palette which is fully transparent, and nothing else. This is good enough for most cases, but it is a limitation.
Another huge issue is partial frames. A GIF frame can choose to not override the previous frame, but be placed on top of it, utilizing the transparency to preserve parts of the previous frame. This seems like a decent idea in theory, it would improve compression, but most encoders and decoders don’t fully understand it, and often have issues mixing this with GIFs that are supposed to have transparency that clears between frames. This also does not really count as true inter-frame compression, because the pixels have to match 100% for it to work, there’s no compensation for slight shifts in color or movement in the image.
GIFs store frame rate in a strange way. Each frame stores a number, which is how many hundredths of a second the frame should be displayed for. This is an integer, meaning the frame rate of GIFs is “quantized”. This isn’t very noticeable at lower frame rates, but at smoother frame rates, there’s limited options. For example, there are only 4 valid frame rates above 24 fps: 100 fps (1), 50 fps (2), 33.3 fps (3), 25 fps (4). Some encoders get around this by varying the frame time per frame so it averages out to the target FPS, which does work well, but it runs into another issue with frame rate.
The GIF format technically supports up to 100 fps, but many players or browsers (including Chrome!) refuse to play GIFs at that speed, making any 1/100 duration frame take longer. This is especially annoying with encoders, such as FFmpeg, that vary frame time to simulate higher FPS. Even with GIFs well below 50 fps, some encoders will still generate 1/100 s frames to get closer to the target frame rate, which won’t display properly.
Now that we’ve discussed the issues GIF has, let’s discuss the properties a good GIF replacement should have: improving GIF’s weaknesses and keeping its strengths
This is the main thing our format is trying to accomplish, however, this has little to do with the format and more to do with the player. Our format shouldn’t support audio as a first-class feature.
Our format should use existing, modern, and efficient inter-frame compression. Ideally one that’s widely supported with hardware acceleration, so many can be played at once on a single screen. An option for lossless encoding is nice to have, but good-looking lossy compression should be the ideal. We should also have a balance between good file size and good performance. Advanced codecs like VP9 can be very heavy, especially when encoding, and this should be taken into account.
Most video formats lack transparency capabilities, so it’s worth noting that our format should have it. Ideally a full alpha channel, not just a single transparent color.
A nice thing to have for any new format is backwards compatability. My ideal for this is that, for a player that doesn’t know of the format, it should at least recognize the first frame as an image and display that.
This is called progressive playback. This is something we take for granted with many formats, that you don’t need the entire file to begin playback. Some of the existing replacements don’t support this, so it’s worth mentioning. Some formats also support progressive rendering, ie parts of the frame can be displayed without loading it in full, but this isn’t necessary, even if it is nice to have.
Store framerate normally, no partial frames, no restricted palettes.
Let’s make a table of the properties we want our format to have, and update it as we go along.
Treated as an image | Good Compression | Full Transparency | Backwards Compatability | Playing as it loads | No weird quirks |
---|
Many websites with fine control over the player will just use normal videos in place of GIFs. This does work well for those websites, but these cannot be easily shared without losing the GIF property or converting to a GIF. These also often lack transparency capabilities.
Treated as an image | Good Compression | Full Transparency | Backwards Compatability | Playing as it loads | No weird quirks |
---|---|---|---|---|---|
❌ | ✔️ | ❌ | ❌ | ✔️ | ✔️ |
APNG is fairly well known, and is an extension of the PNG format. It can store multiple PNG frames in sequence, and just displays as a regular png image to players that don’t support APNG. However, there is no inter-frame compression, just one PNG per frame, which isn’t super efficient on filesize. Each frame does support full PNG capabilities: so full lossless 8 bits per channel color + transparency.
Treated as an image | Good Compression | Full Transparency | Backwards Compatability | Playing as it loads | No weird quirks |
---|---|---|---|---|---|
✔️ | ❌ | ✔️ | ✔️ | ✔️ | ✔️ |
WebP is very new and very controversial. Many people hate it due to inconsistent support, which isn’t really a fault of the format itself. WebP is a fairly good format, based on VP8 encoding, supporting lossy and lossless compression, transparency, and animation. However, its animation has the same problem as APNG. It’s not a true video codec, it’s just a sequence of individual WebP images, which doesn’t compress super well.
Treated as an image | Good Compression | Full Transparency | Backwards Compatability | Playing as it loads | No weird quirks |
---|---|---|---|---|---|
✔️ | ❌ | ✔️ | ✔️ | ✔️ | ✔️ |
JPEG XL is made by the same people who made JPEG. It supports animation, transparency, lossy and lossless encoding. However, it has the same issue as APNG and WebP, it’s not a true video codec, just a sequence of images. It even inherits some weird quirks from GIF like partial frames.
Treated as an image | Good Compression | Full Transparency | Backwards Compatability | Playing as it loads | No weird quirks |
---|---|---|---|---|---|
✔️ | ❌ | ✔️ | ✔️ | ✔️ | ❌ |
AVIF is a serious contender for GIF’s place (and really all web images) given the requirements I’ve set out. It’s based on AV1, a super modern and amazing video codec. Animated AVIFs support a full AV1 video stream, with transparency, lossless or lossy, and many other bells and whistles like HDR.
Up until this point, this blog post was going to be about how I was going to design my own replacement for GIF, by using the APNG-style backwards compatability on top of WebP. This is because I thought AVIF didn’t support progressive playback. My basic research suggested it wasn’t possible. Also, transparent AVIFs have 2 streams: one for color and one for alpha, and I assumed they were stored separately and so progressive playback wasn’t possible. Also, storing that separately is a little hacky.
But then… I actually tested it. I found an animated AVIF with transparency, limited my network speed (thanks Chrome devtools!), and loaded it, and it started playing nearly instantly after the data started loading, and behaved just like a GIF would, but taking up far less storage and looking far better. I was shocked. I was annoyed that I had to change the premise of my article, but delighted that my perfect GIF replacement format already existed! Although it requires the browser to support AVIF itself, support for animation isn’t required to display the first frame of an animated AVIF.
Turns out, the two streams are interweaved, one frame at a time. AVIF doesn’t support progressive rendering: rending parts of an individual frame or a lower quality version of a frame before it’s finished downloading, but it does support progressive playback
Treated as an image | Good Compression | Full Transparency | Backwards Compatability | Playing as it loads | No weird quirks |
---|---|---|---|---|---|
✔️ | ✔️ | ✔️ | ✔️ | ✔️‼️ | ✔️ |
Format | Treated as an image | Good Compression | Full Transparency | Backwards Compatibility | Playing as it loads | No weird quirks |
---|---|---|---|---|---|---|
Traditional Video Codecs | ❌ | ✔️ | ❌ | ❌ | ✔️ | ✔️ |
APNG | ✔️ | ❌ | ✔️ | ✔️ | ✔️ | ✔️ |
WebP | ✔️ | ❌ | ✔️ | ✔️ | ✔️ | ✔️ |
JPEG XL | ✔️ | ❌ | ✔️ | ✔️ | ✔️ | ❌ |
AVIF | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
AVIF has the properties I want, but it also has some downsides.
AVIF is fantastic. It’s amazingly space efficient and has every feature you could want. However, we run into “the WebP problem”, ie, support.
Most modern browsers do support AVIF, but sometimes its limited.
As mentioned previously, AV1 is a very heavy codec, and is an absolute resource hog when run without hardware acceleration. This matters for GIFs, where its common to display many on the screen at once, such as in an online chat platform or a twitch chat. Thankfully, hardware acceleration is getting there, it will just take some time. High quality AVIFs can struggle to display on devices without hardware acceleration.
Encoding an AV1 video is even harder and more intensive on the system. Hardware encoding acceleration is also on the rise, usually lagging a generation behind decoding acceleration. The RTX 4000+, RX 7000+, Intel Arc, Intel Core 12th gen+, and Ryzen 7000+ have AV1 encoding acceleration. It can be done, but compared to the dead simple GIF encoding, it’s a noticable difference. This also matters for websites that re-encode user-submitted gifs, like Tenor. YouTube already only encodes very popular videos in AV1, but also full YouTube videos are obviously more intensive than a normal GIF.
AV1 technically supports lossless encoding, but it’s not very good. It’s almost always worse than lossless WebP and JPEG-XL, and sometimes even worse than PNG. I suppose intentionally choosing lossless mode means you aren’t super concerned with file size, but it’s important to note. This can always be improved in the future, I suppose.
I hope you all appreciated this overview of modern animated image formats, and why I like AVIF so much. With the rise of AV1 hardware acceleration, and improvements to AV1 lossless encoding, hopefully AVIF will present itself as the obvious successor to GIF, and all other web images, and that we will benefit from its cool features and efficient compression.