visit
This article proposes use cases for new image codec and presents libraries to work with it on both front-end and back-end.
MSU Codec Comparison (April 4, 2019) Well, it seems that , the new video coding format developed by Alliance for Open Media, are the most promising at the moment to get highest compression possible, especially at low bit-rates. See e.g. and benchmarks. Since we would need to re-compress images anyway to get benefits of new format, why not choose the best?
Another advantage of AV1 is that it’s a royalty-free format which means you don’t have to pay to patent-holders. For what it’s worth software patents is unfortunately still a thing. Formats like JPEG XR didn’t achieve great adoption mostly because of patents involved. So AV1-based solutions are attractive from both technical and legal points of view.If you’re interested in leveraging AV1 for video compression, take a look at my previous article dedicated to this subject.
Comparison of intra coding efficiency This graphic produced by promotes libaom as a clear winner: it has best score on VMAF metric and its slowest encoding preset is actually faster than competitors’, at least on my pre-AVX2 CPU. (libjpeg results are provided for the reference.) That could be explained by speed-over-quality trade-offs chosen in SVT-AV1 and rav1e. It’s not bad, though still images are represented as single frame videos and it’s not that long to encode one frame even with slowest compression settings. So libaom should be a good choice. Not like we can’t make it faster with speed controls if needed.
I’ve also compared libaom and SVT-AV1 encodes with my eyes because objective metrics are not the single source of truth. From my subjective perspective it pretty much correlated with VMAF results, though sometimes it was hard to choose the best of two.So AV1 encoder is chosen, what’s next? Forum back-end where I’m going to use AVIF is written in Go, so I needed a library for that language. After some search I’ve found mentioned in . It probably works fine and should allow to write Go bindings, but I decided to write my own for better understanding of the format. Since we won’t implement encoder from scratch, the entire library boils down to libaom cgo wrapper and pure Go ISOBMFF muxer. libaom provides typical encoder library C API. We need to prepare frame i.e. wrap pixel data into library’s structures, run encode function on it and get results back. Most encoders operate with , and is the most common. I’m using image package from standard Go library to get RGB pixel values from the image provided by the user. It supports decoding most popular JPEG and PNG formats out of the box. Pixels in .png are already stored as RGB and for .jpg Go will convert them to RGB automatically. We just need to convert RGB to Y’CbCr BT.709 4:2:0 limited range and can pass it to encoder. If it sounds scary, don’t worry. This operation boils down to multiplying R, G and B components of every pixel with some coefficient and few additions. Now we need to pass that data to libaom, I’m using small C wrapper for easier interoperability between C and Go. libaom’s API is pretty straightforward, but there are few things worth to note:
package main
import (
"image"
_ "image/jpeg"
"log"
"os"
"github.com/Kagami/go-avif"
)
func main() {
if len(os.Args) != 3 {
log.Fatalf("Usage: %s src.jpg dst.avif", os.Args[0])
}
srcPath := os.Args[1]
src, err := os.Open(srcPath)
if err != nil {
log.Fatalf("Can't open sorce file: %v", err)
}
dstPath := os.Args[2]
dst, err := os.Create(dstPath)
if err != nil {
log.Fatalf("Can't create destination file: %v", err)
}
img, _, err := image.Decode(src)
if err != nil {
log.Fatalf("Can't decode source file: %v", err)
}
err = avif.Encode(dst, img, nil)
if err != nil {
log.Fatalf("Can't encode source image: %v", err)
}
log.Printf("Encoded AVIF at %s", dstPath)
}
See for further details. go-avif also provides simple CLI utility for converting images to AVIF format with single command from console. You can download binaries for Windows, Linux and macOS . Usage cheat sheet:
# Encode JPEG to AVIF with default settings
avif -e cat.jpg -o kitty.avif
# Encode PNG with slowest speed and quality 15
avif -e dog.png -o doggy.avif --best -q 15
# Fastest encoding
avif -e pig.png -o piggy.avif --fast
Service Worker fetch interceptor (MDN docs)
First API provides a way to intercept any fetch requests occurred on the page and respond with custom JavaScript-processed answer. Pretty impressive, isn’t it? Without that feature we would be limited to imperative decode-me-that-file-and-paint-it-here style of library API which is of course usable too, but ugly. We, web developers, are used to transparent polyfills that hide implementation details under the hood and provide precious and clean APIs. For image format that would mean ability to display it with <IMG> tag, CSS properties and so on. As you might have guessed library I’m going to propose uses exactly that mechanism of embedding.
What about the second API? Since format is not yet supported natively, we also need to somehow decode it, i.e. transform bytes into actual pixels to display. Well, JavaScript is (obviously) Turing complete language so it’s perfectly possible to write decoder of any complexity in pure JS. That was actually demonstrated in the past, see e.g. . Unfortunately decoders are tend to be very computationally expensive, especially in the case of new formats such as AV1. Even native code implemented in computer’s best friend languages such as C and Assembly tend to be slow, with full access to SIMD instructions, threads and what else. Recently we’ve got one useful instrument in JS land that helps with that type of tasks. It’s WebAssembly, a new binary format that is designed to evaluate code with speed close to the native.I’ve already written an article about using it on the web, so won’t go into much detail here. All we need to know is that it allows to convert library written in C/C++ in such form that it can be executed inside a web page without any changes to the code. This is especially useful since there is great AV1 decoder already exist. I’m talking about , the current state of the art.
One thing worth to note, we won’t get full speed ( per Full HD frame) of native code with WebAssembly for the reasons. It’s currently 32-bit, no SIMD, no threads (or behind the flag) and sandboxing also brings some overhead. But for the still images 100–200 ms of delay should be fine.The only code we need to write are small wrappers for C and JavaScript to glue everything together. You can see the implementation in and files respectfully. The entire polyfill is available and also at . WebAssembly is cool, but can we do better? The most keen-eyed readers should have noticed that AV1 in AVIF is exactly the same as in video, so we should be able to decode using already shipped AV1 codec for HTML5 video. It turns out we can! Well, at least in browsers that support AV1, it’s still bleeding edge technology. The tricky part is that we can’t insert AVIF file as is into <video> tag, that simply won’t work. We need to write the parser (demuxer) of ISOBMFF container format in order to extract the actual content of the frame (OBUs). And then also write a muxer to wrap that frame into playable .mp4 video which also uses ISOBMFF container.
AVIF file in ISOBMFF Box Structure Viewer It turned out to be pretty fun and exciting task, my implementation can be found . I suggest you also scroll through . MP4 file is like XML i.e. nested tags with some attributes and content, but binary. The design is clean and simple, I really like it. After we’ve got .mp4 file in a typed array, we need to convert it to blob and pass to standard video element. Believe me or not but it turned out to be a really hard task. That’s because inside Service Worker you don’t have access to DOM and can’t create new HTML elements with document.createElement. Fail. After some thinking I’ve come to a solution that made the architecture of the library really clumsy. Since we have to respond to intercepted fetch event in the same Service Worker that received it but can process it only in main thread, we would just use message passing to do the decoding task and get results back. It turned out to be working pretty well. And also a bit faster than WebAssembly version of dav1d because of access to SIMD and things like that. There is one small thing needed to be done left. Browsers won’t understand uncompressed Y’CbCr frame returned by the decoder. We can only respond with image data supported by the standard <IMG> tag. Those are JPEG, PNG, BMP and similar to them. The simplest solution would be to use standard toDataURL("image/jpeg") method of canvas component to get JPEG data as a string. Though this way leads to quality and performance losses, so I’ve implemented small .bmp muxer in pure JS instead: . BMP can contain uncompressed RGB pixel data, so it’s only a matter of writing header and reordering of RGB to BGR triplet.
avif.js demo in Chrome for Android The entire library is published and also at . Usage is dead simple:
// Install library
npm install avif.js
// Put this to reg.js and serve avif-sw.js from web root
// Both scripts should be transpilled (either manually with e.g.
// browserify or automatically by parcel)
require("avif.js").register("/avif-sw.js");
// HTML
<body>
<!-- Register worker -->
<script src="reg.js"></script>
<!-- Can embed AVIF with IMG tag now -->
<img src="image.avif">
<!-- Or via CSS property -->
<div style="background: url(image2.avif)">
some content
</div>
</body>
You can also see the demo .
I’m not satisfied with only one particular thing about it: Service Workers is complex, fragile and error-phone API. E.g. see for details about update mechanism of service worker. You can easily mess up and break the entire site. Or won’t get updated version. Or fetch requests will hang forever. Needless to say all these things happened to me while developing avif.js. Hopefully there won’t be issues like that anymore since code is stabilized. Let’s also hope web standard authors will improve situation in next iterations of Service Workers API.