Multimedia

Introduction

In the Internet context, multimedia refers to objects that are a mix of text, images, audio, video, animations, and other elements. This section is about time-based media: audio, movies, and interactive web-based applications, such as Flash. We are going to talk about these elements individually, as well as their combination into multimedia objects. Text/HTML, still Images, and PDFs are discussed separately at length in other chapters. You might also want to consult the general chapter on Downloads.

In writing these guidelines we are assuming a typical bandwidth of 20kbps, as discussed in the Introduction. While it is possible to deliver multimedia assets over low bandwidth, 20kbps is at the lower end of bandwidths through which this can be achieved, and the use of multimedia objects in designing websites for low bandwidth needs to be considered carefully. Nevertheless, there are many good reasons for including multimedia objects, and we discuss what compromises are available.

To give some sense of scale, the following table shows the time to download a 20 minute video, audio versions, and the script, over a connection that provides an effective bandwidth of 20kbps[1]. This is assuming a continuous connection — in reality a lengthy download is unlikely to run to completion and may have to be resumed several times.

Video, MP4, 77MB 9 hours
Audio, MP3, 5MB 33 minutes
Audio, mobile phone (AMR-NB), 800kB 5 minutes
Script, plain text, 20kB 8 sec

Time-based Media

Time-based media is designed to play back over time, such as linear audio and video. Before we begin, it is worth clarifying a potential confusion that can arise with the terminology here. Bitrate (for example kbps) is often used as a measure of the quality of audio and video recordings. This is separate from the issue of the bandwidth available to a user, although the units used to measure it are the same.

In dealing with these media, compression is essential: uncompressed video is only used in a professional context, and most music played over the Internet is compressed in some way, even in a high bandwidth context. It is only the powerful video and audio compression techniques that have become available since the late 1990s that have opened up the Internet to video and audio content.

There is no one best format for a piece of video or audio, and so it is useful to provide more than one. However, rather than just providing alternative video formats, you should also provide an audio alternative to video. This dramatically cuts down the bandwidth needed, and may convey much of the information contained in the video. Providing an audio alternative is usually a small effort, particularly if you are using an encoding application that is capable of batch encoding.

Further, provide a text alternative to audio and video. For films, a script may be available already, and it is little extra effort to put the script online alongside the video. However, the value of video/audio is often in the immediacy with which an event can be relayed, and a script may not be available. If creating a full transcript is not feasible, you could consider summarising the main points as text. Increasingly, speech to text conversion will become available, and it may be preferable to provide an imperfect transcript than no transcript at all.

Audio

Most audio encoding applications are set by default to 128kbps, which is typically aimed at music compression. But for speech, you can choose much lower settings. For speech, with good compression, not much is gained above 64kbps (mono). In fact, MP3 at 32kbps is acceptable, and the AMR-NB (adaptive multi-rate compression — narrow band) mobile phone encoding works as low as 5kbps. RealAudio also compresses well to bitrates around 12-32kbps.

In summary:

  • Bitrates of 128kbps are for library use for music, and should not be used for Internet audio.
  • Instead, MP3 at 32kbps is suitable for speech and general podcasting.
  • If you can, provide an additional format in AMR-NB at 5kbps, suitable for low bandwidth audio.

Note that there is a trade-off between the advantages of specialised compression, and the fact that fewer players may be able to handle the resulting format. The MP3 format has excellent market penetration, in the sense that most media players are able to play it, but AMR-NB has better compression. However fewer players are able to play AMR-NB so it is not advisable to provide it alone. Providing both AMR-NB and MP3 is a good strategy.

Production Tips: There are a few tips for producing audio that will help your audio quality remain high at low bitrates. Firstly, you should use mono, rather than stereo encoding. Note that many encoding applications will use stereo by default, but unless you absolutely need two channels, you should just use mono.

Record your audio as well as you can: not surprisingly, pristinely recorded audio will sound better when compressed than poorly recorded audio. So you should use the best quality microphones and audio recorders you have available. If possible, record in a quiet environment, avoiding environmental noises, such as road noise, computer fans, air conditioning, and so forth. It is also good practice to normalise the audio, as well as to perform audio range compression. This helps to use the available bits per sample most efficiently. It is outside the scope of these guidelines to provide further information on such techniques, but tutorials are available elsewhere.

Video

Dealing with video over very low bandwidth is more difficult than audio. For speech-based audio, 32kbps is quite sufficient, and there is likely to be little noticeable benefit above 64kbps. For video, there is no such upper ceiling. Moreover, there is an even larger range of formats and options to consider than for audio. Video is tricky, even in a high bandwidth context, due to this plurality of formats and players. When this section was initially written, the large range of formats available includes these relatively recent video formats: QuickTime/H.264, RealVideo-10, WMV-9 / VC-1, Flash video with VP6, as well as mobile video formats such as 3GP (with MPEG or H.264 codecs). You want something that has a large base of users, and both strong audio and video compression, but because there are so many formats the choice wasn't simple. As of 2007, a Flash beta release is available, that supports H.264 playback in Flash.

Flash video has been becoming quite ubiquitous, but now with H.264 video compression, it is an excellent choice for delivering video online. Previously, Flash used to use MP3 compression for the audio track, which made it less suitable for video delivery over connections of less than 56kbps. However, now the AAC codec is supported for audio, making lower bitrates available. Moreover, Flash can play back the usual file formats related to H.264, such as mp4, mov, m4v, and 3gp. This means that a 3gp file with H.264 compression can be used to playback online (through a flash video player, see below), as well as for download to play back on a (more recent) mobile phone. Mobile devices have limited bandwidth and processing power at their disposal. If your video can be downloaded via the Internet onto a mobile device and plays back well, it is likely that it will work over low bandwidths.

You might want to provide Flash video for standard bandwidth access, with an additional video format (e.g. mobile device). In any case, pick a format that has good video compression — typically a recent video standard, but consider that this has implications for which players can play your video.

However, it is probably most important to provide an audio alternative rather than several video alternatives, along with a text alternative if possible, as outlined above.

Production Tips: In terms of production, well produced video will look better when heavily compressed, so if you can use a reasonably modern video camera to produce it, or a (semi-)professional DV/HDV camera, this will help. The less movement there is in the video, the better it will compress. So you'll want to at least eliminate accidental motion, and use a tripod. If the nature of your production allows it, you might want to restrict intentional camera movements (zoom, tilt, pan, and tracking) too. Also, bear in mind that the video is likely to be watched on small screens, so get your subjects to fill the frame.

Encoding Tips: The following tips are a little technical in nature, and you might need to consult additional tutorial materials. Whatever format you use, if your encoder supports it, you should use two-pass encoding. It is more time consuming to encode, because the video to be encoded is analysed in a first pass. However, two-pass encoding results in better video for the same bitrate and filesize. If your video is just for download (and not streaming), you could use a variable bitrate setting, to use the available bitrate most efficiently. For MP4 formats, make sure that you retain "forced key frames": in principle you might save on file size by not doing this, but if your key frames are too far apart, the video won't play back well on iPods, and won't seek well for streaming.

Delivery and Presentation of Time-based Media

Now that we've looked at encoding a single piece of audio or video into various formats, we need to consider delivery of these media on the web.

By 'delivery methods', we mean the choice between download, progressive download, and true streaming.

Download uses the HTTP protocol, and more information is available in the Downloads chapter. From a low bandwidth perspective, letting users download multimedia is good: the encoding bitrate of your content may well exceed the bandwidth that the viewer has available. This means that it will take significantly longer to download the media object than it will to watch it. If you try to stream this media it will keep stopping during playback while the viewer tries to buffer the next chunk. With a download, the viewer can wait until the content has been fully downloaded before trying to watch. Also the content can be downloaded overnight or downloaded through a download manager.

Progressive download is a download straight into a player (also by HTTP), such that the player can start to play content before it is fully downloaded. This has the advantage that you can start viewing the content quickly, and you can wait for the content to buffer sufficiently if your bandwidth isn't quite good enough. However, because the progressive download is into a player, the URL to the media itself can typically only be determined by looking at the HTML code of the surrounding page. Ideally you would provide visitors with the full link to the media content, so that they have the choice whether to watch the content as progressive download, or download the content directly.

Streaming has the advantage of using your bandwidth efficiently: you just consume the bandwidth for what you watch. However, if the user's available bandwidth is below that required to deliver the media, they won't be able to experience the content via streaming.

Generally speaking, you should offer a download, or a download in addition to other delivery methods.

For presentation methods, we distinguish embedded content, linked content, and syndicated content.

Embedded content is content which is closely linked to the surrounding web page or application, and we'll discuss this further below.

Generally speaking, linking content is better than embedding, as you can warn viewers of the size before they attempt to obtain the content.

Delivering your content through RSS syndication (e.g. as podcast) is an interesting delivery option. By default, episodes are retrieved in the 'background', and some podcast receivers, such as iTunes, naturally support resuming downloads. This means that content can be retrieved quite robustly over longer periods of time.

The following table gives combinations of presentation and delivery methods, with a number of recommendations.

Download progressive download streaming
Embedded (N/A) Don't auto load Don't auto load
Linked Give size Give size + bitrate Give bitrate
Syndicated Good (N/A) (N/A)

Embedded Content

Embedded content poses particular problems, and we close this section by giving further recommendations. Embedded content can refer to progressive download or streaming content that is used through a player which itself is embedded in an HTML page through an object or embed tag (or both). In a slightly narrower sense, by embedding we mean content embedded into a page that also has other useful content on it, such as text and images. The problem with embedding is that it limits access to such other useful parts of the page, which on their own would have been perfectly accessible over low bandwidth.

Recommendation 1: If you wish to embed movie content, consider moving the embedded content to a separate html page that only holds the embedded content, and then linking to that page (warning the user of the size). This means that the other low bandwidth compatible content on the main page can still be accessed.

Recommendation 2: Don't auto-load. We recommend that you do not let embedded players auto-load content. Auto-loading means that all content required by the page is loaded when the pages loads. For instance, when you load a web page, the images typically just load, unless you have asked your browser not to do so. For multimedia, there is no such "don't auto-load movies" setting. If your multimedia objects load automatically, your page will be impossible to load over low bandwidth. You should therefore avoid audio auto-playing in the background, or players simply starting to play content. If you embed multimedia content and don't let it auto-load then only the required player for the content is loaded, and the user can decide which pieces of actual media to play.

You should bear in mind that auto-loading content poses problems even in high bandwidth settings: auto-loading content can cause the browser to become unresponsive, or to issue warning messages.

Flash Movies

The issue of auto-loading is also relevant for Flash video. Flash applications are discussed below, but for now, we consider a Flash application, that simply has a linear movie in it. Often, the movie is put into an SWF file, and in this case, the whole file needs to be downloaded before the movie is available, just like an image. The movie might be tens if not hundreds of MB, which clearly makes your page unusable in a low bandwidth context, and potentially in any context. A much better alternative is to provide a player (a much smaller SWF) that then loads a flash video file (FLV), typically as progressive download.

For instance, FlowPlayer is a 74kB SWF file that can play Flash video[2]. If you have configured FlowPlayer correctly (as shown in the example below), then the movie content starts to be loaded only when your visitors click 'play'. When the page loads, only the player itself will be loaded, i.e. the size of the usual html page is increased by just 74kB for the player. The player, as a download, can also be cached by the browser. Note that with a bandwidth of 20kbps even this player could take up to 30 seconds to download, so to stay within the target of 10 second page load times, even the FlowPlayer should be on a separate page with any link to it stating the size of the page.

Example: Using a linked flash video FlowPlayer

<object type="application/x-shockwave-flash" data="FlowPlayer.swf" 
width="512" height="311" id="FlowPlayer">
...
<param name="flashvars" value="config={videoFile: 'mymovie.flv', 
autoPlay: false, 
autoBuffering: false}" />
</object>

Similar remarks hold with regard to playing Flash movies as part of Flash applications: you can use the same technique as above to load movies into your Flash application when they are required. Also, always give visitors play controls, so that when the movie is fully loaded they can watch the content in one go. You should avoid loading content into a Flash application without rewind buttons, so that once the user has "stuttered" their way through the content, they have the opportunity to watch the whole content in one go.

Animation

A linear Flash animation is to a live action movie what a vector graphics image is to a compressed bitmap image. Creation of animation (in Flash or otherwise) is labour intensive, but can be an interesting way of delivering great looking moving images over lower bandwidths. As with all linear media, provide an audio-only alternative, do not embed, and provide a text alternative of the audio if possible.

Example: http://www.bbc.co.uk/cult/tamaraswift/dramatisation/

The Ghosts of Albion animated adventure is presented in several parts. Each part is presented in low and high version Flash animations, as well as audio only, as follows:

Animation   Low | High   (Flash)  
Sound only  Low | High   (RealPlayer)  

Multimedia Objects

Having discussed various elements of multimedia, we now consider how these media elements get combined into a piece of multimedia. There are at least three choices:

  • Elements provided individually through linking ("layered delivery" paradigm)
  • HTML with embedded multimedia (not auto loading)
  • All in one application (e.g. a Flash animation, or an executable binary)

If your multimedia elements do not need to be tightly integrated, but can be provided individually "as a set" e.g. on a web page, then links to individual elements are an appropriate way of delivering your materials. Your user can choose which elements (and which versions of these) to pick, and can "upgrade" to better quality versions for assets that are more interesting. You might be able to provide an RSS feed with enclosures (podcast), through which the user can use an application to manage the download of the assets.

If you wish to integrate your media more tightly then the second option, embedded multimedia with no auto load, means that the plugins to the player need to be loaded, but the media itself is not loaded until required. This is also a good way of providing alternative versions. This can be done very elegantly if media settings are remembered, so that the user only chooses once, and is then presented with future media elements in the chosen format. The above comments for presenting those individual video and audio elements apply.

The third option concerns whole applications. If you have a whole application (such as a Flash application), the user has no choice over which parts of the application to watch, and what to watch first, until the application has downloaded fully. Thus, large multimedia applications, such as Flash, pose a particular problem:

  • They are embedded in web pages, so cannot be easily downloaded over time, and then viewed offline.
  • It is difficult to provide alternative versions, as the authoring process commonly only allows for providing multi-media assets in one particular setting.

In view of this:

  • Consider whether you need to bundle your assets into an application (e.g. Flash), or whether you can use alternative ways of bundling your objects together, such as a web page ("layered delivery").
  • Where you do need an application, make these applications available for download, so that users can download them in their own time.
  • Choose sensible bitrates for audio, and do provide an alternative version of the application if possible.
  • If your application naturally separates into different elements (such as different lessons in a course, or different stages in a game), provide those separately.
  • Where your pages or applications contain embedded video or audio content as progressive downloads, do not auto-load, and provide rewind buttons so that the user is in control of the delivery of the content.

Summary

General Recommendations for Time-based Media

  • Provide alternatives, and a range of formats
  • Use the 'video-audio-text cascade':
    • Provide audio alternatives to video
    • Provide a text alternative to video/audio
  • When using an encoding application to produce a file for download, check the settings for quality and/or compression (the default settings of some applications will produce large files)

Recommendations for Audio

  • Check your encoding application settings.
  • Provide MP3 at 32kbps,
  • and AMR narrowband at 5kbps
  • Use a single audio channel (mono)
  • If you can, produce audio with little background noise, using the best equipment you have available.
  • Normalise and audio-range compress the audio prior to encoding

Recommendations for Video

  • Pick a format that has good video compression — pick a recent video standard
  • Use two-pass encoding
  • Consider video for mobile devices

Recommendations for Media Delivery

  • Let users download.
  • Where possible, link to objects, and give sizes and bitrates where applicable
  • If you embed media, do not auto-load
  • If possible, use RSS to deliver your content as a podcast

Recommendations for Multimedia Objects

  • Make multimedia applications available for download
  • Choose low bitrates for audio
  • Rather than one large application, consider providing several smaller ones

Footnotes

[#1] File size is usually given in bytes (B) whereas the speed of an internet connection is generally quoted in terms of bits (b) per second. This can sometimes lead to confusion. A 20kB file will take 8 seconds to download at 20kbps because 1 byte is equivalent to 8 bits.

[#2] http://flowplayer.org/