Steganography

Unlike cryptography, where the aim is to conceal the contents of a message, steganography is the practice of concealing the existence of a message. In the digital age, messages take the form of a bit string, and are typically hidden within files encoding images, audio or video. We specifically focus on the task of hiding one image inside another.

This page is statically served with GitHub Pages. The corresponding repository is located here.

How it works

Information can be injected into image files in a number of ways. Some image file formats (such as PNG) store metadata; a simple approach is to hide information in these metadata headers. Alternatively, with JPEG image formats, information can be hidden in the quantisation coefficients during JPEG compression. The technique we use (and describe here) operates in the spatial domain, encoding information in the raw pixel colour values. For this reason, provided any compression is lossless, this technique is file format agnostic.

An image is a grid of coloured pixels. Each pixel is described by a red component, a green component, and a blue component, each taking a value between 0 and 255 and combining additively as light would. Lower values correspond to less of a colour, while higher values correspond to more of a colour. For example, the colour green is encoded by 0 red, 255 green and 0 blue, and cyan is encoded by 0 red, 255 green and 255 blue. A pixel can therefore be encoded in three bytes: one byte per colour. Importantly, the Least Significant Bits (LSBs) of each of these bytes do not have a large visual impact on the image. For example, if the last two bits of each byte were to be cleared, the resulting colour components would differ from the original by at most 3. It is in these LSBs that the payload (i.e., the image to hide) is concealed. Specifically, for each byte in the base (carrier) image, the 0 < n < 8 Most Significant Bits (MSBs) are kept, and the (8 - n) LSBs are replaced with the (8 - n) MSBs of the corresponding byte in the image to hide. The following diagram illustrates this encoding scheme for n = 5, where A shows the region allocated to the MSBs of the base image, and B shows the region allocated to the MSBs of the image to hide.

The choice of n is therefore a tradeoff between the visual impact on the base image and the size of the payload.

By hiding the above coastal scene (sea.png) in the above image of a twisty puzzle (cube.png) with different values of n, the table below demonstrates this tradeoff.

`n`	`Encoded image`	`Decoded image`
1
2
3
4
5
6
7

Securing the hidden image

Though the payload is successfully hidden using the above technique, it is not encrypted. Therefore, if an adversary decodes every image, the payload will be retrieved in full. There are many methods of combatting this; this site implements a simple one which preserves the payload size. A password is used to seed a pseudorandom number generator (PRNG) (this site uses a script by David Bau). This PRNG is used to generate a bit key for each bit to hide. Then, before replacing the LSBs of the base image, the MSBs of the image to hide are first XORed (⊕) with the corresponding pseudorandom bit key. This is effectively a version of the One Time Pad (OTP) encryption scheme, and hence has the following problems:

The security of the scheme relies on true randomness, which is not provided by the PRNG. Any predictability in the PRNG may leak information about the payload.
The security of this scheme can be undermined by an insecure password exchange.
This scheme provides no authentication, and is therefore susceptible to alteration.
Passwords can only be used once; password reuse compromises the security of this scheme.

To demonstrate this last point in particular, consider two images, i₁ and i₂. Suppose we hide both of these images with the same password. When seeded with this password, the PRNG will generate the bit string k. Therefore, the concatenated LSBs of the encoded image hiding i₁ equal c₁ = i₁ ⊕ k, and the concatenated LSBs of the encoded image hiding i₂ equal c₂ = i₂ ⊕ k. On their own, c₁ and c₂ are reasonably secure, provided k is not known. However, if both c₁ and c₂ are known, the security is heavily undermined: c₁ ⊕ c₂ = ( i₁ ⊕ k ) ⊕ ( i₂ ⊕ k ) = i₁ ⊕ i₂, as (1) XOR is an associative operator; and (2) the XOR of a value with itself is zero. To illustrate this, we introduce a new third image, coffee.png:

The following two images, c1.png and c2.png, encode (within sea.png) cube.png and coffee.png respectively, both using the password 'hello'.

When decoded without a password, the result is seemingly random noise:

However, as shown below, the XOR of these decoded images clearly leaks information. Indeed, as expected, the MSBs of this image match the MSBs of the XOR of cube.png and coffee.png.