Everything is 1, Except for the 0
First published on 21 June 2021
Last updated on 2 March 2022
Introduction
Since childhood I have been told that computers understand nothing but ones and zeros. If this is true, how is it possible that computers can deal with text, sound, and image data? This blog post is intended to address the question of how ones and zeros accumulate to something that we consider meaningful data. The post is divided into two parts: in a first step, I shine a light on numeral representations of input and output with the help of the binary and hexadecimal systems. Both numeral systems are prerequisites for a basic understanding of how traditional computers work. Subsequently, I examine the three cases of text, sound, and image data. In doing so, I provide illustrative examples for each case that admittedly reduce the complexity of modern applications but nevertheless help to clarify the fundamental operating principles that underlie digital representations of analogue data.
The Binary System
To conceive the structure of electronic data, understanding the binary system is indispensable because by means of it, we can electronically represent two simple but essential states: on and off. Imagine you would be a bird that could only produce a single sound; how would you communicate your concerns and ambitions to your fellow birds? Exactly, stringing together these uniform sounds would offer you a basic way to encode your ideas. A single sound could mean that you are hungry, while a succession of two sounds cloud mean that you are talkative. In the same way, we can use ones and zeros to encode letters. Let us say, for instance, that 1 (or “on”) represents the letter “A” and 10 (or “off” and “on”) the letter “B”. We could then write 1 10 10 1 (read from right to left) to refer to the band (or anything else) called “ABBA”. To take a real life example, we can extend this assumption by saying that 01000001 stands for the letter “A” and 01000010 for the letter “B”. We now have a specific length for each character (eight digits), which we attain by filling our code with 0s and finally a 01. (We will get back to this example in the second part of this post.)
Bit | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 |
Value (2x) | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 |
Value (dec.) | 128 | 64 | 32 | 16 | 8 | 4 | 2 | 1 |
Value (hex.) | 80 | 40 | 20 | 10 | 8 | 4 | 2 | 1 |
The eight-digit codes that we just made up to refer to the letters “A” and “B” can also represent numbers in the binary system, namely the decimal numbers 65 and 66. The table above illustrates how exactly we get there. The first “bit” of our series of numbers has the value 2 0, i.e. 1, in the decimal system, and the second “bit” has the value 2 1, which corresponds to 2 in the decimal system. In the same way, the following “bits” have the values 2 2, 2 3, 2 4 and so on. This is very similar to the decimal system, in which we have nine digits and the first “bit” has the value 10 0, the second the value 10 1 etc. Thus, we write 23, i.e. 3 times 10 0 plus 2 times 10 1, to refer to the decimal number twenty-three. Another numeral system is called the hexadecimal system, which uses the base 16 to represent numbers.
The Hexadecimal System
Like Roman numerals, binary numbers tend to be long and confusing for the human eye, especially when it comes to real-life data. The advantage of the hexadecimal system is that two-digit numbers, with each digit ranging from 1–F (with A to F for the numbers 10 to 15), can display the same range of numbers as eight-digit binary figures. Hence, the smallest two-digit numeral in hexadecimal is 00, while the largest is FF (F or 15 times 16 plus F or 15 times 1). If we want to examine the bit strings that build up a file, it is common to use hex editors, which display those basal binary number strings in hexadecimal notation. This not only reduces the length of the strings by four but also makes them more easy to read, provided that you understand the concept of different numeral systems.
Text
After my comments on the binary and hexadecimal system, let us continue with looking at text data and how it can be digitally encoded in computers.
American Standard Code for Information Interchange (ASCII)
The American Standard Code for Information Interchange (ASCII or US-ASCII) is one basic character encoding standard based on seven-bit teleprinter code. First published in 1963, ASCII received its most recent update during 1986. Since then, it encodes 128 specified characters, ninety-five of which are printable, into seven bit integers. The printable characters include the digits 0 to 9, the lowercase letters a to z, the uppercase letters A to Z, and punctuation symbols. Beyond that, ASCII includes 33 non-printing control codes, which originated with Teletype machines. The chart below shows how these characters are encoded. The percent sign ‘%’, for example, is represented by binary 0010 0101 or hexadecimal 25 (which is 37 in the decimal system). In the same way, binary 0010 1010 or hexadecimal 2A encodes the asterisk sign ‘*’. As mentioned above, our example string “ABBA” is represented by binary 0100 0001 for “A” plus two times 0100 0001 for “B” and again 0100 0001 for “A”: 01000001 01000010 01000010 01000001 (or hexadecimal 41 42 42 41).
Unicode
Since ASCII cannot encode more than 128 characters, an extension was developed in the late 1980s, which we refer to as Unicode. As a de facto information technology standard, Unicode allows for the consistent encoding of text data. It comprises most of the world’s writing systems and includes even emojis and other pictographic sets. The Unicode codespace consists of 17 so-called planes, numbered 0 to 16. Each plane is a continuous group of 65,536 (FFFF or 2 16) code points, so that the 17 planes can accommodate 1,114,112 code points. With most code points being unallocated, Unicode defines 143,859 characters covering 154 modern and historic scripts. The first and most used plane, plane 0, is the Basic Multilingual Plane (BMP) and it contains characters for almost all modern languages, and a large number of symbols. For instance, the letter “A” is represented by the code point U+0042. Unicode is implemented by different character encodings, which include two mapping methods: the Unicode Transformation Format (UTF) encodings, and the Universal Coded Character Set (UCS) encodings. UTF-8 and UTF-16 are the most commonly used encodings, which map code points to a unique sequence of bits (or bytes). For UTF encodings, numbers in the names of the encodings indicate the number of bits per code unit. Hence, UTF-8 uses one to four bytes for each code point, while UTF-16 uses one or two 16-bit code units per code point.
UTF-8
UTF-8 comprises 1,112,064 valid character code points in Unicode using one to four one-byte (8-bit) code units. Since 2009, UTF-8 is the most common encoding for the World Wide Web. Beyond that, all e-mail programs are able to display and create mail using UTF-8. The table below illustrates the basic structure of UTF-8 encoded text. UTF-8 is backward compatible with ASCII, thus the first 128 characters (numbered U+0000 to U+007F) use the basic character encoding standard based on seven-bit teleprinter code, which is preceded by a zero. If two bytes are used to encode one character, the first eight-digit binary figure starts with two 1s followed by one 0; if three bytes are used to encode one character, the first eight-digit binary figure starts with three 1s followed by one 0, and so on. All following bytes start with 10, which marks them as non-leading bytes. Up to three bytes are needed for characters in the Basic Multilingual Plane, which contains virtually all characters in common use. Four bytes are needed for characters in the other 16 planes of Unicode.
Number of bytes | First code point | Last code point | Byte 1 | Byte 2 | Byte 3 | Byte 4 |
---|---|---|---|---|---|---|
1 | U+0000 | U+007F | 0******* | |||
2 | U+0080 | U+007FF | 110***** | 10****** | ||
3 | U+0800 | U+FFFF | 1110**** | 10****** | 10****** | |
4 | U+10000 | U+10FFFF | 11110*** | 10****** | 10****** | 10****** |
Sound
To digitally represent sampled analogue (audio) signals, a method called pulse-code modulation (PCM) is used. Today, PCM is the standard form of digital audio in computers. In a PCM stream, the amplitude of the analogue signal is sampled regularly at uniform intervals, and each sample is assigned to the nearest value within a range of digital steps. Hence, the PCM stream's fidelity to the original analogue signal is determined by two basic properties: the sampling rate and the bit depth. The sampling rate indicates the number of times per second that amplitude samples are taken, and the bit depth determines the number of possible digital values that can be used to represent each amplitude sample. The amplitude is typically stored as either an integer or a floating point number, encoded as a binary number with a fixed number of digits: the sample's bit depth. The graphic below illustrates an example of 4-bit PCM (16 different binary-coded possibilities) showing quantization and sampling (blue) of a signal (red).
Bitmap Images
A bitmap image contains pixel data (as opposed to vector images), whereby each pixel of a bitmap image is defined by a single bit or a group of bits. Typically, these files have the .bmp extension. In contrast to images with the .png or .jpg extensions, bitmap images are usually uncompressed, which means that similar pixels are not grouped to decrease overall file size. Instead, every pixel has its own bit(s) in the file. As mentioned in earlier chapters, every file in a computer is made of binary numbers, whether that is an image file or a text file. The first screenshot below shows the different sections that make up a bitmap image, containing information about the metadata, color pallet, and actual pixel data. The second screenshot contains the same bytes, however, this time they are uncommented and displayed with a hex viewer.
The first part of a bitmap image is the so-called BITMAPFILEHEADER, which contains information about the bitmap file. It has a size of 14 bytes (with one hexadecimal number representing one byte). The second part is the so-called BITMAPINFOHEADER, which contains another 40 bytes of meta data about the bitmap image. The third part of a bitmap image is the optional COLORTABLE, which is absent in our example. The following block contains the actual PIXELDATA, with three bytes representing one pixel. The first byte defines the individual blue value of the pixel, while the second and the third byte define the green and red values, respectively. Hence, we are dealing with a 24-bit image because three bytes (24-bit) define one pixel. Since bitmap follows bottom-up scanning, the first scan line is the last row of the bitmap image. In our example, I reconstrcuted the logo of my university (Bielefeld University), which is composed of a black square on a white background with the upper right corner missing.
Conclusion
In this blog post, I addressed the question of how ones and zeros accumulate to something that we consider meaningful data. The post was divided into two parts: in a first step, I shone a light on numeral representations of input and output with the help of the binary and hexadecimal systems. Subsequently, I examined the three cases of text, sound, and (bitmap) image data. First, looking at text data, I introduced ASCII, Unicode, and UTF-8. Second, I outlined the standard form of digital audio in computers: PCM. Third, I created a bitmap image from scratch. I showed that every file in a computer is made of binary numbers, whether that is a text file, a sound file or an image file. In a nutshell: everything is 1, except for the 0.