Computer must not only be able to carry out computations, they must be able to do them quickly and efficiently. There are several data representations, typically for integers, real numbers, characters, and logical values.
Number Representation in Various Numeral Systems
A numeral system is a collection of symbols used to represent small numbers, together with a system of rules for representing larger numbers. Each numeral system uses a set of digits. The number of various unique digits, including zero, that a numeral system uses to represent numbers is called base or radix.
Base – b numeral system
b basic symbols (or digits) corresponding to natural numbers between 0 and b − 1 are used in the representation of numbers.
To generate the rest of the numerals, the position of the symbol in the figure is used. The symbol in the last position has its own value, and as it moves to the left its value is multiplied by b.
We write a number in the numeral system of base b by expressing it in the form
N(b), with n+1 digit for integer and m digits for fractional part, represents the sum:
in the decimal system. Note thatis the digit from the position of
Decimal, Binary, Octal and Hexadecimal are common used numeral system. The decimal system has ten as its base. It is the most widely used numeral system, because humans have four fingers and a thumb on each hand, giving total of ten digit over both hand.
Switches, mimicked by their electronic successors built of vacuum tubes, have only two possible states: “open” and “closed”. Substituting open=1 and closed=0 yields the entire set of binary digits. Modern computers use transistors that represent two states with either high or low voltages. Binary digits are arranged in groups to aid in processing, and to make the binary numbers shorter and more manageable for humans.Thus base 16 (hexadecimal) is commonly used as shorthand. Base 8 (octal) has also been used for this purpose.
Decimal notation is the writing of numbers in the base-ten numeral system, which uses various symbols (called digits) for no more than ten distinct values (0, 1, 2, 3, 4, 5, 6, 7, 8 and 9) to represent any number, no matter how large. These digits are often used with a decimal separator which indicates the start of a fractional part, and with one of the sign symbols + (positive) or − (negative) in front of the numerals to indicate sign.
Decimal system is a place-value system. This means that the place or location where you put a numeral determines its corresponding numerical value. A two in the one’s place means two times one or two. A two in the one-thousand’s place means two times one thousand or two thousand.
The place values increase from right to left. The first place just before the decimal point is the one’s place, the second place or next place to the left is the ten’s place, the third place is the hundred’s place, and so on.
The place-value of the place immediately to the left of the “decimal” point is one in all place-value number systems. The place-value of any place to the left of the one’s place is a whole number computed from a product (multiplication) in which the base of the number system is repeated as a factor one less number of times than the position of the place.
For example, 5246 can be expressed like in the following expressions
The place-value of any place to the right of the decimal point is a fraction computed from a product in which the reciprocal of the base—or a fraction with one in the numerator and the base in the denominator—is repeated as a factor exactly as many times as the place is to the right of the decimal point.
The binary number system is base 2 and therefore requires only two digits, 0 and 1. The binary system is useful for computer programmers, because it can be used to represent the digital on/off method in which computer chips and memory work.
A binary number can be represented by any sequence of bits (binary digits), which in turn may be represented by any mechanism capable of being in two mutually exclusive states.
Counting in binary is similar to counting in any other number system. Beginning with a single digit, counting proceeds through each symbol, in increasing order. Decimal counting uses the symbols 0 through 9, while binary only uses the symbols 0 and 1.
When the symbols for the first digit are exhausted, the next-higher digit (to the left) is incremented, and counting starts over at 0A single bit can represent one of two values, 0 or 1.Binary numbers are convertible to decimal numbers.
Here’s an example of a binary number,, and its representation in the decimal notation
The hexadecimal system is base 16. Therefore, it requires 16 digits. The digits 0 through 9 are used, along with the letters A through F, which represent the decimal values 10 through 15. Here is an example of a hexadecimal number and its decimal equivalent:
The hexadecimal system (often called the hex system) is useful in computer work because it is based on powers of 2. Each digit in the hex system is equivalent to a four-digit binary number. Table below shows some hex/decimal/binary equivalents.
|Hexadecimal Digit||Decimal Equivalent||Binary Equivalent|
Binary is also easily converted to the octal numeral system, since octal uses a radix of 8, which is a power of two (namely, 23, so it takes exactly three binary digits to represent an octal digit). The correspondence between octal and binary numerals is the same as for the first eight digits of hexadecimal in the table above. Binary 000 is equivalent to the octal digit 0, binary 111 is equivalent to octal 7, and so forth.
Converting from octal to binary proceeds in the same fashion as it does for hexadecimal:
And from octal to decimal:
Converting from decimal to base–b
To convert a decimal fraction to another base, say base b, you split it into an integer and a fractional part. Then divide the integer by b repeatedly to get each digit as a remainder. Namely, with value of integer part =, first divide value by b the remainder is the least significant digit . Divide the result by b, the remainder is .Continue this process until the result is zero, giving the most significant digit, . Let’s convert to hexadecimal:
After that, multiply the fractional part by b repeatedly to get each digit as an integer part. We will continue this process until we get a zero as our fractional part or until we recognize an infinite repeating pattern.
Now convert 0.625 to hexadecimal :
0.39625 * 16 = 0.625 ————————————-> 0
.625* 16 = 10 —————————> A.
We get fractional part is zero.
In summary, the result of conversionto hexadecimal is AB5C.0A
Data Representation in a Computer. Units of Information
Data Representation refers to the methods used internally to represent information stored in a computer. Computers store lots of different types of information:
- graphics of many varieties (stills, video, animation)
At least, these all seem different to us. However, all types of information stored in a computer are stored internally in the same simple format: a sequence of 0’s and 1’s. How can a sequence of 0’s and 1’s represent things as diverse as your photograph, your favorite song, a recent movie, and your term paper?
- Numbers must be expressed in binary form following some specific standard.
- Character data are assigned a sequence of binary digits
- Other types of data, such as sounds, videos or other physical signals are converted to digital following the schema below
Continuous signalPhysical signalComputerConvert ADSensor
Depending on the nature of its internal representation, data items are divided into:
- Basic types (simple types or type primitives) : the standard scalar predefined types that one would expect to find ready for immediate use in any programming language
- Structured types(Higher level types) are then made up from such basic types or other existing higher level types.
Units of Information
The most basic unit of information in a digital computer is called a BIT, which is a contraction of Binary Digit. In the concrete sense, a bit is nothing more than a state of “on” or “off” (or “high” and “low”) within a computer circuit. In 1964, the designers of the IBM System/360 mainframe computer established a convention of using groups of 8 bits as the basic unit of addressable computer storage. They called this collection of 8 bits a byte.
Computer words consist of two or more adjacent bytes that are sometimes addressed and almost always are manipulated collectively. The word size represents the data size that is handled most efficiently by a particular architecture. Words can be 16 bits, 32 bits, 64 bits, or any other size that makes sense within the context of a computer’s organization.
Some other units of information are described in the following table :
Representation of Integers
An integer is a number with no fractional part; it can be positive, negative or zero. In ordinary usage, one uses a minus sign to designate a negative integer. However, a computer can only store information in bits, which can only have the values zero or one. We might expect, therefore, that the storage of negative integers in a computer might require some special technique – allocating one sign bit (often the most significant bit) to represent the sign: set that bit to 0 for a positive number, and set to 1 for a negative number.
Unsigned integers are represented by a fixed number of bits (typically 8, 16, 32, and/or 64)
- With 8 bits, 0…255 (0016…FF16) can be represented;
- With 16 bits, 0…65535 (000016…FFFF16) can be represented;
- In general, an unsigned integer containing n bits can have a value between 0 and
If an operation on bytes has a result outside this range, it will cause an ‘overflow’
The binary representation discussed above is a standard code for storing unsigned integer numbers. However, most computer applications use signed integers as well; i.e. the integers that may be either positive or negative.
In binary we can use one bit within a representation (usually the most significant or leading bit) to indicate either positive (0) or negative (1), and store the unsigned binary representation of the magnitude in the remaining bits.
However, for reasons of ease of design of circuits to do arithmetic on signed binary numbers (e.g. addition and subtraction), a more common representation scheme is used called two’s complement. In this scheme, positive numbers are represented in binary, the same as for unsigned numbers. On the other hand, a negative number is represented by taking the binary representation of the magnitude:
- Complement the bits : Replace all the 1’s with 0’s, and all the 0’s with 1’s;
- Add one to the complemented number.
+4210 = 001010102 and so -4210 = 110101102
- Binary number with leading 0 is positive
- Binary number with leading 1 is negative
Performing two’s complement on the decimal 42 to get -42
Using a eight-bit representation
42= 00101010 Convert to binary 11010101 Complement the bits 11010101 Add 1 to the complement + 00000001 -------- 11010110 Result is -42 in two's complement
Arithmetic Operations on Integers
Addition and Subtraction of integers
Addition and subtraction of unsigned binary numbers
Binary Addition is much like normal everyday (decimal) addition, except that it carries on a value 2 instead of value 10.
0 + 0 = 0
0 + 1 = 1
1 + 0 = 1
1 + 1 = 0, and carry 1 to the next more significant bit
00011010 + 00001100 = 00100110 1 1 carries 0 0 0 1 1 0 1 0 = 26(base 10) + 0 0 0 0 1 1 0 0 = 12(base 10) ---------------- 0 0 1 0 0 1 1 0 = 38(base 10) 11010001 + 00111110 = 100011010 1 1 1 carries 1 1 0 1 0 0 0 1 = 208 (base 10) + 0 1 0 0 1 0 0 1 = 73 (base 10) ---------------- 1 0 0 0 1 1 0 1 0 = 281 (base 10)
The result exceeds the magnitude which can be represented with 8 bits. This is an overflow.
Subtraction is executed by using two’s complement
Addition and subtraction of signed binary numbers
Multiplication and Division of Integers
Multiplication in the binary system works the same way as in the decimal system:
0 x 0 = 0
0 x 1 = 0
1 x 0 = 0
1 x 1 = 1, and no carry or borrow bits
00101001 × 00000110 = 11110110 0 0 1 0 1 0 0 1 = 41(base 10) × 0 0 0 0 0 1 1 0 = 6(base 10) ---------------------- 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 1 0 1 0 0 1 ---------------------------- 0 0 1 1 1 1 0 1 1 0 = 246(base 10) 00010111 × 00000011 = 01000101 0 0 0 1 0 1 1 1 = 23(base 10) × 0 0 0 0 0 0 1 1 = 3(base 10) ---------------------- 1 1 1 1 1 carries 0 0 1 0 1 1 1 0 0 1 0 1 1 1 0 0 1 0 0 0 1 0 1 = 69(base 10)
Binary division follow the same rules as in decimal division.
Logical operations on Binary Numbers
Logical Operation with one or two bits
NOT : Changes the value of a single bit. If it is a “1”, the result is “0”; if it is a “0”, the result is “1”.
AND: Compares 2 bits and if they are both “1”, then the result is “1”, otherwise, the result is “0”.
OR : Compares 2 bits and if either or both bits are “1”, then the result is “1”, otherwise, the result is “0”.
XOR : Compares 2 bits and if exactly one of them is “1” (i.e., if they are different values), then the result is “1”; otherwise (if the bits are the same), the result is “0”.
Logical operators between two bits have the following truth table
|x||y||x AND y||x OR y||x XOR y|
Logical Operation with one or two binary numbers
A logical (bitwise) operation operates on one or two bit patterns or binary numerals at the level of their individual bits.
NOT 0111 = 1000
An AND operation takes two binary representations of equal length and performs the logical AND operation on each pair of corresponding bits. In each pair, the result is 1 if the first bit is 1 AND the second bit is 1. Otherwise, the result is 0.
0101 AND 0011 = 0001
An OR operation takes two bit patterns of equal length, and produces another one of the same length by matching up corresponding bits (the first of each; the second of each; and so on) and performing the logical OR operation on each pair of corresponding bits.
0101 OR 0011 = 0111
An exclusive or operation takes two bit patterns of equal length and performs the logical XOR operation on each pair of corresponding bits.
0101 XOR 0011 = 0110
It is important to handle character data. Character data is not just alphabetic characters, but also numeric characters, punctuation, spaces, etc. They need to be represented in binary.
There aren’t mathematical properties for character data, so assigning binary codes for characters is somewhat arbitrary.
ASCII Code Table
ASCII stands for American Standard Code for Information Interchange. The ASCII standard was developed in 1963, permitted machines from different manufacturers to exchange data.
ASCII code table consists of 128 binary values (0 to 127), each associated with a character or command. The non-printing characters are used to control peripherals such as printer.
The extended ASCII character set also consists 128 128 characters representing additional special, mathematical, graphic and foreign characters.
Unicode Code Table
There are some problems with the ASCII code table. With ASCII character set, string datatypes allocated one byte per character. But logographic languages such as Chinese, Japanese, and Korean need far more than 256 characters for reasonable representation. Even Vietnamese, a language uses almost Latin letters, need 61 characters for representation. Where can we find numbers for our characters? is it a solution : 2 bytes per character?
Hundreds of different encoding systems were invented. But these encoding systems conflict with one another : two encodings can use the same number for two different characters, or use different numbers for the same character.
The Unicode standard was first published in 1991. With two bytes for each character, it can represent 216-1 different characters.
The Unicode standard has been adopted by such industry leaders as HP, IBM, Microsoft, Oracle, Sun, and many others. It is supported in many operating systems, all modern browsers, and many other products.
The obvious advantages of using Unicode are :
- To offer significant cost savings over the use of legacy character sets.
- To enable a single software product or a single website to be targeted across multiple platforms, languages and countries without re-engineering.
- To allow data to be transported through many different systems without corruption.
Representation of Real Numbers
No human system of numeration can give a unique representation to real numbers. If you give the first few decimal places of a real number, you are giving an approximation to it.
Mathematicians may think of one approach : a real number x can be approximated by any number in the range from x – epsilon to x + epsilon. It is fixed-point representation. Fixed-point representations are unsatisfactory for most applications involving real numbers.
Scientists or engineers will probably use scientific notation: a number is expressed as the product of a mantissa and some power of ten.
A system of numeration for real numbers will typically store the same three data — a sign, a mantissa, and an exponent — into an allocated region of storage
The analogues of scientific notation in computer are described as floating-point representations.
In the decimal system, the decimal point indicates the start of negative powers of 10.
If we are using a system in base k (ie the radix is k), the ‘radix point’ serves the same function:
A floating point representation allows a large range of numbers to be represented in a relatively small number of digits by separating the digits used for precision from the digits used for range.
To avoid multiple representations of the same number floating point numbers are usually normalized so that there is only one nonzero digit to the left of the ‘radix’ point, called the leading digit.
A normalized (non-zero) floating-point number will be represented using
- s is the sign,
- – termed the significand – has p significant digits, each digit satisfies 0 <b
- , is the exponent
- b is the base (or radix)
If k = 10 (base 10) and p = 3, the number 0·1 is represented as 0.100
If k = 2 (base 2) and p = 24, the decimal number 0·1 cannot be represented exactly but is approximately
be represents the value
In brief, a normalized representation of a real number consist of
- The range of the number : the number of digits in the exponent (i.e. by ) and the base b to which it is raised
- The precision : the number of digits p in the significand and its base b
IEEE 754/85 Standard
There are many ways to represent floating point numbers. In order to improve portability most computers use the IEEE 754 floating point standard.
There are two primary formats:
- 32 bit single precision.
- 64 bit double precision.
Single precision consists of:
- A single sign bit, 0 for positive and 1 for negative;
- An 8 bit base-2 (b = 2) excess-127 exponent, with = –126 (stored as ) and = 127 (stored as ).
- a 23 bit base-2 (k=2) significand, with a hidden bit giving a precision of 24 bits (i.e. )
- Single precision has 24 bits precision, equivalent to about 7.2 decimal digits.
- The largest representable non-infinite number is almost
- The smallest representable non-zero normalized number is
- Denormalized numbers (eg ) can be represented.
- There are two zeros, 0.
- There are two infinities, .
- A NaN (not a number) is used for results from undefined operations
Double precision floating point standard requires a 64 bit word
- The first bit is the sign bit
- The next eleven bits are the exponent bits
- The final 52 bits are the fraction
Range of double numbers : [±2.225×10−308÷±1.7977×10308]
How to Reuse & Attribute This Content
© Jul 29, 2009 Huong Nguyen. Textbook content produced by Huong Nguyen is licensed under a Creative Commons Attribution License 3.0 license.
Under this license, any user of this textbook or the textbook contents herein must provide proper attribution as follows:
The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the creative commons license and may not be reproduced without the prior and express written consent of Rice University. For questions regarding this license, please contact firstname.lastname@example.org.
If you use this textbook as a bibliographic reference, then you should cite it as follows:
Huong Nguyen, Introduction to Computer Science. OpenStax CNX. Jul 29, 2009 http://email@example.com.
If you redistribute this textbook in a print format, then you must include on every physical page the following attribution:
Download for free at http://firstname.lastname@example.org.
If you redistribute part of this textbook, then you must retain in every digital format page view (including but not limited to EPUB, PDF, and HTML) and on every physical printed page the following attribution:
Download for free at http://email@example.com.