How Computer Print Human Readable Characters : ASCII and UNICODE

Tanya Mittal
4 min readFeb 12, 2019

--

(ASCII : Pronounced as as-key)
ASCII, abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. It is a standard that assigns letters, number and other characters in the 256 slots available in the 8-bit code. The ASCII decimal (Dec) number is created from binary, which is the language of all computer .
Taking/considering the lower “h” character (char) has a decimal value of 104, which is “01101000” in binary.
ASCII was first developed and published in 1963 by the X3 committee: a part of the ASA (American Standards Association). It was first published as ASA X3.4- 1963, with ten revisions of the standard being published between 1967 and 1986.
Technically, ASCII is 7-bit representing only 128 characters (0-127).

The range -

0-31: are control characters
32-127: representing alphabetical characters from A to Z, numerals from 0 to 9 and punctuation marks , as shown below in table..

Fig 1: ASCII Table

ASCII only may be used to encode U.S. English. Some people confuse codes above 128-255 to be ASCII, but technically speaking they are not. It is common to use an 8-bit so, this last character allowed for an extra 128 characters, known as extended ASCII also called superset of ASCII used by DOS operating system. A more universal standard is the IOS Latin 1 set of
characters, which is used by many operating system, as well as web browsers. Another set of codes that is used on large IBM computer is EBCDIC (Extended Binary Coded Decimal Interchange Code) is 8-bit character.

Fig 2: ASCII Code

UNICODE:

Unicode is a universal character encoding standard . It defines the way individual characters are represented in text files, web pages, and other types of documents. ... While ASCII only uses one byte to represent each character , Unicode supports up to 4 bytes for each character.
Unicode is a superset of ASCII, and the numbers 0–128 have the same meaning in ASCII as they have in Unicode. ... Because Unicode characters don’t generally fit into one 8-bit byte, there are numerous ways of storing Unicode characters in byte sequences, such as UTF-32 and UTF-8
The terms “Version 11.0” or “Unicode 11.0” are abbreviations for the full version reference, Version 11.0.0. The citation for the latest published version of the Unicode Standard is: The Unicode Consortium. The Unicode Standard.
As of version 11.0, Unicode contains a repertoire of over 137,000 characters covering 146 modern and historic scripts, as well as multiple symbol sets. Unicode allows for 17 planes, each of 65,536 possible characters (or 'code points’). This gives a total of 1,114,112 possible characters. At present, only about 10% of this space has been allocated.

Difference Between Unicode and ASCII

The main difference between the two is in the way they encode the character and the number of bits that they use for each. ASCII originally used seven bits to encode each character. This was later increased to eight with Extended ASCII to address the apparent inadequacy of the original. In contrast, Unicode uses a variable bit encoding program where you can choose between 32, 16, and 8-bit encodings. Using more bits lets you use more characters at
the expense of larger files while fewer bits give you a limited choice but you save a lot of space. Using fewer bits (i.e. UTF-8 or ASCII) would probably be best if you are encoding a large document in English.

OTHER…DIFFERENCES.

  1. ASCII uses an 8-bit encoding while Unicode uses a variable bit encoding.
  2. Unicode is standardized while ASCII isn’t.

3. Unicode represents most written languages in the world while ASCII does not.
4. ASCII has its equivalent within Unicode.
Why UNICODE Is Used Instead Of ASCII ??..
Unicode uses between 8 and 32 bits per character, so it can represent characters from languages from all around the world. It is commonly used across the internet.... Global companies, like Facebook and Google, would not use the ASCII character set because their users communicate in many different languages.

UNICODE VERSUS ASCII

Fig 3: Difference Between ASCII and UNICODE

--

--