Character encoding

From Wikiversity
Jump to navigation Jump to search
Educational level: this is a secondary education resource.
Type classification: this is a lesson resource.
Completion status: this resource has reached a high level of completion.

Character encoding is used to represent a repertoire of characters by some kind of encoding system. Depending on the abstraction level and context, corresponding code points and the resulting code space may be regarded as bit patterns, octets, natural numbers, electrical pulses, etc. A character encoding is used in computation, data storage, and transmission of textual data.[1]

Introduction[edit | edit source]

Data are Numbers, Text, Sounds, Image, Animation, Video, etc.; in order to define them in the real world, we are using Numbers (0,...9), Alphabet (A,...Z) and Symbols (@,[,\,...) or a combination of them, for example:

  • 2012 ( Number )
  • Wikiversity ( Alphabet )
  • 3<5 ( combination of Number and Symbol )
  • X=(Y*Z)+W-U ( combination of Alphabet and Symbols )
  • N123 ( combination of Alphabet and Number )

Computers don't understand the definitions for Numbers (0,...9), the Alphabet (A,...Z) and Symbols (@,[,\,...), so in order to process those pieces of information a unique code must be assigned to each of them. The unique code for Numbers (0,...9), the Alphabet (A,...Z) and Symbols (@,[,\,...) is a binary numeral.

ASCII Code[edit | edit source]

Pronounced: "ask-ee". A type of binary code that uses 7 bit for each character ( Number, Alphabet and Symbols ). A total of 128 characters (2^7=128). For example:

Number (0 - 9)

  • 0110000 (binary) or 48 (decimal) for character → 0
  • 0110001 (binary) or 49 (decimal) for character → 1
  • 0110010 (binary) or 50 (decimal) for character → 2
  • ...
  • 0111001 (binary) or 57 (decimal) for character → 9

Alphabet (A -Z)

  • 1000001 (binary) or 65 (decimal) for character → A
  • 1000010 (binary) or 66 (decimal) for character → B
  • 1000011 (binary) or 67 (decimal) for character → C
  • ...
  • 1011010 (binary) or 90 (decimal) for character → Z

Symbols (@ < ( & ^ % $ #...)

  • 1000000 (binary) or 64 (decimal) for character → @
  • 0111100 (binary) or 60 (decimal) for character → <
  • 0101000 (binary) or 40 (decimal) for character → (
  • 0100110 (binary) or 38 (decimal) for character → &

ASCII stands for American Standard Code for Information Interchange.

View the ASCII character table at http://www.asciicodes.us

Extended ASCII code[edit | edit source]

In Extended ASCII uses 8 bits (1 byte) for each character (Number, Alphabet and Symbols). A total of 256 characters (2^8=256). For example:

  • 00110000 (binary) or 48 (decimal) for character → 0
  • 01000001 (binary) or 65 (decimal) for character → A
  • 01000000 (binary) or 64 (decimal) for character → @

View Extended ASCII character table at http://ascii-code.com

Unicode[edit | edit source]

A type of binary code that uses 16 bits for each character (Number, Alphabet and Symbols). A total of 65536 characters (2^16=65536). For example:

  • 0000000000110000 (binary) or 48 (decimal) for character → 0
  • 0000000001000001 (binary) or 65 (decimal) for character → A
  • 0000000001000000 (binary) or 64 (decimal) for character → @

View Unicode character table at http://unicode-table.com/en/#control-character

See Also[edit | edit source]

References[edit | edit source]