For question 1, should we assume that ASCII takes 7-bits to encode? In reality ASCII is usually encoded in 8 bits
Date: 04 Jun 2014 12:47
Number of posts: 2
RSS: New posts
When restricting ourselves to "pure ASCII", there is no reason to use 8 bits. In the "real world", 8 bits are used for various reasons, such as maintaining prefix freeness when used in the wider context of unicode. But for "pure ASCII", the 8th bit is unnecessary and will just bias our estimates.
We thus firmly stand by our 7 bits.
See also the passage below (from Wikipedia):
The term extended ASCII (or high ASCII) describes eight-bit or larger character encodings that include the standard seven-bit ASCII characters as well as others. The use of the term is sometimes criticized, because it can be mistakenly interpreted that the ASCII standard has been updated to include more than 128 characters or that the term unambiguously identifies a single encoding, both of which are untrue.