For HTML (not usually XHTML), the other method is for the HTML document to include this information at its top, inside the HEAD element.
XHTML documents have a third option: to express the character encoding in the XML preamble, for example
Each of these method advises the receiver that the file being sent uses the character encoding specified. The character encoding is often referred to as the 'character set' and it indeed does limit the characters in the raw source text. However the HTML standard states that the "charset" is to be treated as an encoding of unicode characters and provides a way to specify characters that the "charset" does not cover. The term Code page is also used similarly.
It is a bad idea to send incorrect information about the character encoding in use. For example, a server where multiple users may place files created on different machines cannot promise that all the files it sends will conform (some users may have machines with different character sets). For this reason, many servers simply do not send the information at all, to avoid making any false promises. This however may result in the equally bad situation of the user agent displaying the document wrongly because it does not know which character encoding to use.
It should also be noted that the specification in the HTTP headers overrides a specification in a meta element in the document itself, which can be a problem if the headers are incorrect and one does not have the access or the knowledge to change them.
Browsers receiving a file with no character encoding information must make a blind assumption. For Western European languages, it is typical and fairly safe to assume windows-1252 (which is similar to ISO-8859-1 but has printable characters in place of some control codes that are forbidden in HTML anyway), but it is also common for browsers to assume the character set native to the machine on which they are running. The consequence of choosing incorrectly is that characters outside the printable ASCII range (32 to 126) usually appear incorrectly. This presents few problems for English-speaking users, but other languages require characters outside that range for everyday use. In CJK environments where there are several different multibyte encodings in use, autodetection is often employed.
It is increasingly common for multilingual websites to use one of the Unicode/ISO 10646 transformation formats, as this allows use of the same encoding for all languages. Generally UTF-8 is used rather than UTF-16 or UTF-32 because it is easier to handle in programming languages that assume a byte-oriented ASCII superset encoding, and it is efficient for ASCII-heavy text (which HTML tends to be).
Successful viewing of a page is not necessarily an indication that its encoding is specified correctly. If the creator of a page and the reader are both assuming some machine-specific character encoding, and the server does not send any identifying information, then the reader will nonetheless see the page as the creator intended, but other readers with different native sets will not.
..
Real Estate Term::Character encodings in HTML Related Sites
[READ MORE : Real Estate Term::Character encodings in HTML]