Hướng dẫn utf-8 to ansi python
MS Notepad gives the user a choice of 4 encodings, expressed in clumsy confusing terminology: "Unicode" is UTF-16, written little-endian. "Unicode big endian" is UTF-16, written big-endian. In both UTF-16 cases, this means that the appropriate BOM will be written. Use "UTF-8" is UTF-8; Notepad explicitly writes a "UTF-8 BOM". Use "ANSI" is a shocker. This is MS terminology for "whatever the default legacy encoding is on this computer". Here is a list of Windows encodings that I know of and the languages/scripts that they are used for:
If the file has been created on the computer where it is being read, then you can obtain the "ANSI" encoding by Be careful using Putting it all together: Sample text file, saved with all 4 encoding choices, looks like this in Notepad:
Here is some demo code:
and here is the output when
run in a Windows "Command Prompt" window using the command
Things to be aware of: (1) "mbcs" is a file-system pseudo-encoding which has no relevance at all to decoding the contents of files. On a system where the default encoding is
(2) |