Turn on suggestions
Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.
Showing results for
File encoding - UTF-8 / ISO
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Plusnet Community
- :
- Forum
- :
- Other forums
- :
- Tech Help - Software/Hardware etc
- :
- Re: File encoding - UTF-8 / ISO
File encoding - UTF-8 / ISO
15-03-2012 1:55 AM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
Hi
I'm having trouble with a php site I'm helping to convert to dynamic. The site is UTF8 but the php scripts are ANSII. The data has been scraped and stored in utf8_unicode fields in mysql.
The rss feed is not displaying but is putting out garbage characters at the beginning. I had this a few days ago and it was down to the included files being a different encoding. They're now all the same but now the files have been uploaded to the server the problem has returned. I suspect there is mixed encoding in the files.
Does anyone know of a program that will let me examine the encoding letter by letter to see if there is mixed encoding in the output?
I'm having trouble with a php site I'm helping to convert to dynamic. The site is UTF8 but the php scripts are ANSII. The data has been scraped and stored in utf8_unicode fields in mysql.
The rss feed is not displaying but is putting out garbage characters at the beginning. I had this a few days ago and it was down to the included files being a different encoding. They're now all the same but now the files have been uploaded to the server the problem has returned. I suspect there is mixed encoding in the files.
Does anyone know of a program that will let me examine the encoding letter by letter to see if there is mixed encoding in the output?
I need a new signature... i'm bored of the old one!
2 REPLIES 2
Re: File encoding - UTF-8 / ISO
15-03-2012 12:31 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
The garbled characters at the begining are probably the Byte Order Mark (BOM), which identifies the encoding of the file/data.
No BOM can mean it's either a plain text for, or it might be UTF-8.
If the characters are 0xEF, 0xBB, 0xBF () the data is UTF-8.
For UTF-16, it'll be either 0xFE,0xFF (þÿ) if big endian and 0xFF,0xFE (ÿþ) for little endian.
A good editor for checking and changing the file encoding is Programmers Notepad.
The only way to confirm the exact contents of the files is to use a Hex editor, XVI32 or HxD are good choices.
It's not uncommon to have to write code to read or skip the BOM.
No BOM can mean it's either a plain text for, or it might be UTF-8.
If the characters are 0xEF, 0xBB, 0xBF () the data is UTF-8.
For UTF-16, it'll be either 0xFE,0xFF (þÿ) if big endian and 0xFF,0xFE (ÿþ) for little endian.
A good editor for checking and changing the file encoding is Programmers Notepad.
The only way to confirm the exact contents of the files is to use a Hex editor, XVI32 or HxD are good choices.
It's not uncommon to have to write code to read or skip the BOM.
Re: File encoding - UTF-8 / ISO
16-03-2012 11:15 AM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
I use notepad++ which allows you to set the encoding however I have content coming from included files, templates, even the database so it's a bit hard to know exactly what is really what and where.
The hex editor tip is probably what I'm looking for - hopefully that might point something out to me although it has been years since I looked at hex so maybe that will do me some good.
The hex editor tip is probably what I'm looking for - hopefully that might point something out to me although it has been years since I looked at hex so maybe that will do me some good.
I need a new signature... i'm bored of the old one!
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Plusnet Community
- :
- Forum
- :
- Other forums
- :
- Tech Help - Software/Hardware etc
- :
- Re: File encoding - UTF-8 / ISO