php - problem - utf-8 without bom notepad++
utf8 without BOM encoding in eclipse (2)
After some headache I figured out that eclipse using set encoding UTF8 (with BOM) causes an error. It causes whitespace to be added when you use an include causing the headers of a webpage to render within the body in chrome.
ie. on index.php with no gap before or after the of course
<?php include_once('header.php'); ?><body>test</body>
and header.php having (without gaps again of course)
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>test title</title> </head>
Then the test title appears within the body (not in view source, but in the console in chrome). This causes a gap at the top of the page.
Opening the index.php and header.php in notepad++ and changing the encoding to UTF8 without BOM solves the problem. How can I fix this in Eclipse?! Switching to notepad++ is not desireable, too many good features in eclipse that are useful (better autocomplete, automatic versioning etc).
A mystery to me...
As far as I'm aware, as long as your workspace (or file-type) settings are set to UTF-8, new BOMs are not created for new files, but old BOMs (created in other editors) will be maintained or even displayed depending on what editor is opening the file (PHP editor, vs PHP+HTML editor).
If you're having random BOMs in your file, it's probably because someone on your team is using a different editor with BOMs on by default (hello, Dreamweaver!). Whenever this person creates a new file, a BOM will be inserted and maintained both by your and his editor (but if one of you removes it it's going to be fine).
If you're bash-savvy, I'd suggest using the following script to prune BOMs in bulk:
sed '1 s/\xEF\xBB\xBF//' < input > output
In eclipse, just change the encoding of the file to iso (Right click on file - > Properties) and delete the first three BOM characters, save file and reopen it.