About the Author

Chris Shiflett

Hi, I’m Chris: entrepreneur, community leader, husband, and father. I live and work in Boulder, CO.


PHP Stripping Newlines

If you're picky about the format of your HTML like me, you've most likely noticed that PHP strips newlines that exist immediately after a closing PHP tag. Try the following code:

<table> 
    <tr>
        <td>
            <?php echo 'TEST'?>
        </td>
    </tr>
</table>

You might be surprised to see that this outputs the following:

<table> 
    <tr>
        <td>
            TEST        </td>
    </tr>
</table>

As PHP developers, most of us expect that anything not within a PHP block is left alone. After all, that's supposed to be the point of the opening and closing PHP tags. (This is the only exception of which I'm aware.) The expected output is as follows:

<table> 
    <tr>
        <td>
            TEST
        </td>
    </tr>
</table>

This has been mentioned on the NYPHP mailing list, but I don't remember seeing it discussed anywhere else, and no one could come up with a good reason for the behavior.

There is an explanation offered here in the manual:

Why does PHP do this? Because when formatting normal HTML, this usually makes your life easier because you don't want that newline, but you'd have to create extremely long lines or otherwise make the raw page source unreadable to achieve that effect.

If you're like me, you're probably a bit surprised to see that this annoying behavior is supposedly there to "make your life easier." I can't think of a situation where I would want this behavior. Am I missing something obvious?

About this post

PHP Stripping Newlines was posted on Tue, 04 Oct 2005. If you liked it, follow me on Twitter or share:

28 comments

1.Martel said:

I suppose it can help when you code in object oriented manner and you have a lot of classes, each in it's own file. Then each of those may have a space or line feed after the closing tag. Now imagine that you want to mess with headers in your main file and one of those included files has a space at the end (making the headers impossible to set since you have started the output in your includes)

Tue, 04 Oct 2005 at 06:45:37 GMT Link


2.Örjan said:

I've been searching for an explanation for this aswell. I just hates it, and it sure does NOT makes life easier. I really like Python's print syntax (which automatically line breaks).

print 'TEST %s' % string

Tue, 04 Oct 2005 at 08:25:03 GMT Link


3.Rob Allen said:

I always thought it was so that trailing new lines at the end of php file that's included in another file wouldn't derail the use use session_start() or header() functions

(if you see what I mean!)

Tue, 04 Oct 2005 at 09:00:38 GMT Link


4.Chris Shiflett said:

I can see how this might help with a trailing newline (not newlines) in an include, but that seems like a very narrow case that's easy to identify. If PHP were to only strip a newline that occurs immediately prior to the end of a file, I wouldn't mind so much. :-)

Tue, 04 Oct 2005 at 09:05:39 GMT Link


5.Stan said:

Actually it doesn't aways strip the newline. The example you provide doesn't show any problem for me (on none of my PHP4 or PHP5 windows or linux based servers).

However, newlines are being stripped when I concatenate strings, such as in the following example:

<p>

<?php

echo 'TEST'.'2';

?>

</p>

So it doesn't happen every time... It's even worst!

Tue, 04 Oct 2005 at 09:10:11 GMT Link


6.Nico Edtinger said:

My PHP doesn't strip the new line. However I would think of something like:

<table>

<tr>

<td>

<?php for($i = 0; $i < 5; ++$i) { ?>

A large block of text.

<?php } ?>

</td>

</tr>

</table>

or other control structures. You don't want to write everything in one big line, but you also don't want newlines "out of nowhere".

The other problem is of course the newline at the end of a file. Vi even complains if the last char is not a newline, at least on Solaris. So it's better to strip that instead of output it.

b4n

Tue, 04 Oct 2005 at 10:03:30 GMT Link


7.Paul Gregg said:

The include file is a red herring here. The reason is simply because of the way we code, consider the following common php structure:

<?php

// Lets do lots of stuff before we start our output

?>

<html><....

Most of us wouldn't expect there to be a newline before the <html> and if PHP did indeed print it we would be complaining about having to do ?><html>

It has been this way since php/fi 2.0 (iirc) and I've been happy with it.

Another similar issue exists with HEREDOC - except the newline *before* the close of the heredoc is removed.

e.g. print <<<END

Test

END

. ":";

Will result in: Test:

I like both ways, and I'd certainly object to any move to change the behaviour now.

Tue, 04 Oct 2005 at 11:06:21 GMT Link


8.Örjan said:

Paul, I'm sure everyone here agrees with you on those arguments. But there's no reason to strip out the new line if the <?php starts at line column > 0 .. eg (this is a poor example, but I guess it's easy to understand):

<?php

// Do lots of stuff.. no row break here

?>

<?php

// still no headers has been sent here! do more stuff!

?>

<html>

<?php print $tidy->head() ?>

<body>

..

</body>

</html>

Currently, if the output from $tidy->head() doesn't end with a newline, then the start of <body> would continue on the same line. That's not really readable code! I don't see any reason why that new line should be stripped out.

All this gives you other problems. Say that you know that using single quotes makes the parser don't evaluate the expressions. Instead of doing this:

<?php print 'some data' . "\n" ?>

this would be sufficient

<?php print 'some data' ?>

Obviously, this should be configurable since there will always be people who like the other way around.

Tue, 04 Oct 2005 at 11:39:33 GMT Link


9.jperkins said:

I completely agree that this feature needs to go away. How do we get that to happen? Campaign for its removal on the internals mailing list?

Tue, 04 Oct 2005 at 12:40:35 GMT Link


10.Paul Gregg said:

jperkins: I don't think that we can, at this point, change this behaviour. Any code which relies on this particular newline output/removal would break.

Take, for example, the ongoing (and repeated) breakageness in the highlight_file (and related php highlighting functions) where the function removed newlines *after* a heredoc when they actually need to be there for the code to function, e.g:

$foo = sprintf(<<<EOM

Hi there %s

EOM

, 'jperkins');

That newline after the EOM is crucial, the highlight functions occassionally (and presently in 5.1RC2-dev) remove it resulting in:

EOM, 'jperkins'); being printed out.

So, if you cut and paste from a highlighted file/source output, you will get broken code.

Take this back into scripts that generate php code, or those producing e.g. CSV format files that are relying on newlines being removed after the ?> in order for the file format to be correct and you can see that you will introduce *huge* BC issues.

Hope this helps.

Paul.

Wed, 05 Oct 2005 at 00:35:02 GMT Link


11.JW said:

First of all, many files contain a newline at then end. When you use Notepad 2 (http://www.flos-freeware.ch/notepad2.html) as text editor, you can see a newline at the end of each file. And, http://pear.php.net/manual/en/standards.file.php (the PEAR Coding Standards) also says you should have a newline at the end of each file.

Beside that, Örjan wrote:

<?php

// Do lots of stuff.. no row break here

?>

<?php

// still no headers has been sent here! do more stuff!

?>

<html>

...

If the newline wasn't deleted, your HTML would contain 2 newlines before the <html> tag.

I think, although not always, this might be useful.

Tue, 11 Oct 2005 at 18:04:47 GMT Link


12.Paul F. De La Cruz said:

I've got a good example of where this lame-brained removal of newlines is bad. It makes me have to go out of my way to make my output work.

I'm sending off an e-mail to people after they've submitted a form. I use a template file with Savant2 along with contents like:

Item Type: <?php echo $this->item_type; ?>
 
Item Weight: <?php echo $this->item_weight; ?>
 
...

You'd figure your e-mail would look like:

Item Type: Widget

Item Weight: 2kg

But it doesn't. Because of the removal of the newlines, it looks like this:

Item Type: WidgetItem Weight: 2kg

You can imagine how bad that can get when you're e-mailing a confirmation letter with 20 odd data items in it.

The only way I can get around this due to PHP's lame idea of helping me out, is to either concatenate a newline on purpose in my echo statement or format my template so it deliberately has two newlines in it:

This is stupid:

Item Type: <?php echo $this->item_type . "\n"; ?>
 
Item Weight: <?php echo $this->item_weight . "\n";

Or even worse which makes me have to -remember- that it's going to look different (and so it's not really intuitive as someone else might think it would output with a bunch of blank lines):

Item Type: <?php echo $this->item_type; ?>
Item Weight: <?php echo $this->item_weight; ?>
...

So even if I <strong>wasn't</strong> picky about how my HTML looks when it's rendered by PHP, it's still causes real problems with things like text e-mails where it actually matters. HTML doesn't care about the newlines but TEXT does.

Ah well. Just have to live with it I guess.

Wed, 12 Oct 2005 at 23:20:34 GMT Link


13.Mike Winger said:

I think that php did it this way so that you could easily put information after whatever you printed.. like.. I do this:

<? print '<li '; if(w/e){ print ' id="menuitem"';} ?> etc...

if print made a new line everytime, my html on my list items wouldnt be formatted correctly.

Wed, 19 Oct 2005 at 12:57:03 GMT Link


14.JL said:

The reason it strips the newline is to avoid the "Cannot send session cookie - headers already sent" error in the following case:

<? /* some PHP code here */ ?>

<? session_start(); ?>

<html>

If PHP generates a newline prior to the session_start(), then you'll get an error (if you have output_buffering turned off).

This is particularly important issue for include files, since they almost always end with a newline. Those newlines really need to be stripped so that they don't screw up the subsequent session_start().

Sun, 23 Oct 2005 at 19:34:36 GMT Link


15.Peter Odding said:

"Those newlines really need to be stripped so that they don't screw up the subsequent session_start()."

Though I agree, it still doesn't explain why PHP doesn't just strip them at the first php opening tag and the last closing one. Right?

So... I'd say leave the current behavior but create an option that just strips newlines before the first and after the last php tag or doesn't strip any newlines.

Sat, 05 Nov 2005 at 05:12:38 GMT Link


16.Joel Davis said:

I know this isn't exactly germaine to the topic, but looking at your guy's code, I thought this would be a good point to tell you that:

<?="Hello World"?>

is the same as if you had written:

<?php echo "Hello World" ?>

it only works with printing out one line of text, works with variables too (<?=$MyVar?>) or any combination acceptable to the "echo" statement

just thought I'd share a lesser-known php shortcut...

PS. it also make the code more readable since there's less php code in the middle of where you're outputting stuff to the user.

Mon, 23 Jan 2006 at 19:47:36 GMT Link


17.Geert De Deckere said:

This unexpected php behaviour is really annoying indeed. It reminds me of MS Word. Trying to make your life easier by doing al sorts of things *automatically*.

I had exactly the same problem with output for e-mail as Paul F. De La Cruz.

I worked around it by adding a space after "?>" at the end of each line which is at least a bit more intuitive than echoing "\n" in my opinion.

However, that workaround forced me to switch the "remove trailing white spaces before saving" feature off in my text editor, which is less convenient.

Tue, 21 Mar 2006 at 18:56:32 GMT Link


18.Marco said:

hehe, nice discussion here... I came with google here :)

Sun, 16 Apr 2006 at 00:51:08 GMT Link


19.Dominik said:

Well I have a suggestion.

The we could implement a new or extend current php tag.

<?php ... ?> - no newline

<?php ... ?-> - inserts newline

That should not be hard to do. And all existing code should work fine.

Sat, 29 Jul 2006 at 14:12:20 GMT Link


20.Vin said:

For some reason, this is pretty high on the Google return for php print newline, but I've noticed that it works perfect as is. Newlines should be specified using \n so you can do something like:

<td><?php print "$thisstuff"; ?></td>
 
<td><?php print "$thatstuff"; ?></td>
 
<td><?php print "$morestuff"; ?></td>
 
<td><?php print "$evenmorestuff"; ?></td>

or any other manner of whatnot without breaking it up all funky.

Sat, 27 Oct 2007 at 05:07:04 GMT Link


21.Beniji said:

Apart from the trailing newline on a file or a here document which is practically never intended to be part of templated output and should indeed be stripped off, I would strongly argue that a general purpose template language should NEVER modify anything outside code delimiters. Thus IMHO PHP's behaviour is incorrect and unnecessary. If people want a newline in the template static content, then it should be left alone.

Its origins presumably lie in the fact PHP was primarily used to generate HTML in which whitespace is supposed to not be significant (although even in HTML it is actually significant, such as within "pre" tags). As other people point out, PHP is often used to generate whitespace sensitive content like emails.

FYI Tomcat (a JSP server) has a stripspaces option in its configuration so people can choose the behaviour they want. For JSP development I always switch it off.

I wish I could switch it off in PHP too.

Tue, 30 Oct 2007 at 19:35:37 GMT Link


22.Buke Beyond said:

I agree it is ridiculous that php is doing this. I am using php for generating commands for other languages, and this is definitely an issue for me.

There needs to be a new switch to shut off this behavior. It also makes sense that it should have been limited to the header sections to correctly solve the problem it was designed to solve.

Fri, 11 Jul 2008 at 12:07:44 GMT Link


23.Sam Souder said:

Just got bit by this at work while writing view templates for e-mails. I spent almost an hour testing various things to try to narrow it down, thinking that it couldn't possibly be PHP mucking with new lines!

I vote for a switch a'la ruby to tell PHP to preserve the newlines or not at the end of a block.

Thu, 04 Sep 2008 at 21:13:44 GMT Link


24.Jens Roland said:

I spent my afternoon Googling for solutions for this, and most solutions are either too huge (HTML Tidy, HTML Purifier) or ridiculously lacking. I found one regex-powered snippet that did the job quite decently, and after some heavy nitpicking and rewriting, it works like a charm.

It is *not* foolproof, but it does some pretty advanced HTML parsing & indenting, as well as rudimentary (but far from complete) Javascript indenting.

<blockquote><p>Optional settings:

* $indent variable to define what type of tabbing to use (tabs/spaces/etc.)

* $no_indent array of tags that you don't want to indent

* no_linebreak array of tags that you don't want to linebreak, ie. 'inline' tags

</p></blockquote>

Also, it automatically handles self-closing XHTML tags, stand-alone HTML tags, trailing whitespace and a bunch of obscure special cases.

Anyway, here it is, take it for a spin if you like ;)

<?php
 
/**
 
 * Indents and removes blank lines in HTML code
 
 * Created by Jens Roland, 2009
 
 * (adapted from a snippet by JonHoo @ http://snippets.dzone.com/)
 
 * 
 
 * The code is not 100% foolproof, but close enough, and surprisingly fast
 
 * 
 
 * Known gotchas:
 
 * - Two closing Javascript brackets on the same line will only count as
 
 *   one, if there are any non-whitespace characters between them
 
 * - If a Javascript line containing an opening bracket happens
 
 *   to have a bracketed expression later in the same line, the next line
 
 *   will not be indented
 
 * - If a line contains two opening tags with non-whitespace content
 
 *   between them, and their corresponding closing tags are on separate
 
 *   lines below, or don't have non-whitespace content between them,
 
 *   the two opening tags will only increase the indent by one, but the
 
 *   two closing tags will decrease the indent by two. The inverse can
 
 *   also happen
 
 * 
 
 * A lot more could be done to make this even better, but I'd rather not
 
 * sacrifice more performance for a detail only source-snoopers will see.
 
 * 
 
 */
function clean_html_code($uncleanhtml)
 
{
 
    // Set wanted indentation
 
    $indent = "\t";
    // Set tags that should not indent
 
    $no_indent = array ('html', 'head', 'body', 'script');
    // Set tags that should not linebreak
 
    $no_linebreak = array ('a', 'b', 'em', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'i', 'span', 'strong', 'title');
    /* STRIP SUPERFLUOUS WHITESPACE */
    // Remove all indentation
 
    $uncleanhtml = preg_replace("/[\r\n]+[\s\t]+/", "\n", $uncleanhtml);
    // Remove all trailing space
 
    $uncleanhtml = preg_replace("/[\s\t]+[\r\n]+/", "\n", $uncleanhtml);
    // Remove all blank lines
 
    $uncleanhtml = preg_replace("/[\r\n]+/", "\n", $uncleanhtml);
    /* INSERT LINE SEPARATORS */
    // Separate 'whitespace-adjacent' tags with newlines, unless they are a pair
 
    $fixed_uncleanhtml = preg_replace("/>[\s\t]*</", ">\n<", $uncleanhtml);
 
    $fixed_uncleanhtml = preg_replace("/((<[a-zA-Z]>)|(<[^\/][^>]*[^\/>]>))\n(<\/)/U", "\${1}\${4}", $fixed_uncleanhtml);
    // Separate closing Javascript brackets with newlines
 
    $fixed_uncleanhtml = preg_replace("/\}[\s\t]*\}/", "}\n}", $fixed_uncleanhtml);
    /* FIX 'HANGING' TAGS */
    // Insert newlines before 'hanging' closing tags (ie. <p>\nSome text</p>\n)
 
    $fixed_uncleanhtml = preg_replace("/(\n[^<\n]*[^<\n\s\t])[\s\t]*(<\/[^>\n]+>[^\n]*\n)/U", "\${1}\n\${2}", $fixed_uncleanhtml);
 
    // Insert newlines after 'hanging' opening tags (ie. <p>Some text\n</p>)
 
    $fixed_uncleanhtml = preg_replace("/((<[a-zA-Z]>)|(<[^\/][^>]*[^\/]>))[\s\t]*([^\s\t(<\/)\n][^(<\/)\n]*\n)/", "\${1}\n\${4}", $fixed_uncleanhtml);
    /* HANDLE THE NO_LINEBREAK LIST */
    // Remove newlines after opening tags from our no_linebreak list (unless they are self-closing)
 
    $fixed_uncleanhtml = preg_replace("/(<(" . implode('|', $no_linebreak) . ")((\s*>)|(\s[^>]*[^\/]>)))\n/U", "\${1}", $fixed_uncleanhtml);
 
    // Remove newlines before closing tags from our no_linebreak list
 
    $fixed_uncleanhtml = preg_replace("/\n(<\/(" . implode('|', $no_linebreak) . ")[\s\t]*>)/U", "\${1}", $fixed_uncleanhtml);
    /* OK, READY TO INDENT */
    $uncleanhtml_array = explode("\n", $fixed_uncleanhtml);
    // Sets no indentation
 
    $indentlevel = 0;
 
    foreach ($uncleanhtml_array as $uncleanhtml_key=>$currentuncleanhtml)
 
    {
 
        $replaceindent = "";
        // Sets the indentation from current indentlevel
 
        for ($o = 0; $o < $indentlevel; $o++)
 
        {
 
            $replaceindent .= $indent;
 
        }
        // If self-closing tag, simply apply indent
 
        if (preg_match("/<(.+)\/>/", $currentuncleanhtml))
 
        {
 
            $cleanhtml_array[$uncleanhtml_key] = $replaceindent.$currentuncleanhtml;
 
        }
 
        // If doctype declaration, simply apply indent
 
        else if (preg_match("/<!(.*)>/", $currentuncleanhtml))
 
        {
 
            $cleanhtml_array[$uncleanhtml_key] = $replaceindent.$currentuncleanhtml;
 
        }
 
        // If opening AND closing tag on same line, simply apply indent
 
        else if (preg_match("/<[^\/](.*)>/", $currentuncleanhtml) && preg_match("/<\/(.*)>/", $currentuncleanhtml))
 
        {
 
            $cleanhtml_array[$uncleanhtml_key] = $replaceindent.$currentuncleanhtml;
 
        }
 
        // If closing HTML tag AND not a tag from the no_indent list, or a closing JavaScript bracket (with no opening bracket on the same line), decrease indentation and then apply the new level
 
        else if ((preg_match("/<\/(.*)>/", $currentuncleanhtml) && !preg_match("/<\/(".implode('|', $no_indent).")((>)|(\s.*>))/", $currentuncleanhtml)) || preg_match("/^\}{1}[^\{]*$/", $currentuncleanhtml))
 
        {
 
            $indentlevel--;
 
            $replaceindent = "";
 
            for ($o = 0; $o < $indentlevel; $o++)
 
            {
 
                $replaceindent .= $indent;
 
            }
            $cleanhtml_array[$uncleanhtml_key] = $replaceindent.$currentuncleanhtml;
 
        }
 
        // If opening HTML tag AND not a stand-alone tag AND not a tag from the no_indent list, or opening JavaScript bracket (with no closing bracket first), increase indentation and then apply new level
 
        else if ((preg_match("/<[^\/](.*)>/", $currentuncleanhtml) && !preg_match("/<(link|meta|base|br|img|hr)(.*)>/", $currentuncleanhtml) && !preg_match("/<(" . implode('|', $no_indent) . ")((>)|(\s.*>))/", $currentuncleanhtml)) || preg_match("/^[^\{\}]*\{[^\}]*$/", $currentuncleanhtml))
 
        {
 
            $cleanhtml_array[$uncleanhtml_key] = $replaceindent.$currentuncleanhtml;
            $indentlevel++;
 
            $replaceindent = "";
 
            for ($o = 0; $o < $indentlevel; $o++)
 
            {
 
                $replaceindent .= $indent;
 
            }
 
        }
 
        // If both a closing and an opening JavaScript bracket (like in a condensed else clause), decrease indentation on this line only
 
        else if (preg_match("/^[^\{\}]*\}[^\{\}]*\{[^\{\}]*$/", $currentuncleanhtml))
 
        {
 
            $indentlevel--;
 
            $replaceindent = "";
 
            for ($o = 0; $o < $indentlevel; $o++)
 
            {
 
                $replaceindent .= $indent;
 
            }
            $cleanhtml_array[$uncleanhtml_key] = $replaceindent.$currentuncleanhtml;
            // Reset indent to previous level
 
            $indentlevel++;
 
            $replaceindent .= $indent;
 
        }
 
        else
 
        // Else, only apply indentation
 
        {
 
            $cleanhtml_array[$uncleanhtml_key] = $replaceindent.$currentuncleanhtml;
 
        }
 
    }
    // Return single string separated by newline
 
    return implode("\n", $cleanhtml_array);
 
}
 
?>

Tue, 13 Jan 2009 at 19:41:14 GMT Link


25.Mal said:

Having used smarty for many years, this has never been a problem for me, but after building a website in a framework that uses PHP as the template/view, I got caught by this feature.

PHP badly needs a flag to stop it messing with newlines, which could be switched on for template directories etc. In my case, I was creating some templates for iCal files, so PHP corrupted the files by incorrectly wrapping the lines.

My solution was to simply echo "\n" at the end of each line, but it looks like a nasty hack.

Mon, 23 Aug 2010 at 21:37:49 GMT Link


26.Chris Rocco said:

PHP does not strip out any newlines in your example. Your problem is that you are expecting:

echo 'TEST';

to output:

TEST<newline>

but it does not. You have to tell PHP that you want a newline there:

echo 'TEST\n';

Since you aren't telling PHP to put a newline at the end of the text you told it to output ("TEST"), it appends the next bit of text directly after the text you output via PHP... and makes you "feel" like PHP took a newline away - but it didn't.

Tue, 27 Sep 2011 at 01:58:49 GMT Link


27.Chris Shiflett said:

Sorry, Chris, but you're wrong. I'm not expecting PHP to output a newline. I'm just not expecting PHP to remove a newline that comes after the closing ?>.

Let me know if you still don't understand.

Tue, 27 Sep 2011 at 14:10:00 GMT Link


Hello! What’s your name?

Want to comment? Please connect with Twitter to join the discussion.