Issue with new line to <p> function - Knowledgebase

Portal Home > Knowledgebase > Articles Database > Issue with new line to

Issue with new line to
function

Posted by Aeron, 01-06-2010, 02:12 PM
Hi all, I've been working on deploying a new-line-to-paragraph function, as I want better formatting of my paragraphs than a simple nl2br() function would provide. The code I'm using is as follows: While that works, it also seems to add an extra set of
tags in between the paragraphs it creates. So it ends up looking something like this: As far as the source text goes, it only has two /n breaks in it between paragraphs. I guess my question, is if there is a way to edit the function so that it only provides the paragraph cutting on double /n/n, rather than on a single /n. Thanks in advance for your help.
Posted by mattle, 01-06-2010, 02:30 PM
This may not be matching the way you think it is...can you copy in an example haystack, and the results of this: On an aside, depending on your application, you may want to convert \r\n to \n and then \r to \n to accommodate Windows and Mac users. If that was the problem here, however, I would expect that you would not be getting any matches. I checked your regex here: http://www.regextester.com/, against a very simple string and it seems to work as expected...
Posted by Aeron, 01-06-2010, 06:02 PM
Hi Mattle, I'm not familiar with regex at all, and I wasn't able to see what you were trying to do with your response, as your code varied so greatly from the original example I posted. You were using the preg_match() function, while the script I'm using uses the preg_replace() function. Is there a big difference between the two? Would it be possible to show me how your changes would work within the original function I posted? The website this is being used on is at goldenshine.com/blog/. If you view the source, you can see that extraneous paragraph problem. Once again, the original text only has as many line breaks between paragraphs as each of the paragraphs in this comment, so it should be able to parse it without creating that extra
tag. So far the only way I've been able to get it to work is to resort to using single line breaks in the raw text I enter into the database, but that's really not ideal.
Posted by mattle, 01-06-2010, 06:19 PM
You should definitely check out the docs here. Basically, preg_replace will find a pattern and make a replacement, preg_match just test the string for matching occurrences of the pattern and dumps them into the $matches array. I'm not giving you production code here, but rather some debugging code that might help you discover why the regular expression isn't working as desired. The first step to debugging a replace is to find out what it's matching. If it's preforming the replacement twice, stands to reason it's matching something twice...I'm just trying to get a little more perspective on what's going on. Here's a basic example you can use to get an idea of what I'm talking about: This is telling me two things...I have one matching condition (the whole regular expression /a/) and that it matched twice in the string "abcda". By contrast, Tells me that I have two matching expressions (/^ab/ and (b)), and that I matched the first with "ab" and the second with "b". Also, I realized that I erred in my previous post, you should use preg_match_all(). preg_match() will stop after the first successful match, thus negating our ability to debug a regex that is matching more times than we would like!
Posted by Aeron, 01-06-2010, 06:20 PM
So you need me to run that code on my server and tell you the outcome?
Posted by Aeron, 01-07-2010, 11:51 AM
Sorry, I just don't know enough about this to be able to run with what you posted. If it helps at all, the function parses the text fine if the paragraphs are separated by only one line break. Like so: This is paragraph 1, consectetur adipiscing elit. Duis in leo metus. Maecenas imperdiet purus sed enim hendrerit sit amet auctor libero dignissim. Donec mollis pellentesque auctor. Vestibulum laoreet ante viverra ipsum tempus gravida interdum erat dictum. Duis gravida mollis tortor. This is paragraph 2, vitae vestibulum lectus facilisis ac. Nam porttitor sagittis ante, eget tempus nulla faucibus a. Suspendisse non elit elit. My problem, is that if you use the standard double line break to separate paragraphs, it inserts and extra set of
tags in there. So I'm guessing the logic needs to execute the paragraph wrap on double line breaks and not only single. If someone could look at the original function and give me a working solution I'd really appreciate it.
Posted by foobic, 01-07-2010, 06:40 PM
The original code does include provision for converting multiple line breaks to a single paragraph separator, so someone (preferably you ) is going to need to debug exactly why it doesn't work with the information you're feeding it. I'd suspect some kind of whitespace character between the line breaks, including Windows \r\n line breaks as mattie mentioned earlier. Put the code from post #2 into your function temporarily and see what it spits out. (Edit: $haystack is $string in your function, of course) Last edited by foobic; 01-07-2010 at 06:43 PM.
Posted by Aeron, 01-08-2010, 07:05 PM
Sorry, I just don't think I understand this language well enough to know what to do with that code. I posted it as-is, and the who top line is commented out because of the hash tag. Can you put the debug code into the code I originally posted, and then I'll run it and give you guys the output? Thanks for your help, and sorry for the lack of comprehension, I know nothing about regex.
Posted by foobic, 01-08-2010, 09:40 PM
The print_r output may appear somewhere unexpected, probably at or near the top of the page, but you should see it in the page source. Try ([\n]{1,}) also, and ([\r\n]{3,}).
Posted by Aeron, 01-08-2010, 10:26 PM
Ok, I've run the script with the different regex combinations, and I've included the results below. ---- ([\r\n]{3,}) gives me the following output: Array( [0] => Array ( [0] => ) [1] => Array ( [0] => ) ) ---- ([\n]{1,}) gives me the following output: Array ( [0] => Array ( [0] => [1] => ) [1] => Array ( [0] => [1] => ) ) ---- ([\n]{2,}) gives me the following output: Array ( [0] => Array ( ) [1] => Array ( ) ) ---- Does that help tell you anything?
Posted by foobic, 01-09-2010, 09:08 PM
So: ([\n]{2,}) doesn't match at all (no match to \n\n)([\n]{1,}) matches twice (two matches to \n, but not together)([\r\n]{3,}) matches once (one match of a mixture of \r and \n, all together) It looks like \r is your problem. You could probably just replace ([\n]{2,}) with ([\r\n]{3,}) in the original routine, but that would fail for non-Windows systems, so I guess what you really need is both, ie. something like:
Posted by tim2718281, 01-10-2010, 01:17 AM
I 'm not sure I understand the specification. If the OP were to state the requirement, it would not take long to produce code to do it ... far easier than wrestling with someone else's code that doesn't work. I am guessing the following: 1) Any HTML sequences for paragraph, paragraph end, break, and break end in the input are to removed 2) If the parameter line_breaks is true, then single newline sequences in the input are to be replaced by
3) The output string is to be divided into HTML paragraphs:
part1
part2
... etc where the divisions between parts are represented by one or more newline sequences in the input. The newline sequences are to be discarded. So now we need to know all the possible newline sequences. Also. is the program required to correctly parse paragraph tags, etc, or specifically the sequence
That is, what should the program do with
and so on Last edited by tim2718281; 01-10-2010 at 01:30 AM.
Posted by Aeron, 01-10-2010, 03:28 PM
foobic That last snippet of code worked perfectly. At least as far as I can tell. Thanks for your help, it's much appreciated.
Posted by mattle, 01-11-2010, 12:54 PM
You may want to test foobic's code on a Mac. IIRC, Safari (and all Aqua apps?) are still using just '\r' for line breaks (even though '\n' is used in the BSD-subsystem files). If that's the case, text entered on a Mac may not match the regex given.

Add to Favourites Print this Article

Knowledgebase

Issue with new line to function

Issue with new line to
function