Home Newest Hot Active Unanswered Views Votes Tags Rank
 
Anderson

Detect paragraphs and add <p> tags for HTML output - php


<?php
function addParagraphsNew($text)
{
// local variables
$returntext = '';       // modified string to return back to caller
$sections   = array();  // array of text sections returned by preg_split()
$pattern1   = '%        # match: <tag attrib="xyz">contents</tag>
 ^                       # tag must start on the beginning of a line
 (                       # capture whole thing in group 1
   <                     # opening tag starts with left angle bracket
   (\w++)                # capture tag name into group 2
   [^>]*+                # allow any attributes in opening tag
   >                     # opening tag ends with right angle bracket
   .*?                   # lazily grab everything up to closing tag
   </\2>                 # closing tag for one we just opened
 )                       # end capture group 1
 $                       # tag must end on the end of a line
 %smx';                  // s-dot matches newline, m-multiline, x-free-spacing
  
$pattern2   = '%        # match: \n--untagged paragraph--\n
 (?:                     # non-capture group for first alternation. Match either...
   \s*\n\s*+             # a newline and all surrounding whitespace (and discard)
 |                       # or...
   ^                     # the beginning of the string
 )                       # end of first alternation group
 (.+?)                   # capture all text between newlines (or string ends)
 (?:\s+$)?               # clear out any whitespace at end of string
 (?=                     # end of paragraph is position followed by either...
   \s*\n\s*              # a newline with optional surrounding whitespace
 |                       # or...
   $                     # the end of the string
 )                       # end of second alternation group
 %x';                    // x-free-spacing
  
// first split text into tagged portions and untagged portions
// Note that the array returned by preg_split with PREG_SPLIT_DELIM_CAPTURE flag will get one
// extra member for each set of capturing parentheses. In this case, we have two sets; 1 - to
// capture the whole HTML tagged section, and 2 - to capture the tag name (which is needed to
// match the closing tag).
$sections = preg_split($pattern1, $text, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE );
  
// now put it back together proccessing only the untagged sections
for ($i = 0; $i < count($sections); $i++) {
     if (preg_match($pattern1, $sections[$i]))
     { // this is a tagged paragraph, don't modify it, just add it (and increment array ptr)
         $returntext .= "\n" . $sections[$i] . "\n";
         $i++; // need to skip over the extra array element for capture group 2
     } else
     { // this is an untagged section. Add paragraph tags around bare paragraphs
         $returntext .= preg_replace($pattern2, "\n<p>$1</p>\n", $sections[$i]);
     }
}
$returntext = preg_replace('/^\s+/', '', $returntext); // clean leading whitespace
$returntext = preg_replace('/\s+$/', '', $returntext); // clean trailing whitespace
return $returntext;
}
  
// Read html file to be processed into $data variable
$data = file_get_contents('test.txt');
echo addParagraphsNew($data);
?>

Anderson
/**
* Replaces double line-breaks with paragraph elements.
*
* A group of regex replaces used to identify text formatted with newlines and
* replace double line-breaks with HTML paragraph tags. The remaining
* line-breaks after conversion become <<br />> tags, unless $br is set to '0'
* or 'false'.
*
* @since 0.71
*
* @param string $pee The text which has to be formatted.
* @param bool $br Optional. If set, this will convert all remaining line-breaks after paragraphing. Default true.
* @return string Text which has been converted into correct paragraph tags.
*/
function wpautop($pee, $br = true) {
$pre_tags = array();
if ( trim($pee) === '' )
return '';
$pee = $pee . "\n"; // just to make things a little easier, pad the end
if ( strpos($pee, '<pre') !== false ) {
        $pee_parts = explode( '</pre>', $pee );
        $last_pee = array_pop($pee_parts);
        $pee = '';
        $i = 0;
        foreach ( $pee_parts as $pee_part ) {
            $start = strpos($pee_part, '<pre');
            // Malformed html?
            if ( $start === false ) {
                $pee .= $pee_part;
                continue;
            }
            $name = "<pre wp-pre-tag-$i></pre>";
            $pre_tags[$name] = substr( $pee_part, $start ) . '</pre>';
            $pee .= substr( $pee_part, 0, $start ) . $name;
            $i++;
        }
        $pee .= $last_pee;
    }
    $pee = preg_replace('|<br />\s*<br />|', "\n\n", $pee);
    // Space things out a little
    $allblocks = '(?:table|thead|tfoot|caption|col|colgroup|tbody|tr|td|th|div|dl|dd|dt|ul|ol|li|pre|select|option|form|map|area|blockquote|address|math|style|p|h[1-6]|hr|fieldset|legend|section|article|aside|hgroup|header|footer|nav|figure|figcaption|details|menu|summary)';
    $pee = preg_replace('!(<' . $allblocks . '[^>]*>)!', "\n$1", $pee);
    $pee = preg_replace('!(</' . $allblocks . '>)!', "$1\n\n", $pee);
    $pee = str_replace(array("\r\n", "\r"), "\n", $pee); // cross-platform newlines
    if ( strpos($pee, '<object') !== false ) {
        $pee = preg_replace('|\s*<param([^>]*)>\s*|', "<param$1>", $pee); // no pee inside object/embed
        $pee = preg_replace('|\s*</embed>\s*|', '</embed>', $pee);
    }
    $pee = preg_replace("/\n\n+/", "\n\n", $pee); // take care of duplicates
    // make paragraphs, including one at the end
    $pees = preg_split('/\n\s*\n/', $pee, -1, PREG_SPLIT_NO_EMPTY);
    $pee = '';
    foreach ( $pees as $tinkle )
        $pee .= '<p>' . trim($tinkle, "\n") . "</p>\n";
    $pee = preg_replace('|<p>\s*</p>|', '', $pee); // under certain strange conditions it could create a P of entirely whitespace
    $pee = preg_replace('!<p>([^<]+)</(div|address|form)>!', "<p>$1</p></$2>", $pee);
    $pee = preg_replace('!<p>\s*(</?' . $allblocks . '[^>]*>)\s*</p>!', "$1", $pee); // don't pee all over a tag
    $pee = preg_replace("|<p>(<li.+?)</p>|", "$1", $pee); // problem with nested lists
    $pee = preg_replace('|<p><blockquote([^>]*)>|i', "<blockquote$1><p>", $pee);
    $pee = str_replace('</blockquote></p>', '</p></blockquote>', $pee);
    $pee = preg_replace('!<p>\s*(</?' . $allblocks . '[^>]*>)!', "$1", $pee);
    $pee = preg_replace('!(</?' . $allblocks . '[^>]*>)\s*</p>!', "$1", $pee);
    if ( $br ) {
        $pee = preg_replace_callback('/<(script|style).*?<\/\\1>/s', '_autop_newline_preservation_helper', $pee);
        $pee = preg_replace('|(?<!<br />)\s*\n|', "<br />\n", $pee); // optionally make line breaks
        $pee = str_replace('<WPPreserveNewline />', "\n", $pee);
    }
    $pee = preg_replace('!(</?' . $allblocks . '[^>]*>)\s*<br />!', "$1", $pee);
    $pee = preg_replace('!<br />(\s*</?(?:p|li|div|dl|dd|dt|th|pre|td|ul|ol)[^>]*>)!', '$1', $pee);
    $pee = preg_replace( "|\n</p>$|", '</p>', $pee );
    if ( !empty($pre_tags) )
        $pee = str_replace(array_keys($pre_tags), array_values($pre_tags), $pee);
    return $pee;
}
/**
 * Newline preservation help function for wpautop
 *
 * @since 3.1.0
 * @access private
 * @param array $matches preg_replace_callback matches array
 * @returns string
 */
function _autop_newline_preservation_helper( $matches ) {
    return str_replace("\n", "<WPPreserveNewline />", $matches[0]);
}
Anderson
function nl2p($string, $line_breaks = true, $xml = true) {
$string = str_replace(array('<p>', '</p>', '<br>', '<br />'), '', $string);
// It is conceivable that people might still want single line-breaks
// without breaking into a new paragraph.
if ($line_breaks == true)
    return '<p>'.preg_replace(array("/([\n]{2,})/i", "/([^>])\n([^<])/i"), array("</p>\n<p>", '$1<br'.($xml == true ? ' /' : '').'>$2'), trim($string)).'</p>';
else 
    return '<p>'.preg_replace(
    array("/([\n]{2,})/i", "/([\r\n]{3,})/i","/([^>])\n([^<])/i"),
    array("</p>\n<p>", "</p>\n<p>", '$1<br'.($xml == true ? ' /' : '').'>$2'),
    trim($string)).'</p>'; 
}
Anderson
function nl2p($str) {
    $arr=explode("\n",$str);
    $out='';
    for($i=0;$i<count($arr);$i++) {
        if(strlen(trim($arr[$i]))>0)
            $out.='<p>'.trim($arr[$i]).'</p>';
    }
    return $out;
}*/
GuruQA
GuruQA
 
 
Anderson

iptables para HLDS - Testado em CentOS 6.5 e 7.2

Gluke

amx_adminmodel Erro Compilação ?

Anderson

The basic natives of SQLX - AMXX

Anderson

Ache o erro HTML

bRuc3

[PLUGIN]Ultimate Reset Frags

Anderson

Eu sou azul - HTML e CSS

LuanVidal

Mod Cod_mw3 Para cs 1.6 Em Portugues

Alves

Quero editar esse plugin Steam Prefix

Anderson

Amx Mod X comandos

Anderson

Modificando o Conteúdo - jQuery