PHP substr()函数,允许您设置开始和停止点并保持HTML格式?
内容导读
互联网集市收集整理的这篇技术教程文章主要介绍了PHP substr()函数,允许您设置开始和停止点并保持HTML格式?,小编现在分享给大家,供广大互联网技能从业者学习和参考。文章包含6560字,纯文字阅读大概需要10分钟。
内容图文
![PHP substr()函数,允许您设置开始和停止点并保持HTML格式?](/upload/InfoBanner/zyjiaocheng/785/2093bc6e4b9641098fdd90c78cbc87d3.jpg)
使用PHP中的普通substr()函数,您可以决定在哪里“开始”剪切字符串,以及设置为设置长度.长度可能是最常用的,但在这种情况下,我需要从头开始切断大约120个字符.问题是我需要保持字符串中的html完整,并且只剪切标签中的实际文本.
我为它找到了一些自定义函数,但我没有找到一个允许你设置起点的单一函数,例如.你想在哪里开始切割字符串.
这是我发现的一个:Using PHP substr() and strip_tags() while retaining formatting and without breaking HTML
所以,我基本上需要一个substr()函数,它与原始函数完全相同,除了保持格式化.
有什么建议?
要修改的示例内容:
<p>Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going <a href="#">through the cites</a> of the word in classical literature, discovered the undoubtable source. Lorem Ipsum comes from sections 1.10.32 and 1.10.33 of "de Finibus</p> <p>Bonorum et Malorum" (The Extremes of Good and Evil) by Cicero, written in 45 BC. This book is a treatise on the theory of ethics, very popular during the <strong>Renaissance</strong>. The first line of Lorem Ipsum, "Lorem ipsum dolor sit amet..", comes from a line in section 1.10.32.</p>
从开始切断5后:
<p>ary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going <a href="#">through the cites</a> of the word in classical literature, discovered the undoubtable source. Lorem Ipsum comes from sections 1.10.32 and 1.10.33 of "de Finibus</p> <p>Bonorum et Malorum" (The Extremes of Good and Evil) by Cicero, written in 45 BC. This book is a treatise on the theory of ethics, very popular during the <strong>Renaissance</strong>. The first line of Lorem Ipsum, "Lorem ipsum dolor sit amet..", comes from a line in section 1.10.32.</p>
并且开始和结束时关闭5:
<p>ary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going <a href="#">through the cites</a> of the word in classical literature, discovered the undoubtable source. Lorem Ipsum comes from sections 1.10.32 and 1.10.33 of "de Finibus</p> <p>Bonorum et Malorum" (The Extremes of Good and Evil) by Cicero, written in 45 BC. This book is a treatise on the theory of ethics, very popular during the <strong>Renaissance</strong>. The first line of Lorem Ipsum, "Lorem ipsum dolor sit amet..", comes from a line in section 1.1</p>
是的,你抓住了我的漂移?
如果它是在一个中间停止切割的话,我宁愿它切掉整个单词,但这并不是非常重要.
**编辑:**固定报价.
解决方法:
你问的问题涉及很多复杂问题(基本上,在给定字符串偏移的情况下生成一个有效的html子集),如果你以一种表达为文本字符数的方式重新构造你的问题,那真的会更好.你想保留而不是切割一个包含html的任意字符串.如果你这样做,这个问题就变得容易了,因为你可以使用真正的HTML解析器.你不必担心:
>意外地将元件切成两半.
>意外地将参与者减少一半.
>不计算元素内的文本.
>确保字符实体计为单个字符.
>确保所有元素都已正确关闭.
>确保不破坏字符串,因为您在utf-8字符串上使用substr().
使用正则表达式(使用u标志)和mb_substr()以及标记栈(我之前已经完成)可以实现这一点,但是有很多边缘情况,你通常会遇到困难.
但是,DOM解决方案相当简单:遍历所有文本节点,计算字符串长度,并根据需要删除或子串其文本内容.下面的代码执行此操作:
$html = <<<'EOT'
<p>Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going <a href="#">through the cites</a> of the word in classical literature, discovered the undoubtable source. Lorem Ipsum comes from sections 1.10.32 and 1.10.33 of "de Finibus</p> <p>Bonorum et Malorum" (The Extremes of Good and Evil) by Cicero, written in 45 BC. This book is a treatise on the theory of ethics, very popular during the <strong>Renaissance</strong>. The first line of Lorem Ipsum, "Lorem ipsum dolor sit amet..", comes from a line in section 1.10.32.</p>
EOT;
function substr_html($html, $start, $length=null, $removeemptyelements=true) {
if (is_int($length)) {
if ($length===0) return '';
$end = $start + $length;
} else {
$end = null;
}
$d = new DOMDocument();
$d->loadHTML('<html><head><meta http-equiv="content-type" content="text/html;charset=utf-8"><title></title></head><body>'.$html.'</body>');
$body = $d->getElementsByTagName('body')->item(0);
$dxp = new DOMXPath($d);
$t_start = 0; // text node's start pos relative to all text
$t_end = null; // text node's end pos relative to all text
// copy because we may modify result of $textnodes
$textnodes = iterator_to_array($dxp->query('/descendant::*/text()', $body));
// PHP 5.2 doesn't seem to implement Traversable on DOMNodeList,
// so `iterator_to_array()` won't work. Use this instead:
// $textnodelist = $dxp->query('/descendant::*/text()', $body);
// $textnodes = array();
// for ($i = 0; $i < $textnodelist->length; $i++) {
// $textnodes[] = $textnodelist->item($i);
//}
//unset($textnodelist);
foreach($textnodes as $text) {
$t_end = $t_start + $text->length;
$parent = $text->parentNode;
if ($start >= $t_end || ($end!==null && $end < $t_start)) {
$parent->removeChild($text);
} else {
$n_offset = max($start - $t_start, 0);
$n_length = ($end===null) ? $text->length : $end - $t_start;
if (!($n_offset===0 && $n_length >= $text->length)) {
$substr = $text->substringData($n_offset, $n_length);
if (strlen($substr)) {
$text->deleteData(0, $text->length);
$text->appendData($substr);
} else {
$parent->removeChild($text);
}
}
}
// if removing this text emptied the parent of nodes, remove the node!
if ($removeemptyelements && !$parent->hasChildNodes()) {
$parent->parentNode->removeChild($parent);
}
$t_start = $t_end;
}
unset($textnodes);
$newstr = $d->saveHTML($body);
// mb_substr() is to remove <body></body> tags
return mb_substr($newstr, 6, -7, 'utf-8');
}
echo substr_html($html, 480, 30);
这将输出:
<p> of "de Finibus</p> <p>Bonorum et Mal</p>
请注意,您的“子串”跨越多个p元素并不会让您感到困惑.
内容总结
以上是互联网集市为您收集整理的PHP substr()函数,允许您设置开始和停止点并保持HTML格式?全部内容,希望文章能够帮你解决PHP substr()函数,允许您设置开始和停止点并保持HTML格式?所遇到的程序开发问题。 如果觉得互联网集市技术教程内容还不错,欢迎将互联网集市网站推荐给程序员好友。
内容备注
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 gblab@vip.qq.com 举报,一经查实,本站将立刻删除。
内容手机端
扫描二维码推送至手机访问。