首页 / PHP / PHP字索引,性能和合理的结果

PHP字索引,性能和合理的结果

内容导读

互联网集市收集整理的这篇技术教程文章主要介绍了PHP字索引,性能和合理的结果，小编现在分享给大家，供广大互联网技能从业者学习和参考。文章包含5353字，纯文字阅读大概需要8分钟。

内容图文

我目前正在为搜索功能开发索引器.索引器将处理来自“字段”的数据.
字段看起来像：

  Field_id   Field_type   Field_name   Field_Data
- 101        text         Name         Intel i7
- 102        integer      Cores        4 physical, 4 virtual
- 103        select       Vendor       Intel
- 104        multitext    Description  The i7 is intel's next gen range of cpus.

索引器将生成以下结果/索引：

  Keyword    Occurrences
- intel      101, 103, 104
- i7         101, 104
- physical   102
- virtual    102
- next       104
- gen        104
- range      104
- cpus       104   (*)
- cpu        104   (*)

所以它有点看起来很好很好,但是,有些问题我想要理清：

>过滤掉常用词(正如你可能注意到的那样,“”是“的”和“英特尔”在列表中缺失)
>关于“cpus”(复数与单数),最好是使用特定类型(单数或复数),两者还是精确(即“cpus”是不同的“cpu”)？
>继续上一个项目,我如何确定复数(不同的口味：test =>测试fish => fish and leaf => leaves)
>我目前正在使用MySql,我非常关心性能问题;我们有500个类别,我们甚至没有启动该网站
>假设我想使用搜索词“vendor：intel”,其中vendor指定字段名称(field_name),您认为会对sql server产生巨大影响吗？
>搜索限制;我根本不喜欢这个,但这是一种可能性,如果你知道任何变通方法,那就听听吧！
>还有其他一些我可能忘记的问题,如果你发现任何问题,欢迎你对我大喊大叫;-)
>我不需要搜索引擎来抓取链接,事实上,我特别希望它不会抓取链接.

(顺便说一句,我不偏向于英特尔,只是碰巧我拥有一台基于i7的电脑;-))

解决方法:

这是对您原始问题的回应,以及您之后的answer/question.

我之前使用过Sphinx搜索引擎(很久以前,所以我有点生疏了),发现它非常好,即使文档有时有点缺乏??.

我确信还有其他方法可以做到这一点,无论是使用自己的自定义代码,还是使用其他搜索引擎–Sphinx恰好是我使用过的.我并不是说它会按照你想要的方式做你想做的一切,但我有理由相信它能很容易地完成大部分工作,并且比用PHP / MySQL单独编写的任何东西都快得多.

我建议在深入Sphinx documentation之前阅读Build a custom search engine with PHP.如果你认为阅读之后不合适,那就足够了.

在回答您的具体问题时,我将文档中的一些链接与一些相关引用放在一起：

过滤掉常用词(正如你可能注意到的那样,“”是“”和“英特尔”在列表中缺失)

11.2.8. stopwords

Stopwords are the words that will not
be indexed. Typically you’d put most
frequent words in the stopwords list
because they do not add much value to
search results but consume a lot of
resources to process.

关于“cpus”(复数与单数),最好是使用特定类型(单数或复数),两者还是精确(即“cpus”是不同的“cpu”)？

11.2.9. wordforms

Word forms are applied after
tokenizing the incoming text by
charset_table rules. They essentialy
let you replace one word with another.
Normally, that would be used to bring
different word forms to a single
normal form (eg. to normalize all the
variants such as “walks”, “walked”,
“walking” to the normal form “walk”).
It can also be used to implement
stemming exceptions, because stemming
is not applied to words found in the
forms list.

继续前面的项目,我如何确定复数(不同的口味：test =>测试fish => fish和leaf => leaves)

Sphinx支持Porter Stemming Algorithm

The Porter stemming algorithm (or
‘Porter stemmer’) is a process for
removing the commoner morphological
and inflexional endings from words in
English. Its main use is as part of a
term normalisation process that is
usually done when setting up
Information Retrieval systems.

假设我想使用搜索词“vendor：intel”,其中vendor指定字段名称(field_name),您认为会对sql server产生巨大影响吗？

3.2. Attributes

A good example for attributes would be
a forum posts table. Assume that only
title and content fields need to be
full-text searchable – but that
sometimes it is also required to limit
search to a certain author or a
sub-forum (ie. search only those rows
that have some specific values of
author_id or forum_id columns in the
SQL table); or to sort matches by
post_date column; or to group matching
posts by month of the post_date and
calculate per-group match counts.

This can be achieved by specifying all
the mentioned columns (excluding title
and content, that are full-text
fields) as attributes, indexing them,
and then using API calls to setup
filtering, sorting, and grouping.

您还可以使用5.3. Extended query syntax搜索特定字段(而不是按属性过滤结果)：

field search operator:
@vendor intel

搜索引擎如何索引一组字段并将找到的短语/关键字/ etc与特定的字段ID绑定？

8.6.1. Query

On success, Query() returns a result set that contains some of the found matches (as requested by SetLimits()) and additional general per-query statistics. > The result set is a hash (PHP specific; other languages might utilize other structures instead of hash) with the following keys and values:

“matches”:
Hash which maps found document IDs to another small hash containing document weight and attribute values (or an array of the similar small hashes if SetArrayResult() was enabled).

“total”:
Total amount of matches retrieved on server (ie. to the server side result set) by this query. You can retrieve up to this amount of matches from server for this query text with current query settings.

“total_found”:
Total amount of matching documents in index (that were found and procesed on server).

“words”:
Hash which maps query keywords (case-folded, stemmed, and otherwise processed) to a small hash with per-keyword statitics (“docs”, “hits”).

“error”:
Query error message reported by searchd (string, human readable). Empty if there were no errors.

“warning”:
Query warning message reported by searchd (string, human readable). Empty if there were no warnings.

另见Listing 11和Listing 13 Listing 13.

内容总结

以上是互联网集市为您收集整理的PHP字索引,性能和合理的结果全部内容，希望文章能够帮你解决PHP字索引,性能和合理的结果所遇到的程序开发问题。如果觉得互联网集市技术教程内容还不错，欢迎将互联网集市网站推荐给程序员好友。

内容备注

版权声明：本文内容由互联网用户自发贡献，该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容，请发送邮件至 gblab@vip.qq.com 举报，一经查实，本站将立刻删除。

内容手机端

扫描二维码推送至手机访问。

本文链接：https://qyyshop.com/info/728945.html

来源：【匿名】

【上一篇】php – nuSOAP WebService的优点？【下一篇】PHP 5 数据对象 (PDO) 抽象层与 Oracle

更多 ►

【PHP字索引,性能和合理的结果】教程文章相关的互联网学习教程文章

使用php显示搜索引擎来的关键词

以下是相关实现代码：复制代码代码如下:<?php/*Plugin Name: display-search-keywordsPlugin URI: http://www.imyxiao.com/1531.htmlDescription: 当访客通过搜索引擎来到你的博客，这个插件可以显示访客搜索的关键词Version: 1.0Author:<a href="http://www.imyxiao.com/">仰肖</a>*/function unescape($str) { $ret = ‘‘; $len = strlen($str); for ($i = 0; $i < $len; $i++) { if ($str[$i] == ‘%‘ && $str[$i +1] == ‘...

php数组函数序列之krsort()- 对数组的元素键名进行降序排序,保持索引关系

krsort()定义和用法 krsort() 函数将数组按照键逆向排序，为数组值保留原来的键。可选的第二个参数包含附加的排序标志。若成功，则返回 TRUE，否则返回 FALSE。语法 krsort(array,sorttype)参数描述 array 必需。规定要排序的数组。 sorttype 可选。规定如何排列数组的值。可能的值： SORT_REGULAR - 默认。以它们原来的类型进行处理（不改变类型）。 SORT_NUMERIC - 把值作为数字来处理 SORT_STRING - 把值作为字符串来处理 S...

解决phpstrom 启动卡的问题和index索引加载慢的问题

第一，解决启动卡的问题只要修改两个Java虚拟机参数，就彻底解决了卡的问题了。操作步骤如下：找到C:\Program Files\JetBrains\PhpStorm 2018.2.2\bin 安装目录下1.找到phpstorm64.exe.vmoptions文件和phpstorm.exe.vmoptions文件，使用记事本打开。2.添加以下两行代码：12-Dawt.usesystemAAFontSettings=lcd -Dawt.java2d.opengl=true 3.保存退出。思路：phpstorm是使用JAVA开发的。由于IDE提供源文件关键字渲染功能，我们对文...

php数组索引的Key加引号和不加引号的区别

今天在看一个PHP博客时留意了这么一句话：“PHP中的索引KEY值如果不用引号括起来的话，会将索引KEY值解释为一个常量，当找不到该常量的定义时，才将其解释为一个字符串”。我有点不太相信，因为我一直都会将索引KEY用引号括起来，从而没有出现这种情况。趁现在有时间正好写行代码测试一下：复制代码代码如下: define(‘WEBHOST‘,‘blog‘); $wso = array();$wso[‘blog‘] = ‘www.weixiaodeyu.com‘;$wso[‘WEBHOST‘] = ‘www....

php数组中删除元素之重新索引的方法

如果要在某个数组中删除一个元素，可以直接用的unset，但今天看到的东西却让我大吃一惊复制代码代码如下: <?php $arr = array(‘a‘,‘b‘,‘c‘,‘d‘); unset($arr[1]); print_r($arr); ?> print_r($arr)之后，结果却不是那样的，最终结果是 Array ( [0] => a [2] => c [3] => d ) 那么怎么才能做到缺少的元素会被填补并且数组会被重新索引呢？答案是array_splice():复制代码代码如下: <?php $arr = array(‘a‘,‘b‘,‘c‘,‘...

PHP 禁止显示目录索引【代码】

apache禁止显示目录索引apache显示目录索引很不安全，下面是操作方法。在httpd.conf文件搜索关键字"Indexes "。 Options Indexes FollowSymLinks AllowOverride None Order allow,deny Allow from all出掉Indexes关键字，修改如下： Options FollowSymLinks MultiViews AllowOverride None Order allow,deny Allow from all原文：http://my.oschina.net/ososchina/blog/492975

PHP二维索引数组的2种遍历方式【代码】

二维索引数组的遍历方式，话不多说，直接看代码。实例一、<?php$arr = array(//定义外层数组 array(1,‘高某‘,‘A公司‘,‘北京市‘,‘(010)987654321‘,‘gm@Linux.com‘),//子数组1 array(2,‘洛某‘,‘B公司‘,‘上海市‘,‘(021)123456789‘,‘lm@apache.com‘),//子数组2 array(3,‘峰某‘,‘C公司‘,‘天津市‘,‘(022)24680246‘,‘fm@mysql.com‘), //子数组3 array(4,‘书某‘,‘D公司‘,‘重庆市‘,‘...

PHP屏蔽蜘蛛访问代码及常用搜索引擎的HTTP_USER_AGENT【代码】【图】

PHP屏蔽蜘蛛访问代码代码：常用搜索引擎名与 HTTP_USER_AGENT对应值百度baiduspider谷歌googlebot搜狗sogou腾讯SOSOsosospider雅虎slurp有道youdaobotBingbingbotMSNmsnbotAlexais_archiver function is_crawler() { $userAgent = strtolower($_SERVER[‘HTTP_USER_AGENT‘]); $spiders = array( ‘Googlebot‘, // Google 爬虫 ‘Baiduspider‘, // 百度爬虫 ‘Yahoo! Slurp‘, // 雅虎爬虫 ‘YodaoBot‘, // 有道爬虫 ‘msnbot‘...

php关联数组和索引数组差别

没有查到明确的php中定义关联数组/索引数组的解析，根据phpdocument及百度的一些资料和实际的代码测试，对关联数组/索引数据进行定义解析。这个问题主要在和手机端ios app产品提供api时遇到，用关联数组转换为json能更好的用oc解析转换为数组。关联数组：没有明确的索引键，默认从0开始作为索引键。 $temp_arr = array ( ‘已经在别处买到‘, ‘商品不符合需求‘, ‘价格太高‘, ‘不想买了‘, ‘卖家没有...

各大搜索引擎Ping服务 php实现方法

各大搜索引擎Ping服务 php实现方法让网站快速收录【1】手动Ping服务地址： Baidu(百度)地址： http://ping.baidu.com/ping.html Google(谷歌)地址：http://blogsearch.google.com/ping Feedsky(飞递)地址：http://ping.feedsky.com/ping.html Qihoo(奇虎)地址：http://so.blog.qihoo.com/pingblog.html IASK(爱问)地址：http://blog.iask.com/ping.php 【2】自动Ping 服务应用编程接口(API)： Google(谷歌)：http://blogsearch.g...

php获取从百度、谷歌等搜索引擎进入网站关键词的方法【代码】

本文实例讲述了php获取从百度、谷歌等搜索引擎进入网站关键词的方法。分享给大家供大家参考。具体实现方法如下： <?php function search_word_from() {$referer = isset($_SERVER[‘HTTP_REFERER‘])?$_SERVER[‘HTTP_REFERER‘]:‘‘;if(strstr( $referer, ‘baidu.com‘)){ //百度preg_match( "|baidu.+wo?r?d=([^\\&]*)|is", $referer, $tmp );$keyword = urldecode( $tmp[1] );$from = ‘baidu‘;}elseif(strstr( $referer, ‘...

php数组中删除元素之重新索引

如果要在某个数组中删除一个元素，可以直接用的unset，但今天看到的东西却让我大吃一惊<?php $arr = array(‘a‘,‘b‘,‘c‘,‘d‘); unset($arr[1]); print_r($arr); ?> print_r($arr)之后，结果却不是那样的，最终结果是 Array ( [0] => a [2] => c [3] => d )那么怎么才能做到缺少的元素会被填补并且数组会被重新索引呢？答案是array_splice():<?php $arr = array(‘a‘,‘b‘,‘c‘,‘d‘); array_splice($arr,1,1); pr...

php网站来路获取代码（针对搜索引擎）

复制代码代码如下:function get_referer(){ $se = 0; $url = $_SERVER["HTTP_REFERER"]; //获取完整的来路URL $str = str_replace("http://","",$url); //去掉http:// $strdomain = explode("/",$str); // 以“/”分开成数组 $domain = $strdomain[0]; //取第一个“/”以前的字符 if(strstr($domain,‘baidu.com‘)){ $se = 1; } else if(strstr($domain,‘google.cn‘)){ $se = 1; } return $se; } 原文：http://www.jb51.net/ar...

【phpcms-v9】缓存索引caches下的各个文件的作用

【phpcms-v9】缓存目录caches下的各个文件的作用原文地址：http://blog.csdn.net/yanhui_wei/article/details/7912957第一：caches/caches_commons目录下缓存文件：[html] view plaincopy1.caches/caches_commons/caches_data/category_content.cache.php: 所有栏目对应的站点ID 2.caches/caches_commons/caches_data/category_content_1.cache.php: 当前站点id下所有栏目的详细配置信息 3.caches/caches_comm...

PHP缓存动态索引页面【代码】

我为缓存的MySQL结果找到了phpfastcahce类.支持WinCache,MemCache,文件,X-Cache,APC Cache的详细信息,并说：数据库的PHP缓存类：您的网站有10,000个在线访问者,而动态页面在每次加载页面时都必须向数据库发送10,000个相同的查询.使用phpFastCache,您的页面仅向数据库发送1个查询,并使用缓存为9,999个其他访问者提供服务. 在示例代码中：<?php// In your config fileinclude("php_fast_cache.php");// This is Optional Config onl...

首页 / PHP / PHP字索引,性能和合理的结果

PHP字索引,性能和合理的结果

内容导读

内容图文

内容总结

内容备注

内容手机端

【PHP字索引,性能和合理的结果】教程文章相关的互联网学习教程文章

使用php显示搜索引擎来的关键词

php数组函数序列之krsort()- 对数组的元素键名进行降序排序,保持索引关系

解决phpstrom 启动卡的问题和index索引加载慢的问题

php数组索引的Key加引号和不加引号的区别

php数组中删除元素之重新索引的方法

PHP 禁止显示目录索引【代码】

PHP二维索引数组的2种遍历方式【代码】

PHP屏蔽蜘蛛访问代码及常用搜索引擎的HTTP_USER_AGENT【代码】【图】

php关联数组和索引数组差别

各大搜索引擎Ping服务 php实现方法

php获取从百度、谷歌等搜索引擎进入网站关键词的方法【代码】

php数组中删除元素之重新索引

php网站来路获取代码（针对搜索引擎）

【phpcms-v9】缓存索引caches下的各个文件的作用

PHP缓存动态索引页面【代码】

PHP - 相关标签

性能 - 相关标签

PHP - 技术教程分类

PHP - 最新教程

PHP - 最热教程