Apache mahout 源码阅读笔记-DataModel之UserBaseRecommender
内容导读
互联网集市收集整理的这篇技术教程文章主要介绍了Apache mahout 源码阅读笔记-DataModel之UserBaseRecommender,小编现在分享给大家,供广大互联网技能从业者学习和参考。文章包含4055字,纯文字阅读大概需要6分钟。
内容图文
![Apache mahout 源码阅读笔记-DataModel之UserBaseRecommender](/upload/InfoBanner/zyjiaocheng/1064/31a425b5b427452b8c60f3068ad075d4.jpg)
@Test public void testHowMany() throws Exception { DataModel dataModel = getDataModel( newlong[] {1, 2, 3, 4, 5}, new Double[][] { {0.1, 0.2}, {0.2, 0.3, 0.3, 0.6}, {0.4, 0.4, 0.5, 0.9}, {0.1, 0.4, 0.5, 0.8, 0.9, 1.0}, {0.2, 0.3, 0.6, 0.7, 0.1, 0.2}, }); //用于计算最相似的用户,领域用户 UserSimilarity similarity = new PearsonCorrelationSimilarity(dataModel); UserNeighborhood neighborhood = new NearestNUserNeighborhood(2, similarity, dataModel); Recommender recommender = new GenericUserBasedRecommender(dataModel, neighborhood, similarity); List<RecommendedItem> fewRecommended = recommender.recommend(1, 2); List<RecommendedItem> moreRecommended = recommender.recommend(1, 4); for (int i = 0; i < fewRecommended.size(); i++) { assertEquals(fewRecommended.get(i).getItemID(), moreRecommended.get(i).getItemID()); } recommender.refresh(null); for (int i = 0; i < fewRecommended.size(); i++) { assertEquals(fewRecommended.get(i).getItemID(), moreRecommended.get(i).getItemID()); } }
相似度计算,参考上篇的PearsonCorrelationSimilarity。
NearestNUserNeighborhood ,获取最近的N个用户,怎么实现的呢?
~/mahout-core/src/main/java/org/apache/mahout/cf/taste/impl/recommender/GenericUserBasedRecommender.java
@Override public List<RecommendedItem> recommend(long userID, int howMany, IDRescorer rescorer) throws TasteException { Preconditions.checkArgument(howMany >= 1, "howMany must be at least 1"); log.debug("Recommending items for user ID ‘{}‘", userID); //根据similarity模型进行计算,计算最相似的N个用户long[] theNeighborhood = neighborhood.getUserNeighborhood(userID); if (theNeighborhood.length == 0) { return Collections.emptyList(); } //获取其他领域用户进行评分而且当前用户所没有进行评分的Item列表,作为推荐的基本池子 FastIDSet allItemIDs = getAllOtherItems(theNeighborhood, userID); //获取池子里面,当前用户偏好最高的TopN进行推荐 TopItems.Estimator<Long> estimator = new Estimator(userID, theNeighborhood); List<RecommendedItem> topItems = TopItems .getTopItems(howMany, allItemIDs.iterator(), rescorer, estimator); log.debug("Recommendations are: {}", topItems); return topItems; }
Estimator的实现,是这样的:
private final class Estimator implements TopItems.Estimator<Long> { privatefinallong theUserID; privatefinallong[] theNeighborhood; Estimator(long theUserID, long[] theNeighborhood) { this.theUserID = theUserID; this.theNeighborhood = theNeighborhood; } @Override publicdouble estimate(Long itemID) throws TasteException { return doEstimatePreference(theUserID, theNeighborhood, itemID); } } }
protected float doEstimatePreference(long theUserID, long[] theNeighborhood, long itemID) throws TasteException { //把相似用户对该Item的偏好累加起来,再做平均值,当做当前用户对改Item的偏好if (theNeighborhood.length == 0) { return Float.NaN; } DataModel dataModel = getDataModel(); double preference = 0.0; double totalSimilarity = 0.0; int count = 0; for (long userID : theNeighborhood) { if (userID != theUserID) { // See GenericItemBasedRecommender.doEstimatePreference() too Float pref = dataModel.getPreferenceValue(userID, itemID); if (pref != null) { double theSimilarity = similarity.userSimilarity(theUserID, userID); if (!Double.isNaN(theSimilarity)) { preference += theSimilarity * pref; totalSimilarity += theSimilarity; count++; } } } } // Throw out the estimate if it was based on no data points, of course, but also if based on // just one. This is a bit of a band-aid on the ‘stock‘ item-based algorithm for the moment. // The reason is that in this case the estimate is, simply, the user‘s rating for one item // that happened to have a defined similarity. The similarity score doesn‘t matter, and that // seems like a bad situation.if (count <= 1) { return Float.NaN; } float estimate = (float) (preference / totalSimilarity); if (capper != null) { estimate = capper.capEstimate(estimate); } return estimate; }
总结:
1)计算最相似的N个用户
2)从最相似的N个用户中,获取自己没有评分过的Item
3)预计自己对每个Item的偏好
4)取偏好最高的N个Item进行推荐
原文:http://www.cnblogs.com/zhangqingping/p/4118840.html
内容总结
以上是互联网集市为您收集整理的Apache mahout 源码阅读笔记-DataModel之UserBaseRecommender全部内容,希望文章能够帮你解决Apache mahout 源码阅读笔记-DataModel之UserBaseRecommender所遇到的程序开发问题。 如果觉得互联网集市技术教程内容还不错,欢迎将互联网集市网站推荐给程序员好友。
内容备注
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 gblab@vip.qq.com 举报,一经查实,本站将立刻删除。
内容手机端
扫描二维码推送至手机访问。