NBC朴素贝叶斯分类器 ————机器学习实战 python代码
内容导读
互联网集市收集整理的这篇技术教程文章主要介绍了NBC朴素贝叶斯分类器 ————机器学习实战 python代码,小编现在分享给大家,供广大互联网技能从业者学习和参考。文章包含2341字,纯文字阅读大概需要4分钟。
内容图文
# -*- coding: utf-8 -*- """ Created on Mon Aug 07 23:40:13 2017 @author: mdz """ import numpy as np def loadData(): vocabList=[[‘my‘, ‘dog‘, ‘has‘, ‘flea‘, ‘problems‘, ‘help‘, ‘please‘], [‘maybe‘, ‘not‘, ‘take‘, ‘him‘, ‘to‘, ‘dog‘, ‘park‘, ‘stupid‘], [‘my‘, ‘dalmation‘, ‘is‘, ‘so‘, ‘cute‘, ‘I‘, ‘love‘, ‘him‘], [‘stop‘, ‘posting‘, ‘stupid‘, ‘worthless‘, ‘garbage‘], [‘mr‘, ‘licks‘, ‘ate‘, ‘my‘, ‘steak‘, ‘how‘, ‘to‘, ‘stop‘, ‘him‘], [‘quit‘, ‘buying‘, ‘worthless‘, ‘dog‘, ‘food‘, ‘stupid‘]] classList=[0,1,0,1,0,1]#1 侮辱性文字,0 正常言论 return vocabList,classList #对vocabList已经拆分过的句子进行筛选,筛选掉重复的单词,最后再返回list #该list的length即属性的个数 def filterVocabList(vocabList): vocabSet=set([]) for document in vocabList: vocabSet=vocabSet|set(document) return list(vocabSet) #对测试样本进行0-1处理 def zero_one(vocabList,input): returnVec=[0]*len(vocabList) for word in input: if word in vocabList: returnVec[vocabList.index(word)]=1 else: print "the word: %s is not in my Vocabulary!"%word return returnVec def trainNbc(trainSamples,trainCategory): numTrainSamp=len(trainSamples) numWords=len(trainSamples[0]) pAbusive=sum(trainCategory)/float(numTrainSamp) #y=1 or 0下的特征计数 p0Num=np.ones(numWords) p1Num=np.ones(numWords) #y=1 or 0下的类别计数 p0NumTotal=numWords p1NumTotal=numWords for i in range(numTrainSamp): if trainCategory[i]==1: p0Num+=trainSamples[i] p0NumTotal+=sum(trainSamples[i]) else: p1Num+=trainSamples[i] p1NumTotal +=sum(trainSamples[i]) p1Vec=np.log(p1Num/p1NumTotal) p0Vec=np.log(p0Num/p0NumTotal) return p1Vec,p0Vec,pAbusive def classifyOfNbc(testSamples,p1Vec,p0Vec,pAbusive): p1=sum(testSamples*p1Vec)+np.log(pAbusive) p0=sum(testSamples*p0Vec)+np.log(1-pAbusive) if p1>p0: return 1 else: return 0 def testingNbc(): vocabList,classList=loadData() vocabSet=filterVocabList(vocabList) trainList=[] for term in vocabList: trainList.append(zero_one(vocabSet,term)) p1Vec,p0Vec,pAbusive=trainNbc(np.array(trainList),np.array(classList)) testEntry=[‘love‘,‘my‘,‘daughter‘] testSamples=np.array(zero_one(vocabSet,testEntry)) print testEntry,‘classified as :‘,classifyOfNbc(testSamples,p0Vec,p1Vec,pAbusive) testEntry=[‘stupid‘,‘garbage‘] testSamples=np.array(zero_one(vocabSet,testEntry)) print testEntry,‘classified as :‘,classifyOfNbc(testSamples,p0Vec,p1Vec,pAbusive)
原文:http://www.cnblogs.com/mdz-great-world/p/7308210.html
内容总结
以上是互联网集市为您收集整理的NBC朴素贝叶斯分类器 ————机器学习实战 python代码全部内容,希望文章能够帮你解决NBC朴素贝叶斯分类器 ————机器学习实战 python代码所遇到的程序开发问题。 如果觉得互联网集市技术教程内容还不错,欢迎将互联网集市网站推荐给程序员好友。
内容备注
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 gblab@vip.qq.com 举报,一经查实,本站将立刻删除。
内容手机端
扫描二维码推送至手机访问。