# algorithm - 贝叶斯讲解 - 朴素贝叶斯分类的简单解释

## 贝叶斯分类器知乎 (4)

Ram Narasimhan在下面很好地解释了这个概念，这是通过朴素贝叶斯的行为代码示例的另一种解释

``````Age,Income,Student,Creadit_Rating,Buys_Computer
<=30,high,no,fair,no
<=30,high,no,excellent,no
31-40,high,no,fair,yes
>40,medium,no,fair,yes
>40,low,yes,fair,yes
>40,low,yes,excellent,no
31-40,low,yes,excellent,yes
<=30,medium,no,fair,no
<=30,low,yes,fair,yes
>40,medium,yes,fair,yes
<=30,medium,yes,excellent,yes
31-40,medium,no,excellent,yes
31-40,high,yes,fair,yes
>40,medium,no,excellent,no
``````

``````import pandas as pd
import pprint

class Classifier():
data = None
class_attr = None
priori = {}
cp = {}
hypothesis = None

def __init__(self,filename=None, class_attr=None ):
self.class_attr = class_attr

'''
probability(class) =    How many  times it appears in cloumn
__________________________________________
count of all class attribute
'''
def calculate_priori(self):
class_values = list(set(self.data[self.class_attr]))
class_data =  list(self.data[self.class_attr])
for i in class_values:
self.priori[i]  = class_data.count(i)/float(len(class_data))
print "Priori Values: ", self.priori

'''
Here we calculate the individual probabilites
P(outcome|evidence) =   P(Likelihood of Evidence) x Prior prob of outcome
___________________________________________
P(Evidence)
'''
def get_cp(self, attr, attr_type, class_value):
data_attr = list(self.data[attr])
class_data = list(self.data[self.class_attr])
total =1
for i in range(0, len(data_attr)):
if class_data[i] == class_value and data_attr[i] == attr_type:
total+=1

'''
Here we calculate Likelihood of Evidence and multiple all individual probabilities with priori
(Outcome|Multiple Evidence) = P(Evidence1|Outcome) x P(Evidence2|outcome) x ... x P(EvidenceN|outcome) x P(Outcome)
scaled by P(Multiple Evidence)
'''
def calculate_conditional_probabilities(self, hypothesis):
for i in self.priori:
self.cp[i] = {}
for j in hypothesis:
self.cp[i].update({ hypothesis[j]: self.get_cp(j, hypothesis[j], i)})
print "\nCalculated Conditional Probabilities: \n"
pprint.pprint(self.cp)

def classify(self):
print "Result: "
for i in self.cp:
print i, " ==> ", reduce(lambda x, y: x*y, self.cp[i].values())*self.priori[i]

if __name__ == "__main__":
c.calculate_priori()
c.hypothesis = {"Age":'<=30', "Income":"medium", "Student":'yes' , "Creadit_Rating":'fair'}

c.calculate_conditional_probabilities(c.hypothesis)
c.classify()
``````

``````Priori Values:  {'yes': 0.6428571428571429, 'no': 0.35714285714285715}

Calculated Conditional Probabilities:

{
'no': {
'<=30': 0.8,
'fair': 0.6,
'medium': 0.6,
'yes': 0.4
},
'yes': {
'<=30': 0.3333333333333333,
'fair': 0.7777777777777778,
'medium': 0.5555555555555556,
'yes': 0.7777777777777778
}
}

Result:
yes  ==>  0.0720164609053
no  ==>  0.0411428571429
``````

``````training set---
round-red
round-orange
oblong-yellow
round-red

dataset----
round-red
round-orange
round-red
round-orange
oblong-yellow
round-red
round-orange
oblong-yellow
oblong-yellow
round-red
``````

``````Problem: Find out the possibility of whether the player plays in Rainy condition?

P(Yes|Rainy) = P(Rainy|Yes) * P(Yes) / P(Rainy)

P(Rainy|Yes) = 2/9 = 0.222
P(Yes) = 9/14 = 0.64
P(Rainy) = 5/14 = 0.36

Now, P(Yes|Rainy) = 0.222*0.64/0.36 = 0.39 which is lower probability which means chances of the match played is low.
``````

k-NN和NaiveBayes都是分类算法。 概念上，k-NN使用“接近度”的概念来分类新的实体。 在k-NN'nearness'中用欧几里德距离或余弦距离等概念来建模。 相比之下，在NaiveBayes中，“概率”概念用于对新实体进行分类。

### 首先，条件概率和贝叶斯规则

（民主党和女参议员）=普罗布（参议员是民主党人）的概率乘以有条件的女性可能性，因为他们是民主党人。

``````  P(Democrat & Female) = P(Democrat) * P(Female | Democrat)
``````

``````  P(Democrat & Female) = P(Female) * P(Democrat | Female)
``````

### 了解贝叶斯规则

P（根据我们知道的某些证据得出的结果）= P（给定我们知道结果的证据）乘以Prob（结果），由P（证据）

``````Probability of Disease D given Test-positive =

Prob(Test is positive|Disease) * P(Disease)
_______________________________________________________________
(scaled by) Prob(Testing Positive, with or without the disease)
``````

## 到朴素贝叶斯的“

``````P(Outcome|Multiple Evidence) =
P(Evidence1|Outcome) * P(Evidence2|outcome) * ... * P(EvidenceN|outcome) * P(Outcome)
scaled by P(Multiple Evidence)
``````

``````                      P(Likelihood of Evidence) * Prior prob of outcome
P(outcome|evidence) = _________________________________________________
P(Evidence)
``````

• 如果Prob（证据|结果）是1，那么我们就乘以1。
• 如果Prob（某些特定的证据）为0，那么整个概率。 如果你看到矛盾的证据，我们可以排除这一结果。
• 既然我们按P（证据）划分所有事物，我们甚至可以不计算它就离开。
• 先验相乘的直觉是，我们给出更普遍的结果的可能性很高，对不可能的结果的可能性低。 这些也被称为`base rates` ，它们是一种缩放预测概率的方法。

### 水果的例子

1. 是否长
2. 无论是甜蜜还是甜蜜
3. 如果它的颜色是黄色的。

``````Type           Long | Not Long || Sweet | Not Sweet || Yellow |Not Yellow|Total
___________________________________________________________________
Banana      |  400  |    100   || 350   |    150    ||  450   |  50      |  500
Orange      |    0  |    300   || 150   |    150    ||  300   |   0      |  300
Other Fruit |  100  |    100   || 150   |     50    ||   50   | 150      |  200
____________________________________________________________________
Total       |  500  |    500   || 650   |    350    ||  800   | 200      | 1000
___________________________________________________________________
``````

`````` P(Banana)      = 0.5 (500/1000)
P(Orange)      = 0.3
P(Other Fruit) = 0.2
``````

“证据”的可能性

``````p(Long)   = 0.5
P(Sweet)  = 0.65
P(Yellow) = 0.8
``````

“可能性”的可能性

``````P(Long|Banana) = 0.8
P(Long|Orange) = 0  [Oranges are never long in all the fruit we have seen.]
....

P(Yellow|Other Fruit)     =  50/200 = 0.25
P(Not Yellow|Other Fruit) = 0.75
``````

### 鉴于水果，如何分类？

``````P(Banana|Long, Sweet and Yellow)
P(Long|Banana) * P(Sweet|Banana) * P(Yellow|Banana) * P(banana)
= _______________________________________________________________
P(Long) * P(Sweet) * P(Yellow)

= 0.8 * 0.7 * 0.9 * 0.5 / P(evidence)

= 0.252 / P(evidence)

P(Orange|Long, Sweet and Yellow) = 0

P(Other Fruit|Long, Sweet and Yellow)
P(Long|Other fruit) * P(Sweet|Other fruit) * P(Yellow|Other fruit) * P(Other Fruit)
= ____________________________________________________________________________________
P(evidence)

= (100/200 * 150/200 * 50/200 * 200/1000) / P(evidence)

= 0.01875 / P(evidence)
``````

### 为什么贝叶斯分类器如此受欢迎？

`Let z = 1 / P(evidence).` 现在我们快速计算以下三个数量。

``````P(Banana|evidence) = z * Prob(Banana) * Prob(Evidence1|Banana) * Prob(Evidence2|Banana) ...
P(Orange|Evidence) = z * Prob(Orange) * Prob(Evidence1|Orange) * Prob(Evidence2|Orange) ...
P(Other|Evidence)  = z * Prob(Other)  * Prob(Evidence1|Other)  * Prob(Evidence2|Other)  ...
``````

`GREEN`先验概率`number of GREEN objects / total number of objects`

`RED`先验概率`number of RED objects / total number of objects`

`GREEN`先验概率`GREEN`

`RED`先验概率`20 / 60`