這篇文章主要介紹了python怎么實現決策樹的相關知識,內容詳細易懂,操作簡單快捷,具有一定借鑒價值,相信大家閱讀完這篇python怎么實現決策樹文章都會有所收獲,下面我們一起來看看吧。
背景介紹
這是我最喜歡的算法之一,我經常使用它。它是一種監督學習算法,主要用于分類問題。令人驚訝的是,它適用于分類和連續因變量。在該算法中,我們將總體分成兩個或更多個同類集。這是基于最重要的屬性/獨立變量來完成的,以盡可能地作為不同的組。
在上圖中,您可以看到人口根據多個屬性分為四個不同的組,以識別“他們是否會玩”。為了將人口分成不同的異構群體,它使用各種技術,如基尼,信息增益,卡方,熵。
理解決策樹如何工作的最好方法是玩Jezzball--一款來自微軟的經典游戲(如下圖所示)?;旧?,你有一個移動墻壁的房間,你需要創建墻壁,以便最大限度的區域被球清除。
所以,每次你用墻隔開房間時,你都試圖在同一個房間里創造2個不同的人口。決策樹以非常類似的方式工作,通過將人口分成盡可能不同的群體。
接下來看使用Python Scikit-learn的決策樹案例:
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
# read the train and test dataset
train_data = pd.read_csv('train-data.csv')
test_data = pd.read_csv('test-data.csv')
# shape of the dataset
print('Shape of training data :',train_data.shape)
print('Shape of testing data :',test_data.shape)
train_x = train_data.drop(columns=['Survived'],axis=1)
train_y = train_data['Survived']
test_x = test_data.drop(columns=['Survived'],axis=1)
test_y = test_data['Survived']
model = DecisionTreeClassifier()
model.fit(train_x,train_y)
# depth of the decision tree
print('Depth of the Decision Tree :', model.get_depth())
# predict the target on the train dataset
predict_train = model.predict(train_x)
print('Target on train data',predict_train)
# Accuray Score on train dataset
accuracy_train = accuracy_score(train_y,predict_train)
print('accuracy_score on train dataset : ', accuracy_train)
# predict the target on the test dataset
predict_test = model.predict(test_x)
print('Target on test data',predict_test)
# Accuracy Score on test dataset
accuracy_test = accuracy_score(test_y,predict_test)
print('accuracy_score on test dataset : ', accuracy_test)
上面代碼運行結果:
Shape of training data : (712, 25)Shape of testing data : (179, 25)Depth of the Decision Tree : 19Target on train data [0 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0 1 0 0 0 1 0 1 0 1 0 0 1 0 1 0 0 0 0 0 0 0 1 0 1 1 1 1 0 1 0 01 0 0 0 0 0 0 1 1 0 0 1 0 0 1 1 1 0 0 0 1 0 1 0 0 1 0 0 0 1 1 0 0 1 0 1 11 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 1 1 0 0 0 1 0 0 1 0 0 0 1 0 1 0 1 0 0 00 1 0 1 1 0 0 0 0 1 1 0 0 1 0 0 1 0 1 1 0 1 1 0 1 1 0 0 0 0 0 0 0 1 0 0 00 0 1 0 0 1 0 1 1 1 1 0 0 1 0 1 0 0 1 1 1 1 0 0 0 1 1 1 1 0 0 0 0 1 0 0 00 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 1 1 01 0 0 0 1 0 0 1 1 0 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 0 0 1 0 0 0 1 1 0 0 1 10 0 0 0 0 0 0 0 1 1 0 1 1 0 1 0 1 1 1 0 0 0 1 0 1 1 0 0 0 0 1 0 1 1 0 0 00 0 0 0 0 1 1 0 0 1 1 0 1 0 0 0 1 0 1 0 0 0 1 0 1 0 0 1 0 1 0 0 0 0 0 1 00 0 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 1 0 0 0 0 1 0 0 1 1 1 1 01 1 0 1 1 1 0 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 0 0 1 0 0 1 0 0 1 0 1 0 1 1 11 0 0 1 0 0 0 1 0 0 0 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 0 0 0 0 0 0 0 1 1 10 0 0 0 0 0 0 0 1 1 1 0 0 1 0 1 1 0 1 0 0 0 1 1 1 0 1 0 0 0 0 0 0 0 0 0 00 0 0 1 0 1 1 0 0 0 0 1 0 0 0 1 0 1 0 1 1 1 0 0 0 0 0 0 1 1 1 0 0 1 1 1 01 0 1 0 0 1 0 0 0 1 1 0 0 1 0 0 1 0 1 0 0 1 0 0 0 1 0 0 1 1 0 1 0 0 0 0 11 0 1 1 1 0 1 0 1 0 1 1 0 1 0 1 0 0 1 0 0 1 0 1 1 0 1 0 0 0 1 0 1 0 0 0 00 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 1 1 0 0 1 0 1 0 0 1 0 0 1 1 1 1 0 1 0 0 01 0 1 0 1 0 1 0 0 0 1 0 0 1 0 0 1 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 1 0 1 0 1 0 1 1 1 0 0 1 0]accuracy_score on train dataset : 0.9859550561797753Target on test data [0 0 0 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 01 1 1 1 0 0 1 0 1 1 0 1 1 1 1 0 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 0 1 1 1 0 1 1 1 0 1 1 1 0 0 0 00 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 1 1 0 0 1 0 1 0 1 1 11 0 1 1 0 1 0 1 0 0 0 0 1 1 1 1 0 1 1 1 1 1 0 0 1 1 0 0 1 1 0 0 0 1 0 1 01 0 0 0 1 0 0 0 0 1 1 0 0 1 0 0 0 0 0 1 0 0 0 1 1 0 1 0 1 1 0 1 0 0 0 0 0]accuracy_score on test dataset : 0.770949720670391
關于“python怎么實現決策樹”這篇文章的內容就介紹到這里,感謝各位的閱讀!相信大家對“python怎么實現決策樹”知識都有一定的了解,大家如果還想學習更多知識,歡迎關注億速云行業資訊頻道。
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。