250x250
Link
๋‚˜์˜ GitHub Contribution ๊ทธ๋ž˜ํ”„
Loading data ...
Notice
Recent Posts
Recent Comments
๊ด€๋ฆฌ ๋ฉ”๋‰ด

Data Science LAB

[Python] LDA(Linear Discriminant Analysis) ๋ณธ๋ฌธ

๐Ÿ›  Machine Learning/์ฐจ์› ์ถ•์†Œ

[Python] LDA(Linear Discriminant Analysis)

ใ…… ใ…œ ใ…” ใ…‡ 2022. 3. 7. 21:45
728x90

LDA ๊ฐœ์š”

LDA๋Š” ์„ ํ˜• ํŒ๋ณ„ ๋ถ„์„๋ฒ•์œผ๋กœ, PCA์™€ ๋งค์šฐ ์œ ์‚ฌํ•˜๊ฒŒ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์…‹์„ ์ €์ฐจ์› ๊ณต๊ฐ„์— ํˆฌ์—ฌํ•ด ์ฐจ์›์„ ์ถ•์†Œํ•˜๋Š” ๊ธฐ๋ฒ•์ด๋‹ค. PCA์™€์˜ ์ฐจ์ด๋Š” LDA๋Š” ์ง€๋„ํ•™์Šต์˜ ๋ถ„๋ฅ˜์—์„œ ์‚ฌ์šฉํ•˜๊ธฐ ์‰ฝ๋„๋ก ๊ฐœ๋ณ„ ํด๋ž˜์Šค๋ฅผ ๋ถ„๋ณ„ํ•  ์ˆ˜ ์žˆ๋Š” ๊ธฐ์ค€์„ ์ตœ๋Œ€ํ•œ ์œ ์ง€ํ•˜๋ฉด์„œ ์ฐจ์›์„ ์ถ•์†Œํ•œ๋‹ค. ๋ฐ˜๋ฉด PCA๋Š” ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์˜ ๋ณ€๋™์„ฑ์˜ ๊ฐ€์žฅ ํฐ ์ถ•์„ ์ฐพ์•˜์ง€๋งŒ, LDA๋Š” ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์˜ ๊ฒฐ์ • ๊ฐ’ ํด๋ž˜์Šค๋ฅผ ์ตœ๋Œ€ํ•œ์œผ๋กœ ๋ถ„๋ฆฌํ•  ์ˆ˜ ์žˆ๋Š” ์ถ•์„ ์ฐพ๋Š”๋‹ค.

 

 

#์ฐธ๊ณ 

2022.03.05 - [Python] PCA(Principal Component Analysis)

 

[Python] PCA(Principal Component Analysis)

PCA ๊ฐœ์š” PCA(Principal Component Analysis)๋Š” ๊ฐ€์žฅ ๋Œ€ํ‘œ์ ์ธ ์ฐจ์› ์ถ•์†Œ ๊ธฐ๋ฒ•์œผ๋กœ ์—ฌ๋Ÿฌ ๋ณ€์ˆ˜ ๊ฐ„์— ์กด์žฌํ•˜๋Š” ์ƒ๊ด€๊ด€๊ณ„๋ฅผ ์ด์šฉํ•ด ์ด๋ฅผ ๋Œ€ํ‘œํ•˜๋Š” ์ฃผ์„ฑ๋ถ„(Principal Component)๋ฅผ ์ถ”์ถœํ•ด ์ฐจ์›์„ ์ถ•์†Œํ•˜๋Š” ๊ธฐ๋ฒ•์ด๋‹ค.

suhye.tistory.com

 

LDA๋Š” ํŠน์ • ๊ณต๊ฐ„์ƒ์—์„œ ํด๋ž˜์Šค ๋ถ„๋ฆฌ๋ฅผ ์ตœ๋Œ€ํ™”ํ•˜๋Š” ์ถ•์„ ์ฐพ๊ธฐ ์œ„ํ•ด ํด๋ž˜์Šค ๊ฐ„ ๋ถ„์‚ฐ๊ณผ ํด๋ž˜์Šค ๋‚ด๋ถ€ ๋ถ„์‚ฐ์˜ ๋น„์œจ์„ ์ตœ๋Œ€ํ™”ํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ์ฐจ์›์„ ์ถ•์†Œํ•œ๋‹ค. ์ฆ‰, ํด๋ž˜์Šค ๊ฐ„ ๋ถ„์‚ฐ์€ ์ตœ๋Œ€ํ•œ ํฌ๊ฒŒ, ํด๋ž˜์Šค ๋‚ด๋ถ€ ๋ถ„์‚ฐ์€ ์ตœ๋Œ€ํ•œ ์ž‘๊ฒŒ ๊ฐ€์ ธ๊ฐ€๋Š” ๋ฐฉ์‹์ด๋‹ค.

 

1. ํด๋ž˜์Šค ๋‚ด๋ถ€์™€ ํด๋ž˜์Šค ๊ฐ„ ๋ถ„์‚ฐ ํ–‰๋ ฌ์„ ๊ตฌํ•œ๋‹ค. ์ด ๋‘ ํ–‰๋ ฌ์€ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์˜ ๊ฒฐ์ • ๊ฐ’ ํด๋ž˜์Šค๋ณ„๋กœ ๊ฐœ๋ณ„ ํ”ผ์ฒ˜์˜ ํ‰๊ท  ๋ฒกํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ตฌํ•œ๋‹ค. 

2. ํด๋ž˜์Šค ๋‚ด๋ถ€ ๋ถ„์‚ฐ ํ–‰๋ ฌ์„ ์•„๋ž˜ ๊ทธ๋ฆผ๊ณผ ๊ฐ™์ด ๊ณ ์œ ๋ฒกํ„ฐ๋กœ ๋ถ„ํ•ดํ•œ๋‹ค. 


3. ๊ณ ์œ ๊ฐ’์ด ํฐ ์ˆœ์„œ๋Œ€๋กœ K(LDA ๋ณ€ํ™˜ ์ฐจ์ˆ˜)๊ฐœ๋งŒํผ ์ถ”์ถœํ•œ๋‹ค. 

4. ๊ณ ์œ ๊ฐ’์ด ๊ฐ€์žฅ ํฐ ์ˆœ์œผ๋กœ ์ถ”์ถœ๋œ ๊ณ ์œ ๋ฒกํ„ฐ๋ฅผ ์ด์šฉํ•˜์—ฌ ์ƒˆ๋กญ๊ฒŒ ์ž…๋ ฅ๋ฐ์ดํ„ฐ๋ฅผ ๋ณ€ํ™˜ํ•œ๋‹ค. 

 

 

 

 

iris ๋ฐ์ดํ„ฐ์…‹์— ์ ์šฉ

๋ฐ์ดํ„ฐ์…‹ ๋กœ๋“œ ํ›„ ์Šค์ผ€์ผ ์ ์šฉ

from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import load_iris

iris = load_iris()
iris_scaled = StandardScaler().fit_transform(iris.data)

LDA๋Š” PCA์™€๋Š” ๋‹ค๋ฅด๊ฒŒ ์ง€๋„ํ•™์Šต์ด๋‹ค. 

 

 

 

 

LDA ์ ์šฉ

lda = LinearDiscriminantAnalysis(n_components=2)
lda.fit(iris_scaled,iris.target)
iris_lda = lda.transform(iris_scaled)
print(iris_lda.shape)

n_components = 2๋กœ ์„ค์ •ํ•˜์—ฌ LDA ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ ์šฉํ•˜์˜€๋‹ค.

 

 

 

 

์‹œ๊ฐํ™”

import pandas as pd 
import matplotlib.pyplot as plt
%matplotlib inline

lda_columns = ['lda_component_1','lda_component_2']
iris_df_lda = pd.DataFrame(iris_lda,columns = lda_columns)
iris_df_lda['target'] = iris.target

markers = ['^','s','o']
for i,marker in enumerate(markers):
    x_axis_data = iris_df_lda[iris_df_lda['target'] == i]['lda_component_1']
    y_axis_data = iris_df_lda[iris_df_lda['target'] == i]['lda_component_2']
    
    plt.scatter(x_axis_data,y_axis_data,marker = marker,label = iris.target_names[i])
    
plt.legend(loc='upper right')
plt.xlabel('lda_component_1')
plt.ylabel('lda_component_2')
plt.show()

 

728x90

'๐Ÿ›  Machine Learning > ์ฐจ์› ์ถ•์†Œ' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

[Python]NMF  (0) 2022.03.08
[Python] SVD(Singular Value Decomposition)  (0) 2022.03.07
[Python] PCA ์˜ˆ์ œ  (0) 2022.03.06
[Python] PCA(Principal Component Analysis)  (0) 2022.03.05
Comments