250x250
Link
๋‚˜์˜ GitHub Contribution ๊ทธ๋ž˜ํ”„
Loading data ...
Notice
Recent Posts
Recent Comments
๊ด€๋ฆฌ ๋ฉ”๋‰ด

Data Science LAB

[Python] ๋…๋ฆฝ / ๋Œ€์‘ ํ‘œ๋ณธ t ๊ฒ€์ • ๋ณธ๋ฌธ

๐Ÿ›  Machine Learning/๊ธฐ์ดˆ ํ†ต๊ณ„

[Python] ๋…๋ฆฝ / ๋Œ€์‘ ํ‘œ๋ณธ t ๊ฒ€์ •

ใ…… ใ…œ ใ…” ใ…‡ 2022. 3. 15. 14:32
728x90
๋Œ€์‘ ํ‘œ๋ณธ vs ๋…๋ฆฝ ํ‘œ๋ณธ

 

  • ๋Œ€์‘ ํ‘œ๋ณธ : ๋ถ€๋ถ€ 100์Œ์„ ๋ฝ‘์•„ ๋‚จํŽธ 100๋ช…๊ณผ ์•„๋‚ด 100๋ช…์œผ๋กœ ์ง‘๋‹จ ๋น„๊ต
  • ๋…๋ฆฝ ํ‘œ๋ณธ : ๋ฌด์ž‘์œ„๋กœ ๋‚จ์ž 100๋ช…, ์—ฌ์ž 100๋ช…์„ ๋ฝ‘์•„ ๋น„๊ต

๋‘ ์ง‘๋‹จ์ด ๋…๋ฆฝ์ ์ด์–ด์•ผ ๋…๋ฆฝ ํ‘œ๋ณธ์ด๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ๋‹ค. 

 

๋“ฑ๋ถ„์‚ฐ ๊ฒ€์ •

๋ฐ˜๋“œ์‹œ ๋“ฑ๋ถ„์‚ฐ ๊ฒ€์ • ํ›„์— ๋…๋ฆฝํ‘œ๋ณธ t๊ฒ€์ •์„ ์ˆ˜ํ–‰ํ•ด์•ผํ•œ๋‹ค. 

๊ท€๋ฌด๊ฐ€์„ค (H0) : ๋‘ ์ง‘๋‹จ์˜ ๋ฐ์ดํ„ฐ๋Š” ๋“ฑ๋ถ„์‚ฐ์„ฑ์„ ๋งŒ์กฑํ•จ

๋Œ€๋ฆฝ๊ฐ€์„ค (H1) : ๋‘ ์ง‘๋‹จ์˜ ๋ฐ์ดํ„ฐ๋Š” ๋“ฑ๋ถ„์‚ฐ์„ฑ์„ ๋งŒ์กฑํ•˜์ง€ ์•Š์Œ

p-value๊ฐ’์ด 0.05๋ณด๋‹ค ์ž‘์œผ๋ฉด ๊ท€๋ฌด๊ฐ€์„ค ๊ธฐ๊ฐ => ๋‘ ์ง‘๋‹จ์€ ๋“ฑ๋ถ„์‚ฐ์„ฑ์„ ๋งŒ์กฑํ•˜์ง€ ์•Š์Œ

import numpy as np
from scipy.stats import levene

a = np.random.normal(10,1,100)
b = np.random.normal(10,1,100)

print("a ๋ถ„์‚ฐ : {0:.4f} , b ๋ถ„์‚ฐ : {1:.4f}".format(np.var(a),np.var(b)))

 

print(levene(a,b))

 

 

p-value๊ฐ’์ด 0.61๋กœ 0.05๋ณด๋‹ค ํฌ๊ธฐ ๋•Œ๋ฌธ์— ๊ท€๋ฌด๊ฐ€์„ค์„ ๊ธฐ๊ฐํ•˜์ง€ ์•Š๋Š”๋‹ค. 

๋”ฐ๋ผ์„œ a,b ๋‘ ์ง‘๋‹จ์˜ ๋ฐ์ดํ„ฐ์˜ ๋ถ„์‚ฐ์€ ๋™์ผํ•˜๋‹ค๊ณ  ํŒ๋‹จํ•  ์ˆ˜ ์žˆ๋‹ค. 

 

 

์ •๊ทœ์„ฑ ๊ฒ€์ •

๊ท€๋ฌด๊ฐ€์„ค : ๋ฐ์ดํ„ฐ๋Š” ์ •๊ทœ ๋ถ„ํฌ๋ฅผ ๋”ฐ๋ฆ„

๋Œ€๋ฆฝ๊ฐ€์„ค : ๋ฐ์ดํ„ฐ๋Š” ์ •๊ทœ ๋ถ„ํฌ๋ฅผ ๋”ฐ๋ฅด์ง€ ์•Š์Œ

p-value๊ฐ€ 0.05 ๋ฏธ๋งŒ์ด๋ฉด ๊ท€๋ฌด๊ฐ€์„ค ๊ธฐ๊ฐ => ๋ฐ์ดํ„ฐ๋Š” ์ •๊ทœ๋ถ„ํฌ๋ฅผ ๋”ฐ๋ฅด์ง€ ์•Š์Œ

from scipy import stats

print('a์˜ ์ •๊ทœ์„ฑ : ',stats.normaltest(a))
print('b์˜ ์ •๊ทœ์„ฑ : ',stats.normaltest(b))

๋‘ ์ง‘๋‹จ์˜ ์ •๊ทœ์„ฑ ๊ฒ€์ • ๊ฒฐ๊ณผ, a,b ๋ชจ๋‘ p-value๊ฐ’์ด 0.05 ์ด์ƒ์ด๋ฏ€๋กœ ์ •๊ทœ ๋ถ„ํฌ๋ฅผ ๋”ฐ๋ฅธ๋‹ค๊ณ  ํŒ๋‹จ

 

 

๋…๋ฆฝํ‘œ๋ณธ t-test
  • ๋‘ ๋…๋ฆฝ๋œ ํ‘œ๋ณธ์˜ ํ‰๊ท ์„ ํ†ต๊ณ„์ ์œผ๋กœ ๋น„๊ตํ•˜๋Š” ๊ธฐ๋ฒ•
  • ๋‘ ๊ฐœ์˜ ๋ชจ์ง‘๋‹จ์—์„œ ํฌ๊ธฐ๊ฐ€ n๊ฐœ์ธ ํ‘œ๋ณธ์„ ๊ฐ๊ฐ ์ถ”์ถœํ•œ ๋’ค ํ‘œ๋ณธ์˜ ๊ด€์ธก๊ฐ’๋“ค์„ ์ด์šฉํ•ด ๊ฒ€์ •
  • ๋‘ ์ง‘๋‹จ์˜ ๋ชจํ‰๊ท ์€ ๋™์ผํ•˜๋‹ค๊ณ  ๊ฐ€์ •
  • ์ •๊ทœ์„ฑ, ๋“ฑ๋ถ„์‚ฐ์„ฑ์„ ๋งŒ์กฑํ•ด์•ผํ•จ
scipy.stats.ttest_ind(a,b,axis=0,equal_var = True, nan_policy = 'propagate', permutations=None, random_state=None, alternative='two-sided', trim=0)

 

equal_var : ๋“ฑ๋ถ„์‚ฐ์„ฑ ์—ฌ๋ถ€ (True : ๋“ฑ๋ถ„์‚ฐ์„ฑ ๋งŒ์กฑ/ False : ๋“ฑ๋ถ„์‚ฐ์„ฑ ๋งŒ์กฑโŒ)

 

๋“ฑ๋ถ„์‚ฐ์„ฑ ๊ฒ€์ • ๊ฒฐ๊ณผ, ๋‘ ์ง‘๋‹จ์˜ ๋ถ„์‚ฐ์ด ๊ฐ™๋‹ค๊ณ  ํŒ๋‹จํ•˜์˜€์œผ๋ฏ€๋กœ, equal_var = True๋กœ ์„ค์ •

stats.ttest_ind(a,b,equal_var=True)

 

 

 

๋Œ€์‘ ํ‘œ๋ณธ t-test
  • a์™€ b์˜ shape์ด ์ผ์น˜ํ•ด์•ผ ํ•จ
scipy.stats.ttest_rel(a,b,axis=0, nan_policy = 'propagate', alternative='two-sided')

 

a = np.random.normal(10,1,100)
b = a + np.random.normal(0,1,100)

stats.ttest_rel(a,b)

 

728x90
Comments