250x250
Link
๋‚˜์˜ GitHub Contribution ๊ทธ๋ž˜ํ”„
Loading data ...
Notice
Recent Posts
Recent Comments
๊ด€๋ฆฌ ๋ฉ”๋‰ด

๋ชฉ๋ก๐Ÿ Python/Pandas (7)

Data Science LAB

[Python] ๋ฐ์ดํ„ฐ ๊ฒฐ์ธก์น˜ ์ฒ˜๋ฆฌ

Pandas ์—์„œ๋Š” ๋‹ค์–‘ํ•œ ๋ฐฉ๋ฒ•์œผ๋กœ ๊ฒฐ์ธก์น˜(NA)๋ฅผ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋‹ค. ๊ฒฐ์ธก์น˜๋ž€, ์ปฌ๋Ÿผ์— ๊ฐ’์ด ์—†๋Š” NULL ์ƒํƒœ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋งํ•˜๋ฉฐ, ๋ฐ์ดํ„ฐ์…‹์„ ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ์— ์ ์šฉํ•  ๋•Œ ๊ฒฐ์ธก์น˜๊ฐ€ ์กด์žฌํ•˜๋ฉด ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋‹ค๋ฅธ ๊ฐ’์œผ๋กœ ๋Œ€์ฒดํ•˜๊ฑฐ๋‚˜ ์‚ญ์ œํ•ด์•ผํ•œ๋‹ค. ๋ฐ์ดํ„ฐ์˜ ๊ฒฐ์ธก ์—ฌ๋ถ€ ํ™•์ธ import pandas as pd import numpy as np data = pd.read_csv("titanic_train.csv") data.head() ๋จผ์ €, ์œ ๋ช…ํ•œ ๋ถ„๋ฅ˜ ๋ฐ์ดํ„ฐ์…‹ ์ค‘ ํ•˜๋‚˜์ธ ํƒ€์ดํƒ€๋‹‰ ๋ฐ์ดํ„ฐ์…‹์„ ๋ถˆ๋Ÿฌ์˜จ๋‹ค. data.isna() isna()๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๋ฐ์ดํ„ฐ๊ฐ€ ๊ฒฐ์ธก์ธ์ง€ ์•„๋‹Œ์ง€์— ๋Œ€ํ•ด True/False๋กœ ๋ฐ˜ํ™˜ํ•ด์ค€๋‹ค. True -> ๊ฒฐ์ธก False -> ๊ฒฐ์ธก X data.isna().sum() isna์— sum(..

๐Ÿ Python/Pandas 2022. 3. 11. 14:05
[Python] loc/iloc ์ฐจ์ด์ 

Pandas๋ฅผ ์ด์šฉํ•ด์„œ ๋ฐ์ดํ„ฐ๋ฅผ ์„ ํƒํ•  ๋•Œ, iloc๊ณผ loc์„ ์ž์ฃผ ์‚ฌ์šฉํ•˜๊ฒŒ ๋˜๋Š”๋ฐ, ๊ฐ€๋” ํ—ท๊ฐˆ๋ ค์„œ ๋‘ ์—ฐ์‚ฐ์ž์— ๋Œ€ํ•ด ํฌ์ŠคํŒ… ํ•ด๋ณด๋ ค๊ณ  ํ•œ๋‹ค! ์œ„์น˜ ๊ธฐ๋ฐ˜ ์ธ๋ฑ์‹ฑ ์œ„์น˜ ๊ธฐ๋ฐ˜ ์ธ๋ฑ์‹ฑ์€ 0์„ ์ถœ๋ฐœ์ ์œผ๋กœ ํ•˜๋Š” ๊ฐ€๋กœ, ์„ธ๋กœ์ถ• ์ขŒํ‘œ ๊ธฐ๋ฐ˜์˜ ํ–‰๊ณผ ์—ด์œ„์น˜๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ์ง€์ •ํ•œ๋‹ค. ๋”ฐ๋ผ์„œ ํ–‰, ์—ด ๊ฐ’์œผ๋กœ ์ •์ˆ˜๊ฐ€ ์ž…๋ ฅ๋˜๋ฉฐ, iloc[] ์—ฐ์‚ฐ์ž๋Š” ์œ„์น˜ ๊ธฐ๋ฐ˜ ์ธ๋ฑ์‹ฑ์— ํ•ด๋‹นํ•œ๋‹ค. iloc[]์€ ํ–‰๊ณผ ์—ด ๊ฐ’์œผ๋กœ integer ๋˜๋Š” integerํ˜•์˜ ์Šฌ๋ผ์ด์‹ฑ, ํŒฌ์‹œ ๋ฆฌ์ŠคํŠธ ๊ฐ’์„ ์ž…๋ ฅํ•ด์ค˜์•ผ ํ•œ๋‹ค. ๋จผ์ €, ์—ฐ์Šต์šฉ์œผ๋กœ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์„ ํ•˜๋‚˜ ์ƒ์„ฑํ•ด์ค๋‹ˆ๋‹ค. import pandas as pd data = {'Name' : ['Red','Blue','Yellow','Green'], 'Year' : [2020,2021,2022,2023],..

๐Ÿ Python/Pandas 2022. 3. 10. 21:02
[Python] ๋ฆฌ์ŠคํŠธ, ๋”•์…”๋„ˆ๋ฆฌ, array ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์œผ๋กœ ๋ณ€ํ™˜

์•ˆ๋…•ํ•˜์„ธ์š”! ์˜ค๋Š˜์€ ์ง€๋‚œ ํฌ์ŠคํŒ…์— ์ด์–ด์„œ Pandas๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์ด์šฉํ•ด list, dictionary, arrayํ˜•์‹์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ํ˜•์‹์œผ๋กœ ๋ณ€ํ™˜ํ•ด๋ณด๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. (เธ‡ •_•)เธ‡ ๋จผ์ €, ๊ฐ„๋‹จํ•˜๊ฒŒ ๋ฆฌ์ŠคํŠธ์™€ array๋ฅผ ์ƒ์„ฑํ•ด ๋ด…๋‹ˆ๋‹ค. import pandas as pd import numpy as np col_name1 = ['col1'] list1 = [1,2,3] array1 = np.array(list1) print('array1 shape: ',array1.shape) array๋Š” 1์ฐจ์›์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋Š” ๋ฐ์ดํ„ฐ์ด๋ฉฐ 3๊ฐœ์˜ ๋กœ์šฐ๊ฐ€ ์žˆ๋Š” ๋ฆฌ์ŠคํŠธ๊ฐ€ ์ƒ์„ฑ๋œ ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋ฆฌ์ŠคํŠธ -> ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ pd.DataFrame(๋ฆฌ์ŠคํŠธ์ด๋ฆ„,์ปฌ๋Ÿผ๋ช…) ์„ ์ด์šฉํ•ด ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ํ˜•์‹์œผ๋กœ ๋ฐ”๊ฟ” ์ค๋‹ˆ๋‹ค..

๐Ÿ Python/Pandas 2022. 2. 16. 12:58