Python

[python] pie ์ฐจํŠธ๋ฅผ ํ™œ์šฉํ•œ ์ œ์ฃผ๋„ ์„ฑ๋ณ„ ์ธ๊ตฌ ๋ถ„ํฌ - Matplotlib ํ™œ์šฉ

hyonie 2024. 5. 12. 16:41

 

 

 


์ œ์ฃผ๋„ ์„ฑ๋ณ„ ์ธ๊ตฌ ํŒŒ์ด ์ฐจํŠธ๋กœ ์‹œ๊ฐํ™”

์ด ์ฝ”๋“œ๋Š” '์ œ์ฃผํŠน๋ณ„์ž์น˜๋„'์˜ ๋‚จ์„ฑ๊ณผ ์—ฌ์„ฑ์˜ 0์„ธ๋ถ€ํ„ฐ 100์„ธ๊นŒ์ง€์˜ ์ธ๊ตฌ ๋ถ„ํฌ๋ฅผ ํŒŒ์ด ์ฐจํŠธ๋กœ ์‹œ๊ฐํ™”ํ•˜๋Š” ๊ณผ์ •์„ ์ˆ˜ํ–‰ํ•œ๋‹ค.

 

๋จผ์ € CSV ํŒŒ์ผ์„ ์ฝ์–ด์˜จ ํ›„, '์ œ์ฃผํŠน๋ณ„์ž์น˜๋„'์ธ ํ–‰์„ ์ฐพ๋Š”๋‹ค. ๊ทธ๋Ÿฐ ๋‹ค์Œ, ํ•ด๋‹น ํ–‰์—์„œ ๋‚จ์„ฑ ๋ฐ์ดํ„ฐ์™€ ์—ฌ์„ฑ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ๊ฐ male ๋ฆฌ์ŠคํŠธ์™€ female ๋ฆฌ์ŠคํŠธ์— ์ €์žฅํ•œ๋‹ค. ๋‚จ์„ฑ ๋ฐ์ดํ„ฐ๋Š” ์Œ์ˆ˜๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ ์ €์žฅํ•˜๊ณ , ์—ฌ์„ฑ ๋ฐ์ดํ„ฐ๋Š” ์–‘์ˆ˜๋กœ ์ €์žฅํ•œ๋‹ค. ์—ฌ๊ธฐ์„œ break๋ฌธ์€ '์ œ์ฃผํŠน๋ณ„์ž์น˜๋„' ํ–‰์„ ์ฐพ์€ ํ›„์—๋Š” ๋” ์ด์ƒ ๋‹ค์Œ ํ–‰์„ ์ฝ์ง€ ์•Š๊ณ  ๋ฐ˜๋ณต๋ฌธ์„ ์ข…๋ฃŒํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ๋œ๋‹ค.

 

๊ทธ ํ›„, plt.pie([10,20])๋Š” ๋‚จ์„ฑ๊ณผ ์—ฌ์„ฑ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํ‘œํ˜„ํ•˜๊ธฐ ์œ„ํ•ด ๋‚จ์„ฑ๊ณผ ์—ฌ์„ฑ์˜ ์ด ํ•ฉ๊ณ„๋ฅผ ๊ฐ๊ฐ 10๊ณผ 20์œผ๋กœ ์ž„์˜๋กœ ์„ค์ •ํ•˜์—ฌ ํŒŒ์ด ์ฐจํŠธ๋ฅผ ๊ทธ๋ฆฐ๋‹ค. ์ด๋Š” ์‹ค์ œ ๋‚จ์„ฑ๊ณผ ์—ฌ์„ฑ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ˜์˜ํ•˜์ง€ ์•Š๊ณ , ์ž„์˜์˜ ๊ฐ’์„ ์‚ฌ์šฉํ•œ ๊ฒƒ์ด๋‹ค. ๊ทธ๋ž˜์„œ '10'๊ณผ '20'์ด ํŒŒ์ด ์ฐจํŠธ์— ํ‘œํ˜„๋œ๋‹ค.

์‹ค์ œ๋กœ๋Š” plt.pie() ํ•จ์ˆ˜์˜ ์ธ์ž๋กœ ๋‚จ์„ฑ๊ณผ ์—ฌ์„ฑ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ „๋‹ฌํ•˜์—ฌ ๊ฐ๊ฐ์˜ ๋น„์œจ์„ ํ‘œ์‹œํ•ด์•ผํ•œ๋‹ค.

import csv
import matplotlib.pyplot as plt

f = open('gender.csv', 'r', encoding = 'cp949')
data = csv.reader(f, delimiter=',')
header = next(data)

male =[]
female = []

for row in data:
  if '์ œ์ฃผํŠน๋ณ„์ž์น˜๋„' in row[0]:
    for i in row[3:104]:
      male.append(-int(i))
    for i in row[106:]:
      female.append(int(i))
    break

print(male)
print(female)

plt.pie([10,20])
plt.show()

 

 

 


 

์ œ์ฃผ๋„ํŠน๋ณ„์ž์น˜๋„์˜ ๋‚จ์„ฑ๊ณผ ์—ฌ์„ฑ์ธ๊ตฌ ์ดํ•ฉ ๊ณ„์‚ฐ ์˜ˆ์ œ

 

from logging import critical
import csv
import matplotlib.pyplot as plt

f = open('gender.csv', 'r', encoding = 'cp949')
data = csv.reader(f, delimiter=',')
header = next(data)

male =0
female = 0

total = []

for row in data:
  if '์ œ์ฃผํŠน๋ณ„์ž์น˜๋„' in row[0]:
    for i in range(101):
      male += int(row[i+3])
      female += int(row[i+106])
    break


total.append(male)
total.append(female)
print(total) # [335813, 331524]

plt.title('jeju')

plt.show()

 

for row in data: ๋ฐ์ดํ„ฐ๋ฅผ ํ•œ ์ค„์”ฉ ๋ฐ˜๋ณตํ•˜๋ฉด์„œ ์ฒ˜๋ฆฌํ•œ๋‹ค.

  • if '์ œ์ฃผํŠน๋ณ„์ž์น˜๋„' in row[0]:: ๋ฐ์ดํ„ฐ์˜ ์ฒซ ๋ฒˆ์งธ ์—ด์ด '์ œ์ฃผํŠน๋ณ„์ž์น˜๋„'์ธ์ง€ ํ™•์ธ
    • for i in range(101):: 0์„ธ๋ถ€ํ„ฐ 100์„ธ๊นŒ์ง€์˜ ๊ฐ ์—ฐ๋ น ๊ทธ๋ฃน์— ๋Œ€ํ•ด ๋ฐ˜๋ณตํ•œ๋‹ค.
      • male += int(row[i+3]): ๋‚จ์„ฑ ์ธ๊ตฌ๋ฅผ ๋ˆ„์ ํ•˜๊ณ , ๋ฐ์ดํ„ฐ์˜ 4๋ฒˆ์งธ ์—ด๋ถ€ํ„ฐ ๋‚จ์„ฑ ์ธ๊ตฌ๊ฐ€ ์‹œ์ž‘๋œ๋‹ค.
      • female += int(row[i+106]): ์—ฌ์„ฑ ์ธ๊ตฌ๋ฅผ ๋ˆ„์ ํ•˜๊ณ , ๋ฐ์ดํ„ฐ์˜ 107๋ฒˆ์งธ ์—ด๋ถ€ํ„ฐ ์—ฌ์„ฑ ์ธ๊ตฌ๊ฐ€ ์‹œ์ž‘๋œ๋‹ค.

 

ํŒŒ์ด ์ฐจํŠธ๋ฅผ ํ™œ์šฉํ•œ ์„ฑ๋ณ„ ์ธ๊ตฌ ๋ถ„ํฌ

color =  ['crimson', 'darkcyan']
plt.title('poplulation distributon by gender')
plt.pie(total, labels=['male', 'female'], colors=color, autopct='%.1f%%', startangle=90)
plt.legend()
plt.show()

 

  • plt.pie(total, labels=['male', 'female'], colors=color, autopct='%.1f%%', startangle=90): ์›ํ˜• ๊ทธ๋ž˜ํ”„๋ฅผ ์ƒ์„ฑ.
  • total์—๋Š” ๋‚จ์„ฑ๊ณผ ์—ฌ์„ฑ์˜ ์ด ์ธ๊ตฌ๊ฐ€ ๋ฆฌ์ŠคํŠธ๋กœ ์ €์žฅ๋œ๋‹ค, labels๋Š” ๊ฐ ํŒŒ์ด ์กฐ๊ฐ์— ๋Œ€ํ•œ ๋ผ๋ฒจ์„ ์„ค์ •ํ•˜๊ณ , colors๋Š” ๊ฐ ํŒŒ์ด์˜ ์ƒ‰์ƒ์„ ์ง€์ •ํ•œ๋‹ค. autopct๋Š” ํŒŒ์ด ์œ„์— ํ‘œ์‹œ๋  ํผ์„ผํŠธ ๊ฐ’์„ ํ˜•์‹ํ™”ํ•œ๋‹ค .1f%%์€ ์†Œ์ˆ˜์  ์ฒซ ๋ฒˆ์งธ ์ž๋ฆฌ๊นŒ์ง€ ํ‘œ์‹œํ•˜๊ณ  ํผ์„ผํŠธ ๊ธฐํ˜ธ๋ฅผ ์ถ”๊ฐ€ํ•œ๋‹ค๋Š” ์˜๋ฏธ๋กœ. startangle=90์€ ์‹œ์ž‘ ๊ฐ๋„๋ฅผ 90๋„๋กœ ์„ค์ •ํ•˜์—ฌ ๊ทธ๋ž˜ํ”„๋ฅผ ํšŒ์ „์‹œํ‚จ๋‹ค.

 

์ œ์ฃผ๋„์˜ ๋‚จ์„ฑ๊ณผ ์—ฌ์„ฑ ์ธ๊ตฌ ๋น„์œจ ์‹œ๊ฐํ™”

 

 


 

๊ฐ ํ˜ˆ์•กํ˜• ์œ ํ˜•์— ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ํŒŒ์ด ์ฐจํŠธ๋กœ ์‹œ๊ฐํ™” 

๊ฐ ํ˜ˆ์•กํ˜• ์œ ํ˜•(A, B, AB, O)์— ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ํŒŒ์ด ์ฐจํŠธ๋กœ ์‹œ๊ฐํ™”ํ•˜๋Š” ์˜ˆ์ œ

import matplotlib.pyplot as plt

label= ['A','B','AB','O']
color =  ['darkmagenta', 'deeppink', 'red', 'green']
plt.pie([10,20,30,40], labels=label, colors =color)
plt.legend()
plt.show()
  • plt.pie([10,20,30,40], labels=label, colors=color): plt.pie() ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํŒŒ์ด ์ฐจํŠธ๋ฅผ ์ƒ์„ฑํ•œ๋‹ค. ๊ฐ ํ˜ˆ์•กํ˜• ์œ ํ˜•์— ํ•ด๋‹นํ•˜๋Š” ๋ฐ์ดํ„ฐ๋Š” [10, 20, 30, 40]๋กœ ์„ค์ •๋˜์–ด ์žˆ๊ณ , ์ด๋Š” ๊ฐ ํ˜ˆ์•กํ˜• ์œ ํ˜•์— ๋Œ€ํ•œ ๋น„์œจ์„ ๋‚˜ํƒ€๋‚ธ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, 'A'ํ˜•์ด 10%, 'B'ํ˜•์ด 20%, 'AB'ํ˜•์ด 30%, 'O'ํ˜•์ด 40%๋ฅผ ์ฐจ์ง€ํ•œ๋‹ค๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•œ๋‹ค.
  • plt.legend(): ๋ฒ”๋ก€๋ฅผ ํ‘œ์‹œํ•˜๋Š” ํ•จ์ˆ˜๋กœ ์ด ํ•จ์ˆ˜๋ฅผ ํ˜ธ์ถœํ•˜๋ฉด ํŒŒ์ด ์ฐจํŠธ์— ์‚ฌ์šฉ๋œ ๊ฐ ๋ผ๋ฒจ์— ๋Œ€ํ•œ ๋ฒ”๋ก€๊ฐ€ ํ‘œ์‹œ๋œ๋‹ค.