본문 바로가기
인공지능(AI) 개발/Python

파이썬으로 정규분포 데이터와 챠트 만들기

by Jaeseok_Shim 2020. 4. 28.

파이썬으로 정규분포 데이터와 챠트를 만들어보자. 

먼저 Numpy의 linspace 함수를 이용해 특정범위의 일정 간격의 랜덤 값을 생성한다.

import numpy as np

x = np.linspace(-3, 3, 200)
print(x)

 

[-3.         -2.96984925 -2.93969849 -2.90954774 -2.87939698 -2.84924623
 -2.81909548 -2.78894472 -2.75879397 -2.72864322 -2.69849246 -2.66834171
 -2.63819095 -2.6080402  -2.57788945 -2.54773869 -2.51758794 -2.48743719
 -2.45728643 -2.42713568 -2.39698492 -2.36683417 -2.33668342 -2.30653266
 -2.27638191 -2.24623116 -2.2160804  -2.18592965 -2.15577889 -2.12562814
 -2.09547739 -2.06532663 -2.03517588 -2.00502513 -1.97487437 -1.94472362
 -1.91457286 -1.88442211 -1.85427136 -1.8241206  -1.79396985 -1.7638191
 -1.73366834 -1.70351759 -1.67336683 -1.64321608 -1.61306533 -1.58291457
 -1.55276382 -1.52261307 -1.49246231 -1.46231156 -1.4321608  -1.40201005
 -1.3718593  -1.34170854 -1.31155779 -1.28140704 -1.25125628 -1.22110553
 -1.19095477 -1.16080402 -1.13065327 -1.10050251 -1.07035176 -1.04020101
 -1.01005025 -0.9798995  -0.94974874 -0.91959799 -0.88944724 -0.85929648
 -0.82914573 -0.79899497 -0.76884422 -0.73869347 -0.70854271 -0.67839196
 -0.64824121 -0.61809045 -0.5879397  -0.55778894 -0.52763819 -0.49748744
 -0.46733668 -0.43718593 -0.40703518 -0.37688442 -0.34673367 -0.31658291
 -0.28643216 -0.25628141 -0.22613065 -0.1959799  -0.16582915 -0.13567839
 -0.10552764 -0.07537688 -0.04522613 -0.01507538  0.01507538  0.04522613
  0.07537688  0.10552764  0.13567839  0.16582915  0.1959799   0.22613065
  0.25628141  0.28643216  0.31658291  0.34673367  0.37688442  0.40703518
  0.43718593  0.46733668  0.49748744  0.52763819  0.55778894  0.5879397
  0.61809045  0.64824121  0.67839196  0.70854271  0.73869347  0.76884422
  0.79899497  0.82914573  0.85929648  0.88944724  0.91959799  0.94974874
  0.9798995   1.01005025  1.04020101  1.07035176  1.10050251  1.13065327
  1.16080402  1.19095477  1.22110553  1.25125628  1.28140704  1.31155779
  1.34170854  1.3718593   1.40201005  1.4321608   1.46231156  1.49246231
  1.52261307  1.55276382  1.58291457  1.61306533  1.64321608  1.67336683
  1.70351759  1.73366834  1.7638191   1.79396985  1.8241206   1.85427136
  1.88442211  1.91457286  1.94472362  1.97487437  2.00502513  2.03517588
  2.06532663  2.09547739  2.12562814  2.15577889  2.18592965  2.2160804
  2.24623116  2.27638191  2.30653266  2.33668342  2.36683417  2.39698492
  2.42713568  2.45728643  2.48743719  2.51758794  2.54773869  2.57788945
  2.6080402   2.63819095  2.66834171  2.69849246  2.72864322  2.75879397
  2.78894472  2.81909548  2.84924623  2.87939698  2.90954774  2.93969849
  2.96984925  3.        ]

Scipy 패키지를 이용하여 정규분포의 활률밀도함수를 구할 수 있다. norm()은 정규분포를 나타내며 pdf()는 확률밀도함수(Probability Density Function, PDF)를 나타낸다.

 

import scipy.stats as stats

y = stats.norm(0, 1).pdf(x)  
print(y)

 

[0.00443185 0.0048492  0.00530104 0.00578971 0.00631769 0.00688754
 0.00750198 0.00816381 0.00887594 0.00964143 0.01046343 0.01134518
 0.01229006 0.01330154 0.01438318 0.01553865 0.01677169 0.01808612
 0.01948585 0.02097483 0.02255707 0.02423662 0.02601757 0.02790401
 0.02990003 0.03200972 0.03423712 0.03658625 0.03906104 0.04166533
 0.04440287 0.04727727 0.05029201 0.05345039 0.05675549 0.0602102
 0.06381716 0.06757874 0.07149701 0.07557373 0.07981032 0.08420782
 0.0887669  0.09348777 0.09837025 0.10341367 0.10861688 0.11397823
 0.11949554 0.12516608 0.13098658 0.13695319 0.14306148 0.14930641
 0.15568236 0.16218308 0.16880173 0.17553084 0.18236234 0.18928757
 0.19629725 0.20338155 0.21053004 0.21773176 0.22497523 0.23224844
 0.23953894 0.2468338  0.2541197  0.26138293 0.26860947 0.27578499
 0.28289489 0.28992442 0.29685863 0.30368249 0.31038093 0.31693887
 0.3233413  0.32957332 0.33562021 0.3414675  0.34710097 0.35250679
 0.3576715  0.36258212 0.36722617 0.37159175 0.37566757 0.379443
 0.38290812 0.38605377 0.3888716  0.39135406 0.3934945  0.39528714
 0.39672713 0.39781056 0.39853449 0.39889695 0.39889695 0.39853449
 0.39781056 0.39672713 0.39528714 0.3934945  0.39135406 0.3888716
 0.38605377 0.38290812 0.379443   0.37566757 0.37159175 0.36722617
 0.36258212 0.3576715  0.35250679 0.34710097 0.3414675  0.33562021
 0.32957332 0.3233413  0.31693887 0.31038093 0.30368249 0.29685863
 0.28992442 0.28289489 0.27578499 0.26860947 0.26138293 0.2541197
 0.2468338  0.23953894 0.23224844 0.22497523 0.21773176 0.21053004
 0.20338155 0.19629725 0.18928757 0.18236234 0.17553084 0.16880173
 0.16218308 0.15568236 0.14930641 0.14306148 0.13695319 0.13098658
 0.12516608 0.11949554 0.11397823 0.10861688 0.10341367 0.09837025
 0.09348777 0.0887669  0.08420782 0.07981032 0.07557373 0.07149701
 0.06757874 0.06381716 0.0602102  0.05675549 0.05345039 0.05029201
 0.04727727 0.04440287 0.04166533 0.03906104 0.03658625 0.03423712
 0.03200972 0.02990003 0.02790401 0.02601757 0.02423662 0.02255707
 0.02097483 0.01948585 0.01808612 0.01677169 0.01553865 0.01438318
 0.01330154 0.01229006 0.01134518 0.01046343 0.00964143 0.00887594
 0.00816381 0.00750198 0.00688754 0.00631769 0.00578971 0.00530104
 0.0048492  0.00443185]

코드를 완성하면 아래와 같다.

import scipy.stats as stats
import numpy as np
import matplotlib.pyplot as plt

# -3에서 3까지 일정한 간격으로 200개의 데이터를 생성
x = np.linspace(-3, 3, 200)

# 평균이 0, 표준편차가 1인 random variable을 생성
y = stats.norm(0, 1).pdf(x)        

# 챠트 생성
plt.plot(x, y)
plt.xlabel("x")
plt.ylabel("y")
plt.show()

댓글