Numpy - Numerical Python

1. Numpy란?

- C로 적혀있는 빠른 벡터와 행렬 작동을 가진 과학 컴퓨팅을 위한 파이썬 모듈이다.

- python에서 수치 컴퓨팅을 위한 실질적인 표준이다.

interactive mode에서 numpy를 import해서 python3 interpreter를 시작해보자.

>>> import numpy as np

2. Numpy array

numpy array는 하나의 데이터 type만 배열에 넣을 수 있다.

(List와의 가장 큰 차이점, Dynamic typing을 지원하지 않는다.)

차원이 늘어날 때마다 shape의 row가 밀린다.

- 간단한 array 만들기

>>> a = np.array ( [ 0, 1, 2, 3 ] )

>>> a

array ( [ 0, 1, 2, 3 ] )

- type check하기

type(data)는 입력 data의 data type을 알려주는 함수이다.

>>> type ( a )

numpy.ndarray

- 요소들의 숫자 유형

>>> a.dtype

dtype('int32')

- 차원의 수

>>> a.ndim

- 간단한 array 수학

>>> a = np.array ( [ 1, 2, 3, 4 ] )

>>> b = np.array ( [ 2, 3, 4, 5 ] )

>>> a + b

array ( [ 3, 5, 7, 9 ] )

>>> a * b

array ( [ 2, 6, 12, 20 ] )

>>> a ** b

array ( [ 1, 8, 81, 1024 ] )

# = array ( [ 1^2, 2^3, 3^4, 4^5 ] )

#전체 array를 scalar value로 곱하기

>>> 0.1 * a

array ( [ 0.1, 0.2, 0.3, 0.4 ] )

# in-place operation

>>> a *= 2

>>> a

array ( [ 2, 4, 6, 8 ] )

# array에 함수 적용

>>> x = 0.1 * a

>>> x

array ( [ 0.2, 0.4, 0.6, 0.8 ] )

>>> y = np.sin(x)

>>> y

array ( [ 0.19866833, 0.38941834, 0.5646427, 0.71735609 ] )

------------------------------------------------------------------------------------------------------------------

- array indexing

>>> a [ 0 ]

>>> a [ 0 ] = 10

>>> a

array ( [ 10, 1, 2, 3 ] )

list와 달리 2차원 배열에서 [0, 0]과 같은 표기법을 제공한다.

matrix일 경우 앞은 row, 뒤는 column을 의미한다.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

a = np.array([[1, 2, 3], [4.5, 4, 6]], int)
print(a)
#[[1 2 3]
#[4 4 6]]
 
print(a[0, 0])
#1
 
print(a[0][0])
#1
 
a[0, 0] = 12 # matrix 0, 0에 12 할당
print(a)
#[[12  2  3]
#[ 4  4  6]]
 
a[0][0] = 5 # matrix 0, 0에 5할당
print(a)
#[[5 2 3]
#[4 4 6]]
Colored by Color Scripter

cs

- type coercion 주의

>>> a.dtype

dtype ( ' int32 ' )

# Numpy에서 데이터 형태 표기 Type옆에 쓰여있는 숫자(8, 15, 32, 64, 128등)는 단일 값을 메모리에 저장하는데 필요한 bits 수

# int32 array에 float를 할당하면 십진수로 길이를 줄인다

>>> a [ 0 ] = 10.6

>>> a

array ( [ 10, 1, 2, 3 ] )

->Numpy array와 Python list와의 차이점

Numpy array : 모든 element들이 같은 type과 size를 가진다.

Python list : Element들은 다양한 type과 size를 가질 수 있다.

Numpy array가 더 계산이 빠르다

------------------------------------------------------------------------------------------------------

- multi-dimensional arrays

>>> a = np.array ( [ [ 0, 1, 2, 3 ], [ 10, 11, 12, 13 ] ] )

>>> a

array ( [ [ 0, 1, 2, 3 ],

[ 10, 11, 12, 13 ] ] )

- shape = (rows, colums)

>>> a.shape

( 2, 4 )

#shape는 각 dimension을 따라 array의 길이를 나열한 tuple을 반환한다.

- element count(data의 개수)

>>> a.size

- 차원의 수

>>> a.ndim

- get / set elements

>>> a [ 1, 3 ]

>>> a [ 1, 3 ] = -1

>>> a

array ( [ [ 0, 1, 2, 3 ],

[ 10, 11, 12, -1 ] ] )

-----------------------------------------------------------------------------------------------------

- Slicing

형태 : var [ lower : upper : step ]

하한 및 상항 값을 지정하여 sequence의 일부를 추출한다.

하위 경계 요소는 포함되지만, 상위 경계 요소는 포함되지 않는다. 수학적으로 : [ lower, upper )

step 값은 요소 사이의 보폭을 지정한다.

- Slicing arrays

>>> a = np.array ( [ 10, 11, 12, 13, 14 ] )

# indices : 0 1 2 3 4

# -5 -4 -3 -2 -1

>>> a [ : ]

array( [ 10, 11, 12, 13, 14 ] )

>>> a [ 1 : 3 ]

array ( [ 11, 12 ] )

# negative indices도 적용된다.

>>> a [ 1 : -2 ]

array ( [ 11, 12 ] )

>>> a [ -4 : 3 ]

array ( [ 11, 12 ] )

- Omitting indices

: 생략된 경계는 list의 시작 ( 또는 끝 )으로 가정한다.

# 첫 3개의 element들

>>> a [ : 3 ]

array ( [ 10, 11 12 ] )

# 마지막 2개의 element들

>>> a [ -2 : ]

array ( [13, 14 ] )

>>> a [ : : 2 ]

array ( [ 10, 12, 14 ] )

- Array Slicing

>>> a = np.array( [ [ i+10*j for i in range(6) ] for j in range(6) ] )

- slicing은 표준 python slicing과 매우 유사하다.

>>> a [ 0, 3:5 ]

array ( [ 3, 4 ] )

# 0번째 차원 안에서 column은 3:5인 배열

>>> a [ 4 : , 4 : ]

array ( [ [ 44, 45 ],

[ 54, 55 ] ] )

>>> a [ : , 2 ]

array ( [ 2, 12, 22, 32, 42, 52 ] )

- 보폭을 정하는 것도 가능하다.

>>> a [ 2 : : 2, : : 2 ]

array ( [ [ 20, 22, 24 ] ,

[ 40, 42, 44 ] ] )

-----------------------------------------------------------------------------------------------------------------------------------

- Array constructor 예시

- Floating Point arrays

>>> a = np.array ( [ 0, 1.0, 2, 3 ] )

>>> a.dtype

dtype( 'float64')

# float 64 bits

# element마다 다른 type을 가질 수 없으므로 주어진 것 중 정보가 가장 많이 필요한 float로 통일됨.

>>> a.nbytes

# 1byte == 8 bits 이므로 64 bits == 8 bytes이다.

# 8 * 4 = 32

- Reducing precision

>>> a = np.array ( [ 0 , 1. , 2 , 3 ], dtype = 'float32' )

>>> a.dtype

dtype( ' float32 ' )

>>> a.nbytes

-----------------------------------------------------------------------------------------------------------------------------------

- Array Creation Functions

- Identity

: 주대각선은 1이고 나머지는 0인 n*n 행렬

>>> a = np.identity( 4 )

>>> a

array ( [ [ 1. , 0. , 0. , 0. ] ,

[ 0. , 1. , 0. , 0. ]

[ 0. , 0. , 1. , 0. ]

[ 0. , 0. , 0. , 1. ] ] )

>>> a.dtype

dtype ( 'float64' )

# The default dtype is float64.

#dtype을 int로 바꿀 수 있다.

>>> np.identity( 4, dtype = int )

array ( [ [ 1, 0, 0, 0 ],

[ 0, 1, 0, 0 ],

[ 0, 0, 1, 0 ],

[ 0, 0, 0, 1 ] ] )

- ONES, ZEROS

ones ( shape, dtype = 'float64' )

zeores ( shape, dtype = 'float64' )

shape는 배열의 dimension을 지정하는 숫자 또는 sequence이다.

dtype이 지정되지 않은 경우 기본적으로 float64가 된다.

>>> np.ones( ( 2, 3 ), dtype = 'float32')

array ( [ [ 1. , 1. , 1. ],

[ 1. , 1. , 1. ] ], dtype = float32 )

>>> np.zeros (3)

array ( [ 0. , 0. , 0. ] )

#np.zeors (3)은 np.zeros( ( 3, ) )과 같다.

- Linspace

: 시작 값과 끝 값 사이에 균일한 간격의 n개의 요소를 생성한다. 시작 값과 끝 값도 포함한다.

# 처음, 끝, 몇 개

>>> np.linspce( 0, 1, 5 )

array ( [ 0. , 0.25 , 0.5 , 0.75 , 1.0 ] )

- something_like

기존의 ndarry의 shape 크기 만큼 1, 0 또는 empty array를 반환

- eye

: 대각선이 1인 행렬, k값의 시작 index의 변경이 가능

- diag

: 대각 형렬의 값을 추출한다.

- random sampling

: 데이터 분포에 따른 sampling으로 array를 생성

np.random.uniform -> 균등분포

np.random.normal -> 정규분포

- Arange

arange ( [ start , ] stop [ , step ] , dtype = None )

: Python의 range()와 거의 비슷하다.

: 시작 값은 포함하지만 끝 값은 포함하지 않는다.

# 0부터 4 미만까지 간격 1

>>> np.arange( 4 )

array ( [ 0, 1, 2, 3 ] )

list와 달리 floating point도 표시 가능하다.

# 처음, 끝, 간격

>>> np.arange ( 1.5 , 2.1 , 0.3 )

array ( [ 1.5 , 1.8 , 2.1 ] )

# non-integer step을 사용할 때 기계의 정밀도가 유한하여 결과가 일관되지 않을 수 있다.

-----------------------------------------------------------------------------------------------------------------------------------

- Transpose (전치행렬)

>>> a = np.array ( [ [ 0, 1, 2 ] ,

[ 3, 4, 5 ] ] )

>>> a.shape

( 2, 3 )

>>> a.T#

array ( [ [ 0, 3 ] ,

[ 1, 4 ] ,

[ 2, 5 ] ] )

>>> a.T.shape

( 3, 2 )

# Transpose는 축의 순서를 바꾼다.

- reshape

array의 shape의 크기를 변경한다. ( element의 갯수는 동일하다.)

>>> a = np.array ( [ [ 0, 1, 2 ] ,

[ 3, 4, 5 ] ] )

# 다른 shape의 새로운 array를 return 해준다.

>>> a.reshape ( 3, 2 )

array ( [ [ 0, 1 ] ,

[ 2, 3 ] ,

[ 4, 5 ] ] )

# Reshape는 array에서의 elements의 수를 바꿀 수 없다.

>>> a.reshape ( 4, 2 )

ValueError: total size of new array must be unchanged

1
2
3

np.array(test_matrix).reshape(-1, 2).reshape
#-1: size를 기반으로 row개수 선정
#row의 갯수는 정확히 모르지만 column의 개수는 2개로 만들어줘야 한다.

cs

- broadcasting

:shape이 다른 배열 간 연산을 지원하는 기능

-----------------------------------------------------------------------------------------------

- Operation Function

- axis: 모든 operation function을 실행할 때, 기준이 되는 dimesion 축

(axis=1 -> column을 기준 / axis=0 -> row를 기준)

- sum

- concatenate

: numpy array를 합치는 함수

np.vstack() -> vertical로 합친다. == np.concatenate(, axis=0)

np.hstack() ->horizontal로 합친다.) == np.concatenate(, axis=1)

3. Vector & Matrix with Numpy

- Vectors는 단지 1차원 arrays이다.

>>> v = np.arange ( 3 )

>>> v

array ( [ 0, 1, 2 ] )

- Matrices는 단지 2차원 arrays이다.

>>> M = np.arange( 9 ).reshape(3, 3)

>>> M

array ( [ 0, 1, 2 ] ,

[ 3, 4, 5 ] ,

[ 6, 7, 8 ] ] )

- Matrix & Vector Multiplication

- *는 element-wise multiplicatino operator이다.

>>> v * v

array ( [ 0, 1, 4 ] )

>>> M * M

array ( [ [ 0, 1, 4 ] ,

[ 9, 16, 25 ] ,

[ 36, 49, 64 ] ] )

- computer graphics에서는 많이 사용되지 않는다.

- Matrix multiplicatoin은 유클리드 공간에서의 내적인 "dot product"를 요구한다.

- @는 matrix multiplication operator이다. (행렬 곱 operator) == np.dot()

>>> v @ v

>>> M @ M

array ( [ [ 15, 18, 21 ] ,

[ 42, 54, 66 ] ,

[ 69, 90, 111] ] )

>>> M @ V

array ( [ 5, 14, 23 ] )

참고: www.boostcourse.org/ai222/lecture/24071/

'Python > Numpy' 카테고리의 다른 글

[Numpy] Comparisons (0)	2021.01.11

새상을 이롭게

Numpy - Numerical Python

1. Numpy란?