섹션.1 머신러닝의 개념과 용어

What is ML?

Programming의 한계
- Spam filter: 이 메일이 스팸인지 아닌지를 판별하기 위한 규칙
- Automatic driving: 또한 너무 규칙이 많다
이러한 한계를 극복하기 위해

Supervised/Unsupervised learning

Supervised learning: 정해져 있는 데이터를 가지고 학습
- 특정 사진들의 label을 알려주면서 학습
Unsupervised learning: label이 정해지지 않은 데이터들을 학습
- 데이터를 보고 스스로 학습한다.

Supervised learning

이미지 라벨링: tag가 되어있는 이미지를 통해 학습
Email spam filter: 라벨된 이메일을 통해 학습

Training data set

Machine learning을 위해 학습시키는 traing data set이 필요
- ex) Alphago가 기보를 보고 학습

Types of supervised learning

regression: 시간을 얼마나 지났는지에 따라 기말 시험 점수 매기는 것
binary classification: 시간이 얼마나 지났는지에 따라 패/논패 주는 거
multi-label classification: 시간이 얼마나 지났는지에 따라서 학점주기

TensorFlow의 기본적인 operations

그래프를 정의한다
session을 통해 feed_dict를 통해서 data를 넘겨준다
그래프가 실행이 되면서 변수들을 업데이트 하거나 출력한다

Tensor

Tensor는 배열들을 의미한다
- 차원(Rank)에 따라 여러 Tensor가 있다
  - Scalar: 0차원
  - Vector: 1차원
  - Matrix: 2차원
  - 3-Tensor: 3차원
  - n-Tensor: n차원

섹션2

Predicting exam score:regression

X ( hours) Y (score)

10	90
9	80
3	50
2	30

regression이 기존의 학습한 데이터들을 기반으로 공부 시간을 통해 점수를 예측한다

X Y

1	1
2	2
3	3

일 때, Y=X의 Linear한 형태로 나타낼 수 있다.

Hypothesis

Cost

다음의 가설(Hypothesis)와 실제의 값이 얼마나 차이가 나는지 cost로 나타낸다.
cost가 적을 수록 예측이 잘 된 모델이다.
cost는 평균 제곱 오차를 통해 계산한다

섹션3

Gradient descent algorithm(경사 하강 알고리즘)

cost function에서 cost를 최소화하기 위해 사용된다.
많은 최소화 문제에 사용된다
cost (W, b) 이cost를 최소화 한다

작동 원리

아무 점에서나 시작한다
w와 b의 값을 계속해서 바꿔가면서 cost를 줄여나간다

기울기를 최소로 하는 w를 찾아간다.

Lab03

import tensorflow as tf

# tf Graph Input
X = [1, 2, 3]
Y = [1, 2, 3]

# Set wrong model weights
W = tf.Variable(5.)

# Linear model
hypothesis = X * W

# Manual gradient
gradient = tf.reduce_mean((W * X - Y) * X) * 2

# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))

# Minimize: Gradient Descent Optimizer
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)

# Get gradients
gvs = optimizer.compute_gradients(cost)

# Optional: modify gradient if necessary
# gvs = [(tf.clip_by_value(grad, -1., 1.), var) for grad, var in gvs]

# Apply gradients
apply_gradients = optimizer.apply_gradients(gvs)

# Launch the graph in a session.
with tf.Session() as sess:
    # Initializes global variables in the graph.
    sess.run(tf.global_variables_initializer())

    for step in range(101):
        gradient_val, gvs_val, _ = sess.run([gradient, gvs, apply_gradients])
        print(step, gradient_val, gvs_val)

섹션4

여러 개의 inputs 을 통해서 결과를 예측하기

Hypothesis

H(x1, x2, x3) = w1x1 + w2x2+ w3x3 + b

H(X) = XW

Matrix를 통해 간단하게

instance가 많을 때 각 instance를 계산할 필요가 없어진다
Matrix product의 [a, b] x [b, c] = [a, c] 라는 특성을 이용해 H(x)와 x의 크기를 통해 w의 크기를 설정해준다.

Loading data from file

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

import matplotlib.pyplot as plt
import numpy as np

xy = np.loadtxt('data-01-test-score.csv', delimiter=',', dtype=np.float32)
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]

print(x_data.shape, x_data, len(x_data))
print(y_data.shape, y_data)

numpy의 loadtxt를 통해 data file을 읽어올 수 있다.

Queue Runners

여러 개의 filenames를 Filename Queue에 쌓아둔다
Reader로 연결해서 데이터를 읽은 후 Decode 해준다.
Decode 해준 값을 Example Queue에 저장하여 학습할 때 활용한다

file들의 리스트를 만들어줌

filename_queue = tf.train.string_input_producer(
    ['data-01-test-score.csv'], shuffle=False, name='filename_queue')

file을 읽어올 Reader를 만들어주고 key, value를 통해 읽어온다.

reader = tf.TextLineReader()
key, value = reader.read(filename_queue)

읽어온 value들을 decode 해준다.

record_defaults = [[0.], [0.], [0.], [0.]]
xy = tf.decode_csv(value, record_defaults=record_defaults)

tf.train.batch

batch를 통해서 데이터를 읽어온다

train_x_batch, train_y_batch = \\
    tf.train.batch([xy[0:-1], xy[-1:]], batch_size=10)

shuffle_batch를 통해서 batch의 순서를 shuffle해줄 수 있다.

섹션5.

Logistic (regression) classification

Classification

Spam Detection
Facebook feed: 이전의 타임 라인들을 보고 학습하여 적절한 피드를 보여줌
Credit Card Fraudulent Transaction: 카드가 도난 당하여 이상하게 사용하면 감지

0, 1 Encoding

classification을 통해 0 과 1로 결정

Linear regression과 차이

Linear regression은 0과 1 이외의 다른 값들이 나옴
그러나 classification은 0과 1사이의 값으로 한정

sigmoid

0과 1사이의 값을 같는 sigmoid 함수를 사용함

Logistic Hypothesis

이러한 특성으로 인해 Logistic regression의 Hypothesis는 다음과 같다

Cost function

Linear regression때와 같이 평균 제곱 오차를 사용하여 cost function을 구하면 다음과 같이 나오게 된다.
로지스틱 회귀에서 평균 제곱 오차를 비용 함수로 사용하면, 경사 하강법을 사용하였을때 찾고자 하는 최소값(글로벌 미니멈)이 아닌 잘못된 최소값(로컬 미니멈)에 빠질 가능성이 매우 높다.

New cost function for logistic

그리하여 새로운 cost function을 사용한다
시그모이드 함수는 0과 1사이의 값을 반환한다. 이는 실제값이 0일 때 값이 1에 가까워지면 오차가 커지며 실제값이 1일 때 값이 0에 가까워지면 오차가 커짐을 의미한다.
log 함수의 특성을 이용하여 이를 표현할 수 있다.

이를 수식으로 표현하면 다음과 같다

Lab05-1

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

import matplotlib.pyplot as plt
import numpy as np

x_data = [[1, 2],
          [2, 3],
          [3, 1],
          [4, 3],
          [5, 3],
          [6, 2]] #[x1,x2]
y_data = [[0],
          [0],
          [0],
          [1],
          [1],
          [1]]  #0은 fail 1은 패스

# placeholders for a tensor that will be always fed.
X = tf.placeholder(tf.float32, shape=[None, 2])
Y = tf.placeholder(tf.float32, shape=[None, 1])

W = tf.Variable(tf.random_normal([2, 1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')

# sigmoid 함수를 활용하여 Hypothesis를 만들어줌
hypothesis = tf.sigmoid(tf.matmul(X, W) + b)

# cost/loss function
cost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) *
                       tf.log(1 - hypothesis))

train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost)

# Accuracy computation
# True if hypothesis>0.5 else False
predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32)
accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float32))

# Launch graph
with tf.Session() as sess:
    # Initialize TensorFlow variables
    sess.run(tf.global_variables_initializer())

    for step in range(10001):
        cost_val, _ = sess.run([cost, train], feed_dict={X: x_data, Y: y_data})
        if step % 200 == 0:
            print(step, cost_val)

    # Accuracy report
    h, c, a = sess.run([hypothesis, predicted, accuracy],
                       feed_dict={X: x_data, Y: y_data})
    print("\\nHypothesis: ", h, "\\nCorrect (Y): ", c, "\\nAccuracy: ", a)

hypothesis와 cost에 sigmoid 함수와 새로운 cost function을 적용하여 x1, x2를 통해 y의 값을 예측하였다.

섹션6

Multinomial classification

여러 개의 결과로 나타내기 위한 classification
- 0 또는 1이 아닌 A or B or C
A or not, B or not, C or Not 인지를 판단하는 3개의 classify(H)들을 통해 구현

Softmax

아래의 수식을 이용하여 모든 확률의 합을 1로 만들어준다.

Cross-entropy cost function

y’은 y의 예측값이다. Loss의 값이 0일 경우 y’이 올바른 예측, Loss가 무한대일 경우 y’이 틀린 예측

lab06-1 softmax

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

import matplotlib.pyplot as plt
import numpy as np

x_data = [[1, 2, 1, 1],
          [2, 1, 3, 2],
          [3, 1, 3, 4],
          [4, 1, 5, 5],
          [1, 7, 5, 5],
          [1, 2, 5, 6],
          [1, 6, 6, 6],
          [1, 7, 7, 7]]
#one-hot encoding 방식으로 y를 표현
y_data = [[0, 0, 1],
          [0, 0, 1],
          [0, 0, 1],
          [0, 1, 0],
          [0, 1, 0],
          [0, 1, 0],
          [1, 0, 0],
          [1, 0, 0]]

X = tf.placeholder("float", [None, 4])
Y = tf.placeholder("float", [None, 3])
#class의 개수
nb_classes = 3 

W = tf.Variable(tf.random_normal([4, nb_classes]), name='weight')
b = tf.Variable(tf.random_normal([nb_classes]), name='bias')

# softmax활용 => exp(logits) / reduce_sum(exp(logits), dim)
hypothesis = tf.nn.softmax(tf.matmul(X, W) + b)

# Cross entropy cost/loss function
cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1))

optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    for step in range(2001):
            _, cost_val = sess.run([optimizer, cost], feed_dict={X: x_data, Y: y_data})

            if step % 200 == 0:
                print(step, cost_val)

    print('--------------')
    # Testing & One-hot encoding
    a = sess.run(hypothesis, feed_dict={X: [[1, 11, 7, 9]]})
    print(a, sess.run(tf.argmax(a, 1)))

    print('--------------')
    b = sess.run(hypothesis, feed_dict={X: [[1, 3, 4, 3]]})
    print(b, sess.run(tf.argmax(b, 1)))

    print('--------------')
    c = sess.run(hypothesis, feed_dict={X: [[1, 1, 0, 1]]})
    print(c, sess.run(tf.argmax(c, 1)))

    print('--------------')
    all = sess.run(hypothesis, feed_dict={X: [[1, 11, 7, 9], [1, 3, 4, 3], [1, 1, 0, 1]]})
    print(all, sess.run(tf.argmax(all, 1)))

argmax는 최댓값을 찾을 tensor와 최댓값을 찾을 축을 인자로 받는다.

Lab06-2 Fancy softmax

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

import matplotlib.pyplot as plt
import numpy as np

xy = np.loadtxt('data-04-zoo.csv', delimiter=',', dtype=np.float32)
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]

print(x_data.shape, y_data.shape)

'''
(101, 16) (101, 1)
'''

nb_classes = 7  # 0 ~ 6

X = tf.placeholder(tf.float32, [None, 16])
Y = tf.placeholder(tf.int32, [None, 1])  # 0 ~ 6

Y_one_hot = tf.one_hot(Y, nb_classes)  # 0~6까지의 숫자를 nb_classes의 개수만큼 one hot으로 변환, shape가 (n,1,7)이 됨
print("one_hot:", Y_one_hot)
Y_one_hot = tf.reshape(Y_one_hot, [-1, nb_classes])  #reshape를 통해 one hot으로 인해 바뀐 shape를 (n,7)로 다시 변환
print("reshape one_hot:", Y_one_hot)

'''
one_hot: Tensor("one_hot:0", shape=(?, 1, 7), dtype=float32)
reshape one_hot: Tensor("Reshape:0", shape=(?, 7), dtype=float32)
'''

W = tf.Variable(tf.random_normal([16, nb_classes]), name='weight')
b = tf.Variable(tf.random_normal([nb_classes]), name='bias')

# tf.nn.softmax computes softmax activations
# softmax = exp(logits) / reduce_sum(exp(logits), dim)
logits = tf.matmul(X, W) + b
hypothesis = tf.nn.softmax(logits)

# Cross entropy cost/loss
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits,
                                                                 labels=tf.stop_gradient([Y_one_hot]))) #one hot된 data를 labels로 넣어주고 logits을 넣어 계산
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)

prediction = tf.argmax(hypothesis, 1)
correct_prediction = tf.equal(prediction, tf.argmax(Y_one_hot, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

# Launch graph
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    for step in range(2001):
        _, cost_val, acc_val = sess.run([optimizer, cost, accuracy], feed_dict={X: x_data, Y: y_data})
                                        
        if step % 100 == 0:
            print("Step: {:5}\\tCost: {:.3f}\\tAcc: {:.2%}".format(step, cost_val, acc_val))

    # Let's see if we can predict
    pred = sess.run(prediction, feed_dict={X: x_data})
    # y_data: (N,1) = flatten => (N, ) matches pred.shape
    for p, y in zip(pred, y_data.flatten()):
        print("[{}] Prediction: {} True Y: {}".format(p == int(y), p, int(y)))

저작자표시 동일조건

'WINK-(Web & App) > 인공지능 스터디' 카테고리의 다른 글

[인공지능 스터디] 김윤희 #2주차 모두를 위한 딥러닝 - 기본적인 머신러닝과 딥러닝 강좌 섹션 7 ~ 15 #완강! (0)	2023.07.17

[인공지능 스터디] 이정욱 #1주차 모두를 위한 딥러닝 - 기본적인 머신러닝과 딥러닝 강좌 섹션 0 ~ 섹션 6

섹션.1 머신러닝의 개념과 용어

What is ML?

Supervised/Unsupervised learning

Supervised learning

Training data set

Types of supervised learning

TensorFlow의 기본적인 operations

Tensor

섹션2

Predicting exam score:regression

Hypothesis

Cost

섹션3

Gradient descent algorithm(경사 하강 알고리즘)

작동 원리

Lab03

섹션4

여러 개의 inputs 을 통해서 결과를 예측하기

Hypothesis

Matrix를 통해 간단하게

Loading data from file

Queue Runners

tf.train.batch

섹션5.

Logistic (regression) classification

Classification

0, 1 Encoding

Linear regression과 차이

sigmoid

Logistic Hypothesis

Cost function

New cost function for logistic

Lab05-1

섹션6

Multinomial classification

Softmax

Cross-entropy cost function

lab06-1 softmax

Lab06-2 Fancy softmax

'WINK-(Web & App) > 인공지능 스터디' 카테고리의 다른 글

'WINK-(Web & App)/인공지능 스터디' Related Articles

티스토리툴바