• ํŠธ๋ ˆ์ด๋‹ ์„ธํŠธ์™€ ๊ฒ€์ฆ์„ธํŠธ ๋‚˜๋ˆ„๊ธฐ
from tensorflow import keras 
from sklearn.model_selection import train_test_split

(train_input, train_target), (test_input, test_target) = keras.datasets.fashion_mnist.load_data()
train_scaled = train_input / 255.0
train_scaled, val_scaled, train_target, val_target = train_test_split(
    train_scaled, train_target, test_size = 0.2, random_state=42
)
  • ๋ชจ๋ธ์„ ๋งŒ๋“œ๋Š” ํ•จ์ˆ˜ ์ƒ์„ฑ
def model_fn(a_layer=None):
    model = keras.Sequential()
    model.add(keras.layers.Flatten(input_shape=(28,28)))
    model.add(keras.layers.Dense(100, activation = "relu"))
    if a_layer:
        model.add(a_layer)
    model.add(keras.layers.Dense(10, activation="softmax"))
    return model
model = model_fn()
model.summary()
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 flatten_1 (Flatten)         (None, 784)               0         
                                                                 
 dense_1 (Dense)             (None, 100)               78500     
                                                                 
 dense_2 (Dense)             (None, 10)                1010      
                                                                 
=================================================================
Total params: 79,510
Trainable params: 79,510
Non-trainable params: 0
_________________________________________________________________
model.compile(loss="sparse_categorical_crossentropy", metrics="accuracy")
history = model.fit(train_scaled, train_target, epochs=5, verbose=0)
history.history
{'loss': [0.5299330353736877,
  0.3862016499042511,
  0.35116592049598694,
  0.3301049470901489,
  0.3153148293495178],
 'accuracy': [0.8133958578109741,
  0.8612083196640015,
  0.8732500076293945,
  0.8815833330154419,
  0.8871250152587891]}

์†์‹ค ๊ณก์„ 

import matplotlib.pyplot as plt
  • ๊ฐ ์—ํฌํฌ๋งˆ๋‹ค ์ธก์ •๋œ loss ๊ฐ’
plt.plot(history.history["loss"])
plt.xlabel('epoch')
plt.ylabel('loss')
plt.show()

png

  • ๊ฐ ์—ํฌํฌ๋งˆ๋‹ค ์ธก์ •๋œ ์ •ํ™•๋„
plt.plot(history.history["accuracy"])
plt.xlabel('epoch')
plt.ylabel('accuracy')
plt.show()

png

  • ์—ํฌํฌ๊ฐ€ ๋Š˜์–ด๋‚  ์ˆ˜๋ก loss ์ค„๊ณ  accuracy ๋Š˜์–ด๋‚จ
  • epoch๋ฅผ 20์œผ๋กœ ์˜ฌ๋ ธ๋”๋‹ˆ ์†์‹ค ์ž˜ ๊ฐ์†Œ
  • ๊ณผ์—ฐ ๋” ๋‚˜์€ ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•œ ๊ฒƒ์ผ๊นŒ?
model = model_fn()
model.compile(loss="sparse_categorical_crossentropy", metrics = "accuracy")
history = model.fit(train_scaled, train_target, epochs=20, verbose = 0)
plt.plot(history.history['loss'])
plt.xlabel('epoch')
plt.ylabel('loss')
plt.show()

png

๊ฒ€์ฆ ์†์‹ค

  • ์—ํฌํฌ์— ๋Œ€ํ•œ ๊ณผ๋Œ€์ ํ•ฉ๊ณผ ๊ณผ์†Œ์ ํ•ฉ ํŒŒ์•…ํ•˜๋ ค๋ฉด ํ›ˆ๋ จ์„ธํŠธ์— ๋Œ€ํ•œ ์ ์ˆ˜ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ๊ฒ€์ฆ ์„ธํŠธ์— ๋Œ€ํ•œ ์ ์ˆ˜๋„ ํ•„์š”
  • fit ๋ฉ”์„œ๋“œ์— validation_data ๋งค๊ฐœ๋ณ€์ˆ˜ ์ถ”๊ฐ€ํ•˜๋ฉด ๊ฒ€์ฆ์„ธํŠธ์— ๋Œ€ํ•œ loss์™€ accuracy๋„ ์•Œ ์ˆ˜ ์žˆ์Œ
model = model_fn()
model.compile(loss="sparse_categorical_crossentropy", metrics="accuracy")
history = model.fit(train_scaled, train_target, epochs=20, verbose=0, validation_data=(val_scaled, val_target))

history.history.keys()
dict_keys(['loss', 'accuracy', 'val_loss', 'val_accuracy'])
  • ์ดˆ๊ธฐ์— ๊ฒ€์ฆ ์†์‹ค์ด ๊ฐ์†Œํ•˜๋‹ค๊ฐ€ ๋‹ค์‹œ ์ƒ์Šนํ•˜๊ธฐ ์‹œ์ž‘
  • ํ›ˆ๋ จ์†์‹ค์€ ๊พธ์ค€ํžˆ ๊ฐ์†Œํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ „ํ˜•์ ์ธ ๊ณผ๋Œ€์ ํ•ฉ ๋ชจ๋ธ์ด ๋งŒ๋“ค์–ด์ง
plt.plot(history.history["loss"])
plt.plot(history.history["val_loss"])
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend(['train','val'])
plt.show()

png

  • optimizer ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์กฐ์ •ํ•˜์—ฌ ๊ณผ๋Œ€์ ํ•ฉ ์™„ํ™”์‹œํ‚ฌ ์ˆ˜ ์žˆ๋Š”๊ฐ€
    • Adam์€ ์ ์‘์  ํ•™์Šต๋ฅ ์„ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์—ํฌํฌ๊ฐ€ ์ง„ํ–‰๋˜๋ฉด์„œ ํ•™์Šต๋ฅ ์˜ ํฌ๊ธฐ ์กฐ์ • ๊ฐ€๋Šฅ
  • overfitting ์ด ํ›จ์”ฌ ์ค„์—ˆ๋‹ค
    • ์—ฌ์ „ํžˆ ์š”๋™์ด ๋‚จ์•„์žˆ๊ธด ํ•จ
    • ์ด๋Š” Adam optimizer๊ฐ€ ์ด ๋ฐ์ดํ„ฐ์…‹์— ์ž˜ ๋งž๋Š”๋‹ค๋Š” ๊ฒƒ์„ ์˜๋ฏธ
model = model_fn()
model.compile(optimizer ="adam", loss="sparse_categorical_crossentropy", metrics="accuracy")
history = model.fit(train_scaled, train_target, epochs=20, verbose=0, validation_data=(val_scaled, val_target))

plt.plot(history.history["loss"])
plt.plot(history.history["val_loss"])
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend(['train','val'])
plt.show()

png

Dropout

  • dropout : ํ›ˆ๋ จ ๊ณผ์ •์—์„œ ์ธต์— ์žˆ๋Š” ์ผ๋ถ€ ๋‰ด๋Ÿฐ์„ ๋žœ๋คํ•˜๊ฒŒ ๊บผ์„œ overfitting ๋ง‰์Œ
  • keras.layers ํŒจํ‚ค์ง€ ์•„๋ž˜ Dropout ํด๋ž˜์Šค๋กœ ์ œ๊ณต
    • ์–ด๋–ค ์ธต ๋’ค์— ๋‘ฌ์„œ ์ด ์ธต์˜ ์ถœ๋ ฅ์„ ๋žœ๋คํ•˜๊ฒŒ 0์œผ๋กœ ๋งŒ๋“ฌ
    • ์ธต์ฒ˜๋Ÿผ ์‚ฌ์šฉ๋˜์ง€๋งŒ ํ›ˆ๋ จ๋˜๋Š” ๋ชจ๋ธ ํŒŒ๋ผ๋ฏธํ„ฐ๋Š” X
    • ์ผ๋ถ€ ๋‰ด๋Ÿฐ์˜ ์ถœ๋ ฅ์„ 0์œผ๋กœ ๋งŒ๋“ค ๋ฟ ์ „์ฒด ์ถœ๋ ฅ ๋ฐฐ์—ด์˜ ํฌ๊ธฐ๋ฅผ ๋ฐ”๊พธ์ง€ X
model = model_fn(keras.layers.Dropout(0.3))
model.summary()
Model: "sequential_6"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 flatten_6 (Flatten)         (None, 784)               0         
                                                                 
 dense_11 (Dense)            (None, 100)               78500     
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense_12 (Dense)            (None, 10)                1010      
                                                                 
=================================================================
Total params: 79,510
Trainable params: 79,510
Non-trainable params: 0
_________________________________________________________________
  • ํ™•์‹คํžˆ ์ค„์–ด๋“  overfitting
  • ์—ด ๋ฒˆ์งธ ์—ํฌํฌ์—์„œ ๊ฒ€์ฆ ์†์‹ค์˜ ๊ฐ์†Œ๊ฐ€ ๋ฉˆ์ถ”์ง€๋งŒ ํฌ๊ฒŒ ์ƒ์Šนํ•˜์ง€ ์•Š๊ณ  ์–ด๋Š ์ •๋„ ์œ ์ง€ ์ค‘
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics="accuracy")
history = model.fit(train_scaled, train_target, epochs=20, verbose=0, validation_data=(val_scaled, val_target))

plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend(['train','val'])
plt.show()

png

๋ชจ๋ธ ์ €์žฅ๊ณผ ๋ณต์›

  • ์œ„์˜ ์‹คํ—˜ ๊ฒฐ๊ณผ 10๋ฒˆ์ด ์ตœ์ 
  • ์œ„์˜ ๋ชจ๋ธ์„ ์—ํฌํฌ ํšŸ์ˆ˜๋ฅผ 10์œผ๋กœ ์ง€์ •ํ•˜๊ณ  ๋‹ค์‹œ ๋ชจ๋ธ ํ›ˆ๋ จํ•˜๊ธฐ
model = model_fn(keras.layers.Dropout(0.3))
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics="accuracy")
history = model.fit(train_scaled, train_target, epochs=10, verbose=0, validation_data=(val_scaled, val_target))

plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend(['train','val'])
plt.show()

png

  • save_weights : ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ ์ €์žฅ
model.save_weights("model-weights.h5")
  • save : ๋ชจ๋ธ ๊ตฌ์กฐ์™€ ๋ชจ๋ธ ํŒŒ๋ผ๋ฏธํ„ฐ ํ•จ๊ป˜ ์ €์žฅ
model.save('model-whole.h5')
  • ํ™•์ธ

png

๋‘ ๊ฐ€์ง€ ์‹คํ—˜

  1. ํ›ˆ๋ จํ•˜์ง€ ์•Š์€ ์ƒˆ๋กœ์šด ๋ชจ๋ธ์„ ๋งŒ๋“ค๊ณ  model-weights.h5 ์—์„œ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ ํŒŒ๋ผ๋ฏธํ„ฐ ์ฝ์–ด์„œ ์‚ฌ์šฉ
  2. ์•„์˜ˆ model-whole.h5 ํŒŒ์ผ์—์„œ ์ƒˆ๋กœ์šด ๋ชจ๋ธ ๋งŒ๋“ค์–ด์„œ ๋ฐ”๋กœ ์‚ฌ์šฉ

์ฒซ๋ฒˆ์งธ ์‹คํ—˜

  • load_weights() -> ๋ชจ๋ธ ํŒŒ๋ผ๋ฏธํ„ฐ ์ ์žฌํ•˜๋Š” ๋ฉ”์„œ๋“œ
  • ์ด ๋ชจ๋ธ์˜ ๊ฒ€์ฆ ์ •ํ™•๋„ ํ™•์ธ
    • ์ผ€๋ผ์Šค์—์„œ์˜ predict ๋ฉ”์„œ๋“œ : ์‚ฌ์ดํ‚ท๋Ÿฐ๊ณผ ๋‹ค๋ฅด๊ฒŒ 10๊ฐœ ํด๋ž˜์Šค์— ๋Œ€ํ•œ ํ™•๋ฅ  ๋ฐ˜ํ™˜
    • ์ค€๋น„๋œ ๊ฒ€์ฆ์„ธํŠธ๋Š” 12000๊ฐœ ์ด๋ฏ€๋กœ predict ๋ฉ”์„œ๋“œ๋Š” (12000,10) ํฌ๊ธฐ์˜ ๋ฐฐ์—ด ๋ฐ˜ํ™˜
model = model_fn(keras.layers.Dropout(0.3))
model.load_weights('model-weights.h5')
import numpy as np
val_labels = np.argmax(model.predict(val_scaled), axis=-1)
np.mean(val_labels==val_target)
0.8781666666666667

๋‘๋ฒˆ์งธ ์‹คํ—˜

model = keras.models.load_model('model-whole.h5')
model.evaluate(val_scaled, val_target)
375/375 [==============================] - 0s 833us/step - loss: 0.3325 - accuracy: 0.8782


[0.33252522349357605, 0.878166675567627]

๊ฒฐ๋ก 

  • ๊ฐ™์€ ๋ชจ๋ธ์„ ์ €์žฅํ•˜๊ณ  ๋‹ค์‹œ ๋ถˆ๋Ÿฌ๋“ค์˜€๊ธฐ ๋•Œ๋ฌธ์— ๋™์ผํ•œ ์ •ํ™•๋„ ์–ป์Œ
  • ๋ถˆํŽธํ•œ ์ 
    • 20๋ฒˆ์˜ ์—ํฌํฌ ๋™์•ˆ ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•˜์—ฌ ๊ฒ€์ฆ ์ ์ˆ˜๊ฐ€ ์ƒ์Šนํ•˜๋Š” ์ง€์  ํ™•์ธ
    • ๋ชจ๋ธ ๊ณผ๋Œ€์ ํ•ฉ๋˜์ง€ ์•Š๋Š” ์—ํฌํฌ๋งŒํผ ๋‹ค์‹œ ํ›ˆ๋ จ
    • ๋‘ ๋ฒˆ์”ฉ ํ›ˆ๋ จํ•˜์ง€ ์•Š๋Š” ๋ฐฉ๋ฒ• ์—†์„๊นŒ

์ฝœ๋ฐฑ

  • ์ฝœ๋ฐฑ : ํ›ˆ๋ จ๊ณผ์ • ์ค‘๊ฐ„์— ์–ด๋–ค ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•˜๋Š” ๊ฐ์ฒด
  • ModelCheckpoint ์ฝœ๋ฐฑ : ๊ธฐ๋ณธ์ ์œผ๋กœ ์ตœ์ƒ์˜ ๊ฒ€์ฆ ์ ์ˆ˜๋ฅผ ๋งŒ๋“œ๋Š” ๋ชจ๋ธ์„ ์ €์žฅ
    • ํ•˜์ง€๋งŒ ์—ฌ์ „ํžˆ 20๋ฒˆ์˜ ์—ํฌํฌ๋™์•ˆ ํ›ˆ๋ จ์„ ํ•จ
    • ์‚ฌ์‹ค ๊ฒ€์ฆ ์†์‹ค ์ ์ˆ˜๊ฐ€ ์ƒ์Šนํ•˜๊ธฐ ์‹œ์ž‘ํ•˜๋ฉด overfitting์ด ์ปค์ง€๊ธฐ ๋•Œ๋ฌธ์— ํ›ˆ๋ จ ๊ณ„์†ํ•  ํ•„์š” ์—†์Œ
    • overfitting์ด ์‹œ์ž‘๋˜๊ธฐ ์ „์— ํ›ˆ๋ จ ๋ฏธ๋ฆฌ ์ข…๋ฃŒํ•˜๋Š” ๊ฒƒ : Early stopping(์กฐ๊ธฐ ์ข…๋ฃŒ)
model = model_fn(keras.layers.Dropout(0.3))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics='accuracy')
checkpoint_cb = keras.callbacks.ModelCheckpoint('best-model.h5')
model.fit(
    train_scaled, train_target, epochs=20, verbose=0, 
    validation_data=(val_scaled, val_target), callbacks=[checkpoint_cb]
)
model = keras.models.load_model('best-model.h5')
model.evaluate(val_scaled, val_target)
375/375 [==============================] - 0s 792us/step - loss: 0.3189 - accuracy: 0.8856

[0.3189184069633484, 0.8855833411216736]

Early Stopping

  • ์กฐ๊ธฐ์ข…๋ฃŒ : ๊ณผ๋Œ€์ ํ•ฉ์ด ์ปค์ง€๊ธฐ ์ „์— ํ›ˆ๋ จ์„ ๋ฏธ๋ฆฌ ์ค‘์ง€ํ•˜๋Š” ๊ฒƒ
  • patience : ์ง€์ •๋œ patience๋ฒˆ ์—ฐ์† ๊ฒ€์ฆ ์ ์ˆ˜๊ฐ€ ํ–ฅ์ƒ๋˜์ง€ ์•Š์„ ๊ฒฝ์šฐ ํ›ˆ๋ จ ์ข…๋ฃŒ
model = model_fn(keras.layers.Dropout(0.3))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics='accuracy')
checkpoint_cb = keras.callbacks.ModelCheckpoint('best-model.h5')
early_stopping_cb = keras.callbacks.EarlyStopping(patience=2, restore_best_weights=True)
history = model.fit(
    train_scaled, train_target, epochs=20, verbose=0, 
    validation_data=(val_scaled, val_target), callbacks=[checkpoint_cb, early_stopping_cb]
)

11๋ฒˆ๋งŒ์— ์ข…๋ฃŒ

early_stopping_cb.stopped_epoch
11
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend(['train','val'])
plt.show()

png

model.evaluate(val_scaled, val_target)
375/375 [==============================] - 0s 791us/step - loss: 0.3259 - accuracy: 0.8845





[0.3258848488330841, 0.8845000267028809]