본문 바로가기

Deep Learning

Deep Learning : 파이썬으로 압축파일 푸는 방법 / JPG나 PNG같은 이미지 파일을, 학습데이터로 만드는 방법(Tensorflow의 ImageDataGenerator)

* 파이썬으로 압축파일 푸는 방법

 

In :

import zipfile

 

 


JPG나 PNG같은 이미지 파일을,  학습데이터로 만드는 방법(Tensorflow의 ImageDataGenerator)

- 사진이 저장된 폴더 경로 만들기

In :

zip_ref = zipfile.ZipFile('/tmp/horse-or-human.zip')

 

- 각 폴더에 저장되어 있는 사진파일이름들 출력하기

In : 

zip_ref.extractall('/tmp/horse-or-human')

zip_ref = zipfile.ZipFile('/tmp/horse-or-human.zip')

zip_ref.extractall('/tmp/validation-horse-or-human')

zip_ref.close()

train_horse_dir = '/tmp/validation-horse-or-human/horses'

train_human_dir = '/tmp/validation-horse-or-human/humans'

validation_human_dir = '/tmp/validation-horse-or-human/humans'

validation_horse_dir = '/tmp/validation-horse-or-human/horses'

import os

train_horse_names = os.listdir(train_horse_dir)

train_human_names = os.listdir(train_human_dir)

 

- 각 디렉토리에 저장된 파일의 갯수 확인하기

In : 

len(train_horse_names)

len(train_human_names)

validation_horse_names = os.listdir(validation_horse_dir)

validation_human_names = os.listdir(validation_human_dir)

len(validation_horse_names)

len(validation_human_names)

 

- 시각화로, 사진 이미지 확인해 보기

 

In : 

%matplotlib inline

import matplotlib.pyplot as plt
import matplotlib.image as mpimg

# Parameters for our graph; we'll output images in a 4x4 configuration
nrows = 4
ncols = 4

# Index for iterating over images
pic_index = 0

 

- 말 8개, 사람 8개씩 이미지 표시하기

In :

# Set up matplotlib fig, and size it to fit 4x4 pics
fig = plt.gcf()
fig.set_size_inches(ncols * 4, nrows * 4)

pic_index += 8
next_horse_pix = [os.path.join(train_horse_dir, fname) 
                for fname in train_horse_names[pic_index-8:pic_index]]
next_human_pix = [os.path.join(train_human_dir, fname) 
                for fname in train_human_names[pic_index-8:pic_index]]

for i, img_path in enumerate(next_horse_pix+next_human_pix):
  # Set up subplot; subplot indices start at 1
  sp = plt.subplot(nrows, ncols, i + 1)
  sp.axis('Off') # Don't show axes (or gridlines)

  img = mpimg.imread(img_path)
  plt.imshow(img)

plt.show()

 

 

Out :

(이미지들이 출력된다.)

 


- RMSprop optimization algorithm 사용하여 컴파일 해보자

In :

from tensorflow.keras.optimizers import RMSprop

model.compile(loss='binary_crossentropy',
              optimizer=RMSprop(learning_rate=0.001),
              metrics=['accuracy'])
# 학습시킬 데이터는 넘파이 어레이 인데, 현재 우리는 png파일로만 준비가 되어있다.
# 따라서 이미지파일을 넘파이로 바꿔주는 라이브러리를, 텐서플로우가 제공해준다.
 
 
 
In :
 
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale=1/255.0)

train_datagen

validation_datagen = ImageDataGenerator(rescale=1/255.0)
 
# 변수로 만들어 줬으면, 그다음 할일은,
# 이미지가 들어있는 디렉토리의 정보, 이미지 사이즈 정보, 분류할 갯수 정보를 알려줘야한다.
# target_size와 input_shape은 가로,세로가 같아야 한다.
# 아래 train_generator는 , 넘파이어레이와, 해당 이미지의 정답지도 가지고 있는 변수다.
# 즉, X_train 과 y_trian을 모두 한꺼번에 가지고 있다 !
 
 
 
In :
 
train_generator = train_datagen.flow_from_directory('/tmp/horse-or-human', target_size=(300,300), class_mode='binary' ) # 1개 = 카테고리컬, 2개= 바이너리
 
 
 
 
Out : 
 
Found 1027 images belonging to 2 classes.
 
 
In :
 
validation_generator = \
 validation_datagen.flow_from_directory('/tmp/horse-or-human', target_size=(300,300), class_mode='binary')

 

Out : Found 1027 images belonging to 2 classes.

 


In : 

epoch_history = model.fit(train_generator, epochs=20, validation_data=(validation_generator))

 

Out :

Epoch 1/20
33/33 [==============================] - 28s 453ms/step - loss: 2.7159 - accuracy: 0.6981 - val_loss: 0.1488 - val_accuracy: 0.9708
Epoch 2/20
33/33 [==============================] - 15s 455ms/step - loss: 0.2621 - accuracy: 0.9250 - val_loss: 0.0345 - val_accuracy: 0.9912
Epoch 3/20
33/33 [==============================] - 15s 445ms/step - loss: 0.2743 - accuracy: 0.9133 - val_loss: 0.0381 - val_accuracy: 0.9912
Epoch 4/20
33/33 [==============================] - 15s 446ms/step - loss: 0.1066 - accuracy: 0.9649 - val_loss: 0.0632 - val_accuracy: 0.9796
Epoch 5/20
33/33 [==============================] - 14s 440ms/step - loss: 0.5389 - accuracy: 0.9552 - val_loss: 0.0089 - val_accuracy: 1.0000
Epoch 6/20
33/33 [==============================] - 14s 441ms/step - loss: 0.2148 - accuracy: 0.9727 - val_loss: 0.0501 - val_accuracy: 0.9873
Epoch 7/20
33/33 [==============================] - 14s 442ms/step - loss: 0.5549 - accuracy: 0.9708 - val_loss: 1.1960 - val_accuracy: 0.7176
Epoch 8/20
33/33 [==============================] - 14s 442ms/step - loss: 0.0320 - accuracy: 0.9922 - val_loss: 5.7768e-04 - val_accuracy: 1.0000
Epoch 9/20
33/33 [==============================] - 14s 441ms/step - loss: 0.1861 - accuracy: 0.9815 - val_loss: 0.0309 - val_accuracy: 0.9893
Epoch 10/20
33/33 [==============================] - 14s 437ms/step - loss: 0.0232 - accuracy: 0.9912 - val_loss: 0.0016 - val_accuracy: 0.9990
Epoch 11/20
33/33 [==============================] - 14s 440ms/step - loss: 3.2687e-04 - accuracy: 1.0000 - val_loss: 4.6751e-05 - val_accuracy: 1.0000
Epoch 12/20
33/33 [==============================] - 14s 443ms/step - loss: 2.9328e-05 - accuracy: 1.0000 - val_loss: 9.4220e-06 - val_accuracy: 1.0000
Epoch 13/20
33/33 [==============================] - 16s 478ms/step - loss: 4.6249e-06 - accuracy: 1.0000 - val_loss: 1.9635e-06 - val_accuracy: 1.0000
Epoch 14/20
33/33 [==============================] - 14s 443ms/step - loss: 1.2522e-06 - accuracy: 1.0000 - val_loss: 4.0686e-07 - val_accuracy: 1.0000
Epoch 15/20
33/33 [==============================] - 15s 451ms/step - loss: 7.1399e-05 - accuracy: 1.0000 - val_loss: 74.0208 - val_accuracy: 0.4995
Epoch 16/20
33/33 [==============================] - 16s 498ms/step - loss: 2.6282 - accuracy: 0.9669 - val_loss: 8.2268e-04 - val_accuracy: 1.0000
Epoch 17/20
33/33 [==============================] - 18s 541ms/step - loss: 0.2303 - accuracy: 0.9737 - val_loss: 0.0078 - val_accuracy: 0.9971
Epoch 18/20
33/33 [==============================] - 15s 451ms/step - loss: 0.0023 - accuracy: 0.9990 - val_loss: 7.8922e-05 - val_accuracy: 1.0000
Epoch 19/20
33/33 [==============================] - 17s 518ms/step - loss: 5.1120e-05 - accuracy: 1.0000 - val_loss: 2.1565e-05 - val_accuracy: 1.0000
Epoch 20/20
33/33 [==============================] - 16s 493ms/step - loss: 1.3909e-05 - accuracy: 1.0000 - val_loss: 5.0468e-06 - val_accuracy: 1.0000
 

 


 
# 모델 평가
# 학습정확도와 밸리데이션정확도 차트로 그려보세요.
 
In :
 
model.evaluate(validation_generator)
 

Out : 

33/33 [==============================] - 9s 269ms/step - loss: 5.0468e-06 - accuracy: 1.0000
[5.046816113463137e-06, 1.0]

 

- 차트만드는 법

In :

plt.plot(epoch_history.history['accuracy'])
plt.plot(epoch_history.history['val_accuracy'])
plt.legend(['train', 'validation'])
plt.show()

 


- 파일찾기 버튼 생성 및 이미지(jpg, png) 업로드 하는 코드

In :

import numpy as np
from google.colab import files
from tensorflow.keras.preprocessing import image

uploaded = files.upload()

for fn in uploaded.keys() :
  path = '/content/' + fn
  img = image.load_img(path, target_size=(300,300))
  x = image.img_to_array(img)

  print(x.shape)

  x = np.expand_dims(x, axis = 0)

  print(x.shape)

  images = np.vstack( [x] )
  classes = model.predict( images, batch_size = 10 )
  
  print(classes)

  if classes[0] > 0.5 :
    print(fn + " is a human")
  else :
    print(fn + " is a horse")

 

 

Out :