Amazon SageMaker Studio Lab 把玩心得

X
eCloudture
Published in
16 min readJan 4, 2022

簡介

Amazon SageMaker Studio Lab 為去年底 re:Invent 所推出的全新服務,方便學生及 ML 開發者更方便的在無需前置繁瑣的安裝使用帶有 GPU、CPU 運算資源的 Jupyter notebook

此外 Amazon SageMaker Studio Lab 可以用非 AWS 帳號註冊,在使用上又更方便了!

Google 上對應的服務為 Colab,以及另一個 ML 學習平台 Kaggle

如何申請 Amazon Sagemaker Studio Lab ?

只要 2 步驟:

  1. 這邊 填帳號申請單
  2. 等待 2 ~ 3 天即可得到帳號
  3. 開始玩 ~
小秘密: Why are you interested in Amazon SageMaker Studio Lab? 選取 Hacksthon 貌似會比較快過審核

如何使用 Amazon Sagemaker Studio Lab ?

先選取運算型態

CPU 有 12 小時的 runtime,GPU 只有 4 小時。

按下 Start runtime

這時候按下 Open project,即可看到熟悉的 Jupyter notebook 畫面

Getting Started with Amazon SageMaker Studio Lab

Installing Python packages

這邊強烈建議使用 conda install 指令來安裝各種 python packages,因為權限問題,若遇安裝的 packeage 需要其他 dependency 時會安裝失敗無法使用。

ref: htop installing issueopencv installing issue in Amazon SageMaker Studio Lab

Create env in Amazon SageMaker Studio Lab

在 Amazon SageMaker Studio Lab 使用者也可以開很多不同的 pytohn runtime 來區分不同專案所需要的環境。

點選 File -> New -> Terminal 即可開啟 Terminal 視窗

  • 在 Terminal 視窗中輸入以下指令即可創立一個 Python 3.9,名為 my_environment 的環境
$ conda create --name my_environment python=3.9#這樣就可以開啟一個 python 3.9 的環境,但是在 Kernel list 還不會被列出來
  • 接著在 Terminal 視窗中輸入以下指令即可切換 Python 執行環境到 my_environment
$ conda activate my_environment
  • 最後在 Terminal 視窗中輸入以下,並等待 1 ~ 2 分鐘即可在 Kernel list 選取到 my_environment 環境
$ conda install ipykernel

Amazon SageMaker Studio Lab 香嗎?

這邊會以 GPU runtime 來討論
  • 首先是硬體規格,GPU 使用的是 Tesla T4,colab 免費方案使用的為 K80,Kaggle 為 P100
Amazon SageMaker Studio Lab
Colab
Kaggle
  • 再來是實際工作負載(mnist CNN)
from tensorflow.keras.datasets import mnist 
from tensorflow.keras.utils import to_categorical
import time
start = time.time()(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images.astype('float32') / 255
test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype('float32') / 255
train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)
from tensorflow.keras import layers
from tensorflow.keras import models
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu')) model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu')) model.add(layers.Dense(10, activation='softmax'))
model.summary()
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=5, batch_size=64)
test_loss, test_acc = model.evaluate(test_images, test_labels) print(test_acc)
done = time.time()
elapsed = done - start
print(f"time spend {elapsed} s")
  • Result of Amazon SageMaker Studio Lab @ T4
Model: "sequential_1" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 26, 26, 32) 320 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 13, 13, 32) 0 _________________________________________________________________ conv2d_1 (Conv2D) (None, 11, 11, 64) 18496 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 3, 3, 64) 36928 _________________________________________________________________ flatten (Flatten) (None, 576) 0 _________________________________________________________________ dense_2 (Dense) (None, 64) 36928 _________________________________________________________________ dense_3 (Dense) (None, 10) 650 ================================================================= Total params: 93,322 
Trainable params: 93,322
Non-trainable params: 0 _________________________________________________________________ Epoch 1/5 938/938 [==============================] - 7s 3ms/step - loss: 0.1776 - accuracy: 0.9439
Epoch 2/5 938/938 [==============================] - 2s 2ms/step - loss: 0.0468 - accuracy: 0.9857
Epoch 3/5 938/938 [==============================] - 2s 3ms/step - loss: 0.0323 - accuracy: 0.9899
Epoch 4/5 938/938 [==============================] - 2s 3ms/step - loss: 0.0242 - accuracy: 0.9926
Epoch 5/5 938/938 [==============================] - 2s 2ms/step - loss: 0.0188 - accuracy: 0.9944
313/313 [==============================] - 1s 1ms/step - loss: 0.0317 - accuracy: 0.9909
0.9908999800682068
time spend 17.8512921333313 s
  • Result of Colab @ K80
Model: "sequential_2" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_6 (Conv2D) (None, 26, 26, 32) 320 max_pooling2d_4 (MaxPooling (None, 13, 13, 32) 0 2D) conv2d_7 (Conv2D) (None, 11, 11, 64) 18496 max_pooling2d_5 (MaxPooling (None, 5, 5, 64) 0 2D) conv2d_8 (Conv2D) (None, 3, 3, 64) 36928 flatten_2 (Flatten) (None, 576) 0 dense_4 (Dense) (None, 64) 36928 dense_5 (Dense) (None, 10) 650 ================================================================= Total params: 93,322 
Trainable params: 93,322
Non-trainable params: 0 _________________________________________________________________ Epoch 1/5 938/938 [==============================] - 14s 13ms/step - loss: 0.1738 - accuracy: 0.9450
Epoch 2/5 938/938 [==============================] - 9s 10ms/step - loss: 0.0486 - accuracy: 0.9853
Epoch 3/5 938/938 [==============================] - 9s 9ms/step - loss: 0.0327 - accuracy: 0.9901
Epoch 4/5 938/938 [==============================] - 9s 9ms/step - loss: 0.0253 - accuracy: 0.9922
Epoch 5/5 938/938 [==============================] - 9s 9ms/step - loss: 0.0203 - accuracy: 0.9937
313/313 [==============================] - 1s 4ms/step - loss: 0.0258 - accuracy: 0.9926
0.9926000237464905
time spend 53.64968514442444 s
  • Result of Kaggle @ P100
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 26, 26, 32) 320 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 13, 13, 32) 0 _________________________________________________________________ conv2d_1 (Conv2D) (None, 11, 11, 64) 18496 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 3, 3, 64) 36928 _________________________________________________________________ flatten (Flatten) (None, 576) 0 _________________________________________________________________ dense (Dense) (None, 64) 36928 _________________________________________________________________ dense_1 (Dense) (None, 10) 650 ================================================================= Total params: 93,322 
Trainable params: 93,322
Non-trainable params: 0 _________________________________________________________________ Epoch 1/5 938/938 [==============================] - 12s 5ms/step - loss: 0.1769 - accuracy: 0.9445
Epoch 2/5 938/938 [==============================] - 5s 5ms/step - loss: 0.0478 - accuracy: 0.9849
Epoch 3/5 938/938 [==============================] - 5s 5ms/step - loss: 0.0317 - accuracy: 0.9898
Epoch 4/5 938/938 [==============================] - 5s 6ms/step - loss: 0.0248 - accuracy: 0.9924
Epoch 5/5 938/938 [==============================] - 5s 5ms/step - loss: 0.0197 - accuracy: 0.9936
313/313 [==============================] - 1s 3ms/step - loss: 0.0368 - accuracy: 0.9904
0.9904000163078308
time spend 35.18884539604187 s

Amazon SageMaker Studio Lab 使用心得

Amazon SageMaker Studio Lab 在使用上不像 Colab、Kaggle 已經有預建好的環境可供使用,使用者需要自己安裝各種 python package 及建立 python 執行環境,這點相對於其他平台帶來的是更高的自由度,但也會造成某些package 不好安裝,或需要特殊版本安裝的問題。

其他比較項目如下表所示,整體來說 Amazon SageMaker Studio Lab 的出現也讓這樣的運算服務有了新選擇。憑藉著較原生的 Jupyter notebook 環境,對於某些使用者說不定是一種更合適的選擇。

--

--