YOLOv5をAWS Lambdaで実行する方法を解説します。

この記事ではYOLOv5をAWS Lambdaで実行する方法について解説します。AWS Lambdaは呼び出し時・イベント発生時だけ起動され実行されるAWSサービスで、呼び出し時・イベント発生時以外は料金がかからないのが特徴です。今回このAWS LambdaでYOLOv5を実行するサービスを作成する方法を解説します。

AWS Lambda(以下Lambda)にはZipファイルを展開して実行する方式と、Dockerコンテナを実行する方式の２種類があります。Zipファイルを展開して実行する方式は、実行する関数がAWS S3にZipファイルで配置され、そのZipファイルがダウンロードされ実行されます。一方、Dockerコンテナを実行する方式は、AWS ECRからDockerイメージがロードされ実行されます。Lambdaのコールドスタート(キャッシュがない状態からLambdaを起動する)時間はZipファイルを展開して実行する方式の方がDockerコンテナを実行する方式と比べてサイズが小さく速く実行できるようです。こちらの投稿を参照してください。一方、Dockerコンテナを実行する方式は、サイズの制限が10GBであるのに対し、Zipファイルを展開して実行する方式は、展開後のファイルサイズが250MBという制限があります。展開後のサイズが250MBを超える場合はDockerコンテナ方式、そうでない場合はZipファイル方式を選ぶと良いと思います。

作成するアプリケーション
前提条件
初期化
hello_world/app.py
hello_world/requirements.txt
template.yaml
ビルド
デプロイ
テスト
参照

作成するアプリケーション

今回作成するアプリケーションは、画像をクライアントからサーバ送信し、サーバで画像をYOLOv5で物体認識し、認識した結果の画像をクライアントに送り返すというものです。

前提条件

この記事にあるサンプルを実行するためには以下の前提条件が必要です。

AWSアカウント
AWS SAM CLIのインストール

初期化

Lambdaの初期セットアップは以下のコマンドで実行します。なおruntimeはpython3.8を使用します。またZipファイルを展開して実行する方式を使用します。

sam init --runtime python3.8 --package-type Zip --app-template hello-world --name yolov5-aws-lambdaCode language: Bash (bash)

上記実行後、yolov5-aws-lambdaディレクトリが作成され、そのディレクトリにhello_worldディレクトリとtemplate.yamlファイルなどが作成されます。

.
├── events
│   └── event.json
├── hello_world
│   ├── app.py
│   ├── __init__.py
│   └── requirements.txt
├── __init__.py
├── README.md
├── template.yaml
└── tests
    ├── __init__.py
    ...Code language: plaintext (plaintext)

今回修正が必要なファイルは、

hello_world/app.py
hello_world/requirements.txt
template.yaml

の３つです。

hello_world/app.py

こちらで作成したアプリケーションを流用します。つまりYOLOv5をOpenCVを使用して実装します。またLambdaではGPUが使えないのでCPUだけでYOLOv5を実行します。

import json
import cv2
import base64
import time
import numpy as np


def build_model():
    net = cv2.dnn.readNet("yolov5s.onnx")
    net.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV)
    net.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU)
    return net


INPUT_WIDTH = 640
INPUT_HEIGHT = 640
SCORE_THRESHOLD = 0.2
NMS_THRESHOLD = 0.4
CONFIDENCE_THRESHOLD = 0.4


def detect(image, net):
    blob = cv2.dnn.blobFromImage(
        image, 1 / 255.0, (INPUT_WIDTH, INPUT_HEIGHT), swapRB=True, crop=False
    )
    net.setInput(blob)
    preds = net.forward()
    return preds


def load_classes():
    class_list = []
    with open("classes.txt", "r") as f:
        class_list = [cname.strip() for cname in f.readlines()]
    return class_list


def wrap_detection(input_image, output_data):
    class_ids = []
    confidences = []
    boxes = []

    rows = output_data.shape[0]

    image_width, image_height, _ = input_image.shape

    x_factor = image_width / INPUT_WIDTH
    y_factor = image_height / INPUT_HEIGHT

    for r in range(rows):
        row = output_data[r]
        confidence = row[4]
        if confidence >= 0.4:

            classes_scores = row[5:]
            _, _, _, max_indx = cv2.minMaxLoc(classes_scores)
            class_id = max_indx[1]
            if classes_scores[class_id] > 0.25:

                confidences.append(confidence)

                class_ids.append(class_id)

                x, y, w, h = row[0].item(), row[1].item(), row[2].item(), row[3].item()
                left = int((x - 0.5 * w) * x_factor)
                top = int((y - 0.5 * h) * y_factor)
                width = int(w * x_factor)
                height = int(h * y_factor)
                box = np.array([left, top, width, height])
                boxes.append(box)

    indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.25, 0.45)

    result_class_ids = []
    result_confidences = []
    result_boxes = []

    for i in indexes:
        result_confidences.append(confidences[i])
        result_class_ids.append(class_ids[i])
        result_boxes.append(boxes[i])

    return result_class_ids, result_confidences, result_boxes


def format_yolov5(frame):
    row, col, _ = frame.shape
    _max = max(col, row)
    result = np.zeros((_max, _max, 3), np.uint8)
    result[0:row, 0:col] = frame
    return result


def yolov5(image):
    colors = [(255, 255, 0), (0, 255, 0), (0, 255, 255), (255, 0, 0)]
    class_list = load_classes()

    net = build_model()

    inputImage = format_yolov5(image)
    outs = detect(inputImage, net)

    class_ids, confidences, boxes = wrap_detection(inputImage, outs[0])

    for (classid, confidence, box) in zip(class_ids, confidences, boxes):
        color = colors[int(classid) % len(colors)]
        cv2.rectangle(image, box, color, 2)
        cv2.rectangle(
            image, (box[0], box[1] - 20), (box[0] + box[2], box[1]), color, -1
        )
        cv2.putText(
            image,
            class_list[classid],
            (box[0], box[1] - 10),
            cv2.FONT_HERSHEY_SIMPLEX,
            0.5,
            (0, 0, 0),
        )
    return image


def base64_to_cv2(image_base64):
    # base64 image to cv2
    image_bytes = base64.b64decode(image_base64)
    np_array = np.fromstring(image_bytes, np.uint8)
    image_cv2 = cv2.imdecode(np_array, cv2.IMREAD_COLOR)
    return image_cv2


def cv2_to_base64(image_cv2):
    # cv2 image to base64
    image_bytes = cv2.imencode(".jpg", image_cv2)[1].tostring()
    image_base64 = base64.b64encode(image_bytes).decode()
    return image_base64


def lambda_handler(event, context):
    body = json.loads(event["body"])
    image = body["image"]
    image = yolov5(base64_to_cv2(image))

    return {
        "statusCode": 200,
        "body": json.dumps(
            {
                "image": cv2_to_base64(image),
            }
        ),
    }Code language: Python (python)

template.yamlにapp.lambda_handlerが最初に呼び出される設定がされているため、Lambdaが呼び出されると上記のlambda_handlerが呼び出されます。

クライアントはbase64エンコードされた画像をLambdaに送るためLambda内では画像をbase64デコードして参照します。

こちらの記事でも使用したこちらのレポジトリからyolov5s.onnxとclasses.txtをhello_worldディレクトリにコピーします。

hello_world/requirements.txt

hello_world/requirements.txtは以下の通りです。通常のopencv-pythonはサイズが大きいため250MBのサイズ制限を超えてしまいますがopencv-python-headlessを指定すことでサイズを小さくすることが可能です。headlessのパッケージはGUI関連のライブラリを含みませんがLambdaではそれらを使用しないため問題ありません。

opencv-python-headless==4.6.0.66Code language: plaintext (plaintext)

template.yaml

変更が必要な部分は以下のとおりです。

Globals.Function.Timeout
- 15秒に変更
  - 変更前の3秒では実行が完了しないため
Globals.Function.MemorySize
- 5312MBに変更
  - Lambdaは実行時間とMemorySizeにより料金が決まります。MemorySizeを小さくすると料金は安くなりますが、使用できる仮想CPU時間が少なくなるため、実行時間が長くなることがあります。こちらの記事に詳しく書かれています。
Resources.HelloWorldFunction.Properties.Events.HelloWorld.Properties.Method
- postに変更
  - クライアントから画像を送信するため

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: >
  yolov5-aws-lambda

  Sample SAM Template for yolov5-aws-lambda

# More info about Globals: https://github.com/awslabs/serverless-application-model/blob/master/docs/globals.rst
Globals:
  Function:
    Timeout: 15
    MemorySize: 5312

Resources:
  HelloWorldFunction:
    Type: AWS::Serverless::Function # More info about Function Resource: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#awsserverlessfunction
    Properties:
      CodeUri: hello_world/
      Handler: app.lambda_handler
      Runtime: python3.8
      Architectures:
        - x86_64
      Events:
        HelloWorld:
          Type: Api # More info about API Event Source: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#api
          Properties:
            Path: /hello
            Method: post
...Code language: YAML (yaml)

ビルド

ビルドします。ソースコード、テンプレートなどを変更した際はビルドが必要です。

sam buildCode language: Bash (bash)

デプロイ

--guidedを指定することでガイドが表示されます。下のサンプルに従って入力します。

$ sam deploy --guided

Configuring SAM deploy
======================

	Looking for config file [samconfig.toml] :  Not found

	Setting default arguments for 'sam deploy'
	=========================================
	Stack Name [sam-app]: ==> any name can be used
	AWS Region [us-east-1]: ==> any region can be used
	#Shows you resources changes to be deployed and require a 'Y' to initiate deploy
	Confirm changes before deploy [y/N]: ==> blank is OK
	#SAM needs permission to be able to create roles to connect to the resources in your template
	Allow SAM CLI IAM role creation [Y/n]: ==> blank is OK
	#Preserves the state of previously provisioned resources when an operation fails
	Disable rollback [y/N]: --> blank is OK
	HelloWorldFunction may not have authorization defined, Is this okay? [y/N]: Y
	Save arguments to configuration file [Y/n]: Y
	SAM configuration file [samconfig.toml]: ==> blank is OK
	SAM configuration environment [default]: ==> blank is OK

	Looking for resources needed for deployment:
	 Managed S3 bucket: aws-sam-cli-managed-default-samclisourcebucket-ntq68vn38lmx
	 A different default S3 bucket can be set in samconfig.toml

	Saved arguments to config file
	Running 'sam deploy' for future deployments will use the parameters saved above.
	The above parameters can be changed by modifying samconfig.toml
	Learn more about samconfig.toml syntax at 
	https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-config.html
...
Outputs
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
...
Key                 HelloWorldApi
Description         API Gateway endpoint URL for Prod stage for Hello World function
Value               https://xxxxxxxxxx.execute-api.xxxxxxxx.amazonaws.com/Prod/hello/Code language: Bash (bash)

上記設定が保存されたsamconfig.tomlが作成されます。次回以降のデプロイはsamconfig.tomlを参照するため--gudedオプションは不要となります。

デプロイの最後に表示される、API Gateway endpoint URLがエンドポイントです。

テスト

デプロイの最後に表示されたエンドポイントに画像を送信すると画像が返ってくるはずです。

wget https://raw.githubusercontent.com/ultralytics/yolov5/master/data/images/zidane.jpg
image=$(base64 -w0 zidane.jpg)
echo { \"image\": \"${image}\" } | \
  curl -X POST -H "Content-Type: application/json" -d @-  https://xxxxxxxxxx.execute-api.xxxxxxxx.amazonaws.com/Prod/hello/ | \
  jq -r .image | \
  base64 -d > predicted.jpgCode language: Bash (bash)

なおすべてのコードはhttps://github.com/otamajakusi/yolov5-aws-lambdaにあります。

以上です。