FastAPI로 머신러닝 모델 서빙 완벽 가이드: Docker와 함께하는 프로덕션 배포

IT기술/파이썬 (python)

FastAPI로 머신러닝 모델 서빙 완벽 가이드: Docker와 함께하는 프로덕션 배포

후스파 2025. 7. 5. 08:28

FastAPI는 머신러닝(ML) 모델을 API 형태로 쉽게 배포할 수 있는 훌륭한 도구입니다.
이 섹션에서는 FastAPI를 사용하여 ML 모델을 서빙하는 방법과 Docker를 활용한 간단한 예제를 소개합니다.

FastAPI로 ML 모델 서빙하기

ML 모델을 FastAPI로 서빙하기 위해서는 기본적으로 다음 단계를 따라야 합니다:

모델 로드: 학습된 ML 모델을 로드합니다
API 엔드포인트 정의: 클라이언트가 요청을 보낼 수 있는 API 엔드포인트를 정의합니다
요청 처리: 클라이언트로부터 받은 데이터를 모델에 입력하고 예측 결과를 반환합니다

FastAPI가 ML 모델 서빙에 적합한 이유

고성능: 비동기 처리로 높은 처리량 제공
자동 문서화: Swagger UI를 통한 자동 API 문서 생성
타입 힌팅: Pydantic을 통한 데이터 검증
쉬운 배포: Docker와의 완벽한 호환성

기본 FastAPI API 구현

예를 들어, 간단한 선형 회귀 모델을 FastAPI로 서빙하는 예제를 살펴보겠습니다.

필요한 라이브러리 설치

먼저 필요한 라이브러리를 설치합니다.

pip install fastapi uvicorn scikit-learn joblib pandas numpy

모델 학습 및 저장

간단한 선형 회귀 모델을 학습하고 저장하는 코드입니다.

# train_model.py
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score
import joblib

# 더 복잡한 데이터 생성
np.random.seed(42)
X = np.random.randn(1000, 3)  # 3개의 특성
y = 2*X[:, 0] + 3*X[:, 1] - X[:, 2] + np.random.randn(1000) * 0.1

# 데이터 분할
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 모델 학습
model = LinearRegression()
model.fit(X_train, y_train)

# 모델 평가
y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"MSE: {mse:.4f}")
print(f"R2 Score: {r2:.4f}")

# 모델 저장
joblib.dump(model, 'linear_regression_model.pkl')
print("모델이 저장되었습니다.")

FastAPI 애플리케이션 구현

FastAPI 애플리케이션을 구현하여 모델을 서빙합니다.

# main.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel, Field
import joblib
import numpy as np
import logging
from typing import List
import uvicorn

# 로깅 설정
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# FastAPI 앱 초기화
app = FastAPI(
    title="ML Model Serving API",
    description="선형 회귀 모델을 서빙하는 FastAPI 애플리케이션",
    version="1.0.0"
)

# 모델 로드
try:
    model = joblib.load('linear_regression_model.pkl')
    logger.info("모델이 성공적으로 로드되었습니다.")
except Exception as e:
    logger.error(f"모델 로드 실패: {e}")
    model = None

# 요청 데이터 모델 정의
class PredictionRequest(BaseModel):
    feature1: float = Field(..., description="첫 번째 특성값")
    feature2: float = Field(..., description="두 번째 특성값")
    feature3: float = Field(..., description="세 번째 특성값")

class BatchPredictionRequest(BaseModel):
    features: List[List[float]] = Field(..., description="배치 예측을 위한 특성 리스트")

class PredictionResponse(BaseModel):
    prediction: float
    model_version: str = "1.0.0"

class BatchPredictionResponse(BaseModel):
    predictions: List[float]
    model_version: str = "1.0.0"

class HealthResponse(BaseModel):
    status: str
    model_loaded: bool

# 헬스 체크 엔드포인트
@app.get("/health", response_model=HealthResponse)
async def health_check():
    """API 상태 확인"""
    return HealthResponse(
        status="healthy" if model is not None else "unhealthy",
        model_loaded=model is not None
    )

# 단일 예측 엔드포인트
@app.post("/predict", response_model=PredictionResponse)
async def predict(request: PredictionRequest):
    """단일 데이터 포인트에 대한 예측"""
    if model is None:
        raise HTTPException(status_code=500, detail="모델이 로드되지 않았습니다.")

    try:
        # 입력 데이터 준비
        data = np.array([[request.feature1, request.feature2, request.feature3]])

        # 예측 수행
        prediction = model.predict(data)

        logger.info(f"예측 완료: {prediction[0]}")

        return PredictionResponse(
            prediction=float(prediction[0])
        )

    except Exception as e:
        logger.error(f"예측 중 오류 발생: {e}")
        raise HTTPException(status_code=500, detail=f"예측 중 오류가 발생했습니다: {str(e)}")

# 배치 예측 엔드포인트
@app.post("/predict/batch", response_model=BatchPredictionResponse)
async def predict_batch(request: BatchPredictionRequest):
    """여러 데이터 포인트에 대한 배치 예측"""
    if model is None:
        raise HTTPException(status_code=500, detail="모델이 로드되지 않았습니다.")

    try:
        # 입력 데이터 검증
        if not request.features:
            raise HTTPException(status_code=400, detail="특성 데이터가 비어있습니다.")

        # 각 행이 3개의 특성을 가지는지 확인
        for i, features in enumerate(request.features):
            if len(features) != 3:
                raise HTTPException(
                    status_code=400, 
                    detail=f"행 {i}에 3개의 특성이 필요합니다. 현재: {len(features)}개"
                )

        # 입력 데이터 준비
        data = np.array(request.features)

        # 배치 예측 수행
        predictions = model.predict(data)

        logger.info(f"배치 예측 완료: {len(predictions)}개 샘플")

        return BatchPredictionResponse(
            predictions=[float(pred) for pred in predictions]
        )

    except HTTPException:
        raise
    except Exception as e:
        logger.error(f"배치 예측 중 오류 발생: {e}")
        raise HTTPException(status_code=500, detail=f"배치 예측 중 오류가 발생했습니다: {str(e)}")

# 모델 정보 엔드포인트
@app.get("/model/info")
async def model_info():
    """모델 정보 반환"""
    if model is None:
        raise HTTPException(status_code=500, detail="모델이 로드되지 않았습니다.")

    return {
        "model_type": "LinearRegression",
        "features": ["feature1", "feature2", "feature3"],
        "model_version": "1.0.0",
        "coefficients": model.coef_.tolist(),
        "intercept": float(model.intercept_)
    }

if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8000)

Docker를 활용한 ML 서빙

Docker를 사용하여 FastAPI 애플리케이션을 컨테이너화하면, 배포와 관리가 수월해집니다.

Dockerfile 작성

아래와 같은 Dockerfile을 작성합니다.

# Dockerfile
FROM python:3.9-slim

# 작업 디렉토리 설정
WORKDIR /app

# 시스템 패키지 업데이트 및 필요한 패키지 설치
RUN apt-get update && apt-get install -y \
    gcc \
    && rm -rf /var/lib/apt/lists/*

# requirements.txt 복사 및 의존성 설치
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# 애플리케이션 코드 복사
COPY . .

# 포트 노출
EXPOSE 8000

# 헬스체크 추가
HEALTHCHECK --interval=30s --timeout=30s --start-period=5s --retries=3 \
    CMD curl -f http://localhost:8000/health || exit 1

# 애플리케이션 실행
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

requirements.txt 작성

필요한 라이브러리를 requirements.txt에 추가합니다.

fastapi==0.104.1
uvicorn[standard]==0.24.0
scikit-learn==1.3.2
joblib==1.3.2
pandas==2.1.4
numpy==1.24.4
pydantic==2.5.0

Docker Compose 설정

# docker-compose.yml
version: '3.8'

services:
  ml-api:
    build: .
    ports:
      - "8000:8000"
    environment:
      - PYTHONPATH=/app
    volumes:
      - ./logs:/app/logs
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s

  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf
    depends_on:
      - ml-api
    restart: unless-stopped

Docker 이미지 빌드 및 실행

Docker 이미지를 빌드하고 실행합니다.

# 이미지 빌드
docker build -t fastapi-ml-model .

# 컨테이너 실행
docker run -d --name ml-model -p 8000:8000 fastapi-ml-model

# 또는 Docker Compose 사용
docker-compose up -d

API 테스트

컨테이너가 실행된 후, 다양한 방법으로 API를 테스트할 수 있습니다.

cURL을 사용한 테스트

# 헬스 체크
curl -X GET "http://localhost:8000/health"

# 단일 예측
curl -X POST "http://localhost:8000/predict" \
     -H "Content-Type: application/json" \
     -d '{"feature1": 1.5, "feature2": 2.0, "feature3": -0.5}'

# 배치 예측
curl -X POST "http://localhost:8000/predict/batch" \
     -H "Content-Type: application/json" \
     -d '{"features": [[1.5, 2.0, -0.5], [0.5, 1.0, 0.2]]}'

# 모델 정보 조회
curl -X GET "http://localhost:8000/model/info"

Python 클라이언트 예제

# client.py
import requests
import json

# API 엔드포인트
BASE_URL = "http://localhost:8000"

def test_health():
    response = requests.get(f"{BASE_URL}/health")
    print("Health Check:", response.json())

def test_single_prediction():
    data = {
        "feature1": 1.5,
        "feature2": 2.0,
        "feature3": -0.5
    }
    response = requests.post(f"{BASE_URL}/predict", json=data)
    print("Single Prediction:", response.json())

def test_batch_prediction():
    data = {
        "features": [
            [1.5, 2.0, -0.5],
            [0.5, 1.0, 0.2],
            [-1.0, 0.5, 1.5]
        ]
    }
    response = requests.post(f"{BASE_URL}/predict/batch", json=data)
    print("Batch Prediction:", response.json())

def test_model_info():
    response = requests.get(f"{BASE_URL}/model/info")
    print("Model Info:", response.json())

if __name__ == "__main__":
    test_health()
    test_single_prediction()
    test_batch_prediction()
    test_model_info()

응답 예시:

{
  "prediction": 4.2,
  "model_version": "1.0.0"
}

고급 기능 구현

모델 버전 관리

# model_manager.py
import joblib
import os
from typing import Dict, Any
from datetime import datetime

class ModelManager:
    def __init__(self):
        self.models = {}
        self.current_model = None

    def load_model(self, model_path: str, version: str):
        """모델 로드 및 버전 관리"""
        try:
            model = joblib.load(model_path)
            self.models[version] = {
                'model': model,
                'loaded_at': datetime.now(),
                'path': model_path
            }
            self.current_model = version
            return True
        except Exception as e:
            print(f"모델 로드 실패: {e}")
            return False

    def get_model(self, version: str = None):
        """특정 버전의 모델 반환"""
        if version is None:
            version = self.current_model
        return self.models.get(version, {}).get('model')

    def list_models(self):
        """로드된 모델 목록 반환"""
        return {
            version: {
                'loaded_at': info['loaded_at'].isoformat(),
                'path': info['path']
            }
            for version, info in self.models.items()
        }

# main.py에서 사용
model_manager = ModelManager()
model_manager.load_model('linear_regression_model.pkl', 'v1.0.0')

@app.get("/models")
async def list_models():
    return model_manager.list_models()

@app.post("/predict/{version}")
async def predict_with_version(version: str, request: PredictionRequest):
    model = model_manager.get_model(version)
    if model is None:
        raise HTTPException(status_code=404, detail=f"모델 버전 {version}을 찾을 수 없습니다.")

    # 예측 로직...

로깅 및 모니터링

# monitoring.py
import time
import psutil
from functools import wraps
from fastapi import Request
import logging

# 성능 모니터링 데코레이터
def monitor_performance(func):
    @wraps(func)
    async def wrapper(*args, **kwargs):
        start_time = time.time()
        start_memory = psutil.virtual_memory().used

        try:
            result = await func(*args, **kwargs)
            status = "success"
        except Exception as e:
            status = "error"
            raise
        finally:
            end_time = time.time()
            end_memory = psutil.virtual_memory().used

            # 메트릭 로깅
            logger.info(f"Function: {func.__name__}, "
                       f"Duration: {end_time - start_time:.3f}s, "
                       f"Memory: {(end_memory - start_memory) / 1024 / 1024:.2f}MB, "
                       f"Status: {status}")

        return result
    return wrapper

# 사용 예시
@app.post("/predict")
@monitor_performance
async def predict(request: PredictionRequest):
    # 예측 로직...
    pass

결론

FastAPI를 사용하여 머신러닝 모델을 API 형태로 쉽게 서빙할 수 있으며, Docker를 활용하면 배포와 관리를 간편하게 할 수 있습니다. 이 접근 방식은 다양한 ML 모델을 서비스화하고, 클라이언트 애플리케이션에서 쉽게 사용할 수 있도록 해줍니다.
핵심 포인트:

FastAPI의 자동 문서화로 API 사용법 쉽게 공유
Pydantic 모델로 입력 데이터 검증 및 타입 안정성 확보
Docker 컨테이너화로 일관된 배포 환경 제공
헬스체크와 모니터링으로 프로덕션 안정성 보장
배치 예측 지원으로 대용량 데이터 처리 효율성 향상

이러한 구조를 통해 머신러닝 모델을 안정적이고 확장 가능한 웹 서비스로 변환할 수 있으며, 다양한 클라이언트 애플리케이션에서 활용할 수 있습니다.

'IT기술 > 파이썬 (python)' 카테고리의 다른 글

FastAPI CRUD 애플리케이션 완벽 가이드: SQLAlchemy와 Tortoise ORM으로 구축하는 실전 프로젝트 (2)	2025.07.08
FastAPI 완벽 가이드: 현대적이고 고성능 Python 웹 프레임워크 (0)	2025.07.06
FastAPI OAuth2와 JWT 인증 완벽 가이드: 안전한 웹 애플리케이션 구축하기 (2)	2025.07.04
FastAPI 성능 최적화 실전 가이드: 워커 설정부터 캐싱·DB·비동기 작업까지 (0)	2025.04.30
[FastAPI] 대규모 프로젝트 설계 가이드: 모듈화, 의존성 주입, 라우터 분리 (0)	2025.04.28

현재글FastAPI로 머신러닝 모델 서빙 완벽 가이드: Docker와 함께하는 프로덕션 배포

HOOSFA

각종 IT 트렌드나 일상 이야기를 공유하며, 초보자든 노련한 전문가든 누구나 편하게 즐길 수 있는 공간이 되었으면 합니다~

클라우드 비용 최적화, Docker, 클라우드 ai, Langchain, AI 구축 비용, DevOps, 플러터, 클라우드 비용, 마이크로서비스아키텍처, 도커, 클라우드 비용 절감, 비용 절감, ai 비용 절감, MSA, AI, Flutter, AI 솔루션 추천, MCP, AI 솔루션, 웹개발,

03-07 04:01

Today :
Yesterday :

일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

HOOSFA