YOLOv8安全帽检测模型部署实战：从训练好的pt文件到Web端可视化应用-深圳市維司達科技有限公司

YOLOv8安全帽检测模型部署实战：从训练好的pt文件到Web端可视化应用

在建筑工地、电力巡检等工业场景中，安全帽佩戴检测是保障作业人员生命安全的重要环节。当我们已经用YOLOv8训练出一个准确率不错的模型（比如mAP@0.5达到0.897的best.pt文件），接下来的关键问题就是：如何将这个模型变成真正可用的服务？本文将带你完整走通从模型导出到Web应用部署的全流程。

1. 模型格式转换与优化

拿到训练好的best.pt文件后，直接在生产环境使用PyTorch模型并不是最优选择。我们需要考虑推理速度、硬件兼容性和部署便捷性等因素。

1.1 导出ONNX格式

ONNX（Open Neural Network Exchange）是一种跨平台的模型表示格式，可以方便地在不同框架间转换和优化。使用YOLOv8自带的导出功能非常简单：

from ultralytics import YOLO # 加载训练好的模型 model = YOLO('path/to/best.pt') # 导出为ONNX格式 model.export(format='onnx', dynamic=True, simplify=True)

关键参数说明：

dynamic=True：允许输入尺寸动态变化
simplify=True：对模型进行简化优化

导出后会生成一个同名的.onnx文件。可以用Netron工具可视化检查模型结构是否正确。

1.2 转换为TensorRT引擎（可选）

如果你有NVIDIA GPU且追求极致性能，可以进一步转换为TensorRT格式：

trtexec --onnx=best.onnx --saveEngine=best.engine \ --fp16 --workspace=4096 --verbose

性能对比测试结果：

格式	推理设备	耗时(ms)	内存占用(MB)
PyTorch	RTX 3090	15.2	1243
ONNX	RTX 3090	9.8	857
TensorRT	RTX 3090	4.3	512

注意：TensorRT转换需要匹配CUDA和cuDNN版本，建议使用Docker环境保证一致性

2. 构建后端API服务

我们需要创建一个Web服务来加载模型并处理检测请求。这里以FastAPI为例，它比Flask有更好的性能和异步支持。

2.1 基础API实现

首先安装依赖：

pip install fastapi uvicorn python-multipart

创建main.py：

from fastapi import FastAPI, UploadFile, File from fastapi.responses import JSONResponse import cv2 import numpy as np from ultralytics import YOLO app = FastAPI() # 加载模型 model = YOLO('best.engine') # 或 'best.onnx' @app.post("/detect") async def detect(file: UploadFile = File(...)): # 读取上传的图片 contents = await file.read() nparr = np.frombuffer(contents, np.uint8) img = cv2.imdecode(nparr, cv2.IMREAD_COLOR) # 执行检测 results = model(img) # 解析结果 detections = [] for result in results: for box in result.boxes: detections.append({ "class": model.names[int(box.cls)], "confidence": float(box.conf), "bbox": box.xyxy[0].tolist() }) return JSONResponse(content={"detections": detections})

启动服务：

uvicorn main:app --reload --host 0.0.0.0 --port 8000

2.2 性能优化技巧

模型预热：在服务启动时先进行一次推理，避免首次请求延迟
批处理支持：修改API支持多图同时处理
结果缓存：对相同图片的重复请求返回缓存结果

from functools import lru_cache import hashlib @lru_cache(maxsize=100) def cached_detection(img_hash, img_array): # 检测逻辑... return detections @app.post("/detect") async def detect(file: UploadFile = File(...)): contents = await file.read() img_hash = hashlib.md5(contents).hexdigest() # ...其余处理 return cached_detection(img_hash, img)

3. 前端可视化实现

为了让非技术人员也能方便使用，我们需要一个直观的前端界面。这里介绍两种方案：

3.1 方案一：Streamlit快速原型

Streamlit特别适合快速构建数据科学应用：

import streamlit as st import requests from PIL import Image, ImageDraw import io st.title("安全帽检测系统") uploaded_file = st.file_uploader("上传工地图片", type=['jpg','png']) if uploaded_file is not None: # 显示原图 image = Image.open(uploaded_file) st.image(image, caption='原始图片', use_column_width=True) # 调用API检测 files = {"file": uploaded_file.getvalue()} response = requests.post("http://localhost:8000/detect", files=files) # 绘制检测框 draw = ImageDraw.Draw(image) for det in response.json()["detections"]: bbox = det["bbox"] draw.rectangle(bbox, outline="red", width=3) draw.text((bbox[0], bbox[1]-20), f"{det['class']} {det['confidence']:.2f}", fill="red") st.image(image, caption='检测结果', use_column_width=True)

启动前端：

streamlit run app.py

3.2 方案二：HTML+JS专业前端

如果需要更专业的界面，可以用HTML+JavaScript实现：

<!DOCTYPE html> <html> <head> <title>安全帽检测系统</title> <style> #preview { max-width: 100%; } .result-container { margin-top: 20px; } .bbox { position: absolute; border: 2px solid red; } .label { position: absolute; color: red; font-weight: bold; } </style> </head> <body> <h1>安全帽检测系统</h1> <input type="file" id="upload" accept="image/*"> <div class="result-container"> <div style="position: relative;"> <img id="preview" style="display: none;"> <div id="results"></div> </div> </div> <script> document.getElementById('upload').addEventListener('change', function(e) { const file = e.target.files[0]; if (!file) return; const reader = new FileReader(); reader.onload = function(event) { const img = document.getElementById('preview'); img.src = event.target.result; img.style.display = 'block'; // 调用检测API detectImage(file); }; reader.readAsDataURL(file); }); async function detectImage(file) { const formData = new FormData(); formData.append('file', file); try { const response = await fetch('http://localhost:8000/detect', { method: 'POST', body: formData }); const data = await response.json(); drawResults(data.detections); } catch (error) { console.error('检测失败:', error); } } function drawResults(detections) { const resultsDiv = document.getElementById('results'); resultsDiv.innerHTML = ''; const img = document.getElementById('preview'); const imgWidth = img.width; const imgHeight = img.height; detections.forEach(det => { const [x1, y1, x2, y2] = det.bbox; const bbox = document.createElement('div'); bbox.className = 'bbox'; bbox.style.left = `${x1/imgWidth*100}%`; bbox.style.top = `${y1/imgHeight*100}%`; bbox.style.width = `${(x2-x1)/imgWidth*100}%`; bbox.style.height = `${(y2-y1)/imgHeight*100}%`; const label = document.createElement('div'); label.className = 'label'; label.style.left = `${x1/imgWidth*100}%`; label.style.top = `${(y1/imgHeight*100)-5}%`; label.textContent = `${det.class} ${(det.confidence*100).toFixed(1)}%`; resultsDiv.appendChild(bbox); resultsDiv.appendChild(label); }); } </script> </body> </html>

4. 系统部署与性能调优

4.1 Docker容器化部署

为了保证环境一致性，建议使用Docker部署：

FROM nvidia/cuda:11.8.0-base WORKDIR /app COPY . . RUN apt-get update && apt-get install -y python3 python3-pip RUN pip install -r requirements.txt EXPOSE 8000 CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

构建并运行：

docker build -t helmet-detection . docker run -p 8000:8000 --gpus all helmet-detection

4.2 性能监控与扩展

对于生产环境，需要考虑：

监控指标：
- API响应时间
- GPU利用率
- 并发处理能力
水平扩展：
- 使用Nginx做负载均衡
- Kubernetes集群部署
自动缩放：
- 根据GPU利用率自动增减实例
- 使用Kubernetes的HPA或云服务商的自动缩放功能

# 示例：使用Prometheus监控GPU nvidia-smi --query-gpu=utilization.gpu --format=csv -l 1 > gpu_usage.csv

4.3 安全防护措施

API限流：

from fastapi import FastAPI, Request from fastapi.middleware import Middleware from slowapi import Limiter from slowapi.util import get_remote_address limiter = Limiter(key_func=get_remote_address) app = FastAPI(middleware=[Middleware(limiter)]) @app.post("/detect") @limiter.limit("5/minute") async def detect(request: Request, file: UploadFile = File(...)): # ...