2025/10/13
使用配置:
yolo11s.yaml
from ultralytics import YOLOdef main():# 1. 并非加载预训练模型(这里用 YOLO11n,也可以换成 yolov11s、best.pt 等)# 而是让YAML 定义自己的网络结构model = YOLO(model=r"ultralytics/cfg/models/11/yolo11s.yaml")#model = YOLO(model=r"runs/detect/yolo11-tea-new4/weights/last.pt")#model = YOLO(model=r"yolo11s.pt")# 2. 开始训练model.train(data=r"datasets/teaDiseases/data.yaml", # 数据集配置文件epochs=300, # 训练轮数(建议 100+)imgsz=640, # 输入图片大小batch=32, # batch sizeworkers=8,device=0, # GPU 设备,CPU 就写 "cpu"name="yolo11-tea-yolo11s", # 改名更清晰lr0=0.001, # 稍微调高初始学习率,m模型可承受lrf=0.01,cos_lr=True,cache='ram', # 避免IO问题patience=50,optimizer='AdamW', # 更平滑、稳定(尤其适合小数据)weight_decay=0.0005, # 常规正则项hsv_h=0.015,hsv_s=0.7,hsv_v=0.4,degrees=5,translate=0.1,scale=0.5,shear=0.1,flipud=0.0,fliplr=0.5,mosaic=1.0,mixup=0.2, # 增强策略,适合病斑类任务val=True,plots=True)# 3. 训练完成后验证模型metrics = model.val(plots=True, iou=0.5) # 注意 iou=0.5 可与论文对齐print("验证结果:", metrics)# 4. 可选:导出为 ONNX/TorchScript 等格式# model.export(format="onnx")if __name__ == "__main__":main()
数据集使用增强之后的数据集:
train_aug_10_13
train: ../train_aug_10_13/images
val: ../val_10_13/images
test: ../test/imagesnc: 3
names: ['algal leaf spot', 'brown blight', 'grey blight']roboflow:workspace: bryan-setyawan-zsjssproject: tea-leaves-diseasesversion: 39license: CC BY 4.0url: https://universe.roboflow.com/bryan-setyawan-zsjss/tea-leaves-diseases/dataset/39
结果:
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size163/300 9.79G 3475 1.501 1.319 82 640: 100%|██████████| 200/200 [00:37<00:00, 5.31it/s]Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 3/3 [00:01<00:00, 2.39it/s]all 181 368 0.618 0.581 0.621 0.433Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size164/300 9.79G 3540 1.496 1.315 85 640: 100%|██████████| 200/200 [00:37<00:00, 5.31it/s]Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 3/3 [00:01<00:00, 2.20it/s]all 181 368 0.616 0.58 0.62 0.431Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size165/300 9.79G 3517 1.506 1.32 110 640: 100%|██████████| 200/200 [00:38<00:00, 5.20it/s]Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 3/3 [00:01<00:00, 2.35it/s]all 181 368 0.615 0.582 0.62 0.43Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size166/300 9.79G 3472 1.501 1.314 108 640: 100%|██████████| 200/200 [00:37<00:00, 5.34it/s]Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 3/3 [00:01<00:00, 2.08it/s]all 181 368 0.619 0.582 0.619 0.428Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size167/300 9.79G 3462 1.501 1.32 96 640: 100%|██████████| 200/200 [00:39<00:00, 5.12it/s]Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 3/3 [00:01<00:00, 2.15it/s]all 181 368 0.619 0.58 0.619 0.43Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size168/300 9.79G 3508 1.496 1.324 77 640: 100%|██████████| 200/200 [00:37<00:00, 5.28it/s]Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 3/3 [00:01<00:00, 2.31it/s]all 181 368 0.628 0.572 0.62 0.431Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size169/300 9.79G 3474 1.486 1.318 89 640: 100%|██████████| 200/200 [00:38<00:00, 5.25it/s]Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 3/3 [00:01<00:00, 2.29it/s]all 181 368 0.632 0.571 0.62 0.432Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size170/300 9.79G 3429 1.482 1.312 77 640: 100%|██████████| 200/200 [00:38<00:00, 5.17it/s]Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 3/3 [00:01<00:00, 2.41it/s]all 181 368 0.635 0.572 0.621 0.432Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size171/300 9.79G 3490 1.48 1.31 90 640: 100%|██████████| 200/200 [00:38<00:00, 5.17it/s]Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 3/3 [00:01<00:00, 2.19it/s]all 181 368 0.631 0.573 0.622 0.433Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size172/300 9.79G 3417 1.489 1.315 109 640: 100%|██████████| 200/200 [00:36<00:00, 5.49it/s]Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 3/3 [00:01<00:00, 2.39it/s]all 181 368 0.638 0.573 0.621 0.432
EarlyStopping: Training stopped early as no improvement observed in last 50 epochs. Best results observed at epoch 122, best model saved as best.pt.
To update EarlyStopping(patience=50) pass a new patience value, i.e. `patience=300` or use `patience=0` to disable EarlyStopping.172 epochs completed in 3.213 hours.
Optimizer stripped from runs/detect/yolo11-tea-yolo11s24/weights/last.pt, 19.2MB
Optimizer stripped from runs/detect/yolo11-tea-yolo11s24/weights/best.pt, 19.2MBValidating runs/detect/yolo11-tea-yolo11s24/weights/best.pt...
Ultralytics 8.3.182 🚀 Python-3.10.12 torch-2.4.0a0+07cecf4168.nv24.05 CUDA:0 (NVIDIA A100-SXM4-80GB, 32768MiB)
YOLO11s summary (fused): 100 layers, 9,413,961 parameters, 0 gradientsClass Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 3/3 [00:01<00:00, 2.61it/s]all 181 368 0.659 0.551 0.622 0.445algal leaf spot 69 203 0.74 0.562 0.717 0.526brown blight 77 89 0.7 0.525 0.608 0.422grey blight 66 76 0.538 0.566 0.542 0.386
Speed: 0.1ms preprocess, 1.1ms inference, 0.0ms loss, 1.6ms postprocess per image
Results saved to runs/detect/yolo11-tea-yolo11s24
Ultralytics 8.3.182 🚀 Python-3.10.12 torch-2.4.0a0+07cecf4168.nv24.05 CUDA:0 (NVIDIA A100-SXM4-80GB, 32768MiB)
YOLO11s summary (fused): 100 layers, 9,413,961 parameters, 0 gradients
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 2.9±2.6 MB/s, size: 47.8 KB)
val: Scanning /home/share/priv/yolo_new/ultralytics-main/ultralytics-main/datasets/teaDiseases/val_10_13/labels.cache... 181 images, 0 backgrounds, 0
WARNING ⚠️ cache='ram' may produce non-deterministic training results. Consider cache='disk' as a deterministic alternative if your disk space allows.
val: Caching images (0.2GB RAM): 100%|██████████| 181/181 [00:02<00:00, 83.24it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 6/6 [00:01<00:00, 3.37it/s]all 181 368 0.664 0.548 0.623 0.44algal leaf spot 69 203 0.739 0.558 0.716 0.523brown blight 77 89 0.71 0.522 0.607 0.416grey blight 66 76 0.543 0.566 0.545 0.382
Speed: 1.4ms preprocess, 2.4ms inference, 0.0ms loss, 0.9ms postprocess per image
小结:
指标 | 含义 | 值 |
---|---|---|
P (Precision) | 预测框中有多少是对的(查准率) | 0.664 |
R (Recall) | 实际目标中有多少被检测出来(查全率) | 0.548 |
mAP@0.5 | IoU=0.5 时的平均准确率 | 0.623 |
mAP@0.5:0.95 | 多阈值下的平均精度(更严格) | 0.44 |
指标 | 含义 | 观察 |
---|---|---|
box_loss ≈ 3400~3500 | 边框回归损失 | 稳定,说明模型收敛正常。 |
cls_loss ≈ 1.48~1.5 | 分类损失 | 较高,说明类别区分仍困难(灰斑病尤其)。 |
dfl_loss ≈ 1.31 | 分布焦点损失 | 稳定,说明预测框质量不错。 |
早停 (EarlyStopping) | epoch 122 时性能最佳,172 停止 | 表明模型在后期不再提升,有过拟合迹象。 |
指标 | 含义 | 观察 |
---|---|---|
box_loss ≈ 3400~3500 | 边框回归损失 | 稳定,说明模型收敛正常。 |
cls_loss ≈ 1.48~1.5 | 分类损失 | 较高,说明类别区分仍困难(灰斑病尤其)。 |
dfl_loss ≈ 1.31 | 分布焦点损失 | 稳定,说明预测框质量不错。 |
早停 (EarlyStopping) | epoch 122 时性能最佳,172 停止 | 表明模型在后期不再提升,有过拟合迹象。 |
改进建议:
✅ 1. 优化数据集
-
严格筛选验证集,不要用增强图像。
-
验证集:应来自原始未增强数据(200–400 张左右)。
-
测试集:可单独保留 100 张真实图像。
-
确保三类样本在验证集中相对平衡。
✅ 2. 更换更强模型
-
改用
yolo11m.pt
或yolo11l.pt
:YOLO11m 约 25M 参数,能显著提升特征学习能力。
✅ 3. 加强训练策略
-
增大学习率:
-
延长训练至 400 epochs;
-
或关闭早停机制:
-
使用余弦退火:
✅ 4. 平衡类别
在 data.yaml
中增加权重或使用 类权重平衡(class weights)
或者对样本少的类别进行额外增强。
✅ 5. 进行高质量验证
训练完成后执行:
生成混淆矩阵与 PR 曲线,检查混淆严重的类别。
🌿 总结一句话:
你的模型已经收敛但尚未泛化。
当前瓶颈主要是:验证集代表性不足 + 数据增强过度 + 模型容量偏小。
优先改善验证集和模型规模,再考虑微调参数。