YOLOv5:指定类别进行评估验证

YOLOv5:指定类别进行评估验证

  • 前言
  • 前提条件
  • 相关介绍
  • 实验环境
  • YOLOv5:指定类别进行评估验证
    • 代码实现
    • 进行验证
    • 没有指定的结果
    • 指定类别的结果

在这里插入图片描述

前言

  • 由于本人水平有限,难免出现错漏,敬请批评改正。
  • 更多精彩内容,可点击进入Python日常小操作专栏、OpenCV-Python小应用专栏、YOLO系列专栏、自然语言处理专栏或我的个人主页查看
  • 基于DETR的人脸伪装检测
  • YOLOv7训练自己的数据集(口罩检测)
  • YOLOv8训练自己的数据集(足球检测)
  • YOLOv5:TensorRT加速YOLOv5模型推理
  • YOLOv5:IoU、GIoU、DIoU、CIoU、EIoU
  • 玩转Jetson Nano(五):TensorRT加速YOLOv5目标检测
  • YOLOv5:添加SE、CBAM、CoordAtt、ECA注意力机制
  • YOLOv5:yolov5s.yaml配置文件解读、增加小目标检测层
  • Python将COCO格式实例分割数据集转换为YOLO格式实例分割数据集
  • YOLOv5:使用7.0版本训练自己的实例分割模型(车辆、行人、路标、车道线等实例分割)
  • 使用Kaggle GPU资源免费体验Stable Diffusion开源项目

前提条件

  • 熟悉Python

相关介绍

  • Python是一种跨平台的计算机程序设计语言。是一个高层次的结合了解释性、编译性、互动性和面向对象的脚本语言。最初被设计用于编写自动化脚本(shell),随着版本的不断更新和语言新功能的添加,越多被用于独立的、大型项目的开发。
  • PyTorch 是一个深度学习框架,封装好了很多网络和深度学习相关的工具方便我们调用,而不用我们一个个去单独写了。它分为 CPU 和 GPU 版本,其他框架还有 TensorFlow、Caffe 等。PyTorch 是由 Facebook 人工智能研究院(FAIR)基于 Torch 推出的,它是一个基于 Python 的可续计算包,提供两个高级功能:1、具有强大的 GPU 加速的张量计算(如 NumPy);2、构建深度神经网络时的自动微分机制。
  • YOLOv5是一种单阶段目标检测算法,该算法在YOLOv4的基础上添加了一些新的改进思路,使其速度与精度都得到了极大的性能提升。它是一个在COCO数据集上预训练的物体检测架构和模型系列,代表了Ultralytics对未来视觉AI方法的开源研究,其中包含了经过数千小时的研究和开发而形成的经验教训和最佳实践。

实验环境

  • Python 3.x (面向对象的高级语言)

YOLOv5:指定类别进行评估验证

  • 背景:在特定场景下,只想关注特定类别的效果,即可指定类别进行评估验证。
  • 目录结构示例
    在这里插入图片描述

代码实现

  • 主要修改官方代码utils/datasets.py中552行的include_class变量。

在这里插入图片描述

# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
"""
Dataloaders and dataset utils
"""import glob
import hashlib
import json
import math
import os
import random
import shutil
import time
from itertools import repeat
from multiprocessing.pool import Pool, ThreadPool
from pathlib import Path
from threading import Thread
from zipfile import ZipFileimport cv2
import numpy as np
import torch
import torch.nn.functional as F
import yaml
from PIL import ExifTags, Image, ImageOps
from torch.utils.data import DataLoader, Dataset, dataloader, distributed
from tqdm import tqdmfrom utils.augmentations import Albumentations, augment_hsv, copy_paste, letterbox, mixup, random_perspective
from utils.general import (DATASETS_DIR, LOGGER, NUM_THREADS, check_dataset, check_requirements, check_yaml, clean_str,segments2boxes, xyn2xy, xywh2xyxy, xywhn2xyxy, xyxy2xywhn)
from utils.torch_utils import torch_distributed_zero_first# Parameters
HELP_URL = 'https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data'
IMG_FORMATS = ['bmp', 'dng', 'jpeg', 'jpg', 'mpo', 'png', 'tif', 'tiff', 'webp']  # include image suffixes
VID_FORMATS = ['asf', 'avi', 'gif', 'm4v', 'mkv', 'mov', 'mp4', 'mpeg', 'mpg', 'wmv']  # include video suffixes# ########################################相机设置########################################
# Get orientation exif tag
for orientation in ExifTags.TAGS.keys():if ExifTags.TAGS[orientation] == 'Orientation':breakdef get_hash(paths):# Returns a single hash value of a list of paths (files or dirs)size = sum(os.path.getsize(p) for p in paths if os.path.exists(p))  # sizesh = hashlib.md5(str(size).encode())  # hash sizesh.update(''.join(paths).encode())  # hash pathsreturn h.hexdigest()  # return hashdef exif_size(img):# Returns exif-corrected PIL sizes = img.size  # (width, height)try:rotation = dict(img._getexif().items())[orientation]if rotation == 6:  # rotation 270s = (s[1], s[0])elif rotation == 8:  # rotation 90s = (s[1], s[0])except Exception:passreturn sdef exif_transpose(image):"""Transpose a PIL image accordingly if it has an EXIF Orientation tag.Inplace version of https://github.com/python-pillow/Pillow/blob/master/src/PIL/ImageOps.py exif_transpose():param image: The image to transpose.:return: An image."""exif = image.getexif()orientation = exif.get(0x0112, 1)  # default 1if orientation > 1:method = {2: Image.FLIP_LEFT_RIGHT,3: Image.ROTATE_180,4: Image.FLIP_TOP_BOTTOM,5: Image.TRANSPOSE,6: Image.ROTATE_270,7: Image.TRANSVERSE,8: Image.ROTATE_90,}.get(orientation)if method is not None:image = image.transpose(method)del exif[0x0112]image.info["exif"] = exif.tobytes()return image# #######################################################################################def create_dataloader(path, imgsz, batch_size, stride, single_cls=False, hyp=None, augment=False, cache=False, pad=0.0,rect=False, rank=-1, workers=8, image_weights=False, quad=False, prefix='', shuffle=False):'''在train.py中被调用,用于生成Trainloader, dataset,testloader自定义dataloader函数: 调用LoadImagesAndLabels获取数据集(包括数据增强) + 调用分布式采样器DistributedSampler +自定义InfiniteDataLoader 进行永久持续的采样数据:param path: 图片数据加载路径 train/test  如: ../datasets/VOC/images/train2007:param imgsz: train/test图片尺寸(数据增强后大小) 640:param batch_size: batch size 大小 8/16/32:param stride: 模型最大stride=32   [32 16 8]:param single_cls: 数据集是否是单类别 默认False:param hyp: 超参列表dict 网络训练时的一些超参数,包括学习率等,这里主要用到里面一些关于数据增强(旋转、平移等)的系数:param augment: 是否要进行数据增强  True:param cache: 是否cache_images False:param pad: 设置矩形训练的shape时进行的填充 默认0.0:param rect: 是否开启矩形train/test  默认训练集关闭 验证集开启:param rank:  多卡训练时的进程编号 rank为进程编号  -1且gpu=1时不进行分布式  -1且多块gpu使用DataParallel模式  默认-1:param workers: dataloader的numworks 加载数据时的cpu进程数:param image_weights: 训练时是否根据图片样本真实框分布权重来选择图片  默认False:param quad: dataloader取数据时, 是否使用collate_fn4代替collate_fn  默认False:param prefix: 显示信息   一个标志,多为train/val,处理标签时保存cache文件会用到:param shuffle: 是否乱序,默认False'''if rect and shuffle: # rect: 是否开启矩形train/test  默认训练集关闭LOGGER.warning('WARNING: --rect is incompatible with DataLoader shuffle, setting shuffle=False')shuffle = False# 主进程实现数据的预读取并缓存,然后其它子进程则从缓存中读取数据并进行一系列运算。# 为了完成数据的正常同步, yolov5基于torch.distributed.barrier()函数实现了上下文管理器with torch_distributed_zero_first(rank):  # init dataset *.cache only once if DDP# 载入文件数据(增强数据集)dataset = LoadImagesAndLabels(path, imgsz, batch_size,augment=augment,  # augmentationhyp=hyp,  # hyperparametersrect=rect,  # rectangular batchescache_images=cache,single_cls=single_cls,stride=int(stride),pad=pad,image_weights=image_weights,prefix=prefix)batch_size = min(batch_size, len(dataset))nd = torch.cuda.device_count()  # number of CUDA devicesnw = min([os.cpu_count() // max(nd, 1), batch_size if batch_size > 1 else 0, workers])  # number of workers# 分布式采样器DistributedSamplersampler = None if rank == -1 else distributed.DistributedSampler(dataset, shuffle=shuffle)# 使用InfiniteDataLoader和_RepeatSampler来对DataLoader进行封装, 代替原先的DataLoader, 能够永久持续的采样数据loader = DataLoader if image_weights else InfiniteDataLoader  # only DataLoader allows for attribute updatesreturn loader(dataset,batch_size=batch_size,shuffle=shuffle and sampler is None,num_workers=nw,sampler=sampler,pin_memory=True,collate_fn=LoadImagesAndLabels.collate_fn4 if quad else LoadImagesAndLabels.collate_fn), datasetclass InfiniteDataLoader(dataloader.DataLoader):""" Dataloader that reuses workersUses same syntax as vanilla DataLoader当image_weights=False时就会调用这两个函数 进行自定义DataLoaderhttps://github.com/ultralytics/yolov5/pull/876使用InfiniteDataLoader和_RepeatSampler来对DataLoader进行封装, 代替原先的DataLoader, 能够永久持续的采样数据"""def __init__(self, *args, **kwargs):super().__init__(*args, **kwargs)# 调用_RepeatSampler进行持续采样object.__setattr__(self, 'batch_sampler', _RepeatSampler(self.batch_sampler))self.iterator = super().__iter__()def __len__(self):return len(self.batch_sampler.sampler)def __iter__(self):for i in range(len(self)):yield next(self.iterator)class _RepeatSampler:""" Sampler that repeats forever这部分是进行持续采样Args:sampler (Sampler)"""def __init__(self, sampler):self.sampler = samplerdef __iter__(self):while True:yield from iter(self.sampler)class LoadImages:# YOLOv5 image/video dataloader, i.e. `python detect.py --source image.jpg/vid.mp4`"""在detect.py中使用load 文件夹中的图片/视频定义迭代器 用于detect.py"""def __init__(self, path, img_size=640, stride=32, auto=True):p = str(Path(path).resolve())  # os-agnostic absolute pathif '*' in p:files = sorted(glob.glob(p, recursive=True))  # globelif os.path.isdir(p):files = sorted(glob.glob(os.path.join(p, '*.*')))  # direlif os.path.isfile(p):files = [p]  # fileselse:raise Exception(f'ERROR: {p} does not exist')images = [x for x in files if x.split('.')[-1].lower() in IMG_FORMATS]videos = [x for x in files if x.split('.')[-1].lower() in VID_FORMATS]ni, nv = len(images), len(videos)self.img_size = img_sizeself.stride = strideself.files = images + videosself.nf = ni + nv  # number of filesself.video_flag = [False] * ni + [True] * nvself.mode = 'image'self.auto = autoif any(videos):self.new_video(videos[0])  # new videoelse:self.cap = Noneassert self.nf > 0, f'No images or videos found in {p}. ' \f'Supported formats are:\nimages: {IMG_FORMATS}\nvideos: {VID_FORMATS}'def __iter__(self):self.count = 0return selfdef __next__(self):if self.count == self.nf:raise StopIterationpath = self.files[self.count]if self.video_flag[self.count]:# Read videoself.mode = 'video'ret_val, img0 = self.cap.read()while not ret_val:self.count += 1self.cap.release()if self.count == self.nf:  # last videoraise StopIterationelse:path = self.files[self.count]self.new_video(path)ret_val, img0 = self.cap.read()self.frame += 1s = f'video {self.count + 1}/{self.nf} ({self.frame}/{self.frames}) {path}: 'else:# Read imageself.count += 1img0 = cv2.imread(path)  # BGRassert img0 is not None, f'Image Not Found {path}'s = f'image {self.count}/{self.nf} {path}: '# Padded resizeimg = letterbox(img0, self.img_size, stride=self.stride, auto=self.auto)[0]# Convertimg = img.transpose((2, 0, 1))[::-1]  # HWC to CHW, BGR to RGBimg = np.ascontiguousarray(img)# 返回路径, resize+pad的图片, 原始图片, 视频对象return path, img, img0, self.cap, sdef new_video(self, path):self.frame = 0self.cap = cv2.VideoCapture(path)self.frames = int(self.cap.get(cv2.CAP_PROP_FRAME_COUNT))def __len__(self):return self.nf  # number of filesclass LoadWebcam:  # for inference# YOLOv5 local webcam dataloader, i.e. `python detect.py --source 0`def __init__(self, pipe='0', img_size=640, stride=32):self.img_size = img_sizeself.stride = strideself.pipe = eval(pipe) if pipe.isnumeric() else pipeself.cap = cv2.VideoCapture(self.pipe)  # video capture objectself.cap.set(cv2.CAP_PROP_BUFFERSIZE, 3)  # set buffer sizedef __iter__(self):self.count = -1return selfdef __next__(self):self.count += 1if cv2.waitKey(1) == ord('q'):  # q to quitself.cap.release()cv2.destroyAllWindows()raise StopIteration# Read frameret_val, img0 = self.cap.read()img0 = cv2.flip(img0, 1)  # flip left-right# Printassert ret_val, f'Camera Error {self.pipe}'img_path = 'webcam.jpg's = f'webcam {self.count}: '# Padded resizeimg = letterbox(img0, self.img_size, stride=self.stride)[0]# Convertimg = img.transpose((2, 0, 1))[::-1]  # HWC to CHW, BGR to RGBimg = np.ascontiguousarray(img)return img_path, img, img0, None, sdef __len__(self):return 0class LoadStreams:# YOLOv5 streamloader, i.e. `python detect.py --source 'rtsp://example.com/media.mp4'  # RTSP, RTMP, HTTP streams`def __init__(self, sources='streams.txt', img_size=640, stride=32, auto=True):self.mode = 'stream'self.img_size = img_sizeself.stride = strideif os.path.isfile(sources):with open(sources) as f:sources = [x.strip() for x in f.read().strip().splitlines() if len(x.strip())]else:sources = [sources]n = len(sources)self.imgs, self.fps, self.frames, self.threads = [None] * n, [0] * n, [0] * n, [None] * nself.sources = [clean_str(x) for x in sources]  # clean source names for laterself.auto = autofor i, s in enumerate(sources):  # index, source# Start thread to read frames from video streamst = f'{i + 1}/{n}: {s}... 'if 'youtube.com/' in s or 'youtu.be/' in s:  # if source is YouTube videocheck_requirements(('pafy', 'youtube_dl==2020.12.2'))import pafys = pafy.new(s).getbest(preftype="mp4").url  # YouTube URLs = eval(s) if s.isnumeric() else s  # i.e. s = '0' local webcamcap = cv2.VideoCapture(s)assert cap.isOpened(), f'{st}Failed to open {s}'w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))fps = cap.get(cv2.CAP_PROP_FPS)  # warning: may return 0 or nanself.frames[i] = max(int(cap.get(cv2.CAP_PROP_FRAME_COUNT)), 0) or float('inf')  # infinite stream fallbackself.fps[i] = max((fps if math.isfinite(fps) else 0) % 100, 0) or 30  # 30 FPS fallback_, self.imgs[i] = cap.read()  # guarantee first frameself.threads[i] = Thread(target=self.update, args=([i, cap, s]), daemon=True)LOGGER.info(f"{st} Success ({self.frames[i]} frames {w}x{h} at {self.fps[i]:.2f} FPS)")self.threads[i].start()LOGGER.info('')  # newline# check for common shapess = np.stack([letterbox(x, self.img_size, stride=self.stride, auto=self.auto)[0].shape for x in self.imgs])self.rect = np.unique(s, axis=0).shape[0] == 1  # rect inference if all shapes equalif not self.rect:LOGGER.warning('WARNING: Stream shapes differ. For optimal performance supply similarly-shaped streams.')def update(self, i, cap, stream):# Read stream `i` frames in daemon threadn, f, read = 0, self.frames[i], 1  # frame number, frame array, inference every 'read' framewhile cap.isOpened() and n < f:n += 1# _, self.imgs[index] = cap.read()cap.grab()if n % read == 0:success, im = cap.retrieve()if success:self.imgs[i] = imelse:LOGGER.warning('WARNING: Video stream unresponsive, please check your IP camera connection.')self.imgs[i] = np.zeros_like(self.imgs[i])cap.open(stream)  # re-open stream if signal was losttime.sleep(1 / self.fps[i])  # wait timedef __iter__(self):self.count = -1return selfdef __next__(self):self.count += 1if not all(x.is_alive() for x in self.threads) or cv2.waitKey(1) == ord('q'):  # q to quitcv2.destroyAllWindows()raise StopIteration# Letterboximg0 = self.imgs.copy()img = [letterbox(x, self.img_size, stride=self.stride, auto=self.rect and self.auto)[0] for x in img0]# Stackimg = np.stack(img, 0)# Convertimg = img[..., ::-1].transpose((0, 3, 1, 2))  # BGR to RGB, BHWC to BCHWimg = np.ascontiguousarray(img)return self.sources, img, img0, None, ''def __len__(self):return len(self.sources)  # 1E12 frames = 32 streams at 30 FPS for 30 yearsdef img2label_paths(img_paths):# Define label paths as a function of image pathssa, sb = os.sep + 'images' + os.sep, os.sep + 'labels' + os.sep  # /images/, /labels/ substringsreturn [sb.join(x.rsplit(sa, 1)).rsplit('.', 1)[0] + '.txt' for x in img_paths]class LoadImagesAndLabels(Dataset):# YOLOv5 train_loader/val_loader, loads images and labels for training and validationcache_version = 0.6  # dataset labels *.cache versiondef __init__(self, path, img_size=640, batch_size=16, augment=False, hyp=None, rect=False, image_weights=False,cache_images=False, single_cls=False, stride=32, pad=0.0, prefix=''):"""初始化过程并没有什么实质性的操作,更多是一个定义参数的过程(self参数),以便在__getitem()__中进行数据增强操作,所以这部分代码只需要抓住self中的各个变量的含义就算差不多了:param path: 图片数据加载路径 train/test  如: ../datasets/VOC/images/train2007:param img_size: train/test图片尺寸(数据增强后大小) 640:param batch_size: batch size 大小 8/16/32:param augment: 是否要进行数据增强  True:param hyp: 超参列表dict 网络训练时的一些超参数,包括学习率等,这里主要用到里面一些关于数据增强(旋转、平移等)的系数:param rect: 是否开启矩形train/test  默认训练集关闭 验证集开启:param image_weights: 训练时是否根据图片样本真实框分布权重来选择图片  默认False:param cache_images: 是否cache_images False:param single_cls: 数据集是否是单类别 默认False:param stride: 模型最大stride=32   [32 16 8]:param pad: 设置矩形训练的shape时进行的填充 默认0.0:param prefix: 显示信息   一个标志,多为train/val,处理标签时保存cache文件会用到self.img_files: {list: N} 存放着整个数据集图片的相对路径self.label_files: {list: N} 存放着整个数据集标签的相对路径cache label -> verify_image_labelself.labels: 如果数据集所有图片中没有一个多边形label  labels存储的label就都是原始label(都是正常的矩形label)否则将所有图片正常gt的label存入labels 不正常gt(存在一个多边形)经过segments2boxes转换为正常的矩形labelself.shapes: 所有图片的shapeself.segments: 如果数据集所有图片中没有一个多边形label  self.segments=None否则存储数据集中所有存在多边形gt的图片的所有原始label(肯定有多边形label 也可能有矩形正常label 未知数)self.batch: 记载着每张图片属于哪个batchself.n: 数据集中所有图片的数量self.indices: 记载着所有图片的indexself.rect=True时self.batch_shapes记载每个batch的shape(同一个batch的图片shape相同)"""# 赋值一些基础的self变量 用于后面在__getitem__中调用self.img_size = img_size # 经过数据增强后的数据图片的大小self.augment = augment # # 是否启动数据增强 一般train时打开 val时关闭self.hyp = hyp # # 超参列表# 图片按权重采样  True就可以根据类别频率(频率高的权重小,反正大)来进行采样  默认False: 不作类别区分self.image_weights = image_weights # 默认Fales# 是否启动矩形训练 一般训练时关闭 验证时打开 可以加速self.rect = False if image_weights else rectself.mosaic = self.augment and not self.rect  # load 4 images at a time into a mosaic (only during training)# # mosaic增强的边界值  [-320, -320]self.mosaic_border = [-img_size // 2, -img_size // 2]self.stride = stride # 最大下采样率 32self.path = path # 图片路径self.albumentations = Albumentations() if augment else None# 得到path路径下的所有图片的路径self.img_filestry:f = []  # image filesfor p in path if isinstance(path, list) else [path]:# 获取数据集路径path,包含图片路径的txt文件或者包含图片的文件夹路径# 使用pathlib.Path生成与操作系统无关的路径,因为不同操作系统路径的‘/’会有所不同p = Path(p)  # os-agnostic# 如果路径path为包含图片的文件夹路径if p.is_dir():  # dir# glob.glab: 返回所有匹配的文件路径列表  递归获取p路径下所有文件f += glob.glob(str(p / '**' / '*.*'), recursive=True)# f = list(p.rglob('*.*'))  # pathlib# 如果路径path为包含图片路径的txt文件elif p.is_file():  # filewith open(p) as t:# 获取图片路径,更换相对路径t = t.read().strip().splitlines()# 获取数据集路径的上级父目录  os.sep为路径里的分隔符(不同路径的分隔符不同,os.sep可以根据系统自适应)parent = str(p.parent) + os.sepf += [x.replace('./', parent) if x.startswith('./') else x for x in t]  # local to global path# f += [p.parent / x.lstrip(os.sep) for x in t]  # local to global path (pathlib)else:raise Exception(f'{prefix}{p} does not exist')# 破折号替换为os.sep,os.path.splitext(x)将文件名与扩展名分开并返回一个列表# 筛选f中所有的图片文件self.img_files = sorted(x.replace('/', os.sep) for x in f if x.split('.')[-1].lower() in IMG_FORMATS)# self.img_files = sorted([x for x in f if x.suffix[1:].lower() in IMG_FORMATS])  # pathlibassert self.img_files, f'{prefix}No images found'except Exception as e:raise Exception(f'{prefix}Error loading data from {path}: {e}\nSee {HELP_URL}')# Check cache# 根据imgs路径找到labels的路径self.label_filesself.label_files = img2label_paths(self.img_files)  # labels# cache label 下次运行这个脚本的时候直接从cache中取label而不是去文件中取label 速度更快cache_path = (p if p.is_file() else Path(self.label_files[0]).parent).with_suffix('.cache')try:# 如果有cache文件,直接加载  exists=True: 是否已从cache文件中读出了nf, nm, ne, nc, n等信息cache, exists = np.load(cache_path, allow_pickle=True).item(), True  # load dict# 如果图片版本信息或者文件列表的hash值对不上号 说明本地数据集图片和label可能发生了变化 就重新cache label文件assert cache['version'] == self.cache_version  # same versionassert cache['hash'] == get_hash(self.label_files + self.img_files)  # same hashexcept Exception:# 否则调用cache_labels缓存标签及标签相关信息cache, exists = self.cache_labels(cache_path, prefix), False  # cache# Display cache# 打印cache的结果 nf nm ne nc n = 找到的标签数量,漏掉的标签数量,空的标签数量,损坏的标签数量,总的标签数量nf, nm, ne, nc, n = cache.pop('results')  # found, missing, empty, corrupt, total# 如果已经从cache文件读出了nf nm ne nc n等信息,直接显示标签信息  msgs信息等if exists:d = f"Scanning '{cache_path}' images and labels... {nf} found, {nm} missing, {ne} empty, {nc} corrupt"tqdm(None, desc=prefix + d, total=n, initial=n)  # display cache resultsif cache['msgs']:LOGGER.info('\n'.join(cache['msgs']))  # display warnings# 数据集没有标签信息 就发出警告并显示标签label下载地址help_urlassert nf > 0 or not augment, f'{prefix}No labels in {cache_path}. Can not train without labels. See {HELP_URL}'# Read cache# Read cache  从cache中读出最新变量赋给self  方便给forward中使用# cache中的键值对最初有: cache[img_file]=[l, shape, segments] cache[hash] cache[results] cache[msg] cache[version]# 先从cache中去除cache文件中其他无关键值如:'hash', 'version', 'msgs'等都删除[cache.pop(k) for k in ('hash', 'version', 'msgs')]  # remove items# pop掉results、hash、version、msgs后只剩下cache[img_file]=[l, shape, segments]# cache.values(): 取cache中所有值 对应所有l, shape, segments# labels: 如果数据集所有图片中没有一个多边形label  labels存储的label就都是原始label(都是正常的矩形label)#         否则将所有图片正常gt的label存入labels 不正常gt(存在一个多边形)经过segments2boxes转换为正常的矩形label# shapes: 所有图片的shape# self.segments: 如果数据集所有图片中没有一个多边形label  self.segments=None#                否则存储数据集中所有存在多边形gt的图片的所有原始label(肯定有多边形label 也可能有矩形正常label 未知数)# zip 是因为cache中所有labels、shapes、segments信息都是按每张img分开存储的, zip是将所有图片对应的信息叠在一起labels, shapes, self.segments = zip(*cache.values()) # # segments: 都是[]self.labels = list(labels) # labels to listself.shapes = np.array(shapes, dtype=np.float64) # image shapes to float64# 更新所有图片的img_files信息 update img_files from cache resultself.img_files = list(cache.keys())  # update# 更新所有图片的label_files信息(因为img_files信息可能发生了变化)self.label_files = img2label_paths(cache.keys())  # updaten = len(shapes)  # number of imagesbi = np.floor(np.arange(n) / batch_size).astype(np.int)  # batch indexnb = bi[-1] + 1  # number of batchesself.batch = bi  # batch index of imageself.n = nself.indices = range(n)# Update labels# include_class = []  # filter labels to include only these classes (optional)include_class = [13,14]  # filter labels to include only these classes (optional)include_class_array = np.array(include_class).reshape(1, -1)for i, (label, segment) in enumerate(zip(self.labels, self.segments)):if include_class:j = (label[:, 0:1] == include_class_array).any(1)self.labels[i] = label[j]if segment:self.segments[i] = segment[j]if single_cls:  # single-class training, merge all classes into 0self.labels[i][:, 0] = 0if segment:self.segments[i][:, 0] = 0# Rectangular Training# 为Rectangular Training作准备# 这里主要是注意shapes的生成 这一步很重要 因为如果采样矩形训练那么整个batch的形状要一样 就要计算这个符合整个batch的shape# 而且还要对数据集按照高宽比进行排序 这样才能保证同一个batch的图片的形状差不多相同 再选则一个共同的shape代价也比较小if self.rect:# Sort by aspect ratios = self.shapes  # whar = s[:, 1] / s[:, 0]  # aspect ratioirect = ar.argsort() # 根据高宽比排序self.img_files = [self.img_files[i] for i in irect] # 获取排序后的img_filesself.label_files = [self.label_files[i] for i in irect] # 获取排序后的label_filesself.labels = [self.labels[i] for i in irect] # 获取排序后的labelsself.shapes = s[irect]  # wh # 获取排序后的whar = ar[irect] # 获取排序后的aspect ratio# Set training image shapes# 计算每个batch采用的统一尺度 Set training image shapesshapes = [[1, 1]] * nbfor i in range(nb):ari = ar[bi == i] # bi: batch indexmini, maxi = ari.min(), ari.max() # 获取第i个batch中,最小和最大高宽比# 如果高/宽小于1(w > h),将w设为img_size(保证原图像尺度不变进行缩放)if maxi < 1:shapes[i] = [maxi, 1]# 如果高/宽大于1(w < h),将h设置为img_size(保证原图像尺度不变进行缩放)elif mini > 1:shapes[i] = [1, 1 / mini]# 计算每个batch输入网络的shape值(向上设置为32的整数倍)# 要求每个batch_shapes的高宽都是32的整数倍,所以要先除以32,取整再乘以32(不过img_size如果是32倍数这里就没必要了)self.batch_shapes = np.ceil(np.array(shapes) * img_size / stride + pad).astype(np.int) * stride# Cache images into RAM/disk for faster training (WARNING: large datasets may exceed system resources)# 是否需要cache image 一般是False 因为RAM会不足  cache label还可以 但是cache image就太大了 所以一般不用# Cache images into memory for faster training (WARNING: large datasets may exceed system RAM)self.imgs, self.img_npy = [None] * n, [None] * nif cache_images:if cache_images == 'disk':self.im_cache_dir = Path(Path(self.img_files[0]).parent.as_posix() + '_npy')self.img_npy = [self.im_cache_dir / Path(f).with_suffix('.npy').name for f in self.img_files]self.im_cache_dir.mkdir(parents=True, exist_ok=True)gb = 0  # Gigabytes of cached imagesself.img_hw0, self.img_hw = [None] * n, [None] * nresults = ThreadPool(NUM_THREADS).imap(self.load_image, range(n))pbar = tqdm(enumerate(results), total=n)for i, x in pbar:if cache_images == 'disk':if not self.img_npy[i].exists():np.save(self.img_npy[i].as_posix(), x[0])gb += self.img_npy[i].stat().st_sizeelse:  # 'ram'self.imgs[i], self.img_hw0[i], self.img_hw[i] = x  # im, hw_orig, hw_resized = load_image(self, i)gb += self.imgs[i].nbytespbar.desc = f'{prefix}Caching images ({gb / 1E9:.1f}GB {cache_images})'pbar.close()def cache_labels(self, path=Path('./labels.cache'), prefix=''):# Cache dataset labels, check images and read shapes"""用在__init__函数中  cache数据集label加载label信息生成cache文件   Cache dataset labels, check images and read shapes:params path: cache文件保存地址:params prefix: 日志头部信息(彩打高亮部分):return x: cache中保存的字典包括的信息有: x[im_file] = [l, shape, segments]一张图片一个label相对应的保存到x, 最终x会保存所有图片的相对路径、gt框的信息、形状shape、所有的多边形gt信息im_file: 当前这张图片的path相对路径l: 当前这张图片的所有gt框的label信息(不包含segment多边形标签) [gt_num, cls+xywh(normalized)]shape: 当前这张图片的形状 shapesegments: 当前这张图片所有gt的label信息(包含segment多边形标签) [gt_num, xy1...]hash: 当前图片和label文件的hash值  1results: 找到的label个数nf, 丢失label个数nm, 空label个数ne, 破损label个数nc, 总img/label个数len(self.img_files)msgs: 所有数据集的msgs信息version: 当前cache version"""x = {}  # dict # 初始化最终cache中保存的字典dict# 初始化number missing, found, empty, corrupt, messages# 初始化整个数据集: 漏掉的标签(label)总数量, 找到的标签(label)总数量, 空的标签(label)总数量, 错误标签(label)总数量, 所有错误信息nm, nf, ne, nc, msgs = 0, 0, 0, 0, []  # number missing, found, empty, corrupt, messagesdesc = f"{prefix}Scanning '{path.parent / path.stem}' images and labels..."# 多进程调用verify_image_label函数with Pool(NUM_THREADS) as pool:# 定义pbar进度条# pool.imap_unordered: 对大量数据遍历多进程计算 返回一个迭代器# 把self.img_files, self.label_files, repeat(prefix) list中的值作为参数依次送入(一次送一个)verify_image_label函数pbar = tqdm(pool.imap(verify_image_label, zip(self.img_files, self.label_files, repeat(prefix))),desc=desc, total=len(self.img_files))# im_file: 当前这张图片的path相对路径# l: [gt_num, cls+xywh(normalized)]#    如果这张图片没有一个segment多边形标签 l就存储原label(全部是正常矩形标签)#    如果这张图片有一个segment多边形标签  l就存储经过segments2boxes处理好的标签(正常矩形标签不处理 多边形标签转化为矩形标签)# shape: 当前这张图片的形状 shape# segments: 如果这张图片没有一个segment多边形标签 存储None#           如果这张图片有一个segment多边形标签 就把这张图片的所有label存储到segments中(若干个正常gt 若干个多边形标签) [gt_num, xy1...]# nm_f(nm): number missing 当前这张图片的label是否丢失         丢失=1    存在=0# nf_f(nf): number found 当前这张图片的label是否存在           存在=1    丢失=0# ne_f(ne): number empty 当前这张图片的label是否是空的         空的=1    没空=0# nc_f(nc): number corrupt 当前这张图片的label文件是否是破损的  破损的=1  没破损=0# msg: 返回的msg信息  label文件完好=‘’  label文件破损=warning信息for im_file, lb, shape, segments, nm_f, nf_f, ne_f, nc_f, msg in pbar:nm += nm_f # 累加总number missing labelnf += nf_f # 累加总number found labelne += ne_f # 累加总number empty labelnc += nc_f # 累加总number corrupt labelif im_file:x[im_file] = [lb, shape, segments] # 信息存入字典 key=im_file  value=[l, shape, segments]if msg:msgs.append(msg) # # 将msg加入总msgpbar.desc = f"{desc}{nf} found, {nm} missing, {ne} empty, {nc} corrupt"pbar.close() # 关闭进度条# 日志打印所有msg信息if msgs:LOGGER.info('\n'.join(msgs))# 一张label都没找到 日志打印help_url下载地址if nf == 0:LOGGER.warning(f'{prefix}WARNING: No labels found in {path}. See {HELP_URL}')# 将当前图片和label文件的hash值存入最终字典distx['hash'] = get_hash(self.label_files + self.img_files)# 将nf, nm, ne, nc, len(self.img_files)存入最终字典distx['results'] = nf, nm, ne, nc, len(self.img_files)# 将所有数据集的msgs信息存入最终字典distx['msgs'] = msgs  # warnings# 将当前cache version存入最终字典distx['version'] = self.cache_version  # cache versiontry:np.save(path, x)  # save cache for next timepath.with_suffix('.cache.npy').rename(path)  # remove .npy suffixLOGGER.info(f'{prefix}New cache created: {path}')except Exception as e:LOGGER.warning(f'{prefix}WARNING: Cache directory {path.parent} is not writeable: {e}')  # not writeablereturn xdef __len__(self):'''求数据集图片的数量。'''return len(self.img_files)# def __iter__(self):#     self.count = -1#     print('ran dataset iter')#     #self.shuffled_vector = np.random.permutation(self.nF) if self.augment else np.arange(self.nF)#     return selfdef __getitem__(self, index):"""这部分是数据增强函数,一般一次性执行batch_size次。训练 数据增强: mosaic(random_perspective) + hsv + 上下左右翻转测试 数据增强: letterbox:return torch.from_numpy(img): 这个index的图片数据(增强后) [3, 640, 640]:return labels_out: 这个index图片的gt label [6, 6] = [gt_num, 0+class+xywh(normalized)]:return self.img_files[index]: 这个index图片的路径地址:return shapes: 这个batch的图片的shapes 测试时(矩形训练)才有  验证时为None   for COCO mAP rescaling"""index = self.indices[index]  # linear, shuffled, or image_weightshyp = self.hypmosaic = self.mosaic and random.random() < hyp['mosaic']if mosaic:# Load mosaicimg, labels = self.load_mosaic(index)shapes = None# MixUp augmentationif random.random() < hyp['mixup']:img, labels = mixup(img, labels, *self.load_mosaic(random.randint(0, self.n - 1)))else:# Load imageimg, (h0, w0), (h, w) = self.load_image(index)# Letterboxshape = self.batch_shapes[self.batch[index]] if self.rect else self.img_size  # final letterboxed shapeimg, ratio, pad = letterbox(img, shape, auto=False, scaleup=self.augment)shapes = (h0, w0), ((h / h0, w / w0), pad)  # for COCO mAP rescalinglabels = self.labels[index].copy()if labels.size:  # normalized xywh to pixel xyxy formatlabels[:, 1:] = xywhn2xyxy(labels[:, 1:], ratio[0] * w, ratio[1] * h, padw=pad[0], padh=pad[1])if self.augment:img, labels = random_perspective(img, labels,degrees=hyp['degrees'],translate=hyp['translate'],scale=hyp['scale'],shear=hyp['shear'],perspective=hyp['perspective'])nl = len(labels)  # number of labelsif nl:labels[:, 1:5] = xyxy2xywhn(labels[:, 1:5], w=img.shape[1], h=img.shape[0], clip=True, eps=1E-3)if self.augment:# Albumentationsimg, labels = self.albumentations(img, labels)nl = len(labels)  # update after albumentations# HSV color-spaceaugment_hsv(img, hgain=hyp['hsv_h'], sgain=hyp['hsv_s'], vgain=hyp['hsv_v'])# Flip up-downif random.random() < hyp['flipud']:img = np.flipud(img)if nl:labels[:, 2] = 1 - labels[:, 2]# Flip left-rightif random.random() < hyp['fliplr']:img = np.fliplr(img)if nl:labels[:, 1] = 1 - labels[:, 1]# Cutouts# labels = cutout(img, labels, p=0.5)# nl = len(labels)  # update after cutoutlabels_out = torch.zeros((nl, 6))if nl:labels_out[:, 1:] = torch.from_numpy(labels)# Convertimg = img.transpose((2, 0, 1))[::-1]  # HWC to CHW, BGR to RGBimg = np.ascontiguousarray(img)return torch.from_numpy(img), labels_out, self.img_files[index], shapesdef load_image(self, i):"""用在LoadImagesAndLabels模块的__getitem__函数和load_mosaic模块中从self或者从对应图片路径中载入对应index的图片 并将原图中hw中较大者扩展到self.img_size, 较小者同比例扩展loads 1 image from dataset, returns img, original hw, resized hw:params self: 一般是导入LoadImagesAndLabels中的self:param index: 当前图片的index:return: img: resize后的图片(h0, w0): hw_original  原图的hwimg.shape[:2]: hw_resized resize后的图片hw(hw中较大者扩展到self.img_size, 较小者同比例扩展)"""# loads 1 image from dataset index 'i', returns (im, original hw, resized hw)im = self.imgs[i]if im is None:  # not cached in RAMnpy = self.img_npy[i]if npy and npy.exists():  # load npyim = np.load(npy)else:  # read imagef = self.img_files[i]im = cv2.imread(f)  # BGRassert im is not None, f'Image Not Found {f}'h0, w0 = im.shape[:2]  # orig hwr = self.img_size / max(h0, w0)  # ratioif r != 1:  # if sizes are not equalim = cv2.resize(im,(int(w0 * r), int(h0 * r)),interpolation=cv2.INTER_LINEAR if (self.augment or r > 1) else cv2.INTER_AREA)return im, (h0, w0), im.shape[:2]  # im, hw_original, hw_resizedelse:return self.imgs[i], self.img_hw0[i], self.img_hw[i]  # im, hw_original, hw_resizeddef load_mosaic(self, index):"""用在LoadImagesAndLabels模块的__getitem__函数 进行mosaic数据增强将四张图片拼接在一张马赛克图像中  loads images in a 4-mosaic:param index: 需要获取的图像索引:return: img4: mosaic和随机透视变换后的一张图片  numpy(640, 640, 3)labels4: img4对应的target  [M, cls+x1y1x2y2]"""# YOLOv5 4-mosaic loader. Loads 1 image + 3 random images into a 4-image mosaiclabels4, segments4 = [], []s = self.img_sizeyc, xc = (int(random.uniform(-x, 2 * s + x)) for x in self.mosaic_border)  # mosaic center x, yindices = [index] + random.choices(self.indices, k=3)  # 3 additional image indicesrandom.shuffle(indices)for i, index in enumerate(indices):# Load imageimg, _, (h, w) = self.load_image(index)# place img in img4if i == 0:  # top leftimg4 = np.full((s * 2, s * 2, img.shape[2]), 114, dtype=np.uint8)  # base image with 4 tilesx1a, y1a, x2a, y2a = max(xc - w, 0), max(yc - h, 0), xc, yc  # xmin, ymin, xmax, ymax (large image)x1b, y1b, x2b, y2b = w - (x2a - x1a), h - (y2a - y1a), w, h  # xmin, ymin, xmax, ymax (small image)elif i == 1:  # top rightx1a, y1a, x2a, y2a = xc, max(yc - h, 0), min(xc + w, s * 2), ycx1b, y1b, x2b, y2b = 0, h - (y2a - y1a), min(w, x2a - x1a), helif i == 2:  # bottom leftx1a, y1a, x2a, y2a = max(xc - w, 0), yc, xc, min(s * 2, yc + h)x1b, y1b, x2b, y2b = w - (x2a - x1a), 0, w, min(y2a - y1a, h)elif i == 3:  # bottom rightx1a, y1a, x2a, y2a = xc, yc, min(xc + w, s * 2), min(s * 2, yc + h)x1b, y1b, x2b, y2b = 0, 0, min(w, x2a - x1a), min(y2a - y1a, h)img4[y1a:y2a, x1a:x2a] = img[y1b:y2b, x1b:x2b]  # img4[ymin:ymax, xmin:xmax]padw = x1a - x1bpadh = y1a - y1b# Labelslabels, segments = self.labels[index].copy(), self.segments[index].copy()if labels.size:labels[:, 1:] = xywhn2xyxy(labels[:, 1:], w, h, padw, padh)  # normalized xywh to pixel xyxy formatsegments = [xyn2xy(x, w, h, padw, padh) for x in segments]labels4.append(labels)segments4.extend(segments)# Concat/clip labelslabels4 = np.concatenate(labels4, 0)for x in (labels4[:, 1:], *segments4):np.clip(x, 0, 2 * s, out=x)  # clip when using random_perspective()# img4, labels4 = replicate(img4, labels4)  # replicate# Augmentimg4, labels4, segments4 = copy_paste(img4, labels4, segments4, p=self.hyp['copy_paste'])img4, labels4 = random_perspective(img4, labels4, segments4,degrees=self.hyp['degrees'],translate=self.hyp['translate'],scale=self.hyp['scale'],shear=self.hyp['shear'],perspective=self.hyp['perspective'],border=self.mosaic_border)  # border to removereturn img4, labels4def load_mosaic9(self, index):"""用在LoadImagesAndLabels模块的__getitem__函数 替换mosaic数据增强将九张图片拼接在一张马赛克图像中  loads images in a 9-mosaic:param self::param index: 需要获取的图像索引:return: img9: mosaic和仿射增强后的一张图片labels9: img9对应的target"""# YOLOv5 9-mosaic loader. Loads 1 image + 8 random images into a 9-image mosaiclabels9, segments9 = [], []s = self.img_sizeindices = [index] + random.choices(self.indices, k=8)  # 8 additional image indicesrandom.shuffle(indices)hp, wp = -1, -1  # height, width previousfor i, index in enumerate(indices):# Load imageimg, _, (h, w) = self.load_image(index)# place img in img9if i == 0:  # centerimg9 = np.full((s * 3, s * 3, img.shape[2]), 114, dtype=np.uint8)  # base image with 4 tilesh0, w0 = h, wc = s, s, s + w, s + h  # xmin, ymin, xmax, ymax (base) coordinateselif i == 1:  # topc = s, s - h, s + w, selif i == 2:  # top rightc = s + wp, s - h, s + wp + w, selif i == 3:  # rightc = s + w0, s, s + w0 + w, s + helif i == 4:  # bottom rightc = s + w0, s + hp, s + w0 + w, s + hp + helif i == 5:  # bottomc = s + w0 - w, s + h0, s + w0, s + h0 + helif i == 6:  # bottom leftc = s + w0 - wp - w, s + h0, s + w0 - wp, s + h0 + helif i == 7:  # leftc = s - w, s + h0 - h, s, s + h0elif i == 8:  # top leftc = s - w, s + h0 - hp - h, s, s + h0 - hppadx, pady = c[:2]x1, y1, x2, y2 = (max(x, 0) for x in c)  # allocate coords# Labelslabels, segments = self.labels[index].copy(), self.segments[index].copy()if labels.size:labels[:, 1:] = xywhn2xyxy(labels[:, 1:], w, h, padx, pady)  # normalized xywh to pixel xyxy formatsegments = [xyn2xy(x, w, h, padx, pady) for x in segments]labels9.append(labels)segments9.extend(segments)# Imageimg9[y1:y2, x1:x2] = img[y1 - pady:, x1 - padx:]  # img9[ymin:ymax, xmin:xmax]hp, wp = h, w  # height, width previous# Offsetyc, xc = (int(random.uniform(0, s)) for _ in self.mosaic_border)  # mosaic center x, yimg9 = img9[yc:yc + 2 * s, xc:xc + 2 * s]# Concat/clip labelslabels9 = np.concatenate(labels9, 0)labels9[:, [1, 3]] -= xclabels9[:, [2, 4]] -= ycc = np.array([xc, yc])  # centerssegments9 = [x - c for x in segments9]for x in (labels9[:, 1:], *segments9):np.clip(x, 0, 2 * s, out=x)  # clip when using random_perspective()# img9, labels9 = replicate(img9, labels9)  # replicate# Augmentimg9, labels9 = random_perspective(img9, labels9, segments9,degrees=self.hyp['degrees'],translate=self.hyp['translate'],scale=self.hyp['scale'],shear=self.hyp['shear'],perspective=self.hyp['perspective'],border=self.mosaic_border)  # border to removereturn img9, labels9@staticmethoddef collate_fn(batch):"""这个函数会在create_dataloader中生成dataloader时调用:整理函数  将image和label整合到一起:return torch.stack(img, 0): 如[16, 3, 640, 640] 整个batch的图片:return torch.cat(label, 0): 如[15, 6] [num_target, img_index+class_index+xywh(normalized)] 整个batch的label:return path: 整个batch所有图片的路径:return shapes: (h0, w0), ((h / h0, w / w0), pad)    for COCO mAP rescalingpytorch的DataLoader打包一个batch的数据集时要经过此函数进行打包 通过重写此函数实现标签与图片对应的划分,一个batch中哪些标签属于哪一张图片,形如[[0, 6, 0.5, 0.5, 0.26, 0.35],[0, 6, 0.5, 0.5, 0.26, 0.35],[1, 6, 0.5, 0.5, 0.26, 0.35],[2, 6, 0.5, 0.5, 0.26, 0.35],]前两行标签属于第一张图片, 第三行属于第二张。。。注意:这个函数一般是当调用了batch_size次 getitem 函数后才会调用一次这个函数,对batch_size张图片和对应的label进行打包。 强烈建议这里大家debug试试这里return的数据是不是我说的这样定义的。"""img, label, path, shapes = zip(*batch)  # transposedfor i, lb in enumerate(label):lb[:, 0] = i  # add target image index for build_targets()return torch.stack(img, 0), torch.cat(label, 0), path, shapes@staticmethoddef collate_fn4(batch):"""同样在create_dataloader中生成dataloader时调用:这里是yolo-v5作者实验性的一个代码 quad-collate function 当train.py的opt参数quad=True 则调用collate_fn4代替collate_fn作用:  如之前用collate_fn可以返回图片[16, 3, 640, 640] 经过collate_fn4则返回图片[4, 3, 1280, 1280]将4张mosaic图片[1, 3, 640, 640]合成一张大的mosaic图片[1, 3, 1280, 1280]将一个batch的图片每四张处理, 0.5的概率将四张图片拼接到一张大图上训练, 0.5概率直接将某张图片上采样两倍训练"""img, label, path, shapes = zip(*batch)  # transposedn = len(shapes) // 4img4, label4, path4, shapes4 = [], [], path[:n], shapes[:n]ho = torch.tensor([[0.0, 0, 0, 1, 0, 0]])wo = torch.tensor([[0.0, 0, 1, 0, 0, 0]])s = torch.tensor([[1, 1, 0.5, 0.5, 0.5, 0.5]])  # scalefor i in range(n):  # zidane torch.zeros(16,3,720,1280)  # BCHWi *= 4if random.random() < 0.5:im = F.interpolate(img[i].unsqueeze(0).float(), scale_factor=2.0, mode='bilinear', align_corners=False)[0].type(img[i].type())lb = label[i]else:im = torch.cat((torch.cat((img[i], img[i + 1]), 1), torch.cat((img[i + 2], img[i + 3]), 1)), 2)lb = torch.cat((label[i], label[i + 1] + ho, label[i + 2] + wo, label[i + 3] + ho + wo), 0) * simg4.append(im)label4.append(lb)for i, lb in enumerate(label4):lb[:, 0] = i  # add target image index for build_targets()return torch.stack(img4, 0), torch.cat(label4, 0), path4, shapes4# Ancillary functions --------------------------------------------------------------------------------------------------
def create_folder(path='./new'):# Create folderif os.path.exists(path):shutil.rmtree(path)  # delete output folderos.makedirs(path)  # make new output folderdef flatten_recursive(path=DATASETS_DIR / 'coco128'):# Flatten a recursive directory by bringing all files to top levelnew_path = Path(str(path) + '_flat')create_folder(new_path)for file in tqdm(glob.glob(str(Path(path)) + '/**/*.*', recursive=True)):shutil.copyfile(file, new_path / Path(file).name)def extract_boxes(path=DATASETS_DIR / 'coco128'):  # from utils.datasets import *; extract_boxes()# Convert detection dataset into classification dataset, with one directory per classpath = Path(path)  # images dirshutil.rmtree(path / 'classifier') if (path / 'classifier').is_dir() else None  # remove existingfiles = list(path.rglob('*.*'))n = len(files)  # number of filesfor im_file in tqdm(files, total=n):if im_file.suffix[1:] in IMG_FORMATS:# imageim = cv2.imread(str(im_file))[..., ::-1]  # BGR to RGBh, w = im.shape[:2]# labelslb_file = Path(img2label_paths([str(im_file)])[0])if Path(lb_file).exists():with open(lb_file) as f:lb = np.array([x.split() for x in f.read().strip().splitlines()], dtype=np.float32)  # labelsfor j, x in enumerate(lb):c = int(x[0])  # classf = (path / 'classifier') / f'{c}' / f'{path.stem}_{im_file.stem}_{j}.jpg'  # new filenameif not f.parent.is_dir():f.parent.mkdir(parents=True)b = x[1:] * [w, h, w, h]  # box# b[2:] = b[2:].max()  # rectangle to squareb[2:] = b[2:] * 1.2 + 3  # padb = xywh2xyxy(b.reshape(-1, 4)).ravel().astype(np.int)b[[0, 2]] = np.clip(b[[0, 2]], 0, w)  # clip boxes outside of imageb[[1, 3]] = np.clip(b[[1, 3]], 0, h)assert cv2.imwrite(str(f), im[b[1]:b[3], b[0]:b[2]]), f'box failure in {f}'def autosplit(path=DATASETS_DIR / 'coco128/images', weights=(0.9, 0.1, 0.0), annotated_only=False):""" Autosplit a dataset into train/val/test splits and save path/autosplit_*.txt filesUsage: from utils.datasets import *; autosplit()Argumentspath:            Path to images directoryweights:         Train, val, test weights (list, tuple)annotated_only:  Only use images with an annotated txt file"""path = Path(path)  # images dirfiles = sorted(x for x in path.rglob('*.*') if x.suffix[1:].lower() in IMG_FORMATS)  # image files onlyn = len(files)  # number of filesrandom.seed(0)  # for reproducibilityindices = random.choices([0, 1, 2], weights=weights, k=n)  # assign each image to a splittxt = ['autosplit_train.txt', 'autosplit_val.txt', 'autosplit_test.txt']  # 3 txt files[(path.parent / x).unlink(missing_ok=True) for x in txt]  # remove existingprint(f'Autosplitting images from {path}' + ', using *.txt labeled images only' * annotated_only)for i, img in tqdm(zip(indices, files), total=n):if not annotated_only or Path(img2label_paths([str(img)])[0]).exists():  # check labelwith open(path.parent / txt[i], 'a') as f:f.write('./' + img.relative_to(path.parent).as_posix() + '\n')  # add image to txt filedef verify_image_label(args):# Verify one image-label pairim_file, lb_file, prefix = argsnm, nf, ne, nc, msg, segments = 0, 0, 0, 0, '', []  # number (missing, found, empty, corrupt), message, segmentstry:# verify imagesim = Image.open(im_file)im.verify()  # PIL verifyshape = exif_size(im)  # image sizeassert (shape[0] > 9) & (shape[1] > 9), f'image size {shape} <10 pixels'assert im.format.lower() in IMG_FORMATS, f'invalid image format {im.format}'if im.format.lower() in ('jpg', 'jpeg'):with open(im_file, 'rb') as f:f.seek(-2, 2)if f.read() != b'\xff\xd9':  # corrupt JPEGImageOps.exif_transpose(Image.open(im_file)).save(im_file, 'JPEG', subsampling=0, quality=100)msg = f'{prefix}WARNING: {im_file}: corrupt JPEG restored and saved'# verify labelsif os.path.isfile(lb_file):nf = 1  # label foundwith open(lb_file) as f:lb = [x.split() for x in f.read().strip().splitlines() if len(x)]if any([len(x) > 8 for x in lb]):  # is segmentclasses = np.array([x[0] for x in lb], dtype=np.float32)segments = [np.array(x[1:], dtype=np.float32).reshape(-1, 2) for x in lb]  # (cls, xy1...)lb = np.concatenate((classes.reshape(-1, 1), segments2boxes(segments)), 1)  # (cls, xywh)lb = np.array(lb, dtype=np.float32)nl = len(lb)if nl:assert lb.shape[1] == 5, f'labels require 5 columns, {lb.shape[1]} columns detected'assert (lb >= 0).all(), f'negative label values {lb[lb < 0]}'assert (lb[:, 1:] <= 1).all(), f'non-normalized or out of bounds coordinates {lb[:, 1:][lb[:, 1:] > 1]}'_, i = np.unique(lb, axis=0, return_index=True)if len(i) < nl:  # duplicate row checklb = lb[i]  # remove duplicatesif segments:segments = segments[i]msg = f'{prefix}WARNING: {im_file}: {nl - len(i)} duplicate labels removed'else:ne = 1  # label emptylb = np.zeros((0, 5), dtype=np.float32)else:nm = 1  # label missinglb = np.zeros((0, 5), dtype=np.float32)return im_file, lb, shape, segments, nm, nf, ne, nc, msgexcept Exception as e:nc = 1msg = f'{prefix}WARNING: {im_file}: ignoring corrupt image/label: {e}'return [None, None, None, None, nm, nf, ne, nc, msg]def dataset_stats(path='coco128.yaml', autodownload=False, verbose=False, profile=False, hub=False):""" Return dataset statistics dictionary with images and instances counts per split per classTo run in parent directory: export PYTHONPATH="$PWD/yolov5"Usage1: from utils.datasets import *; dataset_stats('coco128.yaml', autodownload=True)Usage2: from utils.datasets import *; dataset_stats('path/to/coco128_with_yaml.zip')Argumentspath:           Path to data.yaml or data.zip (with data.yaml inside data.zip)autodownload:   Attempt to download dataset if not found locallyverbose:        Print stats dictionary"""def round_labels(labels):# Update labels to integer class and 6 decimal place floatsreturn [[int(c), *(round(x, 4) for x in points)] for c, *points in labels]def unzip(path):# Unzip data.zip TODO: CONSTRAINT: path/to/abc.zip MUST unzip to 'path/to/abc/'if str(path).endswith('.zip'):  # path is data.zipassert Path(path).is_file(), f'Error unzipping {path}, file not found'ZipFile(path).extractall(path=path.parent)  # unzipdir = path.with_suffix('')  # dataset directory == zip namereturn True, str(dir), next(dir.rglob('*.yaml'))  # zipped, data_dir, yaml_pathelse:  # path is data.yamlreturn False, None, pathdef hub_ops(f, max_dim=1920):# HUB ops for 1 image 'f': resize and save at reduced quality in /dataset-hub for web/app viewingf_new = im_dir / Path(f).name  # dataset-hub image filenametry:  # use PILim = Image.open(f)r = max_dim / max(im.height, im.width)  # ratioif r < 1.0:  # image too largeim = im.resize((int(im.width * r), int(im.height * r)))im.save(f_new, 'JPEG', quality=75, optimize=True)  # saveexcept Exception as e:  # use OpenCVprint(f'WARNING: HUB ops PIL failure {f}: {e}')im = cv2.imread(f)im_height, im_width = im.shape[:2]r = max_dim / max(im_height, im_width)  # ratioif r < 1.0:  # image too largeim = cv2.resize(im, (int(im_width * r), int(im_height * r)), interpolation=cv2.INTER_AREA)cv2.imwrite(str(f_new), im)zipped, data_dir, yaml_path = unzip(Path(path))with open(check_yaml(yaml_path), errors='ignore') as f:data = yaml.safe_load(f)  # data dictif zipped:data['path'] = data_dir  # TODO: should this be dir.resolve()?check_dataset(data, autodownload)  # download dataset if missinghub_dir = Path(data['path'] + ('-hub' if hub else ''))stats = {'nc': data['nc'], 'names': data['names']}  # statistics dictionaryfor split in 'train', 'val', 'test':if data.get(split) is None:stats[split] = None  # i.e. no test setcontinuex = []dataset = LoadImagesAndLabels(data[split])  # load datasetfor label in tqdm(dataset.labels, total=dataset.n, desc='Statistics'):x.append(np.bincount(label[:, 0].astype(int), minlength=data['nc']))x = np.array(x)  # shape(128x80)stats[split] = {'instance_stats': {'total': int(x.sum()), 'per_class': x.sum(0).tolist()},'image_stats': {'total': dataset.n, 'unlabelled': int(np.all(x == 0, 1).sum()),'per_class': (x > 0).sum(0).tolist()},'labels': [{str(Path(k).name): round_labels(v.tolist())} for k, v inzip(dataset.img_files, dataset.labels)]}if hub:im_dir = hub_dir / 'images'im_dir.mkdir(parents=True, exist_ok=True)for _ in tqdm(ThreadPool(NUM_THREADS).imap(hub_ops, dataset.img_files), total=dataset.n, desc='HUB Ops'):pass# Profilestats_path = hub_dir / 'stats.json'if profile:for _ in range(1):file = stats_path.with_suffix('.npy')t1 = time.time()np.save(file, stats)t2 = time.time()x = np.load(file, allow_pickle=True)print(f'stats.npy times: {time.time() - t2:.3f}s read, {t2 - t1:.3f}s write')file = stats_path.with_suffix('.json')t1 = time.time()with open(file, 'w') as f:json.dump(stats, f)  # save stats *.jsont2 = time.time()with open(file) as f:x = json.load(f)  # load hyps dictprint(f'stats.json times: {time.time() - t2:.3f}s read, {t2 - t1:.3f}s write')# Save, print and returnif hub:print(f'Saving {stats_path.resolve()}...')with open(stats_path, 'w') as f:json.dump(stats, f)  # save stats.jsonif verbose:print(json.dumps(stats, indent=2, sort_keys=False))return stats

进行验证

python val.py --img 640 --weight yolov5s.pt

没有指定的结果

include_class = []  # filter labels to include only these classes (optional)

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

指定类别的结果

# 指定第13,14类
include_class = [13,14]  # filter labels to include only these classes (optional)

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

  • 由于本人水平有限,难免出现错漏,敬请批评改正。
  • 更多精彩内容,可点击进入Python日常小操作专栏、OpenCV-Python小应用专栏、YOLO系列专栏、自然语言处理专栏或我的个人主页查看
  • 基于DETR的人脸伪装检测
  • YOLOv7训练自己的数据集(口罩检测)
  • YOLOv8训练自己的数据集(足球检测)
  • YOLOv5:TensorRT加速YOLOv5模型推理
  • YOLOv5:IoU、GIoU、DIoU、CIoU、EIoU
  • 玩转Jetson Nano(五):TensorRT加速YOLOv5目标检测
  • YOLOv5:添加SE、CBAM、CoordAtt、ECA注意力机制
  • YOLOv5:yolov5s.yaml配置文件解读、增加小目标检测层
  • Python将COCO格式实例分割数据集转换为YOLO格式实例分割数据集
  • YOLOv5:使用7.0版本训练自己的实例分割模型(车辆、行人、路标、车道线等实例分割)
  • 使用Kaggle GPU资源免费体验Stable Diffusion开源项目

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/600499.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

STM32学习笔记二十:WS2812制作像素游戏屏-飞行射击游戏(10)探索游戏平衡

游戏平衡很重要&#xff0c;然而&#xff0c;却往往得不到开发者的重视。或者&#xff0c;没有花时间仔细去做调整。 做过游戏开发的&#xff0c;都听说过一个词叫“数值爆炸”&#xff0c;实际上就是平衡没做好。 怎么样才能算是平衡呢&#xff1f; 玩家投入游戏的有两个&a…

农业银行RPA实践 3大典型案例分析

零接触开放金融服务在疫情之下被越来越多的银行和客户所认同&#xff0c;引起了更广泛的持续关注&#xff0c;各家银行纷纷开展产品服务创新&#xff0c;加速渠道迁移&#xff0c;同时通过远程办公、构建金融生态等方式积极推进零接触开放金融体系建设。 随着商业银行科技力量的…

Ansible的安装及简单使用

## Ansible的安装及简单使用 ## 一.Ubuntu安装Ansible sudo apt update sudo apt install ansible #使用以下命令检查安装是否成功&#xff1a; ansible --version二.配置Ansible #进入配置文件目录 cd /etc/ansible/ ls#文件含义 ansible.cfg #ansible配置文件,默认基本不用…

scratch绘制小正方形 2023年12月中国电子学会图形化编程 少儿编程 scratch编程等级考试四级真题和答案解析

目录 scratch绘制小正方形 一、题目要求 1、准备工作 2、功能实现 二、案例分析

紫光展锐5G扬帆出海 | 欧洲积极拥抱更多5G选择

和我国一样&#xff0c;欧洲不少国家也在2019年进入5G商用元年&#xff1a;英国在2019年5月推出了5G商用服务&#xff0c;该国最大的移动运营商EE(Everything Everywhere)最先商用5G&#xff1b;德国在2019年年中推出5G商用服务&#xff0c;德国电信、沃达丰和 Telefonica是首批…

Hive实战:分科汇总求月考平均分

文章目录 一、实战概述二、提出任务三、完成任务&#xff08;一&#xff09;准备数据1、在虚拟机上创建文本文件2、上传文件到HDFS指定目录 &#xff08;二&#xff09;实现步骤1、启动Hive Metastore服务2、启动Hive客户端3、创建分区的学生成绩表4、按分区加载数据5、查看分区…

nginx rewrite重写URL地址, laravel路由404问题

前言 在开发项目时&#xff0c;我面临一个需求&#xff1a;区分移动端和桌面端访问路径。移动端访问应在路径前加上/m/&#xff0c;而桌面端则不需要。例如&#xff1a; 移动端: 域名/m/路由地址桌面端: 域名/路由地址 这种设计在路由规则上带来了一定的重复&#xff0c;因为…

[C#]使用OpenCvSharp实现二维码图像增强超分辨率

【官方框架地址】 github.com/shimat/opencvsharp 【算法介绍】 借助于opencv自带sr.prototxt和sr.caffemodel实现对二维码图像增强 【效果展示】 【实现部分代码】 using System; using System.Collections.Generic; using System.ComponentModel; using System.Data; usin…

【机器学习】循环神经网络(一)

一、网络结构 RNN 处理输入序列时的信息流。 粗体箭头为各时间点信息流的活跃路径&#xff0c;虚线箭头显示当时不活动的连接。 一个简单RNN例子 RNN 不是一类网络&#xff0c;而是适用于不同问题的拓扑结构的集合。循环网络的一个有趣的方面是&#xff0c;有了足够的层和节点&…

export命令详解

export命令详解 大家好&#xff0c;我是免费搭建查券返利机器人赚佣金就用微赚淘客系统3.0的小编&#xff0c;也是冬天不穿秋裤&#xff0c;天冷也要风度的程序猿&#xff01;今天&#xff0c;让我们一同深入了解 Linux 和 Unix 系统中的一个强大命令——export&#xff0c;并…

PC+Wap仿土巴兔装修报价器源码 PHP源码

核心功能&#xff1a; 业主自助预算计算&#xff1a;通过简洁的界面&#xff0c;业主可以输入装修需求&#xff0c;系统自动进行预算计算信息自动收集&#xff1a;系统自动收集业主的基本信息&#xff0c;如姓名、联系方式、房屋面积等一键发送报价&#xff1a;业主完成预算计…

GnuTLS recv error (-110): The TLS connection was non-properly terminated.

bug 解决方案&#xff1a;参考 GnuTLS recv error (-110): The TLS connection was non-properly terminated. 解决方案&#xff1a; apt-get install gnutls-bin git config --global http.sslVerify false git config --global http.postBuffer 1048576000参考

【SpringCloud】7、Spring Cloud Gateway限流配置

1、限流介绍 Spring Cloud Gateway 的限流配置主要涉及到令牌桶算法的实现。令牌桶算法可以对某一时间窗口内的请求数进行限制,保持系统的可用性和稳定性,防止因流量暴增而导致的系统运行缓慢或宕机。 在 Spring Cloud Gateway 中,官方提供了 RequestRateLimiterGatewayFi…

uni-app 前后端调用实例 基于Springboot 上拉分页实现

锋哥原创的uni-app视频教程&#xff1a; 2023版uniapp从入门到上天视频教程(Java后端无废话版)&#xff0c;火爆更新中..._哔哩哔哩_bilibili2023版uniapp从入门到上天视频教程(Java后端无废话版)&#xff0c;火爆更新中...共计23条视频&#xff0c;包括&#xff1a;第1讲 uni…

Grafana相关问题及答案(2024)

1、Grafana 是什么&#xff0c;它用于什么目的&#xff1f; Grafana 是一个高度灵活且富有功能的数据可视化和监控平台&#xff0c;旨在为技术专业人员提供强大的方式来显示和分析他们的数据。下面将详细介绍 Grafana 的关键特点和它的使用场景。 关键特点 1. 数据源的多样性…

【数据库系统概论】数据库并发控制机制——并发操作带来的数据不一致性问题有哪些

系统文章目录 数据库的四个基本概念&#xff1a;数据、数据库、数据库管理系统和数据库系统 数据库系统的三级模式和二级映射 数据库系统外部的体系结构 数据模型 关系数据库中的关系操作 SQL是什么&#xff1f;它有什么特点&#xff1f; 数据定义之基本表的定义/创建、修改和…

算法基础之合并果子

合并果子 核心思想&#xff1a; 贪心 Huffman树(算法): 每次将两个最小的堆合并 然后不断向上合并 #include<iostream>#include<algorithm>#include<queue> //用小根堆实现找最小堆using namespace std;int main(){int n;cin>>n;priority_queue&l…

项目-苍穹外卖基础(持续更新中~)

day1: login的后端实现&#xff1a; day 2:

【java】期末复习知识点

简单不先于复杂&#xff0c;而是在复杂之后。 文章目录 填空题封装包主类开发过程的改变interfaceabstract class访问控制关键字继承多态object 类Java I/O(输入/输出)异常线程和进程创建线程的两种基本方法 编程题Hello World编写Swing程序&#xff0c;显示一个空白窗口 填空题…

【大数据进阶第三阶段之Hive学习笔记】Hive的数据类型与数据操作

【大数据进阶第三阶段之Hive学习笔记】Hive安装-CSDN博客 【大数据进阶第三阶段之Hive学习笔记】Hive常用命令和属性配置-CSDN博客 【大数据进阶第三阶段之Hive学习笔记】Hive基础入门-CSDN博客 【大数据进阶第三阶段之Hive学习笔记】Hive查询、函数、性能优化-CSDN博客 …