从标注到部署：手把手教你用Labelme标注数据并转COCO格式，喂给SOLOv2做实例分割-洪萨配资

从零构建实例分割数据集：Labelme标注与COCO格式转换全流程实战

在计算机视觉领域，高质量的数据标注是模型成功的基础。不同于常规的目标检测任务，实例分割要求精确到像素级别的标注，这对数据准备工作提出了更高要求。本文将带您完整走通从原始图像标注到模型可读格式的全流程，特别适合需要处理自定义数据集的研究者和工程师。

1. 标注工具选择与Labelme基础操作

Labelme作为MIT开源的图像标注工具，以其轻量化和多边形标注能力成为实例分割数据准备的首选。与矩形框标注工具不同，Labelme允许我们通过描点方式精确勾勒物体轮廓。

安装与基础配置：

pip install labelme # 启动图形界面 labelme

首次使用时建议进行以下配置调整：

在Preferences中设置默认保存路径
开启Auto save功能防止意外丢失标注
调整顶点大小(Point size)至3-5像素以获得更好的标注体验

高效标注技巧：

使用Ctrl+鼠标滚轮快速缩放图像
按Esc键可快速完成当前多边形绘制
右键点击顶点可进行编辑调整
善用复制多边形功能处理相似物体

提示：对于复杂边缘物体，建议先用较稀疏的点勾勒大致轮廓，再逐步添加细节顶点，这样比一次性密集标注效率更高。

标注完成后，每个图像会生成对应的JSON文件，包含以下关键信息：

{ "version": "5.1.1", "flags": {}, "shapes": [ { "label": "parking_space", "points": [[x1,y1], [x2,y2], ...], "shape_type": "polygon" } ], "imagePath": "image_001.jpg", "imageHeight": 1080, "imageWidth": 1920 }

2. Labelme标注数据解析与质量检查

获得原始标注后，我们需要系统性地验证数据质量。常见问题包括：

多边形顶点顺序不一致（顺时针/逆时针混用）
相邻物体标注存在重叠或间隙
部分遮挡物体标注不完整

使用Python进行批量检查：

import json from pathlib import Path def validate_labelme_json(json_path): with open(json_path) as f: data = json.load(f) issues = [] for shape in data['shapes']: # 检查顶点数量是否足够构成多边形 if len(shape['points']) < 3: issues.append(f"顶点不足: {shape['label']}") # 检查坐标是否超出图像范围 for x, y in shape['points']: if not (0 <= x <= data['imageWidth'] and 0 <= y <= data['imageHeight']): issues.append(f"坐标越界: {shape['label']}") return issues # 批量检查整个标注目录 label_dir = Path("labels") for json_file in label_dir.glob("*.json"): problems = validate_labelme_json(json_file) if problems: print(f"{json_file.name}存在问题:") print("\n".join(problems))

常见修正方案：

问题类型	检测方法	修正手段
顶点顺序不一致	计算多边形面积符号	使用`cv2.contourArea`+`orientation`统一方向
相邻物体重叠	IoU计算	手动调整顶点或使用NMS算法处理
标注不完整	视觉检查	补充标注或排除该样本

3. COCO数据格式深度解析

COCO格式作为实例分割领域的事实标准，其数据结构设计值得深入理解。与Labelme的单图像单文件不同，COCO采用集中式存储，将所有标注信息整合在一个JSON文件中。

COCO数据集的核心结构：

{ "info": {...}, # 数据集元信息 "licenses": [...], # 版权信息 "images": [ # 图像列表 { "id": int, # 唯一图像ID "width": int, # 图像宽度 "height": int, # 图像高度 "file_name": str, # 文件名 "license": int, # 许可协议ID "coco_url": str # 可选下载URL } ], "annotations": [ # 标注列表 { "id": int, # 唯一标注ID "image_id": int, # 对应图像ID "category_id": int, # 类别ID "segmentation": [ # 分割多边形 [x1,y1,x2,y2,...] # 单个多边形的顶点坐标 ], "area": float, # 区域面积 "bbox": [x,y,w,h], # 外接矩形框 "iscrowd": 0/1 # 是否群体标注 } ], "categories": [ # 类别列表 { "id": int, # 类别ID "name": str, # 类别名称 "supercategory": str # 父类别 } ] }

关键字段注意事项：

annotation.id必须全局唯一，常见方案是采用image_id*1000 + annotation_index的编码方式
segmentation字段支持RLE或polygon格式，实例分割通常使用后者
area应基于实际多边形计算，而非简单取bbox面积
iscrowd标记会影响评估指标计算，单个物体应始终设为0

4. 从Labelme到COCO的格式转换实战

下面我们实现一个健壮的转换脚本，处理各种边界情况：

import json import os import numpy as np from datetime import datetime from pathlib import Path from tqdm import tqdm class Labelme2COCO: def __init__(self, class_mapping): """ :param class_mapping: 字典，将labelme标签映射到COCO类别ID """ self.class_mapping = class_mapping self.coco = { "info": { "description": "Custom Dataset", "url": "", "version": "1.0", "year": datetime.now().year, "contributor": "", "date_created": datetime.now().strftime("%Y-%m-%d %H:%M:%S") }, "licenses": [{"id": 1, "name": "Academic Use"}], "images": [], "annotations": [], "categories": [] } # 初始化类别信息 for name, cid in class_mapping.items(): self.coco["categories"].append({ "id": cid, "name": name, "supercategory": "object" }) def _calculate_area(self, segmentation): """计算多边形区域面积""" poly = np.array(segmentation).reshape(-1, 2) return 0.5 * np.abs(np.dot(poly[:, 0], np.roll(poly[:, 1], 1)) - np.dot(poly[:, 1], np.roll(poly[:, 0], 1))) def _get_bbox(self, segmentation): """从多边形坐标计算外接矩形""" poly = np.array(segmentation).reshape(-1, 2) x_min, y_min = np.min(poly, axis=0) x_max, y_max = np.max(poly, axis=0) return [float(x_min), float(y_min), float(x_max - x_min), float(y_max - y_min)] def convert(self, labelme_dir, output_path): """执行转换""" labelme_files = list(Path(labelme_dir).glob("*.json")) annotation_id = 1 for image_id, json_file in enumerate(tqdm(labelme_files)): with open(json_file) as f: labelme_data = json.load(f) # 添加图像信息 img_name = labelme_data["imagePath"] self.coco["images"].append({ "id": image_id, "file_name": img_name, "width": labelme_data["imageWidth"], "height": labelme_data["imageHeight"], "license": 1, "date_captured": datetime.now().strftime("%Y-%m-%d %H:%M:%S") }) # 处理每个标注 for shape in labelme_data["shapes"]: if shape["shape_type"] != "polygon": continue # 转换多边形格式 segmentation = [] for point in shape["points"]: segmentation.extend(point) # 添加到COCO标注 self.coco["annotations"].append({ "id": annotation_id, "image_id": image_id, "category_id": self.class_mapping[shape["label"]], "segmentation": [segmentation], "area": self._calculate_area(segmentation), "bbox": self._get_bbox(segmentation), "iscrowd": 0 }) annotation_id += 1 # 保存结果 with open(output_path, "w") as f: json.dump(self.coco, f, indent=2) # 使用示例 converter = Labelme2COCO({"parking_space": 1, "vehicle": 2}) converter.convert("path/to/labelme_jsons", "output/coco_format.json")

转换过程中的典型陷阱：

ID冲突：确保image_id和annotation_id的生成策略不会导致重复
坐标系统差异：Labelme使用左上角原点，需确认与模型要求是否一致
类别映射遗漏：所有Labelme标签必须能在class_mapping中找到对应项
顶点顺序影响：某些模型对多边形顶点顺序敏感，建议统一为顺时针

5. 数据验证与SOLOv2适配

转换完成后，我们需要验证数据能否被SOLOv2正确加载。推荐使用以下检查清单：

数据完整性验证：

from pycocotools.coco import COCO coco = COCO("path/to/coco_format.json") print(f"数据集包含 {len(coco.imgs)} 张图像") print(f"共 {len(coco.anns)} 个标注实例") print(f"类别分布: {coco.getCatIds()}") # 可视化检查 import matplotlib.pyplot as plt img_ids = coco.getImgIds()[:3] for img_id in img_ids: img_info = coco.loadImgs(img_id)[0] ann_ids = coco.getAnnIds(imgIds=img_id) annotations = coco.loadAnns(ann_ids) # 显示图像和标注 img = plt.imread(f"images/{img_info['file_name']}") plt.imshow(img) coco.showAnns(annotations) plt.show()

SOLOv2特定适配要点：