目标检测中的评价——定位精度：voc_eval_loc.m

目标检测中的评价——定位精度：voc_eval_loc.m1. voc_eval__.m2. voc_eval_loc__.m3. VOCevalloc.m3.1 给真实目标分配检测结果3.2 实现VOCevalloc的关键部分运行voc_eval__.m：path = '/home/user3/CODE/fast-rcnn-loc/data/OPTdevkit2017';comp_id =...

angry_snail_flying

735人浏览 · 2020-12-19 11:13:55

angry_snail_flying · 2020-12-19 11:13:55 发布

目标检测中的评价——定位精度：voc_eval_loc.m

1. voc_eval__.m
2. voc_eval_loc__.m
3. VOCevalloc.m
- - 3.1 给真实目标分配检测结果
  - 3.2 实现VOCevalloc的关键部分

运行voc_eval__.m：

path = '/home/user3/CODE/fast-rcnn-loc/data/OPTdevkit2017';
comp_id = 'comp4-17304';
test_set = 'val';
output_dir = '/home/user3/CODE/fast-rcnn-loc/output/loc/opt_2017_val';
rm_res = 0;

res = voc_eval__(path, comp_id, test_set, output_dir, rm_res);

1. voc_eval__.m

先读取VOCopts，便于处理：

function res = voc_eval__(path, comp_id, test_set, output_dir, rm_res)

VOCopts = get_voc_opts(path);
VOCopts.testset = test_set;
VOCopts.detrespath=[VOCopts.resdir 'Main/%s_det_' VOCopts.testset '_%s.txt'];

然后对每个类别，分别进行分析：

for i = 1:length(VOCopts.classes)
  cls = VOCopts.classes{i};
  cls = lower(cls);
  res(i) = voc_eval_loc__(cls, VOCopts, comp_id, output_dir, rm_res);
end

2. voc_eval_loc__.m

准备工作：

function res = voc_eval_loc__(cls, VOCopts, comp_id, output_dir, rm_res)

addpath(fullfile(VOCopts.datadir, 'VOCcode'));
res = VOCevalloc(VOCopts, comp_id, cls);

3. VOCevalloc.m

加载真实结果，测试样本N个：

function res = VOCevalloc(VOCopts,id,cls)

% load test set, gtids: Nx1 cell
[gtids,~]=textread(sprintf(VOCopts.imgsetpath,VOCopts.testset),'%s %d');

% load ground truth objects
tic;
npos=0;
gt(length(gtids))=struct('BB',[],'diff',[],'det',[]);
for i=1:length(gtids)
    % display progress
    if toc>1
        fprintf('%s: pr: load: %d/%d\n',cls,i,length(gtids));
        drawnow;
        tic;
    end
    
    % read annotation
    rec=PASreadrecord(sprintf(VOCopts.annopath,gtids{i}));
    
    % extract objects of class
    clsinds=strmatch(cls,{rec.objects(:).class},'exact');
    gt(i).BB=cat(1,rec.objects(clsinds).bbox)';
    gt(i).diff=[rec.objects(clsinds).difficult];
    gt(i).det=false(length(clsinds),1);
    npos=npos+sum(~gt(i).diff);
end

加载检测结果：

% load results
[ids,confidence,b1,b2,b3,b4]=textread(sprintf(VOCopts.detrespath,id,cls),'%s %f %f %f %f %f');
BB=[b1 b2 b3 b4]';

检测结果排序：

% sort detections by decreasing confidence
[sc,si]=sort(-confidence);
ids=ids(si);
BB=BB(:,si);

首先，了解数据结构：
ids: 所有bndbox所在图像的id，数量nd远超图像数量；
BB: 4行M列；
注意，这里的nd对所有类别都是一样的数值。即每个bndbox对所有类别都有这样一组数据。

3.1 给真实目标分配检测结果

首先来看原始的代码（VOCevaldet.m）：

% assign detections to ground truth objects
% nd: 所有bndbox的数量；
% tp: false positive；
% fp: true positive；
nd=length(confidence);
tp=zeros(nd,1);
fp=zeros(nd,1);
tic;

% 对每个bndbox做一次
for d=1:nd
    % display progress
    if toc>1
        fprintf('%s: pr: compute: %d/%d\n',cls,d,nd);
        drawnow;
        tic;
    end
    
    % find ground truth image
    % i 是检测到的bndbox对应的gt集中图像的id，需要验证存在性
    i=strmatch(ids{d},gtids,'exact');
    if isempty(i)
        error('unrecognized image "%s"',ids{d});
    elseif length(i)>1
        error('multiple image "%s"',ids{d});
    end

    % assign detection to ground truth object if any
    % 在检测结果和gt之间匹配
    % bb是检测结果，ids是对应的图像id；gtbb、gtids为对应gt
    bb=BB(:,d);
    ovmax=-inf;
    for j=1:size(gt(i).BB,2)
        bbgt=gt(i).BB(:,j);
        bi=[max(bb(1),bbgt(1)) ; max(bb(2),bbgt(2)) ; min(bb(3),bbgt(3)) ; min(bb(4),bbgt(4))];
        iw=bi(3)-bi(1)+1;
        ih=bi(4)-bi(2)+1;
        if iw>0 & ih>0                
            % compute overlap as area of intersection / area of union
            % ovmax: 与gt的最大重叠IoU
            % jmax: 最大重叠bndbox的对应图象id
            ua=(bb(3)-bb(1)+1)*(bb(4)-bb(2)+1)+...
               (bbgt(3)-bbgt(1)+1)*(bbgt(4)-bbgt(2)+1)-...
               iw*ih;
            ov=iw*ih/ua;
            if ov>ovmax
                ovmax=ov;
                jmax=j;
            end
        end
    end

    % assign detection as true positive/don't care/false positive
    % 重叠大于等于0.5
    if ovmax>=VOCopts.minoverlap
        if ~gt(i).diff(jmax)
            if ~gt(i).det(jmax)
                tp(d)=1;            % true positive
				gt(i).det(jmax)=true;
            else
                fp(d)=1;            % false positive (multiple detection)
            end
        end
    else
        fp(d)=1;                    % false positive
    end
end

大意：
从一个bb出发，和同图像中所有gtbb进行位置对比。
如果IOU不小于0.5，且该gtbb之前没有被其他bb标记过，则对应的tp(d)记为1；
如果gtbb被其他bb标记过，说明有更好的bb和gtbb对应（按照得分排序），则fp(d)记为1；
如果IOU小于0.5，则直接将fp(d)记为1；

之后的代码：

% compute precision/recall
% cumsum 是累积和向量，由此可以得出pr曲线
fp=cumsum(fp);
tp=cumsum(tp);
rec=tp/npos;
prec=tp./(fp+tp);

% compute average precision
% 求mAP
ap=0;
for t=0:0.1:1
    p=max(prec(rec>=t));
    if isempty(p)
        p=0;
    end
    ap=ap+p/11;
end

3.2 实现VOCevalloc的关键部分

需求：
• Correct: correct class and IOU > .5
• Localization: correct class, .1 < IOU < .5
• Similar: class is similar, IOU > .1
• Other: class is wrong, IOU > .1
• Background: IOU < .1 for any object
这五类如果全部划分出来比较麻烦，因为之前代码里给出的相当于只考虑类别正确的bndbox，所以Similar、Other都不好处理。

简化需求：
• A.Correct: correct class and IOU > .5
• B.Localization: correct class, .1 < IOU < .5
• C.Other: left

设计思路：
1.对每个类别计算一个定位误差百分比；
2.并且计算一个正确率百分比；
3.从gtbb出发；
4.注意每个gtbb对应的bb只选择头一个。

由于我们这里代码整体的前提为——只考虑类别正确的，所以伪代码为：

if IOU >= 0.5
	if first% 防止重复计数一个Correct bbgt，这里用gt.det来分辨
		Correct += 1
elif IOU >= 0.1
	if first % 防止重复计数一个Localization bbgt，这里同样用gt.det来分辨
		Localization += 1

关于这里为什么都用gt.det来区分，因为我们考虑的是对bbgt的分类计数，所以一个bbgt只能被计数一次。因此一个bbgt不管被分到A/B哪一个类别，之后如果有其他bb是A/B，都不再参与计数。

修改代码：

% assign detections to ground truth objects
nd=length(confidence);
tic;

res.crct = 0;
res.lclz = 0;
res.ozrs = 0;
for d=1:nd
    % display progress
    if toc>1
        fprintf('%s: loc: compute: %d/%d bndboxes\n',cls,d,nd);
        drawnow;
        tic;
    end
    
    % find ground truth image
    i=strmatch(ids{d},gtids,'exact');
    if isempty(i)
        error('unrecognized image "%s"',ids{d});
    elseif length(i)>1
        error('multiple image "%s"',ids{d});
    end

    % assign detection to ground truth object if any
    bb=BB(:,d);
    ovmax=-inf;
    for j=1:size(gt(i).BB,2)
        bbgt=gt(i).BB(:,j);
        bi=[max(bb(1),bbgt(1)) ; max(bb(2),bbgt(2)) ; min(bb(3),bbgt(3)) ; min(bb(4),bbgt(4))];
        iw=bi(3)-bi(1)+1;
        ih=bi(4)-bi(2)+1;
        if iw>0 & ih>0                
            % compute overlap as area of intersection / area of union
            ua=(bb(3)-bb(1)+1)*(bb(4)-bb(2)+1)+...
               (bbgt(3)-bbgt(1)+1)*(bbgt(4)-bbgt(2)+1)-...
               iw*ih;
            ov=iw*ih/ua;
            if ov>ovmax
                ovmax=ov;
                jmax=j;
            end
        end
    end
	% assign detection as true Correct/ Localization/ Other 
	if ovmax >= 0.1
		if ovmax >= 0.5
			if ~gt(i).det(jmax)
				res.crct = res.crct + 1; 
				gt(i).det(jmax) = true;
			end
		else
			if ~gt(i).det(jmax)
				res.lclz = res.lclz + 1; 
				gt(i).det(jmax) = true;
			end
		end
	end
end

% compute results
res.ozrs = npos - res.crct - res.lclz;
res.npos = npos;
res.ratio = [res.crct/npos, res.lclz/npos, res.ozrs/npos];