AprilTags图像识别定位实例

引言AprilTags图像识别与定位AprilTags是密歇根大学Edwin Olson教授及其实验室团队在2011年率先提出的一种标记识别算法，目前程序包已经开源，并且广泛运用到机器人和无人机的视觉导航当中。那么本文主要是以AprilTags程序包为基础，为大家做一个关于无人机实时位姿测定的实例演示。准备工具为了能够更方便地在自己的电脑上（以Win10为例）运行AprilTags实例，建...

文章共7,270字 · 阅读需要大约25分钟

一键AI生成摘要，助你高效阅读

问答

tju科学狂人

20643人浏览 · 2019-04-05 22:16:58

tju科学狂人 · 2019-04-05 22:16:58 发布

引言

AprilTags图像识别与定位
AprilTags是密歇根大学Edwin Olson教授及其实验室团队在2011年率先提出的一种标记识别算法，目前程序包已经开源，并且广泛运用到机器人和无人机的视觉导航当中。那么本文主要是以AprilTags程序包为基础，为大家做一个关于无人机实时位姿测定的实例演示。

准备工具

为了能够更方便地在自己的电脑上（以Win10为例）运行AprilTags实例，建议大家使用如下的操作环境和开发工具：

VMware Station 15 这是运行Linux系统的虚拟机环境，可自行在网页上搜索下载，安装完成将下面的ubuntu16.04系统写入虚拟机即可
Ubuntu16.04操作系统使用镜像源下载速度更快，中科大镜像源：http://mirrors.ustc.edu.cn/ubuntu-releases/16.04/
Apriltags程序包下载：https://github.com/AprilRobotics/apriltags
打开网页后直接下载所列出的全部文件和文档然后放入一个文件夹即可
在ubuntu系统中安装ROS操作系统，并且这里还需要下载两个ROS应用包：usb_cam作为图像视窗，usb_calibration作为相机标定
当然还要有opencv函数库（我用的版本很老了，opencv2.4.9或者以上版本应该都可以）

步骤1：摄像头标定

利用usb_calibration进行摄像头标定，具体操作可参考： https://blog.csdn.net/heyijia0327/article/details/43538695
这一步是准确定位的关键，实际上在视觉导航的位姿测定中，我们往往选取地面上的Tag标识作为世界坐标原点（即其坐标已知且固定)，而摄像头所测得的图像经过处理电路传输到计算机时会自动生成一个以计算机为标准的图像坐标数据，这个图像坐标系实际上正是由于无人机把地面的Tag进行了一系列姿态旋转和位置平移后所形成的，并且图像坐标和世界坐标都是已知的，因此只要我们找到联系这两个坐标系的变换矩阵T，自然就确定了无人机旋转和平移的信息（如下图所示）
在这里插入图片描述
看到这里先不要慌~，只要明白上文所讲的内容就可以理解这个变换关系。图中等式左边[u,v,1]代表计算机中的图像坐标，等式最右侧的[Xw，Yw，Zw]代表世界坐标，现在请看右数第二个这个4x4变换矩阵T，其中包含了一个3x3的旋转矩阵R（对应三个欧拉角）和一个3x1的平移矩阵P（对应三维坐标位置），这是我们要求出位姿的数据。注意：
在这里插入图片描述
将其转换为欧拉角形式为：

看到这大家就会很清楚的发现，我们还缺的就是dx、dy、v0、u0和焦距f这五个参数（Zc是无人机位置高度，可由超声波传感器单独测得），因此这就是我们做摄像头标定的目的，关于这5个参数的具体解释可参见：https://blog.csdn.net/zb1165048017/article/details/71104241
也可以自行查阅张正友标定法的原理。总之，只有获得这5个参数（实际在Apriltags中是被简化成了一个四元数矩阵）我们才可正确获得位姿数据。

重要的提示：如果用ROS时打不开摄像头，请先看操作2中的两个说明。

操作2：程序编译

首先在ubuntu系统下打开终端，切换至AprilTags目录下，输入下列指令：

$ make
$ sudo make install

这样就完成了初步的编译，但是现在只有tag识别功能，为了验证是否成功可以执行以下指令：

$ ./opencv_demo

不出意外，将会跳出摄像头图像，将tags标识移动进视野里后，会有高亮的彩色框标记出tags轮廓，并显示tags的识别数量。
在这里插入图片描述

注意如果你使用的是笔记本电脑内置摄像头，要注意这里的两个坑：
1.首先要把摄像头从主机切换到虚拟机下才可以运行
2.如果运行程序后发现笔记本摄像头指示灯亮起但是无法显示图像，这可能是因为USB兼容性不一致，进而导致虚拟机操作系统无法带动底层的硬件负载。此时要打开虚拟机设置中的USB控制器设置，如果原来是USB2.0，要改成USB3.0
如果原来是USB3.0，改成USB2.0即可。

操作3：程序改进

之前的操作只是完成了标识，接下来我们要返回无人机的位姿矩阵。打开Apriltags文件中example下的opencv_demo.cc文件，下面是运行的主程序。

#include <iostream>
#include "opencv2/opencv.hpp"
#include <fcntl.h>
#include <errno.h> 
#include <cmath>
#include <cstring>
#include <vector>
#include <list>
#include <eigen3/Eigen/Dense>

extern "C" {
#include "apriltag_pose.h"
#include "apriltag.h"
#include "tag36h11.h"
#include "tag25h9.h"
#include "tag16h5.h"
#include "tagCircle21h7.h"
#include "tagCircle49h12.h"
#include "tagCustom48h12.h"
#include "tagStandard41h12.h"
#include "tagStandard52h13.h"
#include "common/getopt.h"
#include "common/homography.h"
}
using namespace std;
using namespace cv;

#ifndef PI
const double PI = 3.14159265358979323846;
#endif
const double TWOPI = 2.0*PI;
/**
 * 定义了角度归一化函数，角度范围统一输出范围是[-pi,pi].
 **/
inline double standardRad(double t) {
  if (t >= 0.) {
    t = fmod(t+PI, TWOPI) - PI;
  } else {
    t = fmod(t-PI, -TWOPI) + PI;
  }
  return t;
}


    int main(int argc, char *argv[])
{
    getopt_t *getopt = getopt_create();

    getopt_add_bool(getopt, 'h', "help", 0, "Show this help");
    getopt_add_bool(getopt, 'd', "debug", 1, "Enable debugging output (slow)");
    getopt_add_bool(getopt, 'q', "quiet", 0, "Reduce output");
    getopt_add_string(getopt, 'f', "family", "tag36h11", "Tag family to use");
    getopt_add_int(getopt, 't', "threads", "1", "Use this many CPU threads");
    getopt_add_double(getopt, 'x', "decimate", "2.0", "Decimate input image by this factor");
    getopt_add_double(getopt, 'b', "blur", "0.0", "Apply low-pass blur to input");
    getopt_add_bool(getopt, '0', "refine-edges", 1, "Spend more time trying to align edges of tags");

    if (!getopt_parse(getopt, argc, argv, 1) ||
            getopt_get_bool(getopt, "help")) {
        printf("Usage: %s [options]\n", argv[0]);
        getopt_do_usage(getopt);
        exit(0);
    }
    // Initialize camera
    VideoCapture cap(0);
    if (!cap.isOpened()) {
        cerr << "Couldn't open video capture device" << endl;
        return -1;
    }
    // Initialize tag detector with options
    apriltag_family_t *tf = NULL;
    const char *famname = getopt_get_string(getopt, "family");
    if (!strcmp(famname, "tag36h11")) {
        tf = tag36h11_create();
    } else if (!strcmp(famname, "tag25h9")) {
        tf = tag25h9_create();
    } else if (!strcmp(famname, "tag16h5")) {
        tf = tag16h5_create();
    } else if (!strcmp(famname, "tagCircle21h7")) {
        tf = tagCircle21h7_create();
    } else if (!strcmp(famname, "tagCircle49h12")) {
        tf = tagCircle49h12_create();
    } else if (!strcmp(famname, "tagStandard41h12")) {
        tf = tagStandard41h12_create();
    } else if (!strcmp(famname, "tagStandard52h13")) {
        tf = tagStandard52h13_create();
    } else if (!strcmp(famname, "tagCustom48h12")) {
        tf = tagCustom48h12_create();
    } else {
        printf("Unrecognized tag family name. Use e.g. \"tag36h11\".\n");   //user does not set the family type of Tags 
        exit(-1);                                                           //default type:  tf = tag36h11_create()  
    }                   


    apriltag_detector_t *td = apriltag_detector_create();              //set some parameters
    apriltag_detector_add_family(td, tf);                              //set parttern recongnize of the specific tag family
    td->quad_decimate = getopt_get_double(getopt, "decimate");       
    td->quad_sigma = getopt_get_double(getopt, "blur");
    td->nthreads = getopt_get_int(getopt, "threads");
    td->debug = getopt_get_bool(getopt, "debug");
    td->refine_edges = getopt_get_bool(getopt, "refine-edges");
    
    Mat frame, gray;
    
    /***********输入标定的相机参数*************/
    apriltag_detection_info_t info;     // parameters of the camera calibrations 在这里把标定得到的四元参数输入到程序里
     info.tagsize = 0.056; //标识的实际尺寸
     info.fx = 652.894;
     info.fy = 651.487;
     info.cx = 301.857;
     info.cy = 237.548;

    while (true) {
        cap >> frame;
        cvtColor(frame, gray, COLOR_BGR2GRAY);

        // Make an image_u8_t header for the Mat data , and turn the gray image into the candidate of the tags waiting to be recongnized(&im)
        image_u8_t im = { .width = gray.cols,        
            .height = gray.rows,
            .stride = gray.cols,
            .buf = gray.data
        };                          

        zarray_t *detections = apriltag_detector_detect(td, &im);    //parttern recongnize to start
        cout << zarray_size(detections) << " tags detected" << endl;

        // Draw detection outlines
        for (int i = 0; i < zarray_size(detections); i++) {
            apriltag_detection_t *det;                       //
            zarray_get(detections, i, &det);
            line(frame, Point(det->p[0][0], det->p[0][1]),
                     Point(det->p[1][0], det->p[1][1]),
                     Scalar(0, 0xff, 0), 2);
            line(frame, Point(det->p[0][0], det->p[0][1]),
                     Point(det->p[3][0], det->p[3][1]),
                     Scalar(0, 0, 0xff), 2);
            line(frame, Point(det->p[1][0], det->p[1][1]),
                     Point(det->p[2][0], det->p[2][1]),
                     Scalar(0xff, 0, 0), 2);
            line(frame, Point(det->p[2][0], det->p[2][1]),
                     Point(det->p[3][0], det->p[3][1]),
                     Scalar(0xff, 0, 0), 2);
            
            stringstream ss;
            ss << det->id;
            String text = ss.str();
            
         
           /*********加入位姿估计函数**********/
            info.det = det;  //det-->info
			apriltag_pose_t pose;
            estimate_pose_for_tag_homography(&info, &pose);
			//double err = estimate_tag_pose(&info, &pose);
         /**********将欧拉角转换成度为坐标，并归一化到[-pi,pi]范围内 ********/     
      double yaw = 180*standardRad(atan2(pose.R->data[3], pose.R->data[0]));
      double pitch=180*standardRad(sin(pose.R->data[6]));
      double roll=180*standardRad(atan2(pose.R->data[7],pose.R->data[8]));
             yaw=yaw/PI;
             pitch=pitch/PI;
             roll=roll/PI;    
			cout<<"THE 3D POSE:"<<"x= "<<pose.t->data[0]<<"  "<<"y= "<<pose.t->data[1]<<" "<<"z= "<<pose.t->data[2]<<endl;  //t output
            /************输出三维位置坐标信息***************/ 
                   cout<<"yaw: "<<yaw<<"'"<<endl;
                   cout<<"pitch: "<<pitch<<"'"<<endl;
                   cout<<"roll: "<<roll<<"'"<<endl;    
             /*************输出3个欧拉角数据****************/
        
		
		  int fontface = FONT_HERSHEY_SCRIPT_SIMPLEX;
          double fontscale = 1.0;
          int baseline;
          Size textsize = getTextSize(text, fontface, fontscale, 2,
                                            &baseline);
            putText(frame, text, Point(det->c[0]-textsize.width/2,
                                       det->c[1]+textsize.height/2),
                    fontface, fontscale, Scalar(0xff, 0x99, 0), 2);
        }
        zarray_destroy(detections);
        imshow("Tag Detections", frame);
        if (waitKey(25) >= 0)
            break;}
    apriltag_detector_destroy(td);

    if (!strcmp(famname, "tag36h11")) {
        tag36h11_destroy(tf);
    } else if (!strcmp(famname, "tag25h9")) {
        tag25h9_destroy(tf);
    } else if (!strcmp(famname, "tag16h5")) {
        tag16h5_destroy(tf);
    } else if (!strcmp(famname, "tagCircle21h7")) {
        tagCircle21h7_destroy(tf);
    } else if (!strcmp(famname, "tagCircle49h12")) {
        tagCircle49h12_destroy(tf);
    } else if (!strcmp(famname, "tagStandard41h12")) {
        tagStandard41h12_destroy(tf);
    } else if (!strcmp(famname, "tagStandard52h13")) {
        tagStandard52h13_destroy(tf);
    } else if (!strcmp(famname, "tagCustom48h12")) {
        tagCustom48h12_destroy(tf);
    }
    getopt_destroy(getopt);
    return 0;
}

实际上只需要在原来程序中调用已有的位姿估计函数即可，改动的地方是图中汉字注释标明的地方。如果想要深入了解这一算法的程序代码，可打开源文件apriltag_pose.c和homography.c。这里还要说明一点的是，上文程序使用的是homography位姿算法，Apriltags中还给出了另一种算法orthogonal iteration。但是在运行此函数时由于要返回一个double类型的参数err，可能是指针调用问题会导致抛出核心转储（core dumped）异常，由于时间原因笔者还没有来得及解决这一问题。如果大家有解决的方法，欢迎及时指出。

我们得到的结果如下：
在这里插入图片描述

其中x，y，z的位置坐标单位为米，3个欧拉角单位均为度。本例中世界坐标的参照原点是笔记本电脑的内置摄像头。

图像识别部分

图像检测部分涉及到了大量计算机视觉和机器学习相关的内容，由于篇幅限制无法展开说明了，其大致分为以下几个步骤：
1.线段聚类（clustering）
在这里插入图片描述
2.图像分割（Segementation）

在这里插入图片描述
3.深度优先搜索（深度为四的DFS确定四边形）

4关键角点阈值筛选（Thresh）

最终可确定是否为Tags标识。

总结补充

以上操作是在笔记本电脑的虚拟机上完成的，我们完全可以将其移植到树莓派上，利用树莓派作为图像处理的主机，并与无人机飞控（Pixhawk）通过串口通信连接，把位姿数据实时传入PID控制程序后由此产生控制信号传输到飞控中，最终完成无人机的自主导航飞行控制。另外若要搭建地面站，则较为复杂一些，还需要无线蓝牙模块以及图传和数传技术的支持。

无人机视觉导航目前仍然是一个热点和难点问题，实际上SLAM技术在这一方面应用的更为成熟广泛，大家有兴趣可以继续学习。而这里用到的Apriltags虽然只是一个程序包，但其中包含了大量计算机视觉和移动机器人技术的综合应用，本文程序框架源自于密歇根大学 April Lab实验室的研究团队，同时也参考了麻省理工学院CSAIL实验室给出的Tags Application工程实例，因此自己也可以说是有幸站在巨人的肩膀上来欣赏这些计算机视觉领域的杰作了。

最后还是要说，本人的学术能力还尚有限，文章中如有不妥之处欢迎各位读者批评与纠正。