各相机模型(针孔+鱼眼)综述

[TOC]

Overview

Lens Projections

  • Perspective and fisheye imaging process

Optics: Terminology

  • Dioptric: All elements are refractive (lenses)
  • Catoptric: All elements are reflective (mirrors)
  • Catadioptric: Elements are refractive and reflective (mirrors + lenses)

Camera Models

Papers

  • Straight Lines Have to be Straight: Automatic Calibration and Removal of Distortion from Scenes of Structured Environments
  • A Generic Camera Model and Calibration Method for Conventional, Wide-Angle, and Fish-Eye Lenses
  • Single View Point Omnidirectional Camera Calibration from Planar Grids
  • The Double Sphere Camera Model
  • Supported models in Kalibr

  • Distortion Models in ROS (distortion_models.h)
    • plumb_bob: a 5-parameter polynomial approximation of radial and tangential distortion
    • rational_polynomial: an 8-parameter rational polynomial distortion model
    • equidistant (lunar)
  • hengli/camodocal
    • Pinhole camera model
    • Unified projection model
    • Equidistant fish-eye model
  • uzh-rpg/rpg_vikit/vikit_common: support pinhole, atan and omni camera

  • ethz-asl/aslam_cv2: support pinhole and unified peojection and radtan, fisheye and equidistant distortion

  • cggos/okvis_cg: support pinhole peojection and radtan and equidistant distortion

  • ptam_cg/src/ATANCamera.cc: support ATAN camera model

Pinhole camera

  • pinhole model (rectilinear projection model) + radial-tangential distortion

The Pinhole camera model is the most common camera model for consumer cameras. In this model, the image is mapped onto a plane through perspective projection. The projection is defined by the camera intrinsic parameters such as focal length, principal point, aspect ratio, and skew.

Fisheye camera

OpenCV fisheye camera model

  • pinhole model (rectilinear projection model) + fisheye distortion

The Fisheye camera model is a camera model utilized for wide field of view cameras. This camera model is neccessary because the pinhole perspective camera model is not capable of modeling image projections as the field of view approaches 180 degrees.

Given a point $ X=[x_c \quad y_c \quad z_c] $ from the camera $z_c=1$ plane in camera coordinates, the pinhole projection is:

\[\begin{cases} r = \sqrt{x_c^2 + y_c^2} \\ \theta = atan2(r, |z_c|) = atan2(r, 1) = atan(r) \end{cases}\]

in another way

\[f = r' \cdot tan(\theta) \quad \text{where} \quad r' = \sqrt{u^2 + v^2}\]

fisheye distortion:

\[\theta_d = \theta (1 + k1 \cdot \theta^2 + k2 \cdot \theta^4 + k3 \cdot \theta^6 + k4 \cdot \theta^8)\]

The distorted point coordinates are

\[\begin{cases} x_d = \frac{\theta_d \cdot x_c} {r} \\ y_d = \frac{\theta_d \cdot y_c} {r} \end{cases}\]

convert into pixel coordinates, the final pixel coordinates vector

\[\begin{cases} u = f_x (x_d + \alpha y_d) + c_x \\ v = f_y \cdot y_d + c_y \end{cases}\]

write in matrix form

\[\begin{aligned} \left[\begin{array}{c}u\\v\\1\end{array}\right] = \left[\begin{array}{ccc} f_x & \alpha & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{array}\right] \left[\begin{array}{c}x_d\\y_d\\1\end{array}\right] \end{aligned}\]

ATAN model

This is an alternative representation for camera models with large radial distortion (such as fisheye cameras) where the distance between an image point and principal point is roughly proportional to the angle between the 3D point and the optical axis. This camera model is first proposed in Straight Lines Have to be Straight: Automatic Calibration and Removal of Distortion from Scenes of Structured Environments.

Given a point $ X=[x_c \quad y_c \quad z_c] $ from the camera $z_c=1$ plane in camera coordinates, the pinhole projection is:

\[r = \sqrt{\frac{x_c^2 + y_c^2}{z_c^2}} = \sqrt{x_c^2 + y_c^2}\]

FOV distortion:

\[r_d = \frac{1}{\omega}arctan( 2 \cdot r \cdot tan(\frac{\omega}{2}) )\]

where $\omega$ is the FOV distortion coefficient

The distorted point coordinates are

\[\begin{cases} x_d = \frac{r_d}{r} \cdot x_c \\ y_d = \frac{r_d}{r} \cdot y_c \end{cases}\]

convert into pixel coordinates, the final pixel coordinates vector

\[\begin{aligned} \left[\begin{array}{c}u\\v\\1\end{array}\right] = \left[\begin{array}{ccc} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{array}\right] \left[\begin{array}{c}x_d\\y_d\\1\end{array}\right] \end{aligned}\]

Equidistant fish-eye model

Omnidirectional Camera

Read More

Ubuntu 16.04 下 PL-SLAM (Stereo) 的安装和使用

[TOC]

Overview

This code rubengooj/pl-slam contains an algorithm to compute stereo visual SLAM by using both point and line segment features.

  • Related Publication

    @article{gomez2017pl,
      title   = {PL-SLAM: a Stereo SLAM System through the Combination of Points and Line Segments},
      author  = {Gomez-Ojeda, Ruben and Zuñiga-Noël, David and Moreno, Francisco-Angel and Scaramuzza, Davide and Gonzalez-Jimenez, Javier},
      journal = {arXiv preprint arXiv:1705.09479},
      year    = {2017}
    

Prerequisites and Dependencies

  • Basics

    sudo apt install build-essential pkg-config libboost-dev \
    libsuitesparse-dev libeigen3-dev libyaml-cpp-dev
    
  • OpenCV 3.x.x
    • I installed OpenCV 3.3.1 along with ros-kinetic
  • G2O
    • recommend version: commit id ff647bd (ff647bd7537860a2b53b3b774ea821a3170feb13)
  • MRPT/mrpt: The Mobile Robot Programming Toolkit
    • recommend version: commit id 0c3d605 (0c3d605c3cbf5f2ffb8137089e43ebdae5a55de3)
    git clone https://github.com/MRPT/mrpt.git
    git branch cg_0c3d605 0c3d605c3cbf5f2ffb8137089e43ebdae5a55de3
    git checkout cg_0c3d605
    
    # install dependencies
    sudo apt install libdc1394-22-dev libjpeg-dev libftdi-dev freeglut3-dev \
    libwxgtk3.0-dev zlib1g-dev libusb-1.0-0-dev libudev-dev libfreenect-dev \
    libavformat-dev libswscale-dev libassimp-dev libgtest-dev libpcap-dev
    
    # build & install
    mkdir build & cd build
    cmake .. & make -j4
    sudo make install
    
  • rubengooj/stvo-pl: Stereo Visual Odometry by combining point and line segment features

    git clone https://github.com/rubengooj/stvo-pl.git
    cd stvo-pl
    chmod +x build.sh
    ./build.sh
    

Note: it’s better mrpt, stvo-pl and pl-slam are in the same directory

Build

Build pl-slam

git clone https://github.com/rubengooj/pl-slam.git
chmod +x build.sh
./build.sh

Errors

  • Q: /usr/bin/ld: cannot find -lg2o_ext_csparse
    A: sudo ln -sv libg2o_csparse_extension.so libg2o_ext_csparse.so

Run

Dataset

Kitti data_odometry_gray

  • edit ~/.bashrc, and
    add export DATASETS_DIR=<path-to-data_odometry_gray>/sequences
  • copy pl-slam/config/dataset_params/kitti00-02.yaml
    to <path-to-data_odometry_gray>/sequences/00/,
    rename the yaml file to dataset_params.yaml and change it if necessary
  • source ~/.bashrc
  • edit pl-slam/config/config/config_kitti.yaml, change the value of vocabulary_p and vocabulary_l
  • run
    ./plslam_dataset 00 -c ../config/config/config_kitti.yaml -o 100 -s 1 -n 1000
    or
    ./plslam_dataset 00 -c ../config/config/config_kitti.yaml -o 100 -s 1

Result

EuRoC MH_01_easy

  • edit ~/.bashrc, and add export DATASETS_DIR=<path-to-MH_01_easy>
  • copy pl-slam/config/dataset_params/euroc_params.yaml to <path-to-MH_01_easy>/mav0/,
    rename the yaml file to dataset_params.yaml and change it if necessary
  • source ~/.bashrc
  • edit pl-slam/config/config/config_euroc.yaml, change the value of vocabulary_p and vocabulary_l
  • run ./plslam_dataset mav0 -c ../config/config/config_euroc.yaml -o 100 -s 1

Run Errors

  • the app crashed and get the following error when restart the app after close it with Ctrl+C

    DRM_IOCTL_I915_GEM_APERTURE failed: Invalid argument Assuming 131072kB available aperture size. May lead to reduced performance or incorrect rendering. get chip id failed: -1 [22] param: 4, val: 0 Segmentation fault (core dumped)

    and it fixed after reinstalling Nvidia-driver

  • the app crashed with the error Segmentation fault (core dumped) after run Frame #1600 with the KITTI data_odometry_gray dataset, but have not solved it

Read More

图像分析之图像特征

[TOC]

Overview

Types of Image Feature:

  • Edges
  • Corners (also known as interest points)
  • Blobs (also known as regions of interest )

Image Corners/Keypoints

Keypoints Structure (from OpenCV):

  • pt: x & y coordinates of the keypoint
  • size: keypoint diameter
  • angle: keypoint orientation
  • response: keypoint detector response on the keypoint (that is, strength of the keypoint)
  • octave: pyramid octave in which the keypoint has been detected
  • class_id: object id

feature detector + feature descriptor

Harris

Define the auto-correlation surface or SSD surface or the weighted sum of squared differences:

\[\begin{aligned} E_{AC}(\Delta \mathbf{u}) &= \sum_{i} \omega(\mathbf{x}_i) [ \mathbf{I}_0 ( \mathbf{x}_i + \Delta \mathbf{u} ) - \mathbf{I}_0 (\mathbf{x}_i) ]^2 \\ &\approx \sum_{i} \omega(\mathbf{x}_i) [ \mathbf{I}_0(\mathbf{x}_i) + \nabla \mathbf{I}_0(\mathbf{x}_i) \cdot \Delta\mathbf{u} - \mathbf{I}_0(\mathbf{x}_i) ]^2 \\ &= \sum_{i} \omega(\mathbf{x}_i) [ \nabla \mathbf{I}_0(\mathbf{x}_i) \cdot \Delta\mathbf{u} ]^2 \\ &= \Delta\mathbf{u}^T \cdot \mathbf{M} \cdot \Delta\mathbf{u} \end{aligned}\]

where

\[\nabla \mathbf{I}_0(\mathbf{x}_i) = ( \frac{\partial{\mathbf{I}_0}}{\partial{x}}, \frac{\partial{\mathbf{I}_0}}{\partial{y}} ) (\mathbf{x}_i)\]

written by simply the gradient

\[\nabla \mathbf{I} = [\mathbf{I}_x, \mathbf{I}_y]\]

and the auto-correlation matrix with the weighting kernel $\omega$

\[\mathbf{M} = \omega * \begin{bmatrix} \mathbf{I}_x^2 & \mathbf{I}_x\mathbf{I}_y \\ \mathbf{I}_x\mathbf{I}_y & \mathbf{I}_y^2 \end{bmatrix}\]

then create a score equation, which will determine if a window can contain a corner or not

\[R = det(\mathbf{M}) - k (trace(\mathbf{M}))^2\]

where

\[det(\mathbf{M}) = \lambda_1 \lambda_2\] \[trace(\mathbf{M}) = \lambda_1 + \lambda_2\]

and, $\lambda_1$ and $\lambda_2$ are the eigen values of $\mathbf{M}$, we can compute it by

\[det(\lambda E - M) = 0\]

So the values of these eigen values decide whether a region is corner, edge or flat

  • When $ |R| $ is small, which happens when $\lambda_1$ and $\lambda_2$ are small, the region is flat.
  • When $R<0$, which happens when $\lambda_1 » \lambda_2$ or vice versa, the region is edge.
  • When $R$ is large, which happens when $\lambda_1$ and $\lambda_2$ are large and $\lambda_1 \sim \lambda_2$, the region is a corner.

Shi-Tomas

  • cv::goodFeaturesToTrack

The Shi-Tomasi corner detector is based entirely on the Harris corner detector. However, one slight variation in a “selection criteria” made this detector much better than the original. It works quite well where even the Harris corner detector fails.

Later in 1994, J. Shi and C. Tomasi made a small modification to it in their paper Good Features to Track which shows better results compared to Harris Corner Detector.

The scoring function in Harris Corner Detector was given by (Harris corner strength):

\[\mathbf{R} = \lambda_1 \lambda_2 - k(\lambda_1 + \lambda_2)^2\]

Instead of this, Shi-Tomasi proposed (get the minimum eigenvalue):

\[R=min(\lambda_1,\lambda_2)\]

If $R$ is greater than a certain predefined value, it can be marked as a corner

FAST

FAST (Features from Accelerated Segment Test) algorithm was proposed by Edward Rosten and Tom Drummond in their paper “Machine learning for high-speed corner detection” in 2006 (Later revised it in 2010).

Feature Detection

检测 局部像素灰度 变化明显的地方。

  • 在图像中选取像素p,假设它的亮度为 $I_p$;
  • 设置一个阈值 $T$;
  • 以像素 $p$ 为中心,选取半径为3的 Bresenham圆 上的16个像素;
  • 假设选取的圆上有连续的N个点的亮度大于 $I_p+T$ 或 $I_p-T$,则该点 $p$ 可被认为是特征点(N通常取12,即为 FAST-12,其他常用的N取值有9和11,分别被成为 FAST-9FAST-11);
  • 循环以上四步;

Non-maximal Suppression

FAST角点经常出现“扎堆”的情况,通过 非极大值抑制,在一定区域内仅保留响应极大值的角点,避免角点集中的问题。

SIFT

in 2004, D.Lowe, University of British Columbia, came up with a new algorithm, Scale Invariant Feature Transform (SIFT) in his paper, Distinctive Image Features from Scale-Invariant Keypoints, which extract keypoints and compute its descriptors.

该算法具有一定的仿射不变性,视角不变性,旋转不变性和光照不变性,所以在图像特征提高方面得到了最广泛的应用。

SURF

In 2006, three people, Bay, H., Tuytelaars, T. and Van Gool, L, published another paper, “SURF: Speeded Up Robust Features” which introduced a new algorithm called SURF. As name suggests, it is a speeded-up version of SIFT.

2006年,Bay和Ess等人基于SIFT算法的思路,提出了加速鲁棒特征(SURF),该算法主要针对于SIFT算法速度太慢,计算量大的缺点,使用了近似Harr小波方法来提取特征点,这种方法就是基于Hessian行列式(DoH)的斑点特征检测方法。通过在不同的尺度上利用积分图像可以有效地计算出近似Harr小波值,简化了二阶微分模板的构建,搞高了尺度空间的特征检测的效率。

SURF算法在积分图像上使用了盒子滤波器对二阶微分模板进行了简化,从而构建了Hessian矩阵元素值,进而缩短了特征提取的时间,提高了效率。

BRIEF Descriptor

  • BRIEF: Binary Robust Independent Elementary Features

在特征点周围邻域内选取若干个像素点对,通过对这些点对的灰度值比较,将比较的结果组合成一个二进制串字符串用来描述特征点。最后,使用汉明距离来计算在特征描述子是否匹配。

BRISK

BRISK算法在特征点检测部分没有选用FAST特征点检测,而是选用了稳定性更强的AGAST算法。在特征描述子的构建中,BRISK算法通过利用简单的像素灰度值比较,进而得到一个级联的二进制比特串来描述每个特征点,这一点上原理与BRIEF是一致的。BRISK算法里采用了邻域采样模式,即以特征点为圆心,构建多个不同半径的离散化Bresenham同心圆,然后再每一个同心圆上获得具有相同间距的N个采样点。

ORB

As an OpenCV enthusiast, the most important thing about the ORB(Oriented FAST and Rotated BRIEF) is that it came from “OpenCV Labs”. This algorithm was brought up by Ethan Rublee, Vincent Rabaud, Kurt Konolige and Gary R. Bradski in their paper ORB: An efficient alternative to SIFT or SURF in 2011. As the title says, it is a good alternative to SIFT and SURF in computation cost, matching performance and mainly the patents. Yes, SIFT and SURF are patented and you are supposed to pay them for its use. But ORB is not !!!

ORB is basically a fusion of FAST keypoint detector and BRIEF descriptor with many modifications to enhance the performance.

oriented FAST

  • use FAST to find keypoints, then apply Harris corner measure to find top N points among them

  • multiscale
    • use pyramid to produce multiscale-features
  • rotation invariance (Orientation): It computes the intensity weighted centroid of the patch with located corner at center. The direction of the vector from this corner point to centroid gives the orientation. To improve the rotation invariance, moments are computed with x and y which should be in a circular region of radius r, where r is the size of the patch. 旋转部分计算如下:

    • 在一个小的图像块 B 中,定义 图像块的一阶矩 为: \(M_{pq} = \sum_{x,y \in B} x^p y^q I(x,y), \quad p,q \in \{ 0,1\}\)

    • 通过矩找到图像块的质心: \(C = \bigg( \frac{M_{10}}{M_{00}}, \frac{M_{01}}{M_{00}} \bigg)\)

    • 几何中心 $O$ 与 质心 $C$ 连接得到 方向向量$\vec{OC}$,于是特征点的方向定义为: \(\theta = arctan( \frac{M_{01}}{M_{10}} )\)

rotated BRIEF

  • Binary Robust Independent Elementary Features

FREAK

  • Fast Retina KeyPoint

根据视网膜原理进行点对采样,中间密集一些,离中心越远越稀疏。并且由粗到精构建描述子,穷举贪婪搜索找相关性小的。42个感受野,一千对点的组合,找前512个即可。这512个分成4组,前128对相关性更小,可以代表粗的信息,后面越来越精。匹配的时候可以先看前16bytes,即代表精信息的部分,如果距离小于某个阈值,再继续,否则就不用往下看了。

SubPixel Corners

  • Subpixel Corners: Increasing accuracy
  • Use the OpenCV function cornerSubPix to find more exact corner positions (more exact than integer pixels)

在 亚像素角点 $\mathbf{q}$ 的求解中,“垂直向量,乘积为0”

\[<\nabla \mathbf{I}(\mathbf{p}_i), \mathbf{q} - \mathbf{p}_i> = 0\]

图像金字塔

use pyramid to produce multiscale-features

  • 均值金字塔:2*2邻域均值滤波
  • 高斯金字塔:向下降采样图像(4层),高斯核5*5
  • 拉普拉斯金字塔

Image Edges

Line

  • 正交表示
  • 普吕克坐标表示

Image Blobs

Planar

Read More

图像分析之图像特征匹配

[TOC]

相似度

  • SSD (Sum of Squared Distance)
\[{D(I_A,I_B)}_{SSD} = \sum_{x,y}[{I_A}(x,y)-{I_B}(x,y)]^2\]
  • SAD (Sum of Absolute Difference)
\[{D(I_A,I_B)}_{SAD} = \sum_{x,y} | {I_A}(x,y)-{I_B}(x,y) |\]
  • NCC (Normalized Cross Correlation)
\[{D(I_A,I_B)}_{NCC} = \frac { \sum_{x,y} {I_A}(x,y) {I_B}(x,y) } { \sqrt { \sum_{x,y} {I_A}(x,y)^2 \sum_{x,y} {I_B}(x,y)^2 } }\]
  • 去均值 版本

  • 汉明距离

块匹配

极线搜索

对极几何中,通过 极线搜索 缩小搜索范围,减少计算资源

仿射变换

基本步骤

  • 假设图像I1和图像I2,分别对应的角点为p1i和p2j,在图像I2角点中找到与图像I1对应的角点;
  • 以角点p1i为中心,在图像I1中提取9*9的像素块作为模板图像T1i;
  • 在图像I2中p1i点周围(以角点p1i为中心20*20的像素 范围)查找所有的角点p2jm(m<=n,n为该范围内角点数);
  • 遍历所有的角点p2jm,以角点p2jm为中心,在图像I2中提取9*9的像素块,计算该像素块与T1i的SSD;
  • SSD最小对应的角点p2jm,即为图像I2中与图像I1中角点p1i对应的匹配角点;
  • 循环执行以上5步,查找图像I2中与图像I1对应的所有匹配角点;

描述子匹配

Brute Force匹配和FLANN匹配是opencv二维特征点匹配常见的两种办法,分别对应BFMatcher(cv::BFMatcher)和FlannBasedMatcher(cv::FlannBasedMatcher)。

Brute-Force

FLANN

Read More

视觉SLAM位姿优化时误差函数雅克比矩阵的计算

[TOC]

概述

SLAM,即 同时定位与建图,视觉SLAM的 定位 即 求取相机位姿(旋转和平移 $[\mathbf{R} \quad \mathbf{t}]$);在SLAM中,我们一般使用 李代数 $\boldsymbol{\xi}$ 来表示 旋转和平移。

</p> * 记 相机内参矩阵 $\mathbf{K}$,相机位姿 $\mathbf{T} = [\mathbf{R} \quad \mathbf{t}]$ (or $\boldsymbol{\xi}$) * 记 $I_1$ 的图像坐标系下,一像素点 $\mathbf{p}(u,v)$;在 $O_1$ 相机坐标系下,其对应的 三维点 $\mathbf{P}(X,Y,Z)$ * 记 $I_2$ 的图像坐标系下,一像素点 $\mathbf{p'}(u',v')$;在 $O_2$ 相机坐标系下,其对应的 三维点 $\mathbf{P'}(X',Y',Z')$,归一化坐标为$\mathbf{p'}_{norm}$ $$ \mathbf{P'} = \mathbf{R} \cdot \mathbf{P} + \mathbf{t} $$ $$ \tilde{\mathbf{p'}}_{norm} = ({X'}_{norm},{Y'}_{norm},1) = \frac{\mathbf{P'}}{Z'} = (\frac{X'}{Z'}, \frac{Y'}{Z'}, 1) $$ $$ \tilde{\mathbf{p'}} = \mathbf{K} \cdot \tilde{\mathbf{p'}}_{norm} $$ 在 **优化位姿** 时,其思想是构造一个关于位姿变化的误差函数,当这个误差函数最小时,认为此时估计的位姿最优。视觉SLAM主要分为 直接法 和 特征点法,但无论是直接法还是特征点法,位姿的迭代优化都是求解一个 **最小二乘问题**。 $$ \min_{\boldsymbol{\xi}} \frac{1}{2} \left\| r(\boldsymbol{\xi}) \right\|^2 $$ * **直接法** 最小化 **光度误差**,即 前后帧像素的灰度误差 $$ \begin{aligned} r(\boldsymbol{\xi}) &= \mathbf{I}_2(\mathbf{p}') - \mathbf{I}_1(\mathbf{p}) \\ &= \mathbf{I}_2(u',v') - \mathbf{I}_1(u,v) \end{aligned} $$ * **特征点法** 最小化 **重投影误差**,即地图点到当前图像投影点与匹配点的坐标误差 $$ \begin{aligned} r(\boldsymbol{\xi}) &= \mathbf{p}' - \mathbf{p} \\ &= (u',v') - (u,v) \end{aligned} $$ 误差函数对于位姿的 **雅可比矩阵(Jacobian Matrix)**,决定着下一步最优迭代估计时 位姿增量的方向。 $$ \mathbf{J}(\boldsymbol{\xi}) = \frac{\partial{r(\boldsymbol{\xi})}}{\partial \boldsymbol{\xi}} $$ 根据上面位姿变换的流程,我们可以用 **链式法则** 来表示 $\mathbf{J}$ $$ \begin{aligned} \mathbf{J}(\boldsymbol{\xi}) &= \frac{\partial{r(\boldsymbol{\xi})}}{\partial \boldsymbol{\xi}} \\ &= \frac{\partial{r(\boldsymbol{\xi})}}{\partial \mathbf{p}'} \cdot \frac{\partial \mathbf{p}'}{\partial \mathbf{p}_{norm}'} \cdot \frac{\partial \mathbf{p}_{norm}'}{\partial \mathbf{P'}} \cdot \frac{\partial \mathbf{P'}}{\partial \boldsymbol{\xi}} \\ &= \mathbf{J}_0 \cdot \mathbf{J}_1 \cdot \mathbf{J}_2 \cdot \mathbf{J}_3 \end{aligned} $$ 由此,直接法 与 特征点法 雅克比矩阵 只区别于 $\mathbf{J}_0$。 本文主要介绍 SLAM优化位姿时误差函数对位姿雅可比矩阵的推导。 # 雅克比矩阵 推导 ## $J_0$ $$ \mathbf{J}_0 = \frac{\partial{r(\boldsymbol{\xi})}}{\partial \mathbf{p}'} $$ ### 直接法 我们已知, 在直接法中,单像素点的误差函数是关于像素值的函数,即 光度误差 $$ r(\boldsymbol{\xi}) = \mathbf{I}_2(u',v') - \mathbf{I}_1(u,v) $$ 由于对于一个特定的像素点,$\mathbf{I}_1(\mathbf{p})$ 是关于 $\boldsymbol{\xi}$ 的常量,所以 $$ \begin{aligned} \mathbf{J}_0 &= \frac{\partial \mathbf{I}_2(\mathbf{p}')}{\partial \mathbf{p}'} \\ &= \bigg[ \frac{\mathbf{I}_2(u'+1,v')-\mathbf{I}_2(u'-1,v')}{2}, \frac{\mathbf{I}_2(u',v'+1)-\mathbf{I}_2(u',v'-1)}{2} \bigg] \end{aligned} $$ 为 图像 $\mathbf{I}_2$ 在 $\mathbf{p}'$ 点处的 **像素梯度** ### 特征点法 我们已知, 在直接法中,单像素点的误差函数是关于像素坐标的函数 $$ r(\boldsymbol{\xi}) = \mathbf{p}' - \mathbf{p} $$ 由于对于一个特定的像素点,$\mathbf{p}$ 是关于 $\boldsymbol{\xi}$ 的常量,所以 $$ \mathbf{J}_0 = \frac{\partial \mathbf{p}'}{\partial \mathbf{p}'} = \mathbf{I} \in \mathbb{R}^{2 \times 2} $$ ## $J_1$ $$ \mathbf{J}_1 = \frac{\partial \mathbf{p}'}{\partial \mathbf{p}_{norm}'} = \frac{\partial (u,v)}{\partial (X_{norm}',Y_{norm}')} $$ 由于 $$ \tilde{\mathbf{p'}} = \mathbf{K} \cdot \tilde{\mathbf{p'}}_{norm} $$ $\mathbf{J}_1$ 的计算跟 **相机投影模型** 有关,本文以 **针孔相机模型** (不考虑畸变)为例 对其进行计算。 针孔相机模型(不考虑畸变) 的 数学模型 为 $$ \mathbf{K} = \begin{bmatrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{bmatrix} $$ 所以 $$ \mathbf{J}_1 = \begin{bmatrix} f_x & 0 \\ 0 & f_y \end{bmatrix} \in \mathbb{R}^{2 \times 2} $$ ## $J_2$ $$ \begin{aligned} \mathbf{J}_2 &= \frac{\partial \mathbf{p}_{norm}'}{\partial \mathbf{P'}} \\ &= \frac{\partial (X_{norm}',Y_{norm}')}{\partial (X',Y',Z')} \end{aligned} $$ 根据 $$ \tilde{\mathbf{p'}}_{norm} = \frac {\mathbf{P'}} {Z'} $$ 计算得 $$ \begin{aligned} \mathbf{J}_2 &= \begin{bmatrix} \frac{1}{Z'} & 0 & -\frac{X'}{Z'^2} \\ 0 & \frac{1}{Z'} & -\frac{Y'}{Z'^2} \end{bmatrix} \\ &= \begin{bmatrix} 1 & 0 & -\frac{X'}{Z'} \\ 0 & 1 & -\frac{Y'}{Z'} \end{bmatrix} \cdot \frac{1}{Z'} \in \mathbb{R}^{2 \times 3} \end{aligned} $$ ## $J_3$ ### 使用 李代数 $$ \begin{aligned} \mathbf{J}_3 &= \frac{\partial \mathbf{P'}}{\partial \boldsymbol{\xi}} \\ &= \frac{\partial (\mathbf{T} \cdot \mathbf{P})}{\partial \boldsymbol{\xi}} \\ &= \frac{\partial ( \exp(\boldsymbol{\xi}^{\wedge}) \mathbf{P}) }{\partial \boldsymbol{\xi}} \end{aligned} $$ 其中 $$ \boldsymbol{\xi} = \begin{bmatrix} \boldsymbol{\rho} \\ \boldsymbol{\phi} \end{bmatrix} \in \mathbb{R}^6 $$ 类似高数中,求取 $f(x)$ 的导数 $$ \frac{df}{dx} = \lim_{\Delta x \rightarrow 0} \frac{f(x+\Delta x) - f(x)}{\Delta x} $$ 我们可以 根据李代数加法来对李代数进行求导,计算雅克比矩阵。 一般更使用的,利用李群来左乘或者右乘微小扰动,在对这个扰动的李代数进行求导,利用 **扰动模型** $\delta \boldsymbol{\xi} = [\delta \boldsymbol{\rho} \quad \delta \boldsymbol{\phi}]$,计算如下 $$ \begin{aligned} \frac{\partial \tilde{\mathbf{P'}}}{\partial \boldsymbol{\xi}} &= \frac{\partial (\mathbf{T} \cdot \tilde{\mathbf{P}})}{\partial \boldsymbol{\xi}} \\ &= \frac{\partial ( \exp(\boldsymbol{\xi}^{\wedge}) \tilde{\mathbf{P}} ) }{\partial \delta \boldsymbol{\xi}} \quad \text{(左扰动模型)} \\ &= \lim_{\delta \boldsymbol{\xi} \rightarrow 0} \frac { \exp(\delta \boldsymbol{\xi}^{\wedge}) \exp(\boldsymbol{\xi}^{\wedge}) \tilde{\mathbf{P}} - \exp(\boldsymbol{\xi}^{\wedge}) \tilde{\mathbf{P}} } { \delta \boldsymbol{\xi} } \\ &\approx \lim_{\delta \boldsymbol{\xi} \rightarrow 0} \frac { (\mathbf{I}+ \delta \boldsymbol{\xi}^{\wedge}) \exp(\boldsymbol{\xi}^{\wedge}) \tilde{\mathbf{P}} - \exp(\boldsymbol{\xi}^{\wedge}) \tilde{\mathbf{P}} } { \delta \boldsymbol{\xi} } \\ &= \lim_{\delta \boldsymbol{\xi} \rightarrow 0} \frac { \delta \boldsymbol{\xi}^{\wedge} \exp(\boldsymbol{\xi}^{\wedge}) \tilde{\mathbf{P}} } { \delta \boldsymbol{\xi} } \\ &= \lim_{\delta \boldsymbol{\xi} \rightarrow 0} \frac { \begin{bmatrix} \delta \boldsymbol{\phi}^{\wedge} & \delta \boldsymbol{\rho} \\ \mathbf{0}^{T} & 0 \end{bmatrix} \begin{bmatrix} \mathbf{R} \cdot \mathbf{P} + \mathbf{t} \\ 1 \end{bmatrix} } { \delta \boldsymbol{\xi} } \\ &= \lim_{\delta \boldsymbol{\xi} \rightarrow 0} \frac { \begin{bmatrix} \delta \boldsymbol{\phi}^{\wedge} (\mathbf{R} \cdot \mathbf{P} + \mathbf{t}) + \delta \boldsymbol{\rho} \\ 0 \end{bmatrix} } { \delta \boldsymbol{\xi} } \\ &= \begin{bmatrix} \mathbf{I} & -(\mathbf{R} \cdot \mathbf{P} + \mathbf{t})^{\wedge} \\ \mathbf{0}^{T} & \mathbf{0}^{T} \end{bmatrix} \\ &= \begin{bmatrix} \mathbf{I} & -\mathbf{P}'^{\wedge} \\ \mathbf{0}^{T} & \mathbf{0}^{T} \end{bmatrix} \end{aligned} $$ 所以 $$ \begin{aligned} \mathbf{J}_3 &= \begin{bmatrix} \mathbf{I} & -\mathbf{P}'^{\wedge} \end{bmatrix} \\ &= \begin{bmatrix} 1 & 0 & 0 & 0 & Z' & -Y' \\ 0 & 1 & 0 & -Z' & 0 & X' \\ 0 & 0 & 1 & Y' & -X' & 0 \end{bmatrix} \in \mathbb{R}^{3 \times 6} \end{aligned} $$ 注意: * 上面的 $\boldsymbol{\xi}$ 中 平移 $\boldsymbol{\rho}$ 在前, 旋转 $\boldsymbol{\phi}$ 在后;如果 旋转在前,平移在后,则 $\mathbf{J}_3$ 的前三列与后三列须对调。 * [ch4 为什么能用左扰动模型来求导啊?](https://github.com/gaoxiang12/slambook/issues/183)[gaoxiang12/slambook Issues #183] > 按照定义,左乘一个扰动,然后令扰动趋于零,求目标函数相对于扰动的变化率,作为导数来使用。同时,在优化过程中,用这种导数算出来的增量,以左乘形式更新在当前估计上,于是使估计值一直在SO(3)或SE(3)上。这种手段称为“流形上的优化”。 * [四元数矩阵与 so(3) 左右雅可比](https://fzheng.me/2018/05/22/quaternion-matrix-so3-jacobians/) ### 使用 四元数 若旋转使用 **四元数** 表示,则更新小量为 $\begin{bmatrix} \delta \mathbf{t} \\ \delta \boldsymbol{\theta} \end{bmatrix}$,则 $$ \begin{aligned} \mathbf{J}_3 &= \frac{\partial \mathbf{P'}} {\partial \begin{bmatrix} \delta \mathbf{t} \\ \delta \boldsymbol{\theta} \end{bmatrix}} &= \frac{\partial (\mathbf{R} \cdot \mathbf{P} + \mathbf{t})} {\partial \begin{bmatrix} \delta \mathbf{t} \\ \delta \boldsymbol{\theta} \end{bmatrix}} &= \left[ \frac{\partial \mathbf{t}}{\partial \delta \mathbf{t}} \quad \frac{\partial (\mathbf{R} \cdot \mathbf{P})}{\partial \delta \boldsymbol{\theta}} \right] \end{aligned} $$ with $$ \begin{aligned} \frac{\partial (\mathbf{R} \cdot \mathbf{P})}{\partial \delta \boldsymbol{\theta}} &= \lim_{\delta \boldsymbol{\theta} \rightarrow 0} \frac {\exp({\delta \boldsymbol{\theta}}^\wedge) \mathbf{R} \mathbf{P} - \mathbf{R} \mathbf{P}} {\delta \boldsymbol{\theta}} \\ &= \lim_{\delta \boldsymbol{\theta} \rightarrow 0} \frac {(\mathbf{I} + {\delta \boldsymbol{\theta}}^\wedge) \mathbf{R} \mathbf{P} - \mathbf{R} \mathbf{P}} {\delta \boldsymbol{\theta}} \\ &= \lim_{\delta \boldsymbol{\theta} \rightarrow 0} \frac ^\wedge \mathbf{R} \mathbf{P}}{\delta \boldsymbol{\theta}} \\ &= -{(\mathbf{R} \cdot \mathbf{P})}^\wedge \end{aligned} $$ 此时 $$ \begin{aligned} \mathbf{J}_3 = \left[ \mathbf{I}_{3 \times 3} \quad -{(\mathbf{R} \cdot \mathbf{P})}^\wedge \right] \end{aligned} $$ # 总结 ## 直接法 $$ \begin{aligned} \mathbf{J}(\boldsymbol{\xi}) &= \mathbf{J}_0 \cdot \mathbf{J}_1 \cdot \mathbf{J}_2 \cdot \mathbf{J}_3 \\ &= \frac{\partial \mathbf{I}_2(\mathbf{p}')}{\partial \mathbf{p}'} \cdot \begin{bmatrix} f_x & 0 \\ 0 & f_y \end{bmatrix} \cdot \begin{bmatrix} 1 & 0 & -\frac{X'}{Z'} \\ 0 & 1 & -\frac{Y'}{Z'} \end{bmatrix} \cdot \begin{bmatrix} 1 & 0 & 0 & 0 & Z' & -Y' \\ 0 & 1 & 0 & -Z' & 0 & X' \\ 0 & 0 & 1 & Y' & -X' & 0 \end{bmatrix} \cdot \frac{1}{Z'} \end{aligned} $$ ## 特征点法 * 使用 李代数 $$ \begin{aligned} \mathbf{J}(\boldsymbol{\xi}) &= \mathbf{J}_0 \cdot \mathbf{J}_1 \cdot \mathbf{J}_2 \cdot \mathbf{J}_3 \\ &= \mathbf{I} \cdot \begin{bmatrix} f_x & 0 \\ 0 & f_y \end{bmatrix} \cdot \begin{bmatrix} 1 & 0 & -\frac{X'}{Z'} \\ 0 & 1 & -\frac{Y'}{Z'} \end{bmatrix} \cdot \begin{bmatrix} 1 & 0 & 0 & 0 & Z' & -Y' \\ 0 & 1 & 0 & -Z' & 0 & X' \\ 0 & 0 & 1 & Y' & -X' & 0 \end{bmatrix} \cdot \frac{1}{Z'} \\ &= \begin{bmatrix} \frac{f_x}{Z'} & 0 & -\frac{X'f_x}{Z'^2} \\ 0 & \frac{f_y}{Z'} & -\frac{Y'f_y}{Z'^2} \end{bmatrix} \cdot \begin{bmatrix} 1 & 0 & 0 & 0 & Z' & -Y' \\ 0 & 1 & 0 & -Z' & 0 & X' \\ 0 & 0 & 1 & Y' & -X' & 0 \end{bmatrix} \\ &= \begin{bmatrix} \frac{f_x}{Z'} & 0 & -\frac{X'f_x}{Z'^2} & -\frac{X'Y'f_x}{Z'^2} & f_x+\frac{X'^2f_x}{Z'^2} & -\frac{Y'f_x}{Z'} \\ 0 & \frac{f_y}{Z'} & -\frac{Y'f_y}{Z'^2} & -f_y-\frac{Y'^2f_y}{Z'^2} & \frac{X'Y'f_y}{Z'^2} & \frac{X'f_y}{Z'} \end{bmatrix} \end{aligned} $$ * 使用 四元数 $$ \mathbf{J} = \begin{bmatrix} \frac{f_x}{Z'} & 0 & -\frac{X'f_x}{Z'^2} \\ 0 & \frac{f_y}{Z'} & -\frac{Y'f_y}{Z'^2} \end{bmatrix} \cdot \left[ \mathbf{I}_{3 \times 3} \quad -{(\mathbf{R} \cdot \mathbf{P})}^\wedge \right] $$ # 注意事项 * $\mathbf{J}_1$ 的计算是根据 **针孔相机模型(不考虑畸变)** 进行计算的 * 本文的 $\boldsymbol{\xi}$ 中 平移 $\boldsymbol{\rho}$ 在前, 旋转 $\boldsymbol{\phi}$ 在后;如果 旋转在前,平移在后,则 $\mathbf{J}_3$ 的前三列与后三列须对调 * 本文定义的 **误差函数 $r(\boldsymbol{\xi})$** 为 **预测值减观测值**;如果定义成 **观测值减预测值**,本文计算的结果 $\mathbf{J}$ 前须加 **负号** # 参考文献 * [SLAM优化位姿时,误差函数的雅可比矩阵的推导](https://blog.csdn.net/zhubaohua_bupt/article/details/74011005) * 《视觉SLAM十四讲》

Read More

^