VINS-Mono 论文公式推导与代码解析

[TOC]

概述

Monocular visual-inertial odometry with relocalization achieved via nonlinear graph optimization-based, tightly-coupled, sliding window, visual-inertial bundle adjustment.

1. 测量预处理

1.1 前端视觉处理

  • Simple feature processing pipeline
    • 自适应直方图均衡化( cv::CLAHE
    • 掩模处理,特征点均匀分布(setMask
    • 提取图像Harris角点(cv::goodFeaturesToTrack
    • KLT金字塔光流跟踪(cv::calcOpticalFlowPyrLK
    • 连续帧跟踪
    • 本质矩阵(RANSAC)去除外点(rejectWithF
    • 发布feature_points(id_of_point, un_pts, cur_pts, pts_velocity)
  • Keyframe selection
    • Case 1: Rotation-compensated average feature parallax is larger than a threshold
    • Case 2: Number of tracked features in the current frame is less than a threshold
    • All frames are used for optimization, but non-keyframes are removed first

1.2 IMU 预积分

IMU 测量方程

忽略地球旋转,IMU 测量方程为

\[\begin{aligned} \hat{a}_t &= a_t + b_{a_t} + R_w^t g^w + n_a \\ \hat{\omega} &= \omega_t + b_{\omega}^t + n_{\omega} \end{aligned}\]

预积分方程

(1)IMU integration in world frame

由上面的IMU测量方程积分就可以计算出下一时刻的p、v和q:

(2)IMU integration in the body frame of first pose of interests

为避免重新传播IMU观测值,选用IMU预积分模型,从世界坐标系转为本体坐标系

则 预积分IMU测量模型(估计值)为

\[\begin{bmatrix} \hat{\alpha}^{b_{k}}_{b_{k+1}}\\ \hat{\gamma}^{b_{k}}_{b_{k+1}}\\ \hat{\beta }^{b_{k}}_{b_{k+1}}\\ 0\\ 0 \end{bmatrix} = \begin{bmatrix} R^{b_{k}}_{w} (p^{w}_{b_{k+1}}-p_{b_{k}}^{w}+\frac{1}{2}g^{w}\Delta t^{2}-v_{b_{k}}^{w}\Delta t) \\ q_{b_{k}}^{w^{-1}}\otimes q^{w}_{b_{k+1}}\\ R^{b_{k}}_{w}(v^{w}_{b_{k+1}}+g^{w}\Delta t-v_{b_{k}}^{w})\\ b_{ab_{k+1}}-b_{ab_{k}}\\ b_{wb_{k+1}}-b_{wb_{k}} \end{bmatrix}\]

离散状态下采用 中值法积分 的预积分方程(预积分 测量值)为

\[\begin{aligned} \delta q_{i+1} &= \delta q_{i} \otimes \begin{bmatrix} 1 \\ \frac{1}{2} w_{i}' \delta t \end{bmatrix} \\ \delta\alpha_{i+1} &= \delta\alpha_{i}+\delta\beta_{i}t+0.5a_{i}'\delta t^{2} \\ \delta\beta_{i+1}&=\delta\beta_{i}+a_{i}' \delta t \\ {b_a}_{i+1}&= {b_a}_i \\ {b_g}_{i+1}&= {b_g}_i \end{aligned}\]

其中

\[\begin{aligned} w_{i}' &= \frac{w_{i+1}+w_{i}}{2}-b_{i} \\ a_{i}' &= \frac{ \delta q_{i}(a_{i}+n_{a0}-b_{a_{i}})+ \delta q_{i+1}(a_{i+1}++n_{a1}-b_{a_{i}})}{2} \end{aligned}\]

midPointIntegration 中的相关代码:

Vector3d un_gyr = 0.5 * (_gyr_0 + _gyr_1) - linearized_bg;
result_delta_q  = delta_q * Quaterniond(1, un_gyr(0) * _dt / 2, un_gyr(1) * _dt / 2, un_gyr(2) * _dt / 2);

Vector3d un_acc_0 =        delta_q * (_acc_0 - linearized_ba);
Vector3d un_acc_1 = result_delta_q * (_acc_1 - linearized_ba);
Vector3d un_acc   = 0.5 * (un_acc_0 + un_acc_1);

result_delta_p = delta_p + delta_v * _dt + 0.5 * un_acc * _dt * _dt;
result_delta_v = delta_v + un_acc * _dt;

// 预积分的过程中Bias没有发生改变
result_linearized_ba = linearized_ba;
result_linearized_bg = linearized_bg;

误差状态方程

IMU误差状态向量

\[\delta X = [\delta P \quad \delta v \quad \delta \theta \quad \delta b_a \quad \delta b_g]^T \in \mathbb{R}^{15 \times 1}\]

根据ESKF中 5.3.3 The error-state kinematics 小节公式

对于 中值积分 下的 误差状态方程

\[\dot{\delta X_k} = \begin{cases} \dot{\delta \theta_{k}} =& -[\frac{w_{k+1}+w_{k}}{2}-b_{g_{k}}]_{\times} \delta \theta_{k}-\delta b_{g_{k}}+\frac{n_{w0}+n_{w1}}{2} \\ \dot{\delta\beta_{k}} =& -\frac{1}{2}q_{k}[a_{k}-b_{a_{k}}]_{\times}\delta \theta \\ &-\frac{1}{2}q_{k+1}[a_{k+1}-b_{a_{k}}]_{\times}((I-[\frac{w_{k+1}+w_{k}}{2}-b_{g_{k}}]_{\times }\delta t) \delta \theta_{k} -\delta b_{g_{k}}\delta t+\frac{n_{w0}+n_{w1}}{2}\delta t) \\ &-\frac{1}{2}q_{k}\delta b_{a_{k}}-\frac{1}{2}q_{k+1}\delta b_{a_{k}}-\frac{1}{2}q_{k}n_{a0}-\frac{1}{2}q_{k}n_{a1} \\ \dot{\delta\alpha_{k}} =& -\frac{1}{4}q_{k}[a_{k}-b_{a_{k}}]_{\times}\delta \theta\delta t \\ &-\frac{1}{4}q_{k+1}[a_{k+1}-b_{a_{k}}]_{\times}((I-[\frac{w_{k+1}+w_{k}}{2}-b_{g_{k}}]_{\times }\delta t) \delta \theta _{k} -\delta b_{g_{k}}\delta t+\frac{n_{w0}+n_{w1}}{2}\delta t)\delta t \\ &-\frac{1}{4}q_{k}\delta b_{a_{k}}\delta t-\frac{1}{4}q_{k+1}\delta b_{a_{k}}\delta t-\frac{1}{4}q_{k}n_{a0}\delta t-\frac{1}{4}q_{k}n_{a1}\delta t \\ \dot{\delta b_{a_k}} =& n_{b_a} \\ \dot{\delta b_{g_k}} =& n_{b_g} \end{cases}\]

简写为

\[\dot{\delta X_k} = F \delta X_k + Gn\]

所以

\[\begin{aligned} \delta X_{k+1} &= \delta X_k + \dot{\delta X_k} \delta t \\ &= \delta X_k + (F \delta X_k + Gn) \delta t \\ &= (I + F \delta t) \delta X_k + (G \delta t) n \end{aligned}\]

展开得

\[\begin{aligned} \begin{bmatrix} \delta \alpha_{k+1}\\ \delta \theta_{k+1}\\ \delta \beta_{k+1} \\ \delta b_{a{}{k+1}} \\ \delta b_{g{}{k+1}} \end{bmatrix}&=\begin{bmatrix} I & f_{01} &\delta t & -\frac{1}{4}(q_{k}+q_{k+1})\delta t^{2} & f_{04}\\ 0 & I-[\frac{w_{k+1}+w_{k}}{2}-b_{wk}]_{\times } \delta t & 0 & 0 & -\delta t \\ 0 & f_{21}&I & -\frac{1}{2}(q_{k}+q_{k+1})\delta t & f_{24}\\ 0 & 0& 0&I &0 \\ 0& 0 & 0 & 0 & I \end{bmatrix} \begin{bmatrix} \delta \alpha_{k}\\ \delta \theta_{k}\\ \delta \beta_{k} \\ \delta b_{a{}{k}} \\ \delta b_{g{}{k}} \end{bmatrix} \\ &+ \begin{bmatrix} \frac{1}{4}q_{k}\delta t^{2}& v_{01}& \frac{1}{4}q_{k+1}\delta t^{2} & v_{03} & 0 & 0\\ 0& \frac{1}{2}\delta t & 0 & \frac{1}{2}\delta t &0 & 0\\ \frac{1}{2}q_{k}\delta t& v_{21}& \frac{1}{2}q_{k+1}\delta t & v_{23} & 0 & 0 \\ 0 & 0 & 0 & 0 &\delta t &0 \\ 0& 0 &0 & 0 &0 & \delta t \end{bmatrix} \begin{bmatrix} n_{a0}\\ n_{w0}\\ n_{a1}\\ n_{w1}\\ n_{ba}\\ n_{bg} \end{bmatrix} \end{aligned}\]

其中

\[\begin{aligned} f_{01}&=-\frac{1}{4}q_{k}[a_{k}-b_{a_{k}}]_{\times}\delta t^{2}-\frac{1}{4}q_{k+1}[a_{k+1}-b_{a_{k}}]_{\times}(I-[\frac{w_{k+1}+w_{k}}{2}-b_{g_{k}}]_{\times }\delta t)\delta t^{2} \\ f_{21}&=-\frac{1}{2}q_{k}[a_{k}-b_{a_{k}}]_{\times}\delta t-\frac{1}{2}q_{k+1}[a_{k+1}-b_{a_{k}}]_{\times}(I-[\frac{w_{k+1}+w_{k}}{2}-b_{g_{k}}]_{\times }\delta t)\delta t \\ f_{04}&=\frac{1}{4}(-q_{k+1}[a_{k+1}-b_{a_{k}}]_{\times}\delta t^{2})(-\delta t) \\ f_{24}&=\frac{1}{2}(-q_{k+1}[a_{k+1}-b_{a_{k}}]_{\times}\delta t)(-\delta t) \\ v_{01}&=\frac{1}{4}(-q_{k+1}[a_{k+1}-b_{a_{k}}]_{\times}\delta t^{2})\frac{1}{2}\delta t \\ v_{03}&=\frac{1}{4}(-q_{k+1}[a_{k+1}-b_{a_{k}}]_{\times}\delta t^{2})\frac{1}{2}\delta t \\ v_{21}&=\frac{1}{2}(-q_{k+1}[a_{k+1}-b_{a_{k}}]_{\times}\delta t^{2})\frac{1}{2}\delta t \\ v_{23}&=\frac{1}{2}(-q_{k+1}[a_{k+1}-b_{a_{k}}]_{\times}\delta t^{2})\frac{1}{2}\delta t \end{aligned}\]

\[\begin{aligned} F' &= I + F \delta t & \in \mathbb{R}^{15 \times 15} \\ V &= G \delta t & \in \mathbb{R}^{15 \times 18} \end{aligned}\]

则简写为

\[\delta X_{k+1} = F' \delta X_k + V n\]

此处 $F’$ 即代码中 F,相关代码见 midPointIntegration

最后得到 IMU预积分测量关于IMU Bias雅克比矩阵 $J_{k+1}$ 、IMU预积分测量的 协方差矩阵 $P_{k+1}$ 和 噪声的 协方差矩阵 $Q$,初始状态下的雅克比矩阵和协方差矩阵为 单位阵零矩阵

\[\begin{aligned} J_{b_k} &= I \\ P_{b_{k}}^{b_k} &= 0 \\ J_{t+\delta t} &= F' J_t = (I + F_t \delta t) J_t, \quad t \in [k, k+1] \\ P_{t+\delta t}^{b_k} &= F' P_t^{b_k} F'^T + V Q V^T \\ &= (I + F_t \delta t) P_{t}^{b_k} (I + F_t \delta t) + (G_t \delta t) Q (G_t \delta t) \\ Q &= \text{diag}( \sigma_{a_0}^2 \quad \sigma_{\omega_0}^2 \quad \sigma_{a_1}^2 \quad \sigma_{\omega_1}^2 \quad \sigma_{b_a}^2 \quad \sigma_{b_g}^2) \in \mathbb{R}^{18 \times 18} \end{aligned}\]

当bias估计轻微改变时,我们可以使用如下的一阶近似 对中值积分得到的预积分测量值进行矫正,而不重传播,从而得到 更加精确的预积分测量值(bias修正的线性模型)

\[\begin{aligned} {\alpha}^{b_{k}}_{b_{k+1}} &\approx \hat{\alpha}^{b_{k}}_{b_{k+1}} + J^{\alpha}_{b_a} \delta {b_a}_k + J^{\alpha}_{b_{\omega}} \delta {b_{\omega}}_k \\ {\beta}^{b_{k}}_{b_{k+1}} &\approx \hat{\beta}^{b_{k}}_{b_{k+1}} + J^{\beta}_{b_a} \delta {b_a}_k + J^{\beta}_{b_{\omega}} \delta {b_{\omega}}_k \\ {\gamma}^{b_{k}}_{b_{k+1}} &\approx \hat{\gamma}^{b_{k}}_{b_{k+1}} \otimes \begin{bmatrix} 1 \\ \frac{1}{2} J^{\gamma}_{b_{\omega}} \delta {b_{\omega}}_k \end{bmatrix} \end{aligned}\]

此时,可以与 卡尔曼滤波 对比一下:

2. 初始化(松耦合)

在提取的图像的Features和做完IMU的预积分之后,进入了系统的初始化环节,主要的目的有以下两个:

  • 系统使用单目相机,如果没有一个良好的尺度估计,就无法对两个传感器做进一步的融合,这个时候需要恢复出尺度;
  • 要对IMU进行初始化,IMU会受到bias的影响,所以要得到IMU的bias。

所以我们要从初始化中恢复出尺度、重力、速度以及IMU的bias,因为视觉(SFM)在初始化的过程中有着较好的表现,所以在初始化的过程中主要以SFM为主,然后将IMU的预积分结果与其对齐,即可得到较好的初始化结果。

2.1 相机与IMU之间的相对旋转

相机与IMU之间的旋转标定非常重要,偏差1-2°系统的精度就会变的极低

设相机利用对极关系得到的旋转矩阵为 $R_{c_{k+1}}^{c_k}$ ,IMU经过预积分得到的旋转矩阵为 $R_{b_{k+1}}^{b_{k}}$,相机与IMU之间的相对旋转为 $R_{c}^{b}$,则对于任一帧满足,

\[R^{b_{k}}_{b_{k+1}}R^{b}_{c}=R^{b}_{c}R^{c_{k}}_{c_{k+1}}\]

将旋转矩阵写为四元数,则上式可以写为

\[q^{b_{k}}_{b_{k+1}} \otimes q^{b}_{c}=q^{b}_{c}\otimes q^{c_{k}}_{c_{k+1}}\]

将其写为左乘和右乘的形式

\[({[q^{b_{k}}_{b_{k+1}}]}_L - {[q^{c_{k}}_{c_{k+1}}]}_R) q^b_c = Q^k_{k+1} q^b_c = 0\]

$[q]_L$ 与 $[q]_R$ 分别表示 四元数左乘矩阵四元数右乘矩阵,其定义为(四元数实部在后)

\[\begin{aligned} [q]_L &= \begin{bmatrix} q_{w}I_{3}+[q_{xyz }]_{\times} & q_{xyz}\\ -q_{xyz} & q_{w} \end{bmatrix} \\ [q]_R &= \begin{bmatrix} q_{w}I_{3}-[q_{xyz }]_{\times} & q_{xyz}\\ -q_{xyz} & q_{w} \end{bmatrix} \end{aligned}\]

那么对于 $n$对测量值,则有

\[\begin{bmatrix} w^{0}_{1}Q^{0}_{1}\\ w^{1}_{2}Q^{1}_{2}\\ \vdots \\ w^{N-1}_{N}Q^{N-1}_{N} \end{bmatrix}q^{b}_{c}=Q_{N}q^{b}_{c}=0\]

其中 $w^{N-1}_{N}$ 为外点剔除权重,其与相对旋转求得的角度残差有关,$N$为计算相对旋转需要的测量对数,其由最终的终止条件决定。角度残差可以写为,

\[{\theta}^{k}_{k+1}= arccos\bigg( \frac{tr(\hat{R}^{b^{-1}}_{c}R^{b_{k}^{-1}}_{b_{k+1}}\hat{R}^{b}_{c}R^{c_{k}}_{c_{k+1}} )-1}{2}\bigg)\]

从而权重为

\[w^{k}_{k+1}= \left\{\begin{matrix} 1, & {\theta}^{k}_{k+1}<threshold (\text{一般5°}) \\ \frac{threshold}{\theta^{k}_{k+1}}, & otherwise \end{matrix}\right.\]

至此,就可以通过求解方程 $Q_N q_c^b=0$ 得到相对旋转,解为 $Q_N$ 的左奇异向量中最小奇异值对应的特征向量。

但是,在这里还要注意 求解的终止条件(校准完成的终止条件) 。在足够多的旋转运动中,我们可以很好的估计出相对旋转 $R_{c}^{b}$,这时 $Q_{N}$ 对应一个准确解,且其零空间的秩为1。但是在校准的过程中,某些轴向上可能存在退化运动(如匀速运动),这时 $Q_{N}$ 的零空间的秩会大于1。判断条件就是 $Q_N$ 的第二小的奇异值是否大于某个阈值,若大于则其零空间的秩为1,反之秩大于1,相对旋转 $R_{c}^{b}$ 的精度不够,校准不成功。

对应代码在 InitialEXRotation::CalibrationExRotation 中。

// 相机与IMU之间的相对旋转
if(ESTIMATE_EXTRINSIC == 2)
{
    ROS_INFO("calibrating extrinsic param, rotation movement is needed");
    if (frame_count != 0)
    {
        // 选取两帧之间共有的Features
        vector<pair<Vector3d, Vector3d>> corres = f_manager.getCorresponding(frame_count - 1, frame_count);

        // 校准相机与IMU之间的旋转
        Matrix3d calib_ric;
        if (initial_ex_rotation.CalibrationExRotation(corres, pre_integrations[frame_count]->delta_q, calib_ric))
        {
            ROS_WARN("initial extrinsic rotation calib success");
            ROS_WARN_STREAM("initial extrinsic rotation: " << endl << calib_ric);
            ric[0] = calib_ric;
            RIC[0] = calib_ric;
            ESTIMATE_EXTRINSIC = 1;
        }
    }
}

2.2 检测IMU可观性

// 计算均值
map<double, ImageFrame>::iterator frame_it;
Vector3d sum_g;
for (frame_it = all_image_frame.begin(), frame_it++; frame_it != all_image_frame.end(); frame_it++)
{
    double sum_dt  = frame_it->second.pre_integration->sum_dt;
    Vector3d tmp_g = frame_it->second.pre_integration->delta_v / sum_dt;
    sum_g += tmp_g;
}
Vector3d aver_g = sum_g * 1.0 / ((int)all_image_frame.size() - 1);

// 计算方差
double var = 0;
for (frame_it = all_image_frame.begin(), frame_it++; frame_it != all_image_frame.end(); frame_it++)
{
    double sum_dt  = frame_it->second.pre_integration->sum_dt;
    Vector3d tmp_g = frame_it->second.pre_integration->delta_v / sum_dt;
    var += (tmp_g - aver_g).transpose() * (tmp_g - aver_g);
}

// 计算标准差
var = sqrt(var / ((int)all_image_frame.size() - 1));
//ROS_WARN("IMU variation %f!", var);
if(var < 0.25) //! 以标准差判断可观性
{
    ROS_INFO("IMU excitation not enouth!");
    //return false;
}

2.3 相机初始化(Vision-Only SFM)

  • 求取本质矩阵求解位姿(relativePose
  • 三角化特征点(sfm.construct
  • PnP求解位姿(cv::solvePnP
  • 转换到IMU坐标系下
  • $c_0$ 坐标系作为参考系
  • 不断重复直到恢复出滑窗内的Features和相机位姿

2.4 视觉与IMU对齐

  • Gyroscope Bias Calibration
  • Velocity, Gravity Vector and Metric Scale Initialization
  • Gravity Refinement
  • Completing Initialization

对应代码:VisualIMUAlignment

陀螺仪Bias标定

标定陀螺仪Bias使用如下代价函数

\[\underset{\delta b_{w}}{min}\sum_{k\in B}^{ }\left \| q^{c_{0}^{-1}}_{b_{k+1}}\otimes q^{c_{0}}_{b_{k}}\otimes\gamma _{b_{k+1}}^{b_{k}} \right \|^{2}\]

因为四元数最小值为单位四元数 $[1,0_{v}]^{T}$,所以令

\[q^{c_{0}^{-1}}_{b_{k+1}}\otimes q^{c_{0}}_{b_{k}}\otimes\gamma _{b_{k+1}}^{b_{k}} = \begin{bmatrix} 1\\ 0 \end{bmatrix}\]

其中

\[\gamma _{b_{k+1}}^{b_{k}}\approx \hat{\gamma}_{b_{k+1}}^{b_{k}}\otimes \begin{bmatrix} 1\\ \frac{1}{2}J^{\gamma }_{b_{w}}\delta b_{w} \end{bmatrix}\]

所以

\[\hat{\gamma}_{b_{k+1}}^{b_{k}} \otimes \begin{bmatrix} 1\\ \frac{1}{2}J^{\gamma }_{b_{w}}\delta b_{w} \end{bmatrix} = q^{c_{0}^{-1}}_{b_{k}}\otimes q^{c_{0}}_{b_{k+1}}\] \[\begin{bmatrix} 1\\ \frac{1}{2}J^{\gamma }_{b_{w}}\delta b_{w} \end{bmatrix}=\hat{\gamma}_{b_{k+1}}^{b_{k}^{-1}}\otimes q^{c_{0}^{-1}}_{b_{k}}\otimes q^{c_{0}}_{b_{k+1}}\]

只取上式虚部,再进行最小二乘求解

\[J^{\gamma^{T}}_{b_{w}}J^{\gamma }_{b_{w}}\delta b_{w}= 2 \cdot J^{\gamma^{T}}_{b_{w}}(\hat{\gamma}_{b_{k+1}}^{b_{k}^{-1}}\otimes q^{c_{0}^{-1}}_{b_{k}}\otimes q^{c_{0}}_{b_{k+1}})_{vec}\]

求解上式的最小二乘解,即可得到 $\delta b_{w}$,注意这个地方得到的只是Bias的变化量,需要在滑窗内累加得到Bias的准确值。

对应代码:solveGyroscopeBias

void solveGyroscopeBias(map<double, ImageFrame> &all_image_frame, Vector3d* Bgs) {
    Matrix3d A;
    Vector3d b;
    A.setZero();
    b.setZero();

    map<double, ImageFrame>::iterator frame_i;
    map<double, ImageFrame>::iterator frame_j;
    for (frame_i = all_image_frame.begin(); next(frame_i) != all_image_frame.end(); frame_i++) {
        frame_j = next(frame_i);

        MatrixXd tmp_A(3, 3);
        VectorXd tmp_b(3);
        tmp_A.setZero();
        tmp_b.setZero();

        Eigen::Quaterniond q_ij(frame_i->second.R.transpose() * frame_j->second.R);
        tmp_A = frame_j->second.pre_integration->jacobian.template block<3, 3>(O_R, O_BG);
        tmp_b = 2 * (frame_j->second.pre_integration->delta_q.inverse() * q_ij).vec();

        A += tmp_A.transpose() * tmp_A;
        b += tmp_A.transpose() * tmp_b;
    }

    Vector3d delta_bg = A.ldlt().solve(b);
    ROS_WARN_STREAM("gyroscope bias initial calibration " << delta_bg.transpose());

    // 因为求解出的Bias是变化量,所以要累加
    for (int i = 0; i <= WINDOW_SIZE; i++)
        Bgs[i] += delta_bg;

    // 利用新的Bias重新repropagate
    for (frame_i = all_image_frame.begin(); next(frame_i) != all_image_frame.end(); frame_i++) {
        frame_j = next(frame_i);
        frame_j->second.pre_integration->repropagate(Vector3d::Zero(), Bgs[0]);
    }
}

初始化速度、重力向量和尺度因子

要估计的状态量

\[X_{I}= [v^{b_{0}}_{b_{0}}, v^{b_{0}}_{b_{1}}, \cdots, v^{b_{n}}_{b_{n}}, g^{c_{0}}, s] \in \mathbb{R}^{3(n+1)+3+1}\]

其中,$g^{c_{0}}$ 为在第 0 帧 Camera 相机坐标系下的重力向量。

根据IMU测量模型可知

\[\begin{aligned} \alpha^{b_{k}}_{b_{k+1}} &= R^{b_{k}}_{c_{0}}(s(\bar{p}^{c_{0}}_{b_{k+1}}-\bar{p}^{c_{0}}_{b_{k}}) + \frac{1}{2}g^{c_{0}}\Delta t_{k}^{2} - R^{c_0}_{b_k} v^{b_k}_{b_{k}} \Delta t_{k}) \\ \beta ^{b_{k}}_{b_{k+1}} &= R^{b_{k}}_{c_{0}}(R^{c_0}_{b_{k+1}} v^{b_{k+1}}_{b_{k+1}} + g^{c_{0}}\Delta t_{k} - R^{c_0}_{b_k} v^{b_k}_{b_{k}}) \end{aligned}\]

我们已经得到了IMU相对于相机的旋转 $q_{b}^{c}$,假设IMU到相机的平移量$p_{b}^{c}$,那么可以很容易地将相机坐标系下的位姿转换到IMU坐标系下

\[\begin{aligned} q_{b_{k}}^{c_{0}} &= q^{c_{0}}_{c_{k}}\otimes (q_{c}^{b})^{-1} \\ s\bar{p}^{c_{0}}_{b_{k}} &= s\bar{p}^{c_{0}}_{c_{k}} - R^{c_{0}}_{b_{k}}p_{c}^{b} \end{aligned}\]

所以,定义相邻两帧之间的IMU预积分出的增量(${\hat{\alpha}}{b{k+1}}^{b_{k}}$,${\hat{\beta}}{b{k+1}}^{b_{k}}$)与预测值之间的残差,即

\[\begin{aligned} r(\hat{z}^{b_{k}}_{b_{k+1}}, X_I) &= \begin{bmatrix} \delta \alpha^{b_{k}}_{b_{k+1}} \\ \delta \beta ^{b_{k}}_{b_{k+1}} \end{bmatrix} \\ &= \begin{bmatrix} \hat{\alpha}^{b_{k}}_{b_{k+1}} -& R^{b_{k}}_{c_{0}}(s(\bar{p}^{c_{0}}_{b_{k+1}}-\bar{p}^{c_{0}}_{b_{k}}) + \frac{1}{2}g^{c_{0}}\Delta t_{k}^{2} - R^{c_0}_{b_k} v^{b_k}_{b_{k}} \Delta t_{k})\\ \hat{\beta}^{b_{k}}_{b_{k+1}} -& R^{b_{k}}_{c_{0}}(R^{c_0}_{b_{k+1}} v^{b_{k+1}}_{b_{k+1}} + g^{c_{0}}\Delta t_{k} - R^{c_0}_{b_k} v^{b_k}_{b_{k}}) \end{bmatrix} \end{aligned}\]

令 $r(\hat{z}{b{k+1}}^{b_{k}}, X_I)=\mathbf{0}$,转换成 $Hx=b$ 的形式

\[\begin{bmatrix} -I\Delta t_{k} & 0 & \frac{1}{2}R^{b_{k}}_{c_{0}} \Delta t_{k}^{2} & R^{b_{k}}_{c_{0}}(\bar{p}^{c_{0}}_{c_{k+1}}-\bar{p}^{c_{0}}_{c_{k}}) \\ -I & R^{b_{k}}_{c_{0}} R^{c_0}_{b_{k+1}} & R^{b_{k}}_{c_{0}}\Delta t_{k} & 0 \end{bmatrix} \begin{bmatrix} v^{b_{k}}_{b_{k}}\\ v^{b_{k+1}}_{b_{k+1}}\\ g^{c_{0}}\\ s \end{bmatrix} = \begin{bmatrix} \alpha^{b_{k}}_{b_{k+1}} - p_c^b + R^{b_{k}}_{c_{0}} R^{c_0}_{b_{k+1}} p_c^b \\ \beta ^{b_{k}}_{b_{k+1}} \end{bmatrix}\]

通过Cholosky分解求解 $X_I$

\[H^T H X_I = H^T b\]

对应代码:LinearAlignment

优化重力

重力矢量的模长固定(9.8),其为2个自由度,在切空间上对其参数化

\[\begin{aligned} \hat{g} &= \|g\| \cdot \bar{\hat{g}} + \omega_1 \vec{b_1} + \omega_2 \vec{b_2} \\ &= \|g\| \cdot \bar{\hat{g}} + B \vec{\omega} \end{aligned} , \quad B \in \mathbb{R}^{3 \times 2}, \vec{\omega} \in \mathbb{R}^{2 \times 1}\]

令 $\hat{g} = g^{c_{0}}$,将其代入上一小节公式得

\[\begin{bmatrix} -I\Delta t_{k} & 0 & \frac{1}{2}R^{b_{k}}_{c_{0}} \Delta t_{k}^{2} B & R^{b_{k}}_{c_{0}}(\bar{p}^{c_{0}}_{c_{k+1}}-\bar{p}^{c_{0}}_{c_{k}}) \\ -I & R^{b_{k}}_{c_{0}} R^{c_0}_{b_{k+1}} & R^{b_{k}}_{c_{0}}\Delta t_{k} B & 0 \end{bmatrix} \begin{bmatrix} v^{b_{k}}_{b_{k}}\\ v^{b_{k+1}}_{b_{k+1}}\\ \vec{\omega}\\ s \end{bmatrix} \\ = \begin{bmatrix} \alpha^{b_{k}}_{b_{k+1}} - p_c^b + R^{b_{k}}_{c_{0}} R^{c_0}_{b_{k+1}} p_c^b - \frac{1}{2}R^{b_{k}}_{c_{0}} \Delta t_{k}^{2} \|g\| \cdot \bar{\hat{g}}\\ \beta ^{b_{k}}_{b_{k+1}} - R^{b_{k}}_{c_{0}}\Delta t_{k} \|g\| \cdot \bar{\hat{g}} \end{bmatrix}\]

同样,通过Cholosky分解求得 $g^{c_{0}}$,即相机 $C_0$ 系下的重力向量。

最后,通过将 $g^{c_{0}}$ 旋转至惯性坐标系(世界系)中的 z 轴方向[0,0,1],可以计算第一帧相机系到惯性系的旋转矩阵 $q_{c_0}^w$,这样就可以将所有变量调整至惯性世界系(水平坐标系,z轴与重力方向对齐)中。

对应代码:RefineGravity

3. 后端优化(紧耦合)

VIO 紧耦合方案的主要思路就是通过将基于视觉构造的残差项和基于IMU构造的残差项放在一起构造成一个联合优化的问题,整个优化问题的最优解即可认为是比较准确的状态估计。

为了限制优化变量的数目,VINS-Mono 采用了滑动窗口的形式,滑动窗口 中的 全状态量

\[\begin{aligned} X &= [x_{0},x_{1},\cdots ,x_{n},x^{b}_{c},{\lambda}_{0},{\lambda}_{1}, \cdots ,{\lambda}_{m}] \\ x_{k} &= [p^{w}_{b_{k}},v^{w}_{b_{k}},q^{w}_{b_{k}},b_{a},b_{g}],\quad k\in[0,n] \\ x^{b}_{c} &= [p^{b}_{c},q^{b}_{c}] \end{aligned}\]
  • 滑动窗口内 n+1 个所有相机的状态(包括位置、朝向、速度、加速度计 bias 和陀螺仪 bias)
  • Camera 到 IMU 的外参
  • m+1 个 3D 点的逆深度

优化过程中的 误差状态量

\[\begin{aligned} \delta X&=[\delta x_{0},\delta x_{1},\cdots ,\delta x_{n},\delta x^{b}_{c},\lambda_{0},\delta \lambda _{1}, \cdots , \delta \lambda_{m}] \\ \delta x_{k}&=[\delta p^{w}_{b_{k}},\delta v^{w}_{b_{k}},\delta \theta ^{w}_{b_{k}},\delta b_{a},\delta b_{g}],\quad k\in[0,n] \\ \delta x^{b}_{c}&= [\delta p^{b}_{c},\delta q^{b}_{c}] \end{aligned}\]

进而得到系统优化的代价函数(Minimize residuals from all sensors)

\[\underset{X}{min} \begin{Bmatrix} \left \| r_{p}-H_{p}X \right \|^{2} + \sum_{k\in B}^{ } \left \| r_{B}(\hat{z}^{b_{k}}_{b_{k+1}},X) \right \|^{2}_{P^{b_{k}}_{b{k+1}}} + \sum_{(i,j)\in C}^{ } \left \| r_{C}(\hat{z}^{c_{j}}_{l},X) \right \|^{2}_{P^{c_{j}}_{l}} \end{Bmatrix}\]

其中三个残差项依次是

  • 边缘化的先验信息
  • IMU测量残差
  • 视觉的观测残差

三种残差都是用 马氏距离(与量纲无关) 来表示的。

Motion-only visual-inertial bundle adjustment: Optimize position, velocity, rotation in a smaller windows, assuming all other quantities are fixed

3.1 IMU 测量残差

(1)IMU 测量残差

上面的IMU预积分(估计值 - 测量值),得到IMU测量残差

\[\begin{aligned} r_{B}(\hat{z}^{b_{k}}_{b_{k+1}},X)= \begin{bmatrix} \delta \alpha ^{b_{k}}_{b_{k+1}}\\ \delta \theta ^{b_{k}}_{b_{k+1}}\\ \delta \beta ^{b_{k}}_{b_{k+1}}\\ 0\\ 0 \end{bmatrix} &=\begin{bmatrix} R^{b_{k}}_{w}(p^{w}_{b_{k+1}}-p_{b_{k}}^{w}+\frac{1}{2}g^{w}\Delta t^{2}-v_{b_{k}}^{w}\Delta t)-\hat{\alpha }^{b_{k}}_{b_{k+1}}\\ 2{[{q_{b_{k}}^{w}}^{-1} \otimes q^{w}_{b_{k+1}} \otimes (\hat{\gamma }^{b_{k}}_{b_{k+1}})^{-1}]}_{xyz} \\ R^{b_{k}}_{w}(v^{w}_{b_{k+1}}+g^{w}\Delta t-v_{b_{k}}^{w})-\hat{\beta }^{b_{k}}_{b_{k+1}} \\ b_{ab_{k+1}}-b_{ab_{k}}\\ b_{gb_{k+1}}-b_{gb_{k}} \end{bmatrix} \end{aligned}\]

其中 $[\hat{\alpha }^{b_{k}}{b{k+1}},\hat{\gamma }^{b_{k}}{b{k+1}},\hat{\beta }^{b_{k}}{b{k+1}}]$ 为 IMU预积分Bias修正值

/**
 * [evaluate 计算IMU测量模型的残差]
 * @param Pi,Qi,Vi,Bai,Bgi  [前一次预积分结果]
 * @param Pj,Qj,Vj,Baj,Bgj  [后一次预积分结果]
 */
Eigen::Matrix<double, 15, 1> evaluate(
        const Eigen::Vector3d &Pi, const Eigen::Quaterniond &Qi, const Eigen::Vector3d &Vi, const Eigen::Vector3d &Bai, const Eigen::Vector3d &Bgi,
        const Eigen::Vector3d &Pj, const Eigen::Quaterniond &Qj, const Eigen::Vector3d &Vj, const Eigen::Vector3d &Baj, const Eigen::Vector3d &Bgj)
{
    Eigen::Matrix<double, 15, 1> residuals;

    Eigen::Matrix3d dp_dba = jacobian.block<3, 3>(O_P, O_BA);
    Eigen::Matrix3d dp_dbg = jacobian.block<3, 3>(O_P, O_BG);

    Eigen::Matrix3d dq_dbg = jacobian.block<3, 3>(O_R, O_BG);

    Eigen::Matrix3d dv_dba = jacobian.block<3, 3>(O_V, O_BA);
    Eigen::Matrix3d dv_dbg = jacobian.block<3, 3>(O_V, O_BG);

    Eigen::Vector3d dba = Bai - linearized_ba;
    Eigen::Vector3d dbg = Bgi - linearized_bg;

    // IMU预积分的结果,消除掉acc bias和gyro bias的影响, 对应IMU model中的\hat{\alpha},\hat{\beta},\hat{\gamma}
    Eigen::Quaterniond corrected_delta_q = delta_q * Utility::deltaQ(dq_dbg * dbg);
    Eigen::Vector3d    corrected_delta_v = delta_v + dv_dba * dba + dv_dbg * dbg;
    Eigen::Vector3d    corrected_delta_p = delta_p + dp_dba * dba + dp_dbg * dbg;

    // IMU项residual计算,输入参数是状态的估计值, 上面correct_delta_*是预积分值, 二者求'diff'得到residual
    residuals.block<3, 1>(O_P, 0)  = Qi.inverse() * (0.5 * G * sum_dt * sum_dt + Pj - Pi - Vi * sum_dt) - corrected_delta_p;
    residuals.block<3, 1>(O_R, 0)  = 2 * (corrected_delta_q.inverse() * (Qi.inverse() * Qj)).vec();
    residuals.block<3, 1>(O_V, 0)  = Qi.inverse() * (G * sum_dt + Vj - Vi) - corrected_delta_v;
    residuals.block<3, 1>(O_BA, 0) = Baj - Bai;
    residuals.block<3, 1>(O_BG, 0) = Bgj - Bgi;

    return residuals;
}

(2)协方差矩阵

此处用到的协方差矩阵为前面IMU预积分计算出的协方差矩阵。

残差的后处理对应代码:

// 在优化迭代的过程中, 预积分值是不变的, 输入的状态值会被不断的更新, 然后不断的调用evaluate()计算更新后的IMU残差
Eigen::Map<Eigen::Matrix<double, 15, 1>> residual(residuals);
residual = pre_integration->evaluate(Pi, Qi, Vi, Bai, Bgi,
                                     Pj, Qj, Vj, Baj, Bgj);

Eigen::Matrix<double, 15, 15> sqrt_info =
        Eigen::LLT<Eigen::Matrix<double, 15, 15>>(pre_integration->covariance.inverse()).matrixL().transpose();
//sqrt_info.setIdentity();

residual = sqrt_info * residual; // 为了保证 IMU 和 视觉參差项在尺度上保持一致,一般会采用与量纲无关的马氏距离

这里残差 residual 乘以 sqrt_info,这是因为真正的优化项其实是 Mahalanobis 距离: $d = r^T P^{-1} r$,其中 $P$ 是协方差。Mahalanobis距离 其实相当于一个残差加权,协方差大的加权小,协方差小的加权大,着重优化那些比较确定的残差。

而 ceres只接受最小二乘优化,也就是 $\min e^T e$,所以把 $P^{-1}$ 做 LLT分解,即 $LL^T=P^{−1}$,则 $d = r^T (L L^T) r = (L^T r)^T (L^T r)$,令 $r’ = (L^T r)$,作为新的优化误差,所以 sqrt_info 等于 $L^T$。

(3)雅克比矩阵

高斯迭代优化过程中会用到IMU测量残差对状态量的雅克比矩阵,但此处我们是 对误差状态量求偏导,下面对四部分误差状态量求取雅克比矩阵。

对$[\delta p^{w}{b{k}},\delta \theta ^{w}{b{k}}]$ 求偏导得

\[J[0]=\begin{bmatrix} -q^{b_{k}}_{w} & R^{b_{k}}_{w}[(p^{w}_{b_{k+1}}-p_{b_{k}}^{w}+\frac{1}{2}g^{w}\Delta t^{2}-v_{b_{k}}^{w}\Delta t)]_{\times }\\ 0 & [q_{b_{k+1}}^{w^{-1}}q^{w}_{b_{k}}]_{L}[\hat{\gamma }^{b_{k}}_{b_{k+1}}]_{R}J^{\gamma}_{b_{w}}\\ 0 & R^{b_{k}}_{w}[(v^{w}_{b_{k+1}}+g^{w}\Delta t-v_{b_{k}}^{w})]_{\times } \\ 0 & 0 \end{bmatrix} \in \mathbb{R}^{15 \times 7}\]

对 $[\delta v^{w}{b{k}},\delta b_{ab_{k}},\delta b_{wb_{k}}]$ 求偏导得

\[J[1]= \begin{bmatrix} -q^{b_{k}}_{w}\Delta t & -J^{\alpha }_{b_{a}} & -J^{\alpha }_{b_{a}}\\ 0 & 0 & -[q_{b_{k+1}}^{w^{-1}}\otimes q^{w}_{b_{k}}\otimes \hat{\gamma }^{b_{k}}_{b_{k+1}}]_{L}J^{\gamma}_{b_{w}}\\ -q^{b_{k}}_{w} & -J^{\beta }_{b_{a}} & -J^{\beta }_{b_{a}}\\ 0& -I &0 \\ 0 &0 &-I \end{bmatrix} \in \mathbb{R}^{15 \times 9}\]

对 $[\delta p^{w}{b{k+1}},\delta \theta ^{w}{b{k+1}}]$ 求偏导得

\[J[2]= \begin{bmatrix} -q^{b_{k}}_{w} &0\\ 0 & [\hat{\gamma }^{b_{k}^{-1}}_{b_{k+1}}\otimes q_{w}^{b_{k}}\otimes q_{b_{k+1}}^{w}]_{L} \\ 0 & 0 \\ 0 & 0 \\ 0 & 0 \end{bmatrix} \in \mathbb{R}^{15 \times 7}\]

对 $[\delta v^{w}{b{k}},\delta b_{ab_{k}},\delta b_{wb_{k}}]$ 求偏导得

\[J[3]= \begin{bmatrix} -q^{b_{k}}_{w} &0 & 0\\ 0 & 0 &0 \\ q^{b_{k}}_{w} & 0 & 0\\ 0& I &0 \\ 0 &0 &I \end{bmatrix} \in \mathbb{R}^{15 \times 9}\]

雅克比矩阵计算的对应代码在 class IMUFactor : public ceres::SizedCostFunction<15, 7, 9, 7, 9> 中的 Evaluate() 函数中。

3.2 视觉(td) 测量残差

视觉测量残差 即 特征点的重投影误差,视觉残差和雅克比矩阵计算的对应代码在 ProjectionFactor::Evaluate 函数中。

(1)切平面重投影误差(Spherical camera model)

\[r_{C}=(\hat{z}_{l}^{c_{j}},X)=[b_{1},b_{2}]^{T}\cdot (\bar{P}_{l}^{c_{j}}-\frac{P_{l}^{c_{j}}}{\left \| P_{l}^{c_{j}} \right \|})\]

其中,

\[P_{l}^{c_{j}}=q_{b}^{c}(q_{w}^{b_{j}}(q_{b_{i}}^{w}(q_{c}^{b} \frac{\bar{P}_{l}^{c_{i}}}{\lambda _{l}}+p_{c}^{b})+p_{b_{i}}^{w}-p_{b_{j}}^{w})-p_{c}^{b})\]
// 将第i frame下的3D点转到第j frame坐标系下
Eigen::Vector3d pts_camera_i = pts_i / inv_dep_i;                 // pt in ith camera frame, 归一化平面
Eigen::Vector3d pts_imu_i    = qic * pts_camera_i + tic;          // pt in ith body frame
Eigen::Vector3d pts_w        = Qi * pts_imu_i + Pi;               // pt in world frame
Eigen::Vector3d pts_imu_j    = Qj.inverse() * (pts_w - Pj);       // pt in jth body frame
Eigen::Vector3d pts_camera_j = qic.inverse() * (pts_imu_j - tic); // pt in jth camera frame

(2)像素重投影误差(Pinhole camera model)

\[r_{C}=(\hat{z}_{l}^{c_{j}},X) = ( \frac{f}{1.5} \cdot I_{2 \times 2} ) \cdot {(\frac{\bar{P}_{l}^{c_{j}}}{\bar{Z}}-\frac{P_{l}^{c_{j}}}{Z_j})}_2\]
Eigen::Map<Eigen::Vector2d> residual(residuals);
#ifdef UNIT_SPHERE_ERROR
// 把归一化平面上的重投影误差投影到Unit sphere上的好处就是可以支持所有类型的相机 why
// 求取切平面上的误差
residual =  tangent_base * (pts_camera_j.normalized() - pts_j.normalized());
#else
// 求取归一化平面上的误差
double dep_j = pts_camera_j.z();
residual = (pts_camera_j / dep_j).head<2>() - pts_j.head<2>();
#endif
residual = sqrt_info * residual; // 转成 与量纲无关的马氏距离

(3)协方差矩阵

固定的协方差矩阵,归一化平面的标准差为 $\frac{1.5}{f}$,即像素标准差为 $1.5$

ProjectionFactor::sqrt_info   = FOCAL_LENGTH / 1.5 * Matrix2d::Identity();

(4)雅克比矩阵

下面关于误差状态量对相机测量残差求偏导,得到高斯迭代优化过程中的雅克比矩阵。

对 $[\delta p^{w}{b{i}},\delta \theta ^{w}{b{i}}]$ 求偏导

\[J[0]=\begin{bmatrix} q_{b}^{c}q_{w}^{b_{j}} & -q_{b}^{c}q_{w}^{b_{j}}q_{b_{i}}^{w}[q_{c}^{b} \frac{\bar{P}_{l}^{c_{i}}}{\lambda_{l}}+p_{c}^{b}]_{\times } \end{bmatrix} \in \mathbb{R}^{3 \times 6}\]

对 $[\delta p^{w}{b{j}},\delta \theta ^{w}{b{j}}]$ 求偏导

\[J[1]=\begin{bmatrix} -q_{b}^{c}q_{w}^{b_{j}} & q_{b}^{c}q_{w}^{b_{j}}[q_{b_{i}}^{w}(q_{c}^{b} \frac{\bar{P}_{l}^{c_{i}}}{\lambda _{l}}+p_{c}^{b})+p_{b_{i}}^{w}-p_{b_{j}}^{w}]_{\times } \end{bmatrix} \in \mathbb{R}^{3 \times 6}\]

对 $[\delta p^{b}{c},\delta \theta ^{b}{c}]$ 求偏导

\[J[2]= \begin{bmatrix} q_{b}^{c}(q_{w}^{b_{j}}q_{bi}^{w}-I_{3*3}) & -q_{b}^{c}q_{w}^{b_{j}}q_{b_{i}}^{w}q_{c}^{b}[\frac{\bar{P}_{l}^{c_{i}}}{\lambda_{l}}]_{\times }+[q_{b}^{c}(q_{w}^{b_{j}}(q_{b_{i}}^{w}p_{c}^{b}+p_{b_{i}}^{w}-p_{b_{j}}^{w})-p_{c}^{b})] \end{bmatrix} \in \mathbb{R}^{3 \times 6}\]

对 $\delta \lambda_{l}$ 求偏导

\[J[3]=-q_{b}^{c}q_{w}^{b_{j}}q_{b_{i}}^{w}q_{c}^{b} \frac{\bar{P}_{l}^{c_{i}}}{\lambda_{l}^{2}} \in \mathbb{R}^{3 \times 1}\]

(5)Vision measurement residual for temporal calibration

视觉残差和雅克比矩阵计算的对应代码在 ProjectionTdFactor::Evaluate 函数中。

// TR / ROW * row_i 是相机 rolling 到这一行时所用的时间
Eigen::Vector3d pts_i_td, pts_j_td;
pts_i_td = pts_i - (td - td_i + TR / ROW * row_i) * velocity_i;
pts_j_td = pts_j - (td - td_j + TR / ROW * row_j) * velocity_j;

Eigen::Vector3d pts_camera_i = pts_i_td / inv_dep_i;
Eigen::Vector3d pts_imu_i    = qic * pts_camera_i + tic;
Eigen::Vector3d pts_w        = Qi * pts_imu_i + Pi;
Eigen::Vector3d pts_imu_j    = Qj.inverse() * (pts_w - Pj);
Eigen::Vector3d pts_camera_j = qic.inverse() * (pts_imu_j - tic);

Eigen::Map<Eigen::Vector2d> residual(residuals);
#ifdef UNIT_SPHERE_ERROR
residual =  tangent_base * (pts_camera_j.normalized() - pts_j_td.normalized());
#else
double dep_j = pts_camera_j.z();
residual = (pts_camera_j / dep_j).head<2>() - pts_j_td.head<2>();
#endif
residual = sqrt_info * residual;
  • 添加对 imu-camera 时间戳不完全同步和 Rolling Shutter 相机的支持:通过前端光流计算得到每个角点在归一化的速度,根据 imu-camera 时间戳的时间同步误差和Rolling Shutter相机做一次rolling的时间,对角点的归一化坐标进行调整

3.3 Temporal Calibration

Timestamps

Time Synchronization

Temporal Calibration

  • calibrate the fixed latency $t_d$ occurred during time stamping
  • change the IMU pre-integration interval to the interval between two image timestamps
    • linear incorporation of IMU measurements to obtain the IMU reading at image time stamping
    • estimates states(position, orientation, etc.) at image time stamping

3.4 边缘化(Marginalization)

SLAM is tracking a noraml distribution through a large state space

滑窗(Sliding Window) 限制了关键帧的数量,防止pose和feature的个数不会随时间不断增加,使得优化问题始终在一个有限的复杂度内,不会随时间不断增长。

Marginalization

然而,将pose移出windows时,有些约束会被丢弃掉,这样势必会导致求解的精度下降,而且当MAV进行一些退化运动(如: 匀速运动)时,没有历史信息做约束的话是无法求解的。所以,在移出位姿或特征的时候,需要将相关联的约束转变成一个约束项作为prior放到优化问题中,这就是marginalization要做的事情。

边缘化的过程就是将滑窗内的某些较旧或者不满足要求的视觉帧剔除的过程,所以边缘化也被描述为 将联合概率分布分解为边缘概率分布和条件概率分布的过程(就是利用shur补减少优化参数的过程)。

直接进行边缘化而不加入先验条件的后果:

  • 无故地移除这些pose和feature会丢弃帧间约束,会降低了优化器的精度,所以在移除pose和feature的时候需要将相关联的约束转变为一个先验的约束条件作为prior放到优化问题中

  • 在边缘化的过程中,不加先验的边缘化会导致系统尺度的缺失(参考[6]),尤其是系统在进行退化运动时(如无人机的悬停和恒速运动)。一般来说 只有两个轴向的加速度不为0的时候,才能保证尺度可观,而退化运动对于无人机或者机器人来说是不可避免的。所以在系统处于退化运动的时候,要加入先验信息保证尺度的可观性

VINS-Mono中为了处理一些悬停的case,引入了一个two-way marginalization:

  • MARGIN_OLD:如果次新帧是关键帧,则丢弃滑动窗口内最老的图像帧,同时对与该图像帧关联的约束项进行边缘化处理。这里需要注意的是,如果该关键帧是观察到某个地图点的第一帧,则需要把该地图点的深度转移到后面的图像帧中去。

  • MARGIN_NEW:如果次新帧不是关键帧,则丢弃当前帧的前一帧。因为判定当前帧不是关键帧的条件就是当前帧与前一帧视差很小,也就是说当前帧和前一帧很相似,这种情况下直接丢弃前一帧,然后用当前帧代替前一帧。为什么这里可以不对前一帧进行边缘化,而是直接丢弃,原因就是当前帧和前一帧很相似,因此当前帧与地图点之间的约束和前一帧与地图点之间的约束是很接近的,直接丢弃并不会造成整个约束关系丢失信息。这里需要注意的是,要把当前帧和前一帧之间的 IMU 预积分转换为当前帧和前二帧之间的 IMU 预积分。

在悬停等运动较小的情况下,会频繁的MARGIN_NEW,这样也就保留了那些比较旧但是视差比较大的pose。这种情况如果一直MARGIN_OLD的话,视觉约束不够强,状态估计会受IMU积分误差影响,具有较大的累积误差。

Schur Complement

  • Marginalization via Schur complement on information matrix

First Estimate Jacobin

4. 重定位

4.1 Loop Detection

Vins-Mono利用 词袋 DBoW2 做Keyframe Database的构建和查询。在建立闭环检测的数据库时,关键帧的Features包括两部分:VIO部分的200个强角点 和 500个Fast角点,然后描述子使用 BRIEF (因为旋转可观,匹配过程中对旋转有一定的适应性,所以不用使用ORB)。

  • Describe features by BRIEF
    • Features that we use in the VIO (200, not enough for loop detection)
    • Extract new FAST features (500, only use for loop detection)
  • Query Bag-of-Word (DBoW2)
    • Return loop candidates

4.2 Feature Retrieval

在闭环检测成功之后,会得到回环候选帧,所以要在已知位姿的回环候选帧和滑窗内的匹配帧通过 BRIEF描述子匹配,然后把回环帧加入到滑窗的优化当中,这时整个滑窗的状态量的维度是不发生变化的,因为回环帧的位姿是固定的。

  • Try to retrieve matches for features (200) that are used in the VIO
  • BRIEF descriptor match
  • Geometric check
    • 2D-2D: fundamental matrix test with RANSAC
    • 3D-3D: PnP test with RANSAC
    • At least 30 inliers

4.3 Tightly-Coupled Relocalization

5. 全局位姿图优化

因为之前做的非线性优化本质只是在一个滑窗之内求解出了相机的位姿,而且在回环检测部分,利用固定位姿的回环帧只是纠正了滑窗内的相机位姿,并没有修正其他位姿(或者说没有将回环发现的误差分配到整个相机的轨迹上),缺少全局的一致性,所以要做一次全局的Pose Graph。全局的Pose Graph较之滑窗有一定的迟滞性,只有相机的Pose滑出滑窗的时候,Pose才会被加到全局的Pose Graph当中。

(1) Adding Keyframes into the Pose Graph

  • Sequential edges from VIO
    • Connected with 4 previous keyframes
  • Loop closure edges
    • Only added when a keyframe is marginalized out from the sliding window VIO
    • Multi-constraint relocalization helps eliminating false loop closures
    • Huber norm for rejection of wrong loops

(2) 4-DOF Pose Graph Optimization

  • Roll and pitch are observable from VIO

(3) Pose Graph Management

(4) Map Reuse

  • Save map at any time
  • Load map and re-localize with respect to it
  • Pose graph merging

6. Remarks on Monocular Visual-Inertial SLAM

  • Important factors
    • Access to raw camera data (especially for rolling shutter cameras)
    • Sensor synchronization and timestamps
    • Camera-IMU rotation
    • Estimator initialization
  • Not-so-important factors
    • Camera-IMU translation
    • Types of features (we use the simplest corner+KLT)
    • Quality of feature tracking (outlier is acceptable)
  • Failures – need more engineering treatment
    • Long range scenes (aerial vehicles)
    • Constant velocity (ground vehicle)
    • Pure rotation (augmented reality)
  • Be aware of computational power requirement

参考文献

  • [1] VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator
  • [2] Shaojie Shen, Monocular Visual-Inertial SLAM slides, 2018
  • [3] Quaternion kinematics for the error-state Kalman filter
  • [4] Xiaobuyi, VINS-Mono代码分析总结
Read More

Ubuntu 16.04 下 VINS-Mono 的安装和使用(RealSense ZR300)

[TOC]

Overview

本文介绍在 Ubuntu 16.04(ROS Kinetic)的PC平台上使用 RealSense ZR300 的 fisheye camera (FOV: 100x133) + IMU 运行 我对VINS-Mono的改版 cggos/vins_mono_cg

安装&运行 ZR300驱动

RealSense ZR300 的加速度计和陀螺仪的时间戳不是完全相同的(加速度计的频率为250Hz而陀螺仪的频率为200Hz),但是由于RealSense ZR300的时间戳是在硬件上打的,不是操作系统接收到到图像和IMU的时间戳,所以可以通过 插值 的方式使它们的时间戳对齐。

这里使用 maplab 推荐的驱动:ethz-aslmaplab_realsense,对IMU的陀螺仪、加速度计、图像的时间戳做了对齐处理。

  • 安装 librealsense

    sudo apt install ros-${ROS_VERSION}-librealsense
    
  • 安装 maplab_realsense

    mkdir -p ws_realsense/src
    cd ws_realsense/src
    git clone https://github.com/ethz-asl/maplab_realsense.git
    wstool init
    wstool merge maplab_realsense/dependencies.rosinstall
    wstool update -j4
    cd ../
    catkin build
    
  • 运行 maplab_realsense

    source devel/setup.bash
    roslaunch maplab_realsense maplab_realsense.launch
    

    正常运行界面如下:

    注意:若运行错误,可尝试运行多次,或者 更换尝试更换USB口。

    其 Topic list:

    /rosout
    /rosout_agg
    /tf_static
    /zr300_node/color/camera_info
    /zr300_node/color/image_raw
    /zr300_node/color/image_raw/compressed
    /zr300_node/color/image_raw/compressed/parameter_descriptions
    /zr300_node/color/image_raw/compressed/parameter_updates
    /zr300_node/color/image_raw/compressedDepth
    /zr300_node/color/image_raw/compressedDepth/parameter_descriptions
    /zr300_node/color/image_raw/compressedDepth/parameter_updates
    /zr300_node/color/image_raw/theora
    /zr300_node/color/image_raw/theora/parameter_descriptions
    /zr300_node/color/image_raw/theora/parameter_updates
    /zr300_node/depth/camera_info
    /zr300_node/device_time
    /zr300_node/device_time/parameter_descriptions
    /zr300_node/device_time/parameter_updates
    /zr300_node/fisheye/camera_info
    /zr300_node/fisheye/image_raw
    /zr300_node/fisheye/image_raw/compressed
    /zr300_node/fisheye/image_raw/compressed/parameter_descriptions
    /zr300_node/fisheye/image_raw/compressed/parameter_updates
    /zr300_node/fisheye/image_raw/compressedDepth
    /zr300_node/fisheye/image_raw/compressedDepth/parameter_descriptions
    /zr300_node/fisheye/image_raw/compressedDepth/parameter_updates
    /zr300_node/fisheye/image_raw/theora
    /zr300_node/fisheye/image_raw/theora/parameter_descriptions
    /zr300_node/fisheye/image_raw/theora/parameter_updates
    /zr300_node/imu
    /zr300_node/ir_1/camera_info
    /zr300_node/ir_1/image_raw
    /zr300_node/ir_1/image_raw/compressed
    /zr300_node/ir_1/image_raw/compressed/parameter_descriptions
    /zr300_node/ir_1/image_raw/compressed/parameter_updates
    /zr300_node/ir_1/image_raw/compressedDepth
    /zr300_node/ir_1/image_raw/compressedDepth/parameter_descriptions
    /zr300_node/ir_1/image_raw/compressedDepth/parameter_updates
    /zr300_node/ir_1/image_raw/theora
    /zr300_node/ir_1/image_raw/theora/parameter_descriptions
    /zr300_node/ir_1/image_raw/theora/parameter_updates
    /zr300_node/ir_2/camera_info
    /zr300_node/ir_2/image_raw
    /zr300_node/ir_2/image_raw/compressed
    /zr300_node/ir_2/image_raw/compressed/parameter_descriptions
    /zr300_node/ir_2/image_raw/compressed/parameter_updates
    /zr300_node/ir_2/image_raw/compressedDepth
    /zr300_node/ir_2/image_raw/compressedDepth/parameter_descriptions
    /zr300_node/ir_2/image_raw/compressedDepth/parameter_updates
    /zr300_node/ir_2/image_raw/theora
    /zr300_node/ir_2/image_raw/theora/parameter_descriptions
    /zr300_node/ir_2/image_raw/theora/parameter_updates
    /zr300_node/pointcloud
    

标定 ZR300

使用 kalibr 对 ZR300 进行标定,标定方法可参考 Kalibr 之 Camera-IMU 标定 (总结)

安装&运行 VINS-Mono

这里使用 我对VINS-Mono的改版 cggos/vins_mono_cg,配置文件或者标定参数均使用VINS-Mono默认文件。

  • 安装 Dependencies
    • eigen3
    • ceres-solver
  • 安装 vins_mono_cg

    mkdir -p ws_vins/src
    cd ws_vins/src
    git clone https://github.com/cggos/vins_mono_cg.git
    cd ../
    catkin_make -j3
    
  • 修改 config/realsense/realsense_fisheye_config.yaml

    imu_topic:   "/zr300_node/imu"
    image_topic: "/zr300_node/fisheye/image_raw"
    output_path: "./output/"
    
  • 运行 vins_mono_cg

    roslaunch vins_estimator realsense_fisheye.launch
    

Read More

计算机视觉对极几何之Triangulate

[TOC]

极线搜索

If we are using only the left camera, we can’t find the 3D point corresponding to the point $x$ in image because every point on the line $OX$ projects to the same point on the image plane. But consider the right image also. Now different points on the line $OX$ projects to different points ( $x^{\prime}$) in right plane. So with these two images, we can triangulate the correct 3D point.

The projection of the different points on $OX$ form a line on right plane (line $l^{\prime}$). We call it epiline corresponding to the point $x$. It means, to find the point $x$ on the right image, search along this epiline. It should be somewhere on this line (Think of it this way, to find the matching point in other image, you need not search the whole image, just search along the epiline. So it provides better performance and accuracy). This is called Epipolar Constraint. Similarly all points will have its corresponding epilines in the other image.

  • epiline: $l$, $l^{\prime}$
  • epipole: $e$, $e^{\prime}$
  • epipolar plane: $XOO^{\prime}$

求解空间点坐标

Triangulate in ORB-SLAM2

已知,两个视图下匹配点的 图像坐标 $\boldsymbol{p}_1$ 和 $\boldsymbol{p}_2$,两个相机的相对位姿 $\boldsymbol{T}$ ,相机内参矩阵 $\boldsymbol{K}$,求 对应的三维点坐标 $\boldsymbol{P}$,其齐次坐标为 $\tilde{\boldsymbol{P}}$。

两个视图的 投影矩阵 分别为

\[\boldsymbol{P}_1 = \boldsymbol{K} \cdot [\boldsymbol{I}_{3 \times 3} \quad \mathbf{0}_{3 \times 1}], \quad \boldsymbol{P}_1 \in \mathbb{R}^{3 \times 4} \\ \boldsymbol{P}_2 = \boldsymbol{K} \cdot [ \boldsymbol{R}_{3 \times 3} \quad \boldsymbol{t}_{3 \times 1} ], \quad \boldsymbol{P}_2 \in \mathbb{R}^{3 \times 4}\] \[\boldsymbol{T} = \boldsymbol{T}_{21} = [ \boldsymbol{R} \quad \boldsymbol{t} ] \in \mathbb{R}^{3 \times 4}\]

由于是齐次坐标的表示形式,使用叉乘消去齐次因子

\[\tilde{\boldsymbol{p}_1} \times (\boldsymbol{P}_1 \tilde{\boldsymbol{P}}) = \mathbf{0} \\ \tilde{\boldsymbol{p}_2} \times (\boldsymbol{P}_2 \tilde{\boldsymbol{P}}) = \mathbf{0}\]

把 $\boldsymbol{P}_1$ 和 $\boldsymbol{P}_2$ 按行展开(上标代表行索引)代入,对于第一视图有

\[\begin{bmatrix} 0 & -1 & v_1 \\ 1 & 0 & -u_1 \\ -v_1 & u_1 & 0 \end{bmatrix} \cdot \begin{bmatrix} \boldsymbol{P}_1^1 \cdot \tilde{\boldsymbol{P}} \\ \boldsymbol{P}_1^2 \cdot \tilde{\boldsymbol{P}} \\ \boldsymbol{P}_1^3 \cdot \tilde{\boldsymbol{P}} \end{bmatrix} = \mathbf{0}\]

\[u_1 (\boldsymbol{P}_1^3 \cdot \tilde{\boldsymbol{P}}) - (\boldsymbol{P}_1^1 \cdot \tilde{\boldsymbol{P}}) = 0 \\ v_1 (\boldsymbol{P}_1^3 \cdot \tilde{\boldsymbol{P}}) - (\boldsymbol{P}_1^2 \cdot \tilde{\boldsymbol{P}}) = 0 \\ u_1 (\boldsymbol{P}_1^2 \cdot \tilde{\boldsymbol{P}}) - v_1 (\boldsymbol{P}_1^1 \cdot \tilde{\boldsymbol{P}}) = 0 \\\]

可见第三个式子可以由上两个式子线性表示,所以只需要取前连个式子即可

\[\begin{bmatrix} u_1 \boldsymbol{P}_1^3 - \boldsymbol{P}_1^1 \\ v_1 \boldsymbol{P}_1^3 - \boldsymbol{P}_1^2 \end{bmatrix} \cdot \tilde{\boldsymbol{P}} = \mathbf{0}\]

同样的,对于第二视图

\[\begin{bmatrix} u_2 \boldsymbol{P}_2^3 - \boldsymbol{P}_2^1 \\ v_2 \boldsymbol{P}_2^3 - \boldsymbol{P}_2^2 \end{bmatrix} \cdot \tilde{\boldsymbol{P}} = \mathbf{0}\]

组合起来

\[\boldsymbol{A_{4 \times 4}} \cdot \tilde{\boldsymbol{P}} = \begin{bmatrix} u_1 \boldsymbol{P}_1^3 - \boldsymbol{P}_1^1 \\ v_1 \boldsymbol{P}_1^3 - \boldsymbol{P}_1^2 \\ u_2 \boldsymbol{P}_2^3 - \boldsymbol{P}_2^1 \\ v_2 \boldsymbol{P}_2^3 - \boldsymbol{P}_2^2 \end{bmatrix} \cdot \tilde{\boldsymbol{P}} = \mathbf{0}\]

求解 $\boldsymbol{P}$ 相当于解一个 线性最小二乘问题

SVD分解 $\boldsymbol{A}$

\[\text{SVD}(\boldsymbol{A}) = \boldsymbol{U} \boldsymbol{\Sigma} \boldsymbol{V}^T\]

方程的解为 $\boldsymbol{A}$ 的 最小奇异值 对应的 奇异矢量,即 齐次坐标

\[\tilde{\boldsymbol{P}} = (X,Y,Z,W) = \boldsymbol{V}_3\]

最终,$\boldsymbol{P}$(第一视图坐标系下三维坐标)为

\[\boldsymbol{P} = (\frac{X}{W}, \frac{Y}{W}, \frac{Z}{W})\]

orbslam2_cg中示例代码

void Initializer::Triangulate(
  const cv::KeyPoint &kp1,
  const cv::KeyPoint &kp2,
  const cv::Mat &P1,
  const cv::Mat &P2,
  cv::Mat &x3D)
{
    cv::Mat A(4,4,CV_32F);

    A.row(0) = kp1.pt.x*P1.row(2)-P1.row(0);
    A.row(1) = kp1.pt.y*P1.row(2)-P1.row(1);
    A.row(2) = kp2.pt.x*P2.row(2)-P2.row(0);
    A.row(3) = kp2.pt.y*P2.row(2)-P2.row(1);

    cv::Mat u,w,vt;
    cv::SVD::compute(A,w,u,vt,cv::SVD::MODIFY_A| cv::SVD::FULL_UV);
    x3D = vt.row(3).t();
    x3D = x3D.rowRange(0,3)/x3D.at<float>(3);
}

Triangulate in PTAM

已知,两个视图下匹配点的 归一化平面(z=1)齐次坐标 $\boldsymbol{p}_1$ 和 $\boldsymbol{p}_2$,两个相机的相对位姿 $\boldsymbol{T}$,求 对应的三维点坐标 $\boldsymbol{P}$(第一视图坐标系下三维坐标),其齐次坐标为 $\tilde{\boldsymbol{P}}$。

方程

\[\boldsymbol{p}_1 \times (\boldsymbol{I}_{3 \times 4} \cdot \tilde{\boldsymbol{P}}) = \mathbf{0} \\ \boldsymbol{p}_2 \times (\boldsymbol{T}_{21} \cdot \tilde{\boldsymbol{P}}) = \mathbf{0}\] \[\boldsymbol{T} = \boldsymbol{T}_{21} = [ \boldsymbol{R} \quad \boldsymbol{t} ] \in \mathbb{R}^{3 \times 4}\]

矩阵形式

\[\boldsymbol{A_{6 \times 4}} \cdot \tilde{\boldsymbol{P}} = \begin{bmatrix} \boldsymbol{p}_1 \times \boldsymbol{I}_{3 \times 4} \\ \boldsymbol{p}_2 \times \boldsymbol{T}_{21} \end{bmatrix} \cdot \tilde{\boldsymbol{P}} = \mathbf{0}\]

求解 $\boldsymbol{P}$ 相当于解一个 线性最小二乘问题

SVD分解 $\boldsymbol{A}$

\[\text{SVD}(\boldsymbol{A}) = \boldsymbol{U} \boldsymbol{\Sigma} \boldsymbol{V}^T\]

方程的解为 $\boldsymbol{A}$ 的 最小奇异值 对应的 奇异矢量,即 齐次坐标

\[\tilde{\boldsymbol{P}} = (X,Y,Z,W) = \boldsymbol{V}_3\]

最终,$\boldsymbol{P}$(第一视图坐标系下三维坐标)为

\[\boldsymbol{P} = (\frac{X}{W}, \frac{Y}{W}, \frac{Z}{W})\]

ptam_cg中示例代码Triangulate

Vector<3> MapMaker::Triangulate(
  SE3<> se3AfromB,
  const Vector<2> &v2A,
  const Vector<2> &v2B)
{
    Matrix<3,4> PDash;
    PDash.slice<0,0,3,3>() = se3AfromB.get_rotation().get_matrix();
    PDash.slice<0,3,3,1>() = se3AfromB.get_translation().as_col();

    Matrix<4> A;
    A[0][0] = -1.0; A[0][1] =  0.0; A[0][2] = v2B[0]; A[0][3] = 0.0;
    A[1][0] =  0.0; A[1][1] = -1.0; A[1][2] = v2B[1]; A[1][3] = 0.0;
    A[2] = v2A[0] * PDash[2] - PDash[0];
    A[3] = v2A[1] * PDash[2] - PDash[1];

    SVD<4,4> svd(A);
    Vector<4> v4Smallest = svd.get_VT()[3];
    if(v4Smallest[3] == 0.0)
        v4Smallest[3] = 0.00001;
    return project(v4Smallest);
}

ptam_cg中示例代码TriangulateNew

Vector<3> MapMaker::TriangulateNew(
  SE3<> se3AfromB,
  const Vector<2> &v2A,
  const Vector<2> &v2B)
{
    Vector<3> v3A = unproject(v2A);
    Vector<3> v3B = unproject(v2B);

    Matrix<3> m3A = TooN::Zeros;
    m3A[0][1] = -v3A[2];
    m3A[0][2] =  v3A[1];
    m3A[1][2] = -v3A[0];
    m3A[1][0] = -m3A[0][1];
    m3A[2][0] = -m3A[0][2];
    m3A[2][1] = -m3A[1][2];
    Matrix<3> m3B = TooN::Zeros;
    m3B[0][1] = -v3B[2];
    m3B[0][2] =  v3B[1];
    m3B[1][2] = -v3B[0];
    m3B[1][0] = -m3B[0][1];
    m3B[2][0] = -m3B[0][2];
    m3B[2][1] = -m3B[1][2];

    Matrix<3,4> m34AB;
    m34AB.slice<0,0,3,3>() = se3AfromB.get_rotation().get_matrix();
    m34AB.slice<0,3,3,1>() = se3AfromB.get_translation().as_col();

    SE3<> se3I;
    Matrix<3,4> m34I;
    m34I.slice<0,0,3,3>() = se3I.get_rotation().get_matrix();
    m34I.slice<0,3,3,1>() = se3I.get_translation().as_col();

    Matrix<3,4> PDashA = m3A * m34AB;
    Matrix<3,4> PDashB = m3B * m34I;

    Matrix<6,4> A;
    A.slice<0,0,3,4>() = PDashA;
    A.slice<3,0,3,4>() = PDashB;

    SVD<6,4> svd(A);
    Vector<4> v4Smallest = svd.get_VT()[3];
    if(v4Smallest[3] == 0.0)
        v4Smallest[3] = 0.00001;

    return project(v4Smallest);
}

求解空间点深度

已知,两个视图下匹配点的 归一化平面(z=1)齐次坐标 $\boldsymbol{p}_1$ 和 $\boldsymbol{p}_2$,两个相机的相对位姿 $\boldsymbol{T}$,求解空间点深度 $Z_1$ 和 $Z_2$

\[\boldsymbol{T} = \boldsymbol{T}_{21} = [ \boldsymbol{R} \quad \boldsymbol{t} ] \in \mathbb{R}^{3 \times 4}\] \[Z_2 \cdot \boldsymbol{p}_2 = \boldsymbol{T}_{21} \cdot ( Z_1 \cdot \boldsymbol{p}_1 ) = Z_1 \cdot \boldsymbol{R} \boldsymbol{p}_1 + \boldsymbol{t}\]

矩阵形式

\[\begin{bmatrix} \boldsymbol{p}_2 & -\boldsymbol{R} \boldsymbol{p}_1 \end{bmatrix} \cdot \begin{bmatrix} Z_2 \\ Z_1 \end{bmatrix} = \boldsymbol{t}\]

\[\begin{bmatrix} \boldsymbol{R} \boldsymbol{p}_1 & -\boldsymbol{p}_2 \end{bmatrix} \cdot \begin{bmatrix} Z_1 \\ Z_2 \end{bmatrix} = - \boldsymbol{t}\]

Triangulate in SVO

上式即 $Ax=-t$ 的形式,解该方程可以用 正规方程

\[A^T A x = - A^T t\]

解得

\[x = - (A^TA)^{-1} A^T t\]

svo_cg中示例代码

bool depthFromTriangulation(
    const SE3& T_search_ref,
    const Vector3d& f_ref,
    const Vector3d& f_cur,
    double& depth)
{
  Matrix<double,3,2> A; A << T_search_ref.rotation_matrix() * f_ref, f_cur;
  const Matrix2d AtA = A.transpose() * A;
  if(AtA.determinant() < 0.000001)
    return false;
  const Vector2d depth2 =
    - AtA.inverse()* A.transpose() * T_search_ref.translation();
  depth = fabs(depth2[0]);
  return true;
}

Triangulate in REMODE

由于解向量是二维的,对上式采用 克莱默法则 求解:

\[\begin{bmatrix} \boldsymbol{p}_2 \\ \boldsymbol{R} \boldsymbol{p}_1 \end{bmatrix} \begin{bmatrix} \boldsymbol{p}_2 & -\boldsymbol{R} \boldsymbol{p}_1 \end{bmatrix} \begin{bmatrix} Z_2 \\ Z_1 \end{bmatrix} = \begin{bmatrix} \boldsymbol{p}_2 \boldsymbol{p}_2 & -\boldsymbol{p}_2 \cdot \boldsymbol{R} \boldsymbol{p}_1 \\ \boldsymbol{p}_2 \cdot \boldsymbol{R} \boldsymbol{p}_1 & -\boldsymbol{R} \boldsymbol{p}_1 \cdot \boldsymbol{R} \boldsymbol{p}_1 \end{bmatrix} \begin{bmatrix} Z_2 \\ Z_1 \end{bmatrix} = \boldsymbol{t}\]

REMODE中示例代码

// Returns 3D point in reference frame
// Non-linear formulation (ref. to the book 'Autonomous Mobile Robots')
__device__ __forceinline__
float3 triangulatenNonLin(
    const float3 &bearing_vector_ref,
    const float3 &bearing_vector_curr,
    const SE3<float> &T_ref_curr)
{
  const float3 t = T_ref_curr.getTranslation();
  float3 f2 = T_ref_curr.rotate(bearing_vector_curr);
  const float2 b = make_float2(dot(t, bearing_vector_ref),
                               dot(t, f2));
  float A[2*2];
  A[0] = dot(bearing_vector_ref, bearing_vector_ref);
  A[2] = dot(bearing_vector_ref, f2);
  A[1] = -A[2];
  A[3] = dot(-f2, f2);

  const float2 lambdavec = make_float2(A[3] * b.x - A[1] * b.y,
      -A[2] * b.x + A[0] * b.y) / (A[0] * A[3] - A[1] * A[2]);
  const float3 xm = lambdavec.x * bearing_vector_ref;
  const float3 xn = t + lambdavec.y * f2;
  return (xm + xn)/2.0f;
}

Reference

Read More

矩阵变换、分解与矩阵子空间和线性方程组

[TOC]

Matrix

正规矩阵

\[A^H A = A A^H\]
  • 酉矩阵 $A^H A = A A^H = I$
    • 正交矩阵 $A^T A = AA^T = I$
      • 特殊正交矩阵:$\det(A) = +1$,是一个旋转矩阵
        • Givens矩阵:$G(i,j,\theta)$
      • 瑕旋转矩阵:$\det(A) = -1$,瑕旋转是旋转加上镜射
        • Householder矩阵:$H = I - 2 uu^T$
  • 厄米特矩阵 $A^H = A$
    • 实对称矩阵 $A^T = A$
  • 反厄米特矩阵 $A^H = -A$
    • 实反对称矩阵 $A^T = -A$

正定与半正定矩阵

在线性代数里,正定矩阵(positive definite matrix)有时简称为 正定阵

对任意的非零向量 $\boldsymbol{x}$ 恒有 二次型

\[f = \boldsymbol{x}^T \boldsymbol{A} \boldsymbol{x} > 0\]

则称 $f$ 为 正定二次型,对应的矩阵为 正定矩阵;若 $f \ge 0$,则 对应的矩阵为 半正定矩阵

直观理解

令 $Y=MX$,则 $X^T Y > 0$,所以

\[cos(\theta) = \frac{X^T Y}{\|X\|\|Y\|} > 0\]

因此,从另一个角度,正定矩阵代表一个向量经过它的变化后的向量与其本身的夹角 小于90度

判别对称矩阵A的正定性

  • 求出A的所有特征值
    • 若A的特征值均为正数,则A是正定的;若A的特征值均为负数,则A为负定的。
  • 计算A的各阶顺序主子式
    • 若A的各阶顺序主子式均大于零,则A是正定的;若A的各阶顺序主子式中,奇数阶主子式为负,偶数阶为正,则A为负定的。

伴随矩阵

余子式

在n阶行列式D中划去任意选定的k行、k列后,余下的元素按原来顺序组成的n-k阶行列式M,称为 行列式D的k阶子式A的余子式

A关于第i行第j列的余子式(记作 $M_{ij}$)是去掉A的第i行第j列之后得到的(n− 1)×(n− 1)矩阵的行列式。

代数余子式

如果k阶子式A在行列式D中的行和列的标号分别为i1,i2,…,ik和j1,j2,…,jk。则在A的余子式M前面添加符号 $(-1)^{\left(i_{1}+i_{2}+, \ldots,+i_{k}\right)+\left(j_{1}+j_{2}+, \ldots,+j_{k}\right)}$,所得到的n-k阶行列式,称为 行列式D的k阶子式A的代数余子式

A的 (代数)余子矩阵 是一个n×n的矩阵C,使得其第i行第j列的元素是A关于第i行第j列的代数余子式:

\[\mathbf{C}_{i j}=(-1)^{i+j} M_{i j}\]

伴随矩阵(adjoint matrix)

A的 代数余子矩阵 的 转置矩阵,也就是说, A的伴随矩阵是一个n×n的矩阵(记作adj(A)),使得其第i 行第j 列的元素是A关于第j 行第i 列的代数余子式

\[\mathbf{A}^* = \operatorname{adj}(\mathbf{A})=\mathbf{C}^{T}\]

自伴随矩阵

\[\mathbf{A}^* = \mathbf{A}\]

可逆矩阵(非奇异矩阵)

\[\mathbf{A}^{-1} = \frac{1}{|\mathbf{A}|} \mathbf{A}^*\]

其他

  • 对角阵:任意正规矩阵 都可以经过 正交变换 变成 对角矩阵

  • 上(下)三角阵

矩阵变换

正交变换

Givens rotation(吉文斯旋转)

在数值线性代数中,吉文斯旋转(Givens rotation)是在两个坐标轴所展开的平面中的旋转。

吉文斯旋转 矩阵:

\[G(i, j, \theta)=\left[\begin{array}{ccccccc} 1 & \cdots & 0 & \cdots & 0 & \cdots & 0 \\ \vdots & \ddots & \vdots & & \vdots & & \vdots \\ 0 & \cdots & c & \cdots & s & \cdots & 0 \\ \vdots & & \vdots & \ddots & \vdots & & \vdots \\ 0 & \cdots & -s & \cdots & c & \cdots & 0 \\ \vdots & & \vdots & & \vdots & \ddots & \vdots \\ 0 & \cdots & 0 & \cdots & 0 & \cdots & 1 \end{array}\right]\] \[G \triangleq\left[\begin{array}{cc} \cos \phi & \sin \phi \\ -\sin \phi & \cos \phi \end{array}\right]\]

当一个吉文斯旋转矩阵 $G(i, j, \theta)$ 从左侧乘另一个矩阵 $A$ 的时候,$GA$ 只作用于 $A$ 的第 $i$ 和 $j$ 行。

\[\left[\begin{array}{cc} c & -s \\ s & c \end{array}\right]\left[\begin{array}{l} a \\ b \end{array}\right]=\left[\begin{array}{l} r \\ 0 \end{array}\right]\]

明确计算出 $\theta$ 是没有必要的。我们转而直接获取 c, s 和 r。一个明显的解是

\[\begin{aligned} &r \leftarrow \sqrt{a^{2}+b^{2}}\\ &c \leftarrow a / r\\ &s \leftarrow-b / r \end{aligned}\]
  • Givens 旋转在数值线性代数中主要的用途是在向量或矩阵中介入零。例如,这种效果可用于计算矩阵的 QR分解

\[G_{j_{k}} \ldots G_{j_{2}} G_{j_{1}} R_{a}=\left[\begin{array}{c} R^{\prime} \\ 0 \end{array}\right]\]
  • 超过 Householder变换 的一个好处是它们可以轻易的并行化,另一个好处是对于非常稀疏的矩阵计算量更小。

Jacobi rotation

Householder 变换

豪斯霍尔德变换(Householder transformation)或译“豪斯霍德转换”,又称初等反射(Elementary reflection),这一变换将一个向量变换为由一个超平面反射的镜像,是一种线性变换。其变换矩阵被称作豪斯霍尔德矩阵,在一般内积空间中的类比被称作豪斯霍尔德算子。超平面的法向量被称作豪斯霍尔德向量。

豪斯霍尔德变换可以将向量的某些元素置零,同时保持该向量的范数不变。例如,将非零列向量 $\mathbf{x}=\left[x_{1}, \dots, x_{n}\right]^{T}$ 变换为单位基向量 $\mathbf{e}=[1,0, \ldots, 0]^{T}$ 的豪斯霍尔德矩阵为

\[\mathbf{H}=\mathbf{I}-\frac{2}{\langle\mathbf{v}, \mathbf{v}\rangle} \mathbf{v} \mathbf{v}^{H}\]

其中豪斯霍尔德向量 $\mathbf{v}$ 满足:

\[\mathbf{v}=\mathbf{x}+\operatorname{sgn}\left(\mathbf{x}_{1}\right)\|\mathbf{x}\|_{2} \mathbf{e}_{1}\]

Matrix Decomposition

EVD (Eigen Decomposition)

\[A = VDV^{-1}\]
  • $A$ 是 方阵;$D$ 是 对角阵,其 特征值从大到小排列;$V$ 的列向量为 特征向量
  • 若 $A$ 为 对称阵,则 $V$ 为 正交矩阵,其列向量为 $A$ 的 单位正交特征向量

SVD (Singular Value Decomposition)

\[A = UDV^T\]
  • $A$ 为 $m \times n$ 的矩阵;$D$ 是 非负对角阵,其 奇异值从大到小排列;$U$、$V$ 均为 正交矩阵

SVD分解十分强大且适用,因为任意一个矩阵都可以实现SVD分解,而特征值分解只能应用于方阵。

奇异值与特征值

\[AV = UD \Rightarrow Av_i = \sigma_{i} u_i \Rightarrow \sigma_{i} = \frac{Av_i}{u_i} \\[4ex] A^T A = (V D^T U^T) (U D V^T) = V D^2 V^T \\[2ex] A A^T = (U D V^T) (V D^T U^T) = U D^2 U^T\]
  • $A^T A$ 的 特征向量 组成的是SVD中的 $V$ 矩阵
  • $A A^T$ 的 特征向量 组成的是SVD中的 $U$ 矩阵
  • $A^T A$ 或 $A A^T$ 的 特征值 $\lambda_i$ 与 $A$ 的 奇异值 $\sigma_i$ 满足 $\sigma_i = \sqrt{\lambda_i}$

PCA

奇异值减少得特别快,在很多情况下,前10%甚至1%的奇异值的和就占了全部的奇异值之和的99%以上,可以用 最大的 $k$ 个的奇异值和对应的左右奇异向量 来近似描述矩阵

\[A_{m \times n} = U_{m \times m} D_{m \times n} V_{n \times n}^T \approx U_{m \times k} D_{k \times k} V_{k \times n}^T\]

LU Decomposition

\[A = LU\]
  • $A$ 是 方阵;$L$ 是 下三角矩阵;$U$ 是 上三角矩阵

PLU 分解

\[A = PLU\]

事实上,PLU 分解有很高的数值稳定性,因此实用上是很好用的工具。

LDU 分解

\[A = LDU\]

Cholesky Decomposition

\[A = LDL^T\]
  • $A$ 是 方阵正定矩阵;$L$ 是 下三角矩阵

classic:

\[A = LL^T \\[2ex] A^{-1} = (L^T)^{-1} L^{-1} = (L^{-1})^T L^{-1}\]

LDLT

QR Decomposition

\[A = QR\]
  • $A$ 为 $m \times n$ 的矩阵;$Q$ 为 酉矩阵;$R$ 是 上三角矩阵

Four Fundamental Subspaces

  • 零空间 与 行空间 正交
  • 左零空间 与 列空间 正交

column space (the range or image) $C(A)$

In linear algebra, the column space (also called the range or image) of a matrix A is the span (set of all possible linear combinations) of its column vectors. The column space of a matrix is the image or range of the corresponding matrix transformation.

row space $C(A^T)$

nullspace $N(A)$

\[A^T x = 0\]

The nullity of a matrix is the dimension of the null space.

The rank and nullity of a matrix A with n columns are related by the equation:

\[\operatorname{rank}(A) + \operatorname{nullity}(A) = n\]

left nullspace $N(A^T)$

\[A^T x = 0 \longrightarrow x^T A = 0\]

左零空间 求解

已知 \(A \in \mathbb{R}^{m \times n}, \quad \operatorname{rank}(A) = r = n, \quad m > n\)

SVD

\[\begin{aligned} A_{m \times n} &= U_{m \times m} \cdot {\Sigma}_{m \times n} \cdot V^T_{n \times n} \\ &= \left[\begin{array}{ccc|ccc} u_{1} & \cdots & u_{r} & u_{r+1} & \cdots & u_{m} \end{array}\right] \left[\begin{array}{ccc|c} \sigma_{1} & & & \\ & \ddots & & 0 \\ & & \sigma_{r} & \\ \hline & 0 & & 0 \end{array}\right] \left[\begin{array}{c} v_{1}^{T} \\ \vdots \\ v_{r}^{T} \\ \hline v_{r+1}^{T} \\ \vdots \\ v_{n}^{T} \end{array}\right] \\ &= \begin{bmatrix} {U_1}_{m \times r} & {U_2}_{m \times (m-r)} \end{bmatrix} \cdot \begin{bmatrix} {\Sigma}_1 & 0 \\ 0 & 0 \end{bmatrix} \cdot \begin{bmatrix} V^T_1 \\[3pt] V^T_2 \end{bmatrix} \end{aligned}\]

left nullspace

\[U_2^T \cdot A = 0 \longrightarrow N(A^T) = U_2^T \in \mathbb{R}^{(m-r) \times m}\]

column space or the range (????)

\[C(A) = U_1^T \in \mathbb{R}^{r \times m}\]

QR

\[\begin{aligned} A_{m \times n} &= Q_{m \times m} \cdot R_{m \times n} \\ &= \begin{bmatrix} {Q_1}_{m \times r} & {Q_2}_{m \times (m-r)} \end{bmatrix} \begin{bmatrix} R_1 \\ 0 \end{bmatrix} \end{aligned}\]

left nullspace

\[Q_2^T \cdot A = 0 \longrightarrow N(A^T) = Q_2^T \in \mathbb{R}^{(m-r) \times m}\]

线性方程组

非齐次线性方程组

\[A_{m \times n} x = b_{m \times 1}\]

在非齐次方程组中,A到底有没有解析解,可以由增广矩阵来判断:

  • r(A) > r(A b) 不可能,因为增广矩阵的秩大于等于系数矩阵的秩
  • r(A) < r(A b) 方程组无解;
  • r(A) = r(A b) = n,方程组有唯一解;
  • r(A) = r(A b) < n,方程组无穷解;

非齐次线性方程组的最小二乘问题

\[x^* = \arg \min_x{\|Ax - b\|}_2^2\]

m个方程求解n个未知数,有三种情况:

  • m=n
    • 且A为非奇异,则有唯一解 $x=A^{-1}b$
  • m>n,超定问题(overdetermined)
    • $x=A^{+}b$
  • m<n,欠定问题(underdetermined)

齐次线性方程组

\[A_{m \times n} x = 0\]

齐次线性方程 解空间维数=n-r(A)

  • r(A) = n
    • A 是方阵,该方程组有唯一的零解
    • A 不是方阵(m>n),解空间只含有零向量
  • r(A) < n
    • 该齐次线性方程组有非零解,而且不唯一(自由度为n-r(A))

齐次线性方程组的最小二乘问题

\[\min{\|Ax\|}_2^2 \quad s.t. \quad \|x\| = 1\]
  • 最小二乘解为 矩阵 $A^TA$ 最小特征值所对应的特征向量
  • $\text{EVD}(A^{T}A)=[V,D]$,找最小特征值对应的V中的特征向量
  • $\text{SVD}(A)=[U,S,V]$,找S中最小奇异值对应的V的右奇异向量

with Eigen

Read More

计算机视觉对极几何之FEH

[TOC]

Overview

  • The gray region is the epipolar plane
  • The orange line is the baseline
  • the two blue lines are the epipolar lines

Basic Epipolar Geometry entities for pinhole cameras and panoramic camera sensors

  • Fundamental Matrix F
    • maps a point in one image to a line (epiline) in the other image
    • Calculated from matching points from both the images. A minimum of 8 such points are required to find the fundamental matrix (while using 8-point algorithm). More points are preferred and use RANSAC to get a more robust result.
  • Essential Matrix E
  • Homography Matrix H

Foundamental Matrix (基础矩阵)

对极约束:

\[\mathbf{p}'^T \cdot \mathbf{F} \cdot \mathbf{p} = 0\]

其中,$\mathbf{p}, \mathbf{p}’$ 为两个匹配 像素点坐标

基础矩阵:

\[\mathbf{F} = \mathbf{K}'^{-T} \mathbf{E} \mathbf{K}^{-1} = \mathbf{K}'^{-T} \mathbf{t}^\wedge \mathbf{R} \mathbf{K}^{-1}\]

$\mathbf{F}$ 的性质:

  • 对其乘以 任意非零常数,对极约束依然满足(尺度等价性,up to scale)

  • 具有 7个自由度
    • F has seven degrees of freedom: a 3×3 homogeneous matrix has eight indepen- dent ratios (there are nine elements, and the common scaling is not significant); however, F also satisfies the constraint det F = 0 which removes one degree of freedom.
  • 奇异性约束 $\text{rank}(\mathbf{F})=2$
    • The fundamental matrix F may be written as $F = [e^\prime]\times H\pi$ , where $H_\pi$ is the transfer mapping from one image to another via any plane π. Furthermore, since $[e^\prime]\times$ has rank 2 and $H\pi$ rank 3, $F$ is a matrix of rank 2.

$\mathbf{F}$ 与 极线和极点 的关系:

\[\mathbf{l}' = \mathbf{F} \cdot \mathbf{p} \\[2ex] \mathbf{l} = \mathbf{F}^T \cdot \mathbf{p}' \\[2ex] \mathbf{F} \cdot \mathbf{e} = \mathbf{F}^T \cdot \mathbf{e}' = 0\]

$\mathbf{F}$ 的计算:

  • Compute from 7 image point correspondences
  • 8点法(Eight-Point Algorithm

Foundamental Matrix Estimation

  • 类似于下面 $\mathbf{E}$ 的估计

Essential Matrix (本质矩阵)

对极约束:

\[{\mathbf{p}'}^T \cdot \mathbf{E} \cdot \mathbf{p} = 0\]

其中,$\mathbf{p}, \mathbf{p}’$ 为两个匹配像素点的 归一化平面坐标

本质矩阵:

\[\mathbf{E} = \mathbf{t}^\wedge \mathbf{R} = \mathbf{K}'^{T} \mathbf{F} \mathbf{K} \\[2ex]\]

$\mathbf{E}$ 的性质:

  • A 3×3 matrix is an essential matrix if and only if two of its singular values are equal, and the third is zero
  • 对其乘以 任意非零常数,对极约束依然满足(尺度等价性,up to scale)
  • 根据 $\mathbf{E} = \mathbf{t}^\wedge \mathbf{R}$,$\mathbf{E}$ 的奇异值必定是 $[\sigma, \sigma, 0]^T$ 的形式
  • 具有 5个自由度:平移旋转共6个自由度 - 尺度等价性
    • The essential matrix, E = [t] × R, has only five degrees of freedom: both the rotation matrix R and the translation t have three degrees of freedom, but there is an overall scale ambiguity – like the fundamental matrix, the essential matrix is a homogeneous quantity.
  • 奇异性约束 $\text{rank}(\mathbf{E})=2$(因为 $\text{rank}(\mathbf{t^\wedge})=2$)

$\mathbf{E}$ 与 极线和极点 的关系:

\[\mathbf{l}' = \mathbf{E} \cdot \mathbf{p} \\[2ex] \mathbf{l} = \mathbf{E}^T \cdot \mathbf{p}' \\[2ex] \mathbf{E} \cdot \mathbf{e} = \mathbf{E}^T \cdot \mathbf{e}' = 0\]

$\mathbf{E}$ 的计算:

  • 5点法(最少5对点求解)
  • 8点法(Eight-Point Algorithm

Essential Matrix Estimation

矩阵形式:

\[\begin{bmatrix} x' & y' & 1 \end{bmatrix} \cdot \begin{bmatrix} e_{1} & e_{2} & e_{3} \\ e_{4} & e_{5} & e_{6} \\ e_{7} & e_{8} & e_{9} \end{bmatrix} \cdot \begin{bmatrix} x \\ y \\ 1 \end{bmatrix} = 0\]

矩阵 $\mathbf{E}$ 展开,写成向量形式 $\mathbf{e}$,并把所有点(n对点,n>=8)放到一个方程中,齐次线性方程组 如下:

\[\begin{bmatrix} x'^1x^1 & x'^1y^1 & x'^1 & y'^1x^1 & y'^1y^1 & y'^1 & x^1 & y^1 & 1 \\ x'^2x^2 & x'^2y^2 & x'^2 & y'^2x^2 & y'^2y^2 & y'^2 & x^2 & y^2 & 1 \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ x'^nx^n & x'^ny^n & x'^n & y'^nx^n & y'^ny^n & y'^n & x^n & y^n & 1 \\ \end{bmatrix} \cdot \begin{bmatrix} e_{1} \\ e_{2} \\ e_{3} \\ e_{4} \\ e_{5} \\ e_{6} \\ e_{7} \\ e_{8} \\ e_{9} \end{bmatrix} = 0\]

即(the essential matrix lying in the null space of this matrix A

\[\mathbf{A} \cdot \mathbf{e} = \mathbf{0} \quad s.t. \quad \mathbf{A} \in \mathbb{R}^{n \times 9}, n \geq 8\]

对上式 求解 最小二乘解(尺度等价性)

\[\min_{\mathbf{e}} \|\mathbf{A} \cdot \mathbf{e}\|^2 \quad s.t. \quad \|\mathbf{e}\| = 1 \quad \text{or} \quad {\|\mathbf{E}\|}_F = 1\]

SVD分解 $\mathbf{A}$(或者 EVD分解 $\mathbf{A}^T \mathbf{A}$)

\[\text{SVD}(\mathbf{A}) = \mathbf{U} \mathbf{D} \mathbf{V}^T\]

$\mathbf{e}$ 正比于 $\mathbf{V}$ 的最后一列,得到 $\mathbf{E}$

根据 奇异性约束 $\text{rank}(\mathbf{E})=2$,再 SVD分解 $\mathbf{E}$

\[\text{SVD}(\mathbf{E}) = \mathbf{U}_E \mathbf{D}_E \mathbf{V}_E^T\]

求出的 $\mathbf{E}$ 可能不满足其内在性质(奇异值是 $[\sigma, \sigma, 0]^T$ 的形式),此时对 $\mathbf{D}_E$ 进行调整,假设 $\mathbf{D}_E = \text{diag}(\sigma_1, \sigma_2, \sigma_3)$ 且 $\sigma_1 \geq \sigma_2 \geq \sigma_3$,则令

\[\mathbf{D}_E' = \text{diag}(\frac{\sigma_1+\sigma_2}{2}, \frac{\sigma_1+\sigma_2}{2}, 0)\]

或者,更简单的(尺度等价性)

\[\mathbf{D}_E' = \text{diag}(1, 1, 0)\]

最后,$\mathbf{E}$ 矩阵的正确估计为

\[\mathbf{E}' = \mathbf{U}_E \mathbf{D}_E' \mathbf{V}_E^T\]

Rotation and translation from E

The four possible solutions for calibrated reconstruction from E

  • Between the left and right sides there is a baseline reversal
  • Between the top and bottom rows camera B rotates 180 about the baseline
  • only in (a) is the reconstructed point in front of both cameras

Suppose that the SVD of $E$ is

\[\text{SVD}(E) = U \text{diag}(1, 1, 0) V^T\]

there are (ignoring signs) two possible factorizations $E = t^\wedge R = SR$ as follows

\[\begin{aligned} t^\wedge &= UZU^T \quad or \quad U(-Z)U^T \\[2ex] R &= UWV^T \quad or \quad UW^TV^T \end{aligned}\]

where

\[\begin{aligned} W &= \begin{bmatrix} 0 & -1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix} = R_z(\frac{\pi}{2}) \\[3ex] Z &= \begin{bmatrix} 0 & 1 & 0 \\ -1 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix} = \text{diag}(1, 1, 0) \cdot W \end{aligned}\]
  • $W$ is orthogonal
  • $Z$ is skew-symmetric
  • also we can get $t$ with $t = U(0,0,1)^T = \mathbf{u}_3$, the last column of $U$

Rt恢复示例代码 e2rt.cpp

Matrix3d E;
E << -0.0203618550523477,   -0.4007110038118445,  -0.03324074249824097,
      0.3939270778216369,   -0.03506401846698079,  0.5857110303721015,
     -0.006788487241438284, -0.5815434272915686,  -0.01438258684486258;

cout << "E = \n" << E << endl;

// SVD and fix sigular values
JacobiSVD<MatrixXd> svd(E, ComputeThinU | ComputeThinV);
Matrix3d m3U = svd.matrixU();
Matrix3d m3V = svd.matrixV();
Vector3d v3S = svd.singularValues();

double temp = (v3S[0]+v3S[1])/2;
Matrix3d m3S(Vector3d(temp, temp, 0).asDiagonal());

Eigen::Matrix3d m3R_z_p = Eigen::AngleAxisd( M_PI/2, Eigen::Vector3d(0,0,1)).toRotationMatrix();
Eigen::Matrix3d m3R_z_n = Eigen::AngleAxisd(-M_PI/2, Eigen::Vector3d(0,0,1)).toRotationMatrix();
cout << "m3R_z_p = \n" << m3R_z_p << endl;
cout << "m3R_z_n = \n" << m3R_z_n << endl;

// set t1, t2, R1, R2
Matrix3d t_wedge1;
Matrix3d t_wedge2;
t_wedge1 = m3U * m3R_z_p * m3S * m3U.transpose();
t_wedge2 = m3U * m3R_z_n * m3S * m3U.transpose();

Matrix3d R1;
Matrix3d R2;
R1 = m3U * m3R_z_p.transpose() * m3V.transpose();
R2 = m3U * m3R_z_n.transpose() * m3V.transpose();

cout << "R1 = \n" << R1 << endl;
cout << "R2 = \n" << R2 << endl;
cout << "t1 = \n" << Sophus::SO3::vee(t_wedge1) << endl;
cout << "t2 = \n" << Sophus::SO3::vee(t_wedge2) << endl;

// check t^R=E up to scale
Matrix3d tR = t_wedge1 * R1;
cout << "t^R = \n" << tR << endl;

DecomposeE in ORB-SLAM2

void Initializer::DecomposeE(const cv::Mat &E, cv::Mat &R1, cv::Mat &R2, cv::Mat &t) {
    cv::Mat u,w,vt;
    cv::SVD::compute(E,w,u,vt);

    u.col(2).copyTo(t);
    t=t/cv::norm(t);

    cv::Mat W(3,3,CV_32F,cv::Scalar(0));
    W.at<float>(0,1)=-1;
    W.at<float>(1,0)=1;
    W.at<float>(2,2)=1;

    R1 = u*W*vt;
    if(cv::determinant(R1)<0)
        R1=-R1;

    R2 = u*W.t()*vt;
    if(cv::determinant(R2)<0)
        R2=-R2;
}

CheckRT in ORB-SLAM2

Homography Matrix (单应性矩阵)

  • For planar surfaces, 3D to 2D perspective projection reduces to a 2D to 2D transformation
  • [从零开始一起学习SLAM 神奇的单应矩阵](https://zhuanlan.zhihu.com/p/49435367)

设图像 $I_1$ 和 图像 $I_2$ 有一对匹配好的特征点 $p_1$ 和 $p_2$。这个特征点落在平面 $P$ 上,设这个平面满足方程:

\[\mathbf{n}^T \mathbf{P} + d = 0\]

单应性矩阵 通常描述处于 共同平面 上的一些点在 两张图像之间的变换关系

\[\begin{aligned} \mathbf{p}_2 &\simeq \mathbf{K}_2 \left(\mathbf{R} \mathbf{P} + \mathbf{t} \right) \\ &\simeq \mathbf{K}_2 \left(\mathbf{R} \mathbf{P} + \mathbf{t} \cdot\left(-\frac{\mathbf{n}^{T} \mathbf{P}}{\mathbf{d}}\right)\right) \\ &\simeq \mathbf{K}_2 \left(\mathbf{R} - \frac{\mathbf{t} \mathbf{n}^T}{d}\right) \mathbf{P} \\ &\simeq \mathbf{K}_2 \left(\mathbf{R} - \frac{\mathbf{t} \mathbf{n}^T}{d}\right) \mathbf{K}_1^{-1} \mathbf{p}_1 \\ &\simeq \mathbf{H} \cdot \mathbf{p}_1 \end{aligned}\]

单应性矩阵:

\[\mathbf{H} = \mathbf{K}_2 (\mathbf{R} - \frac{\mathbf{t} \mathbf{n}^T}{d}) \mathbf{K}_1^{-1}\]

Homography Estimation

矩阵形式:

\[\begin{bmatrix} x' \\ y' \\ 1 \end{bmatrix} = \begin{bmatrix} h_{11} & h_{12} & h_{13} \\ h_{21} & h_{22} & h_{23} \\ h_{31} & h_{32} & h_{33} \end{bmatrix} \cdot \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}\]

方程形式(两个约束条件):

\[x' = \frac { h_{11}x + h_{12}y + h_{13} } { h_{31}x + h_{32}y + h_{33} } \\[2ex] y' = \frac { h_{21}x + h_{22}y + h_{23} } { h_{31}x + h_{32}y + h_{33} }\]

因为上式使用的是齐次坐标,所以我们可以 对 所有的 $h_{ij}$ 乘以 任意的非0因子 而不会改变方程。

因此, $\mathbf{H}$ 具有 8个自由度,最少通过 4对匹配点(不能出现3点共线) 算出。

实际中,通过 $h_{33}=1$$|\mathbf{H}|_F=1$ 两种方法 使 $\mathbf{H}$ 具有 8自由度。

cont 1: H元素h33=1

线性方程:

\[\mathbf{A} \cdot \mathbf{h} = \mathbf{b}\]

求解:

\[\mathbf{A}^T \mathbf{A} \cdot \mathbf{h} = \mathbf{A}^T \mathbf{b}\]

所以

\[\mathbf{h} = (\mathbf{A}^T \mathbf{A})^{-1} \mathbf{A}^T \mathbf{b}\]

cont 2: H的F范数|H|=1

线性方程:

\[\mathbf{A} \cdot \mathbf{h} = \mathbf{0}\]

求解:

\[\mathbf{A}^T \mathbf{A} \cdot \mathbf{h} = \mathbf{0}\]

对上式 求解 最小二乘解(尺度等价性)

\[\min_{\mathbf{h}} \|(\mathbf{A}^T \mathbf{A}) \cdot \mathbf{h}\|^2 \quad s.t. \quad \|\mathbf{h}\| = 1 \quad \text{or} \quad {\|\mathbf{H}\|}_F = 1\]

SVD分解 或 特征值分解

\[\text{SVD}(\mathbf{A}^T \mathbf{A}) = \mathbf{U} \boldsymbol{\Sigma} \mathbf{U}^T\]

最后 $\mathbf{h}$ 为 $\boldsymbol{\Sigma}$ 中 最小特征值 对应的 $\mathbf{U}$ 中的列向量(单位特征向量);如果只用4对匹配点,那个特征值为0。

H in PTAM

单应性矩阵的计算

Matrix<3> HomographyInit::HomographyFromMatches(vector<HomographyMatch> vMatches)
{
    unsigned int nPoints = vMatches.size();
    assert(nPoints >= 4);
    int nRows = 2*nPoints;
    if(nRows < 9)
        nRows = 9;
    Matrix<> m2Nx9(nRows, 9);
    for(unsigned int n=0; n<nPoints; n++)
    {
        double u = vMatches[n].v2CamPlaneSecond[0];
        double v = vMatches[n].v2CamPlaneSecond[1];

        double x = vMatches[n].v2CamPlaneFirst[0];
        double y = vMatches[n].v2CamPlaneFirst[1];

        // [u v]T = H [x y]T
        m2Nx9[n*2+0][0] = x;
        m2Nx9[n*2+0][1] = y;
        m2Nx9[n*2+0][2] = 1;
        m2Nx9[n*2+0][3] = 0;
        m2Nx9[n*2+0][4] = 0;
        m2Nx9[n*2+0][5] = 0;
        m2Nx9[n*2+0][6] = -x*u;
        m2Nx9[n*2+0][7] = -y*u;
        m2Nx9[n*2+0][8] = -u;

        m2Nx9[n*2+1][0] = 0;
        m2Nx9[n*2+1][1] = 0;
        m2Nx9[n*2+1][2] = 0;
        m2Nx9[n*2+1][3] = x;
        m2Nx9[n*2+1][4] = y;
        m2Nx9[n*2+1][5] = 1;
        m2Nx9[n*2+1][6] = -x*v;
        m2Nx9[n*2+1][7] = -y*v;
        m2Nx9[n*2+1][8] = -v;
    }

    if(nRows == 9)
        for(int i=0; i<9; i++)  // Zero the last row of the matrix,
            m2Nx9[8][i] = 0.0;  // TooN SVD leaves out the null-space otherwise

    // The right null-space of the matrix gives the homography...
    SVD<> svdHomography(m2Nx9);
    Vector<9> vH = svdHomography.get_VT()[8];
    Matrix<3> m3Homography;
    m3Homography[0] = vH.slice<0,3>();
    m3Homography[1] = vH.slice<3,3>();
    m3Homography[2] = vH.slice<6,3>();
    return m3Homography;
};

Rotation and translation from H

  • Motion and structure from motion in a piecewise planar environment

手写笔记

Degenerate Case (退化情况)

The computation of the two-view geometry requires that the matches originate from a 3D scene and that the motion is more than a pure rotation. If the observed scene is planar, the fundamental matrix is only determined up to three degrees of freedom. The same is true when the camera motion is a pure rotation. In this last case -only having one center of projection- depth can not be observed. In the absence of noise the detection of these degenerate cases would not be too hard. Starting from real -and thus noisy- data, the problem is much harder since the remaining degrees of freedom in the equations are then determined by noise. Different models are evaluated. In this case the fundamental matrix (corresponding to a 3D scene and more than a pure rotation), a general homography (corresponding to a planar scene) and a rotation-induced homography are computed. Selecting the model with the smallest residual would always yield the most general model.

  • If the scene is planar, nearly planar or there is low parallax, it can be explained by a homography.

  • a non-planar scene with enough parallax can only be explained by the fundamental matrix, but a homography can also be found explaining a subset of the matches if they lie on a plane or they have low parallax (they are far away)

Reference

  • Epipolar Geometry and the Fundamental Matrix in MVG (Chapter 9)
  • 《视觉SLAM十四讲》
Read More

^