基于ORB算法的可配置图像特征提取加速器

架构特点

1)整体为串并行混合架构,3个尺度并行处理,超过3个尺度依次串行处理,增加灵活性。

2) 配合以上的架构,使用按块加载部分可复用的数据流,节约17%外存带宽。

3)由于按块加载,可以实现关键点在二维方向上的分布均衡,提高整体有效匹配率7%。

4)计算方向角和描述子时,采用数据奇偶分块存储、离散采样、近似计算、超标量计算等多种措施,大幅压缩各自的计算时间92%和67%。

加速器架构

匹配精度对比

(软件、硬件无关键点均衡、硬件有关键点均衡)

A flexible and efficient real-time ORB-based full-HD image feature extraction accelerator

Features

1) serial-parallel hybrid architecture, three scales of an octave are processed in parallel, more octaves are processed in serial, improving flexibility.

2) Cooperated with the architecture, images are loaded from external memory in block-wise, and the overlapped data is reused in on-chip buffer, saving 17% external memory bandwidth.

3) Due to the block-wise dataflow, keypoints are balanced in 2-dimensions through an hardware binary tree, increasing valid match ratio by 7%.

4) For orientation estimation and descriptor generation, multi-bank data interleave, discrete sampling, approximate computing and superscalar computing are adopted, reduce processing time by 92% and 67% respectively.

Architecture of the accelerator

Match ratio comparison (software, hardware without keypoint distribution balance, hardware with keypoint distribution balance)