Whole Stomach 3D Reconstruction and Frame Localization
from Monocular Endoscope Video

Sho Suzuki2, Takuji Gotoda2, Kenji Miki3
1Department of Systems and Control Engineering, School of Engineering, Tokyo Institute of Technology
2Division of Gastroenterology and Hepatology, Department of Medicine, Nihon University School of Medicine
3Department of Internal Medicine, Tsujinaka Hospital Kashiwanoha
1Department of Systems and Control Engineering, School of Engineering, Tokyo Institute of Technology
2Division of Gastroenterology and Hepatology, Department of Medicine, Nihon University School of Medicine
3Department of Internal Medicine, Tsujinaka Hospital Kashiwanoha
Abstract

Gastric endoscopy is a common clinical practice that enables medical doctors to diagnose various lesions inside a stomach. In order to identify the location of a gastric lesion such as early cancer and a peptic ulcer within the stomach, this work addresses to reconstruct the color-textured 3D model of a whole stomach from a standard monocular endoscope video and localize any selected video frame to the 3D model. We examine how to enable structure-from-motion (SfM) to reconstruct the whole shape of a stomach from endoscope images, which is a challenging task due to the texture-less nature of the stomach surface. We specifically investigate the combined effect of chromo-endoscopy and color channel selection on SfM to increase the number of feature points. We also design a plane fitting-based algorithm for 3D point outliers removal to improve the 3D model quality. We show that whole stomach 3D reconstruction can be achieved (more than 90% of the frames can be reconstructed) by using red channel images captured under chromo-endoscopy by spreading indigo carmine (IC) dye on the stomach surface. In experimental results, we demonstrate the reconstructed 3D models for seven subjects and the application of lesion localization and reconstruction. The methodology and results presented in this paper could offer some valuable reference to other researchers and also could be an excellent tool for gastric surgeons in various computer-aided diagnosis applications.

Summary

Firstly, This study was conducted in accordance with the Declaration of Helsinki. The Institutional Review Board at Nihon University Hospital approved the study protocol on March 8, 2018, before patient recruitment. Informed consent was obtained from all patients before they were enrolled. This study was registered with the University Hospital Medical Information Network (UMIN) Clinical Trials Registry (identification No.: UMIN000031776) on March 17, 2018. This study was also approved by the research ethics committee of Tokyo Institute of Technology, where 3D reconstruction experiments were conducted.

Our goal is trying to reconstruct the whole stomach shape from endoscope video using structrure-from-motion pipeline. We captured endoscope video using standard endoscope system. The data are saved as 30 fps AVI format. We also captured additional checkerboard pattern for camera calibration. For each video, we extractd all RGB frames and seperate them into two categories: with and without IC dye. We then removed frames that has little to no movement between successive frames. In our early inspection, we found that there are apparent color artifact in RGB images. To account for the artifact, we then chose to use single channel images for structure-from-motion (SfM) input. We then investiaged the effect of color channel selection and the presence of IC dye to the SfM reconstruction result.

In our extended journal version, we proposed a new plane fitting-based outlier removal strategy and local reconstruction pipeline. Using the new outlier removal pipeline, a high resolution mesh can be produced compared to previous method which needs to downsample the poit cloud size. The local reconstruction is proposed to reconstruct the surounding area of any particularly selected frame by the surgeon. We believe it helps the surgeon by providing a more detailed reconstruction of a particular area.

Flowchart
The flow of overall pipeline

The flow of local reconstruction
Paper
[Accepted for publication, to appear]
[To appear]
[arXiv page] 3D Reconstruction of Whole Stomach from Endoscope Video Using Structure-from-Motion

@misc{widya20193d,
title={3D Reconstruction of Whole Stomach from Endoscope Video Using Structure-from-Motion},
author={Aji Resindra Widya and Yusuke Monno and Kosuke Imahori and Masatoshi Okutomi and Sho Suzuki and Takuji Gotoda and Kenji Miki},
year={2019},
eprint={1905.12988},
archivePrefix={arXiv},
primaryClass={cs.CV}
}

Color channel and IC dye

Red channel

Green channel

Blue channel

Color artifact in RGB images. Examples of endoscope images captured without IC dye (top) and with IC dye (bottom). The color channel misalignment problem is observed in the RGB images. The following images are color channel extracted from RGB images in R, G, B order respectively. We can observe that the IC dye adds textures on the stomach surface, especially in the red channel.

Color artifact in RGB images. Examples of endoscope images captured without IC dye (left) and with IC dye (right). The color channel misalignment problem is observed in the RGB images. The following images are color channel extracted from RGB images in R, G, B order respectively. We can observe that the IC dye adds textures on the stomach surface, especially in the red channel.

Result

We use red channel images with IC blue dye to reconstruct all the results in this section.

Subject A input images. Red dots in each frame represent extracted feature points.

Subject B input images. Red dots in each frame represent extracted feature points.

Subject C input images. Red dots in each frame represent extracted feature points.

Subject D input images. Red dots in each frame represent extracted feature points.

Subject E input images. Red dots in each frame represent extracted feature points.

Subject F input images. Red dots in each frame represent extracted feature points.

Subject G input images. Red dots in each frame represent extracted feature points. In this part, the local reconstruction result is also shown.