论文阅读(Papers)
Title | Keyword | Venue | Link |
Learning Multi-Scene Absolute Pose Regression with Transformers | Camera Localization; Transformer; Multi-Scene APR | ICCV'21 | [Paper] [Code] |
Image-Based Localization Using Hourglass Networks | Camera Localization; Encoder-Decoder CNN; | ICCVW'17 | [Paper] |
Geometric Loss Functions for Camera Pose Regression with Deep Learning | Geometric Reprojection; Homoscedastic Uncertainty; Camera Localization; CNN | CVPR'17 | [Paper] |
Image-Based Localization Using LSTMs for Structured Feature Correlation | Camera Localization; LSTM; CNN; Large-Scale Indoor Localization | ICCV'17 | [Paper][Dataset] |
Sequential Modeling Enables Scalable Learning for Large Vision Models | Large Vision Model (LVM); Unified Vision Dataset (UVD); Auto-Regressive; Pretrained; Transformer | arXiv'23 | [Paper] [Code] [Dataset] [Project] |
Learning to Detect Scene Landmarks for Camera Localization | Camera Localization; Scene Landmarks; CNN; Indoor-6 | CVPR'22 | [Paper] [Code] [Dataset] |
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions | GPT-4V; Large Multi-modal Models (LMMs); Pre-training; Image Captioning | arXiv'23 | [Paper] [Website] |
From Coarse to Fine: Robust Hierarchical Localization at Large Scale | Camera Localization; 6-DoF; SfM; CNN; Hierarchical Framework | CVPR'19 | [Paper] [Code] |
NavMarkAR: A Landmark-based Augmented Reality (AR) Wayfinding System for Enhancing Spatial Learning of Older Adults | HCI; Augmented Reality (AR); Indoor Navigation; Multimodal Interaction; Older Adults | arXiv'23 | [Paper] |
SparsePose: Sparse-View Camera Pose Regression and Refinement | Camera Pose Regression; Sparse Set; Wide-Baseline Images | CVPR'23 | [Paper] [Code] | Image Captioning State-of-the-Art: Is it enough for the Guidance of Visually Impaired in an Environment? | Image Captioning; VIPs | CSA'22 | [Paper] |
Grounding Answers for Visual Questions Asked by Visually Impaired People | Visual Question Answering (VQA); Answer Grounding; VIPs; Viz-Wiz-VQA-Grounding | CVPR'22 | [Paper] [Dataset] |
Structure-from-Motion Revisited | SfM; 3D Reconstruction; COLMAP | CVPR'16 | [Paper] [Code (COLMAP)] |
ViNav: A Vision-Based Indoor Navigation System for Smartphones | Indoor Navigation System; Smartphone System; SfM; OCR; Dead Reckoning; WiFi Fingerprinting | TMC'18 | [Paper] |
Understanding the Limitations of CNN-based Absolute Camera Pose Regression | PoseNet; MapNet; Image Retrieval; Absolute Camera pose Regression (APR); Relative camera Pose Regression (RPR) | CVPR'19 | [Paper] [Code] |
A Preliminary Study on the Possibility of Scene Captioning Model Integration as an Improvement in Assisted Navigation for Visually Impaired Users | Visually Impaired (VI); NAS; Image Captioning; RGB-D | AsiaSim'23 | [Paper] |
ClipCap: CLIP Prefix for Image Captioning | Image Captioning; Prefix; CLIP; GPT; VLP; | arXiv'21 | [Paper] [Code] |
"I Want to Figure Things Out": Supporting Exploration in Navigation for People with Visual Impairments | Human Computer Interaction (HCI); Visually Impaired People (VIPs); Navigation Assistance Systems (NASs); Accessibility | PACMHCI'23 | [Paper] |
Segment Anything | Promptable Segmentation; PFM; SA-1B | ICCV'23 | [Paper] [Code] [Website] |
Camera Pose Auto-Encoders for Improving Pose Regression | Camera Pose Regression; Auto-Encoders; Image Reconstruction | ECCV'22 | [Paper] [Code] |
PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization | Camera Pose Regression; 6-DoF; CNN; Cambridge Landmarks | ICCV'15 | [Paper] [Code] [Dataset] |
Real-time Vision-based Navigation for a Robot in an Indoor Environment | SAM; Obstacle Avodiance; BEV; Path Planning; A* Algorithm | arXiv'23 | [Paper] [Code] [Appendix] [Dataset] [Video] |
IBeaconMap: Automated Indoor Space Representation for Beacon-Based Wayfinding | Indoor Navigation; Beacon Planning; Floor Plan | ICCHP'20 | [Paper] |
综述阅读(Surveys)
Title | Keyword | Venue | Link |
Vision-language navigation: a survey and taxonomy | Vision-and-Language Navigation (VLN); Taxonomy | NCA'23 | [Paper] |
Indoor Navigation Systems for Visually Impaired Persons: Mapping the Features of Existing Technologies to User Needs | VIPs; Indoor Navigation System; Sensor; Assisstive Device; Meta Analysis | Sensors'20 | [Paper] |
A Critical Analysis of Image-based Camera Pose Estimation Techniques | Camera Pose Regression; Structure-Based Localization; Absolute/Relative Pose Regression | arXiv'22 | [Paper] |
基准测试(Benchmarking)
Title | Keyword | Venue | Link |
Fiducial Markers for Pose Estimation: Overview, Applications and Experimental Comparison of the ARTag, AprilTag, ArUco and STag Markers | Fiducial Markers; Pose Estimation; Localization | JINT'21 | [Paper] |