Details
Original language | English |
---|---|
Article number | 108951 |
Number of pages | 18 |
Journal | Engineering Applications of Artificial Intelligence |
Volume | 137 |
Early online date | 8 Aug 2024 |
Publication status | E-pub ahead of print - 8 Aug 2024 |
Abstract
In the realm of digital twin technology, image localization emerges as a crucial aspect, particularly in the challenging domain of civil engineering construction. Unlike the data-rich environments typical of structure-from-motion (sfm) technologies, the construction phase of civil engineering projects often faces economic constraints that limit data collection. This results in sporadic and localized snapshots, rather than comprehensive spatial and temporal coverage of the entire scene. Such prevalent data sparsity poses significant challenges to achieving accurate image localization. Our research is tailored to address this specific challenge, focusing on single image localization in environments where data is inherently sparse. We introduce a multi-scale convolutional attention network, incorporating feature-fused adversarial components, to effectively navigate the complexities of sparse data typical in civil engineering construction sites. The network employs large kernel convolutions for refined channel and spatial attention, ensuring precise location information transmission, even in data-limited scenarios. This accuracy is further augmented by multi-scale convolutional layers and a multi-level discriminator network, aiming to minimize the domain shift between virtual and real-world imagery. Our approach was rigorously tested and subjected to ablation studies on two public datasets, confirming its efficacy. In indoor settings, we achieved a median localization accuracy of 1.12 m and 9.80°, and in outdoor environments, our best results were 3.69 m and 1.67°. These outcomes highlight the effectiveness of our method in addressing the unique challenges posed by data sparsity in civil engineering construction. We also investigated the impact of domain adaptation on localization accuracy across different feature levels, finding that its effect varies depending on the degree of alignment between virtual and real datasets. In conclusion, this study offers a significant contribution to image localization in digital twin technology, particularly in the challenging context of data-sparse civil engineering construction processes. It paves the way for future research in optimizing image localization techniques in similar sparse data environments.
Keywords
- 3D model, Domain adaptation, Large kernel attention, Synthetic dataset, Visual localization
ASJC Scopus subject areas
- Engineering(all)
- Control and Systems Engineering
- Computer Science(all)
- Artificial Intelligence
- Engineering(all)
- Electrical and Electronic Engineering
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: Engineering Applications of Artificial Intelligence, Vol. 137, 108951, 11.2024.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - Improving single image localization through domain adaptation and large kernel attention with synthetic data
AU - Yao, Dansheng
AU - Zhu, Hehua
AU - Ren, Bangke
AU - Zhuang, Xiaoying
N1 - Publisher Copyright: © 2024
PY - 2024/8/8
Y1 - 2024/8/8
N2 - In the realm of digital twin technology, image localization emerges as a crucial aspect, particularly in the challenging domain of civil engineering construction. Unlike the data-rich environments typical of structure-from-motion (sfm) technologies, the construction phase of civil engineering projects often faces economic constraints that limit data collection. This results in sporadic and localized snapshots, rather than comprehensive spatial and temporal coverage of the entire scene. Such prevalent data sparsity poses significant challenges to achieving accurate image localization. Our research is tailored to address this specific challenge, focusing on single image localization in environments where data is inherently sparse. We introduce a multi-scale convolutional attention network, incorporating feature-fused adversarial components, to effectively navigate the complexities of sparse data typical in civil engineering construction sites. The network employs large kernel convolutions for refined channel and spatial attention, ensuring precise location information transmission, even in data-limited scenarios. This accuracy is further augmented by multi-scale convolutional layers and a multi-level discriminator network, aiming to minimize the domain shift between virtual and real-world imagery. Our approach was rigorously tested and subjected to ablation studies on two public datasets, confirming its efficacy. In indoor settings, we achieved a median localization accuracy of 1.12 m and 9.80°, and in outdoor environments, our best results were 3.69 m and 1.67°. These outcomes highlight the effectiveness of our method in addressing the unique challenges posed by data sparsity in civil engineering construction. We also investigated the impact of domain adaptation on localization accuracy across different feature levels, finding that its effect varies depending on the degree of alignment between virtual and real datasets. In conclusion, this study offers a significant contribution to image localization in digital twin technology, particularly in the challenging context of data-sparse civil engineering construction processes. It paves the way for future research in optimizing image localization techniques in similar sparse data environments.
AB - In the realm of digital twin technology, image localization emerges as a crucial aspect, particularly in the challenging domain of civil engineering construction. Unlike the data-rich environments typical of structure-from-motion (sfm) technologies, the construction phase of civil engineering projects often faces economic constraints that limit data collection. This results in sporadic and localized snapshots, rather than comprehensive spatial and temporal coverage of the entire scene. Such prevalent data sparsity poses significant challenges to achieving accurate image localization. Our research is tailored to address this specific challenge, focusing on single image localization in environments where data is inherently sparse. We introduce a multi-scale convolutional attention network, incorporating feature-fused adversarial components, to effectively navigate the complexities of sparse data typical in civil engineering construction sites. The network employs large kernel convolutions for refined channel and spatial attention, ensuring precise location information transmission, even in data-limited scenarios. This accuracy is further augmented by multi-scale convolutional layers and a multi-level discriminator network, aiming to minimize the domain shift between virtual and real-world imagery. Our approach was rigorously tested and subjected to ablation studies on two public datasets, confirming its efficacy. In indoor settings, we achieved a median localization accuracy of 1.12 m and 9.80°, and in outdoor environments, our best results were 3.69 m and 1.67°. These outcomes highlight the effectiveness of our method in addressing the unique challenges posed by data sparsity in civil engineering construction. We also investigated the impact of domain adaptation on localization accuracy across different feature levels, finding that its effect varies depending on the degree of alignment between virtual and real datasets. In conclusion, this study offers a significant contribution to image localization in digital twin technology, particularly in the challenging context of data-sparse civil engineering construction processes. It paves the way for future research in optimizing image localization techniques in similar sparse data environments.
KW - 3D model
KW - Domain adaptation
KW - Large kernel attention
KW - Synthetic dataset
KW - Visual localization
UR - http://www.scopus.com/inward/record.url?scp=85200634854&partnerID=8YFLogxK
U2 - 10.1016/j.engappai.2024.108951
DO - 10.1016/j.engappai.2024.108951
M3 - Article
AN - SCOPUS:85200634854
VL - 137
JO - Engineering Applications of Artificial Intelligence
JF - Engineering Applications of Artificial Intelligence
SN - 0952-1976
M1 - 108951
ER -