Table of Links
Abstract and 1 Introduction
1.1. Spatial Digital Twins (SDTs)
1.2. Applications
1.3. Different Components of SDTs
1.4. Scope of This Work and Contributions
2. Related Work and 2.1. Digital Twins and Variants
2.2. Spatial Digital Twin Case Studies
3. Building Blocks of Spatial Digital Twins and 3.1. Data Acquisition and Processing
3.2. Data Modeling, Storage and Management
3.3. Big Data Analytics System
3.4. Maps and GIS Based Middleware
3.5. Key Functional Components
4. Other Relevant Modern Technologies and 4.1. AI & ML
4.2. Blockchain
4.3. Cloud Computing
5. Challenges and Future Work, and 5.1. Multi-modal and Multi-resolution Data Acquisition
5.2. NLP for Spatial Queries and 5.3. Benchmarking the Databases and Big Data Platform for SDT
5.4. Automated Spatial Insights and 5.5. Multi-modal Analysis
5.6. Building Simulation Environment
5.7. Visualizing Complex and Diverse Interactions
5.8. Mitigating the Security and Privacy Concerns
6. Conclusion and References
5. Challenges and Future Work
Considering the current state of knowledge in spatial technologies and SDTs, we have identified a key set of challenges and opportunities that needs immediate attention from researchers and practitioners in order to build a sustainable SDT. In this section, we discuss these challenges and list some important directions for future work.
5.1. Multi-modal and Multi-resolution Data Acquisition
Most of the existing research [10] in this domain highlights the data acquisition and integration as one of the major challenges in SDT. An SDT involves acquisition of a wide variety of spatial and associated non-spatial data. As an SDT needs to use a wide range of devices to capture data of different spatial and temporal resolution, scale, and precision, quality of these data largely varies. To the best of our knowledge, no longitudinal study has been done for bench-marking data capturing devices used to capture various spatial data so that the data integration from difference sources/devices can be done seamlessly. For example, the integration of BIM and 3D GIS data still remains a challenge due to the generation process (or data sources) and the differences in the standards used in these two formats.
5.2. NLP for Spatial Queries
Current query processing techniques on SDTs is limited to running SQL queries on relational or NoSQL database systems or running textual queries on map services (e.g., finding a POI on Google Maps). Recent breakthroughs in NLP enable researchers to devise text-to-SQL techniques that facilitate automatic translation of natural language text to SQL and run the query to retrieve answers from database tables[8]. Although the accuracy of such approaches is still not good enough for commercial use, the recent breakthroughs in very large conversational generative language models, such as GPT-3/4 [99] and LaMDA [100], are showing great promise in natural language based query processing in database systems. We envision that there is a huge scope of research in natural language based query processing in SDTs by exploiting the power of these very large language models. As the spatial data and their relationships, and other associated data describing spatial entities make the whole data interaction use cases complex, it would be interesting to see how we can use different spatial (e.g., adjacent/nearby) and structural (e.g., road network) properties along with tabular data to answer user queries on SDTs.
5.3. Benchmarking the Databases and Big Data Platform for SDT
Some recent research works experimentally evaluate different spatial databases and big data platforms but under limited settings. In [43], the authors compared Oracle Spatial and PostgreSQL performance using a small spatial dataset consisting of New York city census blocks and streets, where they used select and range query to measure the performance. In [101], the authors compared MongoDB and PostgreSQL to measure the performance for spatio-temporal range and proximity queries, where they used polygons and vessel movement (a sequence of lat-long pairs) datasets. Another recent work [102] compared Geospark based system against three database technologies, namely MongoDB, PostgreSQL, and Amazon EC2 services where they also used polygons and vessel movement datasets to measure the performance of range and proximity queries. As we have observed from the current research, database technologies in this domain are still not mature and only support basic spatial data and spatial queries. As an SDT generally hosts various forms of data ranging from 3D building data to continuous stream of energy consumption, effectiveness of handing these data in existing platforms has not been benchmarked yet.
Authors:
(1) Mohammed Eunus Ali, Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, ECE Building, Dhaka, 1000, Bangladesh;
(2) Muhammad Aamir Cheema, Faculty of Information Technology, Monash University, 20 Exhibition Walk, Clayton, 3164, VIC, Australia;
(3) Tanzima Hashem, Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, ECE Building, Dhaka, 1000, Bangladesh;
(4) Anwaar Ulhaq, School of Computing, Charles Sturt University, Port Macquarie, 2444, NSW, Australia;
(5) Muhammad Ali Babar, School of Computer and Mathematical Sciences, The University of Adelaide, Adelaide, 5005, SA, Australia.