By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Top Remote Sensing Datasets for Training and Evaluating AI Models | HackerNoon
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > Top Remote Sensing Datasets for Training and Evaluating AI Models | HackerNoon
Computing

Top Remote Sensing Datasets for Training and Evaluating AI Models | HackerNoon

News Room
Last updated: 2025/06/10 at 9:52 AM
News Room Published 10 June 2025
Share
SHARE

Table of Links

  1. Abstract and Introduction
  2. Backgrounds
  3. Type of remote sensing sensor data
  4. Benchmark remote sensing datasets for evaluating learning models
  5. Evaluation metrics for few-shot remote sensing
  6. Recent few-shot learning techniques in remote sensing
  7. Few-shot based object detection and segmentation in remote sensing
  8. Discussions
  9. Numerical experimentation of few-shot classification on UAV-based dataset
  10. Explainable AI (XAI) in Remote Sensing
  11. Conclusions and Future Directions
  12. Acknowledgements, Declarations, and References

4 Benchmark remote sensing datasets for evaluating learning models

In this section, we will provide a brief overview of commonly used benchmark datasets that evaluate algorithms in remote sensing. The datasets are categorized and listed based on the type of remote sensing data and platforms they were collected from. It is important to note that these datasets are frequently used by researchers to evaluate and benchmark their algorithms, and although not included in the survey works by [8], they are essential for this review.

4.1 Hyperspectral image dataset

4.1.1 Satellite-Based Data

Most of the datasets described here are more tailored for multi-label image classification, although a few single label-based classification dataset exist.

• Pavia [20, 21]: The Pavia University research team created a hyperspectral image dataset with images consisting of 610 × 610 pixels and 103 spectral bands. Each image in the dataset is a classification map with 9 classes that include mostly urban contexts such as bitumen, brick, and asphalt. The dataset comprises 42,776 labeled images and is specifically designed for multi-label classification.

• Indian Pines [20, 22]: The dataset contains hyperspectral images of a particular landscape in Indiana. It is a multi-label classification dataset where each map consists of 145 × 145 pixels and 224 spectral bands. There are 16 semantic labels available for each map, and the dataset has a total of 10,249 samples.

• Salinas Valley [20]: The Salinas Valley dataset consists of hyperspectral images collected from California, with multi-label classification maps of pixel size 512 × 217 and 224 spectral bands, similar to the Indian Pines dataset. There are 16 semantic classes with 54,129 samples. A subset of the Salinas dataset, referred to as Salinas-A, includes only 86 × 86 image pixels of 6 classes, with a total of 5,348 samples.

• Houston [18]: The Hyperspectral Image Analysis group in collaboration with the NSF Funded Center for Airborne Laser Mapping (NCALM) has acquired images across the University of Houston. This dataset comprises 16 semantic classes of urban objects such as highways, railways, and tennis courts, unlike the Botswana, Indian Pines, and Salinas Valley datasets. The images have 144 spectral bands in the 380 nm to 1050 nm region, and each image has a pixel size of 349 × 1905. The dataset is designed for evaluating multi-label image classification.

• BigEarthNet [23]: The dataset consists of pairs of Sentinel-2 images captured by a multi-spectral sensor, with 590326 pairs collected from 10 European countries. Each image in the pair has a size of 120 × 120 pixels and covers 13 spectral bands. The dataset is annotated with multiple land-cover classes or labels, making it suitable for multi-label classification evaluation.

• EuroSat [24]: The dataset consists of images obtained from the Sentinel2 satellite, covering 13 spectral bands with 10 classes and 27,000 labeled samples. It is utilized for evaluating single-label-based land cover and land use classification. Each image has a pixel size of 64 × 64.

• SEN12MS [25]: The dataset comprises 180,662 images captured from Sentinel-1 and Sentinel-2, with four cover types categorized using different classification schemes. Each image is of size 256 × 256 and contains different spectral bands. The images are annotated by multiple land-cover labels, but the primary objective is to use these labels to infer the overall context of the scene, such as forest, grasslands, or savanna, making it suitable for single label-based scene classification. It is important to note that Sentinel1 images are SAR images, making the dataset useful for SAR-based map classification as well.

4.1.2 UAV-based dataset

• WHU-Hi [26]: The WHU-Hi dataset, which stands for Wuhan UAV-borne Hyperspectral Image, consists of UAV-based images of various crop types gathered in farming areas in Hubei province, China. It is divided into three sub-datasets: WHU-Hi-LongKou, WHU-Hi-HanChuan, and WHU-HiHonghu, each with different individual image sizes, numbers of labels/- classes, and spectral bands, which are explained in Table 2. The dataset is suitable for evaluating multi-label classification algorithms.

4.2 VHR image-based dataset

4.2.1 Satellite-based datasets

• UC Merced Landuse [27]: The dataset was designed for single-label land use classification and comprises 2100 RGB images, each of size 256 × 256 pixels. The dataset consists of 21 classes, predominantly related to urban land use.

• ISPRS Potsdam [28]: The International Society of Photogrammetry and Remote Sensing (ISPRS) developed a dataset for algorithmic evaluation of multi-label map classification. The dataset comprises 38 patches/images. The pixel size of each patch is 6000 × 6000.

• ISPRS Vaihingen [28]: The dataset was created for multi-label map classification and includes 33 patches/images of varying sizes. The pixel size of each patch is 2494 × 2064.

• RESISC45 [29]: The Northwestern Polytechnical University (NWPU) created a dataset for single-label image scene classification. The dataset contains 31,500 images categorized into 45 classes, with each class consisting of 700 images. The pixel size of each image is 256 × 256.

• WHU-RS19 [30]: The dataset is created using satellite images obtained from Google Earth and contains 19 semantic classes, with approximately 50 samples per class. Samples from same class are extracted from different regions with varying resolutions, scales, orientations, and illuminations. Each image in the dataset is 600 x 600 pixels in size. It is intended for the purpose of single label-based image scene classification.

• AID [31]: The Aerial Image Database (AID) is a collection of 10,000 satellite images gathered from Google Earth, each sized 600 × 600 pixels. The dataset includes 30 classes primarily related to urban environments. As with the RESISC45 and WHU-RS19 datasets, AID is used for single-label image scene classification purposes.

4.2.2 UAV-based dataset

• AIDER [12]: The Aerial Image Database for Emergency Response (AIDER) is a collection of 8540 UAV images categorized into four disaster categories – collapsed buildings, fire, flood, and traffic accidents, along with a nondisaster category labeled as ”normal” [32]. This is one of the first UAV-based datasets that can be used as a benchmark for visual-based humanitarian aid or search-and-rescue operations in the RGB spectrum.

• SAMA-VTOL [33]: The SAMA-VTOL aerial image dataset is a new dataset developed from images captured by UAVs. This dataset was created to support a broad spectrum of scientific projects within the field of remote sensing. It is particularly useful for research projects focused on 3D object modeling, urban and rural mapping, and the processing of digital elevation and surface models. The objective is to provide high-resolution, low-cost data that contribute to a better understanding of both urban and rural scenes for various applications.

4.3 SAR image-based dataset

• MSTAR [34]: This dataset consists of 5950 X-band spectral images, each with a size of 128 × 128 pixels, and categorized into 10 classes. It is designed specifically for military object recognition and classification.

• OpenSARShip [35]: The dataset includes 11,346 chips of ships captured by C-band SENTINEL-1 SAR imagery, belonging to 17 ship types, and collected from 41 images. Each chip is labeled with automatic identification system messages indicating different environmental conditions. The image sizes of the chips range from 30 × 30 to 120 × 120 pixels.

It is evident that there are fewer SAR-based benchmark datasets compared to hyperspectral or VHR-based image datasets. According to Fu et al. [36], collecting SAR-based images with fine annotation is more challenging due to the difficulty of acquisition and the tedious and time-consuming process of interpreting and labeling such images. Furthermore, Rostami et al. [37] stated that the devices used for generating SAR images are costly, and the data accessibility is strictly regulated due to its classification.

Table 1 Summary of some datasets commonly utilized for few-shot learning algorithmic evaluation in the domain of remote sensing.Table 1 Summary of some datasets commonly utilized for few-shot learning algorithmic evaluation in the domain of remote sensing.

In Table 2, we have summarized the discussion on the available datasets, highlighting the data type, number of images and classes, pixel sizes, spectral bands (if any), platform, and classification method.

Authors:

(1) Gao Yu Lee, School of Electrical and Electronic Engineering, Nanyang Technological University, 50 Nanyang Ave, 639798, Singapore ([email protected]);

(2) Tanmoy Dam, School of Mechanical and Aerospace Engineering, Nanyang Technological University, 65 Nanyang Drive, 637460, Singapore and Department of Computer Science, The University of New Orleans, New Orleans, 2000 Lakeshore Drive, LA 70148, USA ([email protected]);

(3) Md Meftahul Ferdaus, School of Electrical and Electronic Engineering, Nanyang Technological University, 50 Nanyang Ave, 639798, Singapore ([email protected]);

(4) Daniel Puiu Poenar, School of Electrical and Electronic Engineering, Nanyang Technological University, 50 Nanyang Ave, 639798, Singapore ([email protected]);

(5) Vu N. Duong, School of Mechanical and Aerospace Engineering, Nanyang Technological University, 65 Nanyang Drive, 637460, Singapore ([email protected]).


Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article ChatGPT, Sora down: OpenAI confirms partial outage
Next Article ChatGPT Is Down: OpenAI Confirms a Fix Is Coming Soon
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Sports channel disappears from Virgin TV – but there’s another way to watch FREE
News
Inside Roam Electric’s Nairobi plant assembling 15 bikes daily
Computing
Sky Glass Air is now on sale and it’s an absolute steal – but not for your living room | Stuff
Gadget
WWDC 2025 Rumor Report Card: Which Leaks Were Right or Wrong?
News

You Might also Like

Computing

Inside Roam Electric’s Nairobi plant assembling 15 bikes daily

9 Min Read
Computing

How To Write A YouTube Script in 2023 (+ Free Template)

7 Min Read
Computing

Few-Shot Learning in Remote Sensing: Trends, Gaps, and Future Directions | HackerNoon

10 Min Read
Computing

Why DNS Security Is Your First Defense Against Cyber Attacks?

8 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?