By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: How AI Detects Cancer in Whole Slide Images | HackerNoon
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > How AI Detects Cancer in Whole Slide Images | HackerNoon
Computing

How AI Detects Cancer in Whole Slide Images | HackerNoon

News Room
Last updated: 2025/07/16 at 8:10 PM
News Room Published 16 July 2025
Share
SHARE

Authors:

(1) Martim Afonso, Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais, Lisbon, 1049-001, Portugal;

(2) Praphulla M.S. Bhawsar, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, 20850, Maryland, USA;

(3) Monjoy Saha, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, 20850, Maryland, USA;

(4) Jonas S. Almeida, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, 20850, Maryland, USA;

(5) Arlindo L. Oliveira, Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais, Lisbon, 1049-001, Portugal and INESC-ID, R. Alves Redol 9, Lisbon, 1000-029, Portugal.

Table of Links

Abstract and I. Introduction

  1. Materials and Methods

    2.1. Multiple Instance Learning

    2.2. Model Architectures

  2. Results

    3.1. Training Methods

    3.2. Datasets

    3.3. WSI Preprocessing Pipeline

    3.4. Classification and RoI Detection Results

  3. Discussion

    4.1. Tumor Detection Task

    4.2. Gene Mutation Detection Task

  4. Conclusions

  5. Acknowledgements

  6. Author Declaration and References

ABSTRACT

Whole Slide Images (WSI), obtained by high-resolution digital scanning of microscope slides at multiple scales, are the cornerstone of modern Digital Pathology. However, they represent a particular challenge to AI-based/AI-mediated analysis because pathology labeling is typically done at slide-level, instead of tile-level. It is not just that medical diagnostics is recorded at the specimen level, the detection of oncogene mutation is also experimentally obtained, and recorded by initiatives like The Cancer Genome Atlas (TCGA), at the slide level. This configures a dual challenge: a) accurately predicting the overall cancer phenotype and b) finding out what cellular morphologies are associated with it at the tile level. To address these challenges, a weakly supervised Multiple Instance Learning (MIL) approach was explored for two prevalent cancer types, Invasive Breast Carcinoma (TCGA-BRCA) and Lung Squamous Cell Carcinoma (TCGA-LUSC). This approach was explored for tumor detection at low magnification levels and TP53 mutations at various levels. Our results show that a novel additive implementation of MIL matched the performance of reference implementation (AUC 0.96), and was only slightly outperformed by Attention MIL (AUC 0.97). More interestingly from the perspective of the molecular pathologist, these different AI architectures identify distinct sensitivities to morphological features (through the detection of Regions of Interest, RoI) at different amplification levels. Tellingly, TP53 mutation was most sensitive to features at the higher applications where cellular morphology is resolved.

1. Introduction

Whole slide imaging is the automated process of digitally scanning whole microscope slides with high resolution. During this process, images from each field of view at different resolutions are taken and joined together to create a single digital image pyramid file, known as a whole slide image (WSI) [5]. This digital format, supported by a number of standard serializations, facilitates their distribution for diagnostic, education, and research purposes [3]. WSIs play a critical role in cancer diagnosis [6].

Deep learning has been particularly successful in medical imaging applications such as diagnosis [7], sub-type classification [13, 6] and prognosis [27]. However, deep learning of WSI faces three main challenges [9]: handling large image dimensionality at multiple scales; the lack of strongly annotated data; and more generally the difficulties inherent to approaching classification with information retrieval.

The first factor, the large dimension of the images across multiple scales in the WSI image pyramid starts with resolutions at the base in the order of 100,000 × 100,000 pixels. The sheer size makes it difficult to feed the images directly as input to computer vision models. To overcome this issue, slides are usually divided into multiple fixed-size patches (also described as tiles) that are then used as input to the models.

The second factor is that the labeling is performed at slide level (for the whole WSI, not for individual tiles) or even at the patient level (one label per patient, which can have multiple slides). Expert labeling at the pixel level would be costly and exhausting to the Pathologist. Wihtout pixel-level annotations, fully-supervised approaches cannot be directly employed and require instead weakly-supervised [23, 7, 12] or unsupervised [15, 21] approaches such as multiple instance learning, MIL [2], used in the work reported here.

Finally, it can be challenging to retrieve the relevant pathology information from relatively unstructured clinical reports. As a consequence, the interpretability of regions of interest (RoI) identified by deep learning models is often poor to the point that attempting explanation at the morphology level can only be approached as an exploratory exercise. However, the Regions of interest approach, where recurring morphological patterns are observed, is a familiar procedure in Digital Pathology, where it is sometimes described as “virtual staining”. Accordingly, in the work described here, we “stain” the raw images with heatmaps produced using the deep-learned activation scores.

2. Materials and Methods

2.1. Multiple Instance Learning

Multiple instance learning (MIL) [2] is a weakly supervised learning approach, where instances are grouped into sets called bags. A label is assigned to the entire bag, while the individual instance labels remain unknown. The standard MIL definition uses a bag of instances as 𝑋 = {𝑥1 , …, 𝑥𝐾} that do not have dependency nor ordering among each other. We will also assume that the size of the bag K might vary for different bags. For each bag 𝑋, there is a binary label 𝑌 ∈ {0,1}. Each instance within the bag has a label 𝑦𝑖 , where 𝑦𝑖 ∈ {0,1}. However, the individual instance labels are unknown and are not accessible during training. Thus, the MIL problem can be written as the following:

In other words, we assume that all negative bags contain only negative instances and that positive bags contain at least one positive instance. To classify a bag as positive we only have to consider one instance as positive.

This standard assumption also posits that models must be permutation-invariant (there is no order nor dependency among instances). More specifically, it needs an aggregation operator that is permutation-invariant, usually referred to as a MIL pooling operator.

In a more general sense, a MIL model for bag classification can be expressed by a 3-step process [7]:

• A transformation of the instances with a function 𝑓;

• A permutation-invariant function 𝜎 that combines all instance transformations into a final bag representation;

• A final transformation 𝑔 that receives the bag representation and outputs a final bag score.

In the case of WSI analysis, we can consider each slide as a bag that contains several patches as instances. We have slide-level labels (bag label), but we do not have pixel/patch-level labels (instance labels).

To provide some degree of interpretability, as well as better results, a variation of the attention mechanism can be used as a MIL pooling operator. Because it acts as a weighted average of instances, the original definition can be adapted to be permutation-invariant, making it a valid MIL pooling operator. These attention scores can be used to build a heatmap that allows for the interpretation of which parts of an image are responsible for the final classification. There have been multiple works that use variations of the attention mechanism with the MIL framework in order to achieve this [4, 6, 10, 11, 12, 13, 14, 19, 25, 26].

This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Beeper’s all-in-one messaging app relaunches with an on-device model and premium upgrades | News
Next Article Waterllama’s biggest update yet will help you stay hydrated this summer
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Samsung’s eye-catching Theme Park update teaches Apple how to do a glass UI right
News
Zoho Introduces Homegrown Ai Model Zia, Multiple Ai Agents for Enterprises
Software
How the Right Setup Supports Social Media Marketing Strategies
Computing
Samsung Galaxy Z Fold 7 Review: The Fold we’ve been waiting for
Gadget

You Might also Like

Computing

How the Right Setup Supports Social Media Marketing Strategies

15 Min Read
Computing

How to Find Smart Contract Vulnerabilities Before Exploit Happen | HackerNoon

6 Min Read
Computing

Hackers Exploit Apache HTTP Server Flaw to Deploy Linuxsys Cryptocurrency Miner

5 Min Read
Computing

Google Continues Working On “Magma” For Mesa Cross-Platform System Call Interface

2 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?