How 24 Special Queries Optimized A Neural Network’s Recall Rate

How 24 Special Queries Optimized a Neural Network’s Recall Rate | HackerNoon

Last updated: 2025/07/16 at 4:44 PM

News Room Published 16 July 2025

Table of Links

Abstract and 1. Introduction

Related Work
Method

3.1 Overview of Our Method

3.2 Coarse Text-cell Retrieval

3.3 Fine Position Estimation

3.4 Training Objectives
Experiments

4.1 Dataset Description and 4.2 Implementation Details

4.3 Evaluation Criteria and 4.4 Results
Performance Analysis

5.1 Ablation Study

5.2 Qualitative Analysis

5.3 Text Embedding Analysis
Conclusion and References

Supplementary Material

Details of KITTI360Pose Dataset
More Experiments on the Instance Query Extractor
Text-Cell Embedding Space Analysis
More Visualization Results
Point Cloud Robustness Analysis

Anonymous Authors

Details of KITTI360Pose Dataset
More Experiments on the Instance Query Extractor
Text-Cell Embedding Space Analysis
More Visualization Results
Point Cloud Robustness Analysis

1 DETAILS OF KITTI360POSE DATASET

Figure 1: Visualization of the KITTI360Pose dataset. The trajectories of five training sets, three test sets, and one validation set are shown in the dashed borders. One colored point cloud scene and three cells are shown in the middle.

We conduct an additional experiment to assess the impact of the number of queries on the performance of our instance query extractor. As detailed in Table 1, we evaluate the localization recall rate using 16, 24, and 32 queries. The result demonstrates that using 24 queries yields the highest localization recall rate, i.e, 0.23/0.53/0.64 on the validation set and 0.22/0.47/0.58 on the test set. This finding suggests that the optimal number of queries for maximizing the effectiveness of our model is 24.

Table 1: Ablation study of the query number on KITTI360Pose dataset.

3 TEXT-CELL EMBEDDING SPACE ANALYSIS

Fig. 2 shows the aligned text-cell embedding space via T-SNE [? ]. Under the instance-free scenario, we compare our model with Text2loc [? ] using a pre-trained instance segmentation model, Mask3D [? ], as a prior step. It can be observed that Text2Loc results in a less discriminative space, where positive cells are relatively far from the text query feature. In contrast, our IFRP-T2P effectively reduces the distance between positive cell features and text query features within the embedding space, thereby creating a more informative embedding space. This enhancement in the embedding space is critical for improving the accuracy of text-cell retrieval.

Figure 2: T-SNE visualization for the text features and cell features in the coarse stage.

Authors:

(1) Lichao Wang, FNii, CUHKSZ ([email protected]);

(2) Zhihao Yuan, FNii and SSE, CUHKSZ ([email protected]);

(3) Jinke Ren, FNii and SSE, CUHKSZ ([email protected]);

(4) Shuguang Cui, SSE and FNii, CUHKSZ ([email protected]);

(5) Zhen Li, a Corresponding Author from SSE and FNii, CUHKSZ ([email protected]).

This paper is available on arxiv under CC BY-NC-ND 4.0 Deed (Attribution-Noncommercial-Noderivs 4.0 International) license.

How 24 Special Queries Optimized a Neural Network’s Recall Rate | HackerNoon

Table of Links

1 DETAILS OF KITTI360POSE DATASET

3 TEXT-CELL EMBEDDING SPACE ANALYSIS

Leave a Reply Cancel reply

Stay Connected

Latest News

Meet Augment Code: HackerNoon Company of the Week | HackerNoon

Just bought a new Galaxy foldable? Watch out for this alert-hiding bug

YouTube’s Weird Filters Are Making Shorts Look Worse

Don’t Miss These Shows Before They Leave Netflix This Month [September 2025]

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

Topics

Sign Up for Our Newsletter

Table of Links

1 DETAILS OF KITTI360POSE DATASET

3 TEXT-CELL EMBEDDING SPACE ANALYSIS

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Stay Connected

Latest News