Authors:
(1) VINÍCIUS YU OKUBO, Dept. Electronic Systems Engineering, Polytechnic School, University of São Paulo, Brazil;
(2) KOTARO SHIMIZU, Department of Applied Physics, The University of Tokyo, Tokyo 113-8656, Japan;
(3) B. S. SHIVARAM, Department of Physics, University of Virginia, Charlottesville, Virginia 22904, USA;
(4) HAE YONG KIM, Dept. Electronic Systems Engineering, Polytechnic School, University of São Paulo, Brazil.
Table of Links
Abstract and I. Introduction
II. Related Work
III. Methodology
IV. Experiments and Results
V. Conclusion and References
ABSTRACT
In material sciences, characterizing faults in periodic structures is vital for understanding material properties. To characterize magnetic labyrinthine patterns, it is necessary to accurately identify junctions and terminals, often featuring over a thousand closely packed defects per image. This study introduces a new technique called TM-CNN (Template Matching – Convolutional Neural Network) designed to detect a multitude of small objects in images, such as defects in magnetic labyrinthine patterns. TMCNN was used to identify these structures in 444 experimental images, and the results were explored to deepen the understanding of magnetic materials. It employs a two-stage detection approach combining template matching, used in initial detection, with a convolutional neural network, used to eliminate incorrect identifications. To train a CNN classifier, it is necessary to create a large number of training images. This difficulty prevents the use of CNN in many practical applications. TM-CNN significantly reduces the manual workload for creating training images by automatically making most of the annotations and leaving only a small number of corrections to human reviewers. In testing, TM-CNN achieved an impressive F1 score of 0.988, far outperforming traditional template matching and CNN-based object detection algorithms.
I. INTRODUCTION
In materials, a wide range of periodic structures are seen accompanied by defects within their spatial arrangement. Some magnetic materials exhibit stripe orders, wherein the orientation of electron spins i.e. the magnetic moments show periodic alterations [1], as well as labyrinthine structures, characterized by magnetic arrangements resulting from the propagation of stripes in various directions throughout space [2]–[4]. The two images in Fig. 1 are representative examples showcasing labyrinthine patterns of magnetic domains in Bismuth-doped Yttrium Iron Garnet (Bi:YIG) films at zero field; the intensity represents the out-of-plane component of the magnetic moments. These labyrinthine patterns may present discernible characteristics that are difficult to quantify. As shown in Fig. 1a, which we label as the “quenched” state, the border of dark and bright domains exhibit a sinuous nature and do not appear as parallel. In contrast, the “annealed” state, shown in Fig. 1b, consists of regions with nearly parallel domains. This state exhibits roughly equal widths of dark and bright domains, and the areas occupied by them are also approximately equal for any sampled region. Therefore, the stripes in the annealed state show greater spatial coherence.
Within these magnetic structures, defects take the form of interruptions in the stripes known as “terminals” and points where multiple stripes conjoin, referred to as “junctions” (Fig. 2). The presence of defects not only serves as a crucial metric for quantifying the deviation of a structure from a perfectly periodic structure, but has also gathered considerable attention in recent years due to its implications on physical phenomena arising from the nontrivial geometric properties associated with magnetic structures [5]. Thus, in the realm of condensed matter physics, experimental identification of the number and positions of such defects plays an important role in characterizing material properties.
Accurately counting and differentiating genuine structures
from misidentifications is crucial to a quantitative physical understanding of the origins and evolution of these patterns. Manual annotation of defects in unfeasible. For instance, we used 444 images with 641,649 structures [6]. Furthermore, manual annotation relies on subjective interpretation of junctions and terminals, which could lead to counting inconsistencies. In order to address these issues, an automated process is required. Here, algorithms for finding objects in an image known as object detectors are an excellent choice and can be very effective. They can be broadly categorized into classical and deep learning-based methods.
A. CLASSICAL OBJECT DETECTION METHODS
Classical object detection methods span a variety of techniques to extract and process features from the image. For instance, template matching [7] is a technique employed to find a pre-defined template within a larger image. This is achieved by scanning the entire image and calculating the correlation between the template and the scanned region. Viola and Jones object detecting algorithm [8] employs multiple template matchings using Haar-like features, each of them serving as a weak detector. These features are ensembled through a boosting strategy to form a strong detector. The Histogram of Oriented Gradients (HOG) [9] represents another approach to extract useful features, dividing the image into cells and calculating a gradient histogram for each. This extracted information can be used by machine learning algorithms, such as support vector machine, to identify detections.
B. DEEP LEARNING OBJECT DETECTION METHODS
In the last decade, deep learning approaches have surpassed traditional machine learning techniques in multiple image processing tasks [10], [11]. For object detection, Girshick et al. introduced R-CNN [12], marking it as one of the pioneering detection techniques rooted in deep learning. RCNN operates as a classification-based model: multiple regions are extracted from an image and each is classified independently. This straightforward method led to significant improvements in detection, achieving new state-of-the-art results in the Pascal VOC dataset [13], compared to earlier techniques like Haar features and HOG. A distinguishing feature of R-CNN is its region proposal step. Directly processing every conceivable region of varying sizes and positions in an image is computationally impractical. Hence, this step selects a simplified set of regions from the original image for individualized classification by the CNN model. Further developments brought by Faster R-CNN [14] have improved both accuracy and speed by integrating the region proposal into the model.
Redmon et al. introduced YOLO [15], a deep-learning detection approach modeled as a regression task. This method partitions the image into a grid and each cell contain their own set of outputs. The grid cell where the object is centered has the task of identifying the position, dimensions and class of the object. A standout benefit of this approach is its efficiency: YOLO processes the image in a single pass, contrasting with R-CNN-based models that are divided into region proposal and classification steps. This significantly reduces inference time, enabling real-time video detection. However, YOLO was not able to achieve the precision and recall rates of Faster R-CNN when tested on the Pascal VOC dataset. Comparisons in small object detection settings have also shown Faster RCNN to outperform YOLO [16].
C. THE PROBLEM AND PROPOSED SOLUTION
Modern digital microscopy allows the easy acquisition of high-resolution experimental images, which contain intricate stripe patterns and a large number of defects, sometimes numbering in the thousands. This makes it difficult to use deep learning, as creating an accurately annotated dataset would be a laborious process due to the large number of defects. Furthermore, both YOLO and Faster R-CNN were not designed to detect thousands of small objects. Benchmarks with both methods show that performance degrades when detecting small objects, with Faster R-CNN maintaining a slight advantage over YOLO [16].
Correlation-based granulometry is a technique that can be used to detect a large number of small objects in the image. It was proposed by Maruta et al. [17], [18] to analyze the distribution of square and circular pores in the macroporous silicon layer in scanning electron microscope images. This technique is called “granulometry” because the objective of the original application was to obtain a histogram of pore distribution as a function of size. It was later used by Araújo et al. [19] to detect individual bean grains, analyze each grain, and calculate the quality of the bean batch. This technique performs multiple template matchings to achieve robustness against angle and shape variations, and then performs non-maximum suppression to avoid finding the same object multiple times. However, this technique could not be applied directly in our problem, because defects in labyrinthine magnetic structures vary greatly in shape and cannot be accurately detected using only template matchings.
We demonstrate that this method substantially outperforms template matching alone, while streamlining the image annotation process and reducing the computational burden typically associated with deep learning detection techniques. We also show that the TM-CNN technique outperforms Faster RCNN, achieving a significantly higher F1-score in junctions and terminals detection.
D. STRUCTURE OF THE ARTICLE
The remainder of this work is organized as follows. Section II presents existing techniques for detecting junctions and terminals, differentiating classical and deep learning approaches. Section III describes the dataset used and the proposed TMCNN technique in detail. Section IV presents experimental results from the detection of defects in magnetic labyrinthine patterns and discusses their significance in understanding physical phenomena. Finally, Section V concludes the article by reflecting on the merits of our work.