Authors:
(1) Santosh T.Y.S.S, School of Computation, Information, and Technology; Technical University of Munich, Germany ([email protected]);
(2) Hassan Sarwat, School of Computation, Information, and Technology; Technical University of Munich, Germany ([email protected]);
(3) Ahmed Abdou, School of Computation, Information, and Technology; Technical University of Munich, Germany ([email protected]);
(4) Matthias Grabmair, School of Computation, Information, and Technology; Technical University of Munich, Germany ([email protected]).
Table of Links
Abstract and 1. Introduction
-
Related Work
-
Task, Datasets, Baseline
-
RQ 1: Leveraging the Neighbourhood at Inference
4.1. Methods
4.2. Experiments
-
RQ 2: Leveraging the Neighbourhood at Training
5.1. Methods
5.2. Experiments
-
RQ 3: Cross-Domain Generalizability
-
Conclusion
-
Limitations
-
Ethics Statement
-
Bibliographical References
Abstract
Rhetorical Role Labeling (RRL) of legal judgments is essential for various tasks, such as case summarization, semantic search and argument mining. However, it presents challenges such as inferring sentence roles from context, interrelated roles, limited annotated data, and label imbalance. This study introduces novel techniques to enhance RRL performance by leveraging knowledge from semantically similar instances (neighbours). We explore inference-based and training-based approaches, achieving remarkable improvements in challenging macro-F1 scores. For inference-based methods, we explore interpolation techniques that bolster label predictions without re-training. While in training-based methods, we integrate prototypical learning with our novel discourse-aware contrastive method that work directly on embedding spaces. Additionally, we assess the cross-domain applicability of our methods, demonstrating their effectiveness in transferring knowledge across diverse legal domains.
1. Introduction
In an era of rapid digitalization and exponential growth of legal case volumes, the demand for automated systems to assist legal professionals in tasks like extracting key case elements, summarizing cases, and retrieving relevant cases has surged (Zhong et al., 2020). At the core of these tasks lies Rhetorical Role Labeling (RRL), which involves assigning functional roles to the sentences in the document such as preamble, factual content, evidence, reasoning, etc. Legal documents, characterized by their extensive length, lengthy sentences with unusual word order, frequent cross-references, extensive citation usage, and intricate lexicon, often feature uncommon expressions from everyday language and borrowed terms from various languages to the extent that they are referred to as a sub-language of legalese (Chalkidis et al., 2022; Haigh, 2013).
The task of RRL faces several distinctive challenges. Firstly, contextual dependencies, influenced by surrounding sentences and the case’s context, are pivotal in discerning rhetorical role of each sentence, distinguishing RRL as a sequential sentence classification task. Secondly, the intertwining nature of rhetorical roles further complicates the task. For instance, the rationale behind a judgment (Ratio of the decision) often overlaps with Precedents and Statutes, necessitating a nuanced understanding of these roles’ intricate distinctions (Bhattacharya et al., 2021). Thirdly, obtaining extensive annotated data for specialized domains like law is expensive, requiring expert annotators. Lastly, certain rhetorical roles are disproportionately represented in the dataset, leading to significant class imbalance (Malik et al., 2022; Bhattacharya et al., 2021). Traditional up/down sampling methods struggle to fully address this challenge due to the task’s nature, which involves sequences of sentences at the document level.
Initially RRL task is formulated as sentence classification, treating each sentence in isolation (Ahmad et al., 2020; Walker et al., 2019). Researchers later adopted it as sequential sentence classification, addressing contextual dependencies between sentences (Bhattacharya et al., 2021; Ghosh and Wyner, 2019; Malik et al., 2022; Kalamkar et al., 2022). They introduced a two-level hierarchical model, encoding sentences independently at the lower level and contextualizing them with neighbouring sentences at the higher-level. While this approach effectively addressed the first challenge of RRL, other challenges remain unaddressed. Recently, Santosh et al. 2023 aimed to address data scarcity through data augmentation, but methods like word deletion, sentence swapping and backtranslation could introduce noise and disrupt coherence. However, this approach did not effectively address label imbalance and intricate role intertwining.
In this work, we hypothesize that harnessing knowledge from semantically and contextually similar instances can provide valuable insights to grasp a broader context and reveal underlying rare patterns. This can enhance the understanding of complex label-semantics relationships, improve nuanced label assignments and equip the model to handle less common labels, thus addressing the distinctive challenges of RRL. We explore two approaches for harnessing this knowledge: one directly at inference time without additional parameters or re-training (Sec. 4), and the other during training by incorporating auxiliary loss constraints (Sec. 5). In the inference-based approaches, we interpolate the label distribution predicted by a model with the distribution derived from analogous instances in the training dataset, employing nearest neighbor-based, single, and multiple prototype-based methodologies. These methods enhance performance, particularly on more challenging macro-F1 scores, without requiring retraining. For training-based approaches, we integrate contrastive and prototypical learning which operate directly on the embedding space, leveraging neighborhood relationships. Additionally, we introduce a novel discourse-aware contrastive loss to address the contextual nature of the task. Our experimental results on four datasets from the Indian Jurisdiction validate our proposed methods.
While it is common to develop models for specific courts or domains due to unique vocabulary, complex linguistic structures and specific writing styles, such specialization can hinder the adaptability of these models beyond their original context. In rhetorical role labeling, models might memorize context-specific vocabulary rather than understanding the underlying semantics, making cross-domain applications challenging (Savelka et al., 2021). In such cases, developing a model for a new context typically requires annotating a new dataset, which can be expensive. In our work, we assess the cross-domain generalizability of our methods and observe that they enhance model’s ability to transfer across different legal domains compared to a baseline model lacking these auxiliary techniques (Sec. 6).
This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.