By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: New AI Methods Help Machines Understand Legal Text Better | HackerNoon
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > New AI Methods Help Machines Understand Legal Text Better | HackerNoon
Computing

New AI Methods Help Machines Understand Legal Text Better | HackerNoon

News Room
Last updated: 2025/04/02 at 3:41 AM
News Room Published 2 April 2025
Share
SHARE

Authors:

(1) Santosh T.Y.S.S, School of Computation, Information, and Technology; Technical University of Munich, Germany ([email protected]);

(2) Hassan Sarwat, School of Computation, Information, and Technology; Technical University of Munich, Germany ([email protected]);

(3) Ahmed Abdou, School of Computation, Information, and Technology; Technical University of Munich, Germany ([email protected]);

(4) Matthias Grabmair, School of Computation, Information, and Technology; Technical University of Munich, Germany ([email protected]).

Table of Links

Abstract and 1. Introduction

  1. Related Work

  2. Task, Datasets, Baseline

  3. RQ 1: Leveraging the Neighbourhood at Inference

    4.1. Methods

    4.2. Experiments

  4. RQ 2: Leveraging the Neighbourhood at Training

    5.1. Methods

    5.2. Experiments

  5. RQ 3: Cross-Domain Generalizability

  6. Conclusion

  7. Limitations

  8. Ethics Statement

  9. Bibliographical References

Abstract

Rhetorical Role Labeling (RRL) of legal judgments is essential for various tasks, such as case summarization, semantic search and argument mining. However, it presents challenges such as inferring sentence roles from context, interrelated roles, limited annotated data, and label imbalance. This study introduces novel techniques to enhance RRL performance by leveraging knowledge from semantically similar instances (neighbours). We explore inference-based and training-based approaches, achieving remarkable improvements in challenging macro-F1 scores. For inference-based methods, we explore interpolation techniques that bolster label predictions without re-training. While in training-based methods, we integrate prototypical learning with our novel discourse-aware contrastive method that work directly on embedding spaces. Additionally, we assess the cross-domain applicability of our methods, demonstrating their effectiveness in transferring knowledge across diverse legal domains.

1. Introduction

In an era of rapid digitalization and exponential growth of legal case volumes, the demand for automated systems to assist legal professionals in tasks like extracting key case elements, summarizing cases, and retrieving relevant cases has surged (Zhong et al., 2020). At the core of these tasks lies Rhetorical Role Labeling (RRL), which involves assigning functional roles to the sentences in the document such as preamble, factual content, evidence, reasoning, etc. Legal documents, characterized by their extensive length, lengthy sentences with unusual word order, frequent cross-references, extensive citation usage, and intricate lexicon, often feature uncommon expressions from everyday language and borrowed terms from various languages to the extent that they are referred to as a sub-language of legalese (Chalkidis et al., 2022; Haigh, 2013).

The task of RRL faces several distinctive challenges. Firstly, contextual dependencies, influenced by surrounding sentences and the case’s context, are pivotal in discerning rhetorical role of each sentence, distinguishing RRL as a sequential sentence classification task. Secondly, the intertwining nature of rhetorical roles further complicates the task. For instance, the rationale behind a judgment (Ratio of the decision) often overlaps with Precedents and Statutes, necessitating a nuanced understanding of these roles’ intricate distinctions (Bhattacharya et al., 2021). Thirdly, obtaining extensive annotated data for specialized domains like law is expensive, requiring expert annotators. Lastly, certain rhetorical roles are disproportionately represented in the dataset, leading to significant class imbalance (Malik et al., 2022; Bhattacharya et al., 2021). Traditional up/down sampling methods struggle to fully address this challenge due to the task’s nature, which involves sequences of sentences at the document level.

Initially RRL task is formulated as sentence classification, treating each sentence in isolation (Ahmad et al., 2020; Walker et al., 2019). Researchers later adopted it as sequential sentence classification, addressing contextual dependencies between sentences (Bhattacharya et al., 2021; Ghosh and Wyner, 2019; Malik et al., 2022; Kalamkar et al., 2022). They introduced a two-level hierarchical model, encoding sentences independently at the lower level and contextualizing them with neighbouring sentences at the higher-level. While this approach effectively addressed the first challenge of RRL, other challenges remain unaddressed. Recently, Santosh et al. 2023 aimed to address data scarcity through data augmentation, but methods like word deletion, sentence swapping and backtranslation could introduce noise and disrupt coherence. However, this approach did not effectively address label imbalance and intricate role intertwining.

In this work, we hypothesize that harnessing knowledge from semantically and contextually similar instances can provide valuable insights to grasp a broader context and reveal underlying rare patterns. This can enhance the understanding of complex label-semantics relationships, improve nuanced label assignments and equip the model to handle less common labels, thus addressing the distinctive challenges of RRL. We explore two approaches for harnessing this knowledge: one directly at inference time without additional parameters or re-training (Sec. 4), and the other during training by incorporating auxiliary loss constraints (Sec. 5). In the inference-based approaches, we interpolate the label distribution predicted by a model with the distribution derived from analogous instances in the training dataset, employing nearest neighbor-based, single, and multiple prototype-based methodologies. These methods enhance performance, particularly on more challenging macro-F1 scores, without requiring retraining. For training-based approaches, we integrate contrastive and prototypical learning which operate directly on the embedding space, leveraging neighborhood relationships. Additionally, we introduce a novel discourse-aware contrastive loss to address the contextual nature of the task. Our experimental results on four datasets from the Indian Jurisdiction validate our proposed methods.

While it is common to develop models for specific courts or domains due to unique vocabulary, complex linguistic structures and specific writing styles, such specialization can hinder the adaptability of these models beyond their original context. In rhetorical role labeling, models might memorize context-specific vocabulary rather than understanding the underlying semantics, making cross-domain applications challenging (Savelka et al., 2021). In such cases, developing a model for a new context typically requires annotating a new dataset, which can be expensive. In our work, we assess the cross-domain generalizability of our methods and observe that they enhance model’s ability to transfer across different legal domains compared to a baseline model lacking these auxiliary techniques (Sec. 6).

This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Splunk boosts OpenTelemetry support in its observability framework – News
Next Article Subscribe to Stuff magazine and get it from £2.69 an issue
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

NVIDIA invests in a SMR reactor startup
Mobile
Report: Apple held internal talks about acquiring Perplexity – 9to5Mac
News
Lenovo secures $2 billion investment from Saudi Arabia’s Alat · TechNode
Computing
The Linkind EP6 Smart Hexagon Panels turn boring walls into amazing light shows
News

You Might also Like

Computing

Lenovo secures $2 billion investment from Saudi Arabia’s Alat · TechNode

4 Min Read
Computing

Tesla looks to test FSD software in China with govt approval: report · TechNode

1 Min Read
Computing

Tencent’s Dungeon and Fighter generates $140 million in first week in China · TechNode

1 Min Read
Computing

China’s Great Wall Motor to shut down European office as EV tariffs loom · TechNode

1 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?