By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: How AI Can Better Categorize Legal Documents by Learning from Similar Texts | HackerNoon
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > How AI Can Better Categorize Legal Documents by Learning from Similar Texts | HackerNoon
Computing

How AI Can Better Categorize Legal Documents by Learning from Similar Texts | HackerNoon

News Room
Last updated: 2025/04/02 at 1:27 AM
News Room Published 2 April 2025
Share
SHARE

Table of Links

Abstract and 1. Introduction

  1. Related Work

  2. Task, Datasets, Baseline

  3. RQ 1: Leveraging the Neighbourhood at Inference

    4.1. Methods

    4.2. Experiments

  4. RQ 2: Leveraging the Neighbourhood at Training

    5.1. Methods

    5.2. Experiments

  5. RQ 3: Cross-Domain Generalizability

  6. Conclusion

  7. Limitations

  8. Ethics Statement

  9. Bibliographical References

5. RQ 2: Leveraging the Neighbourhood at Training

We leverage the knowledge from neighbour instances directly at the training process to improve the performance. We explore three methods: contrastive learning, single prototypical learning, and multi prototypical learning. These techniques draw inspiration from the same principles as their inference-time counterparts but serve as auxiliary loss constraints during training. Their primary aim is to improve the discriminative capability of embeddings by highlighting differences between instances with distinct rhetorical roles and similarities among instances sharing the same label.

While the task-specific classification loss focuses on mapping contextualized representations to label outputs with supervision on individual instances, the methods in this section directly operate on embeddings in latent space. They exploit the interplay among instances to establish effective discriminative decision boundaries, serving as a form of regularization.

5.1. Methods

5.1.1. Contrastive Learning

Contrastive learning aims to bring an anchor point closer to related samples while pushing it away from unrelated samples in the embedding space. In a supervised setting, samples with the same/different labels are considered related/unrelated with respect to an anchor (Khosla et al., 2020). The loss is calculated as follows:

where δ(ci , cj ) denotes 1 if ci and cj have same rhetorical label, else 0, N denotes batch size.

Lengthy legal documents limits the number of documents that can be accommodated in a single batch and this raises concerns about having enough positive samples for the minority class instances within a batch for effective contrasting. To overcome this limitation, we utilize a memory bank (Wu et al., 2018), where we progressively reuse encoded representations from previous batches to compute the contrastive loss. In practice, we maintain a fixed-size representation queue for each rhetorical role. As new representations corresponding to specific labels are generated, they are enqueued into the respective queue with their gradients detached. If the queue size for a label exceeds the maximum limit, the oldest element is dequeued. When computing the contrastive loss, we use the same equation 7. However, in addition to the current batch instances, we employ all the representations stored in the memory bank for contrasting purposes, using them as both positives and negatives, based on the anchor point’s label.

To incorporate the concept of context from surrounding sentences into contrastive learning, we introduce a novel discourse-aware contrastive loss. This is based on the idea that sentences in close proximity within a document, sharing the same label, should exhibit a stronger proximity in the embedding space compared to sentences with the same label but positioned farther apart in the document. To implement this concept, we introduce a penalty inversely proportional to the absolute difference in their positions. In particular, we impose a higher penalty on positive sentence pairs that are closer in the document, encouraging them to be closer in the embedding space than pairs originating from greater distances within the document. The discouse-aware loss is as follows:

where β represents a penalty that considers positional information. When ci and cj come from different documents, such as cross-document positives/negatives from the memory bank or across the batch, we apply the lowest possible penalty, considering ci as the farthest sentence relative to in-document positives. We incorporate this additional contrastive loss alongside the classification loss during training.

5.1.2. Single Prototypical Learning

These both views shape the embedding space by aligning prototypes with their corresponding samples, forming distinct clusters of different labels, each centered around a specific prototype vector.

5.1.3. Multi Prototypical Learning

Instead of using a single prototype for each label, this approach employs multiple prototypes for each label to capture the diverse variations within the sentences of the same label. To implement this, a set of M prototypes per label is randomly initialized and a diversity loss (Zhang et al., 2022) is integrated to penalize prototypes of the same label if they are too similar to each other. This ensures that prototypes of the same label are distributed across the embedding space, capturing the multifaceted nuances under each label. The Sample Centric View is also modified to ensure that each sample is in close proximity to at least one prototype among all the prototypes of the same class.

Authors:

(1) Santosh T.Y.S.S, School of Computation, Information, and Technology; Technical University of Munich, Germany ([email protected]);

(2) Hassan Sarwat, School of Computation, Information, and Technology; Technical University of Munich, Germany ([email protected]);

(3) Ahmed Abdou, School of Computation, Information, and Technology; Technical University of Munich, Germany ([email protected]);

(4) Matthias Grabmair, School of Computation, Information, and Technology; Technical University of Munich, Germany ([email protected]).


This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Switch 2 Nintendo Direct: When is it and how to watch
Next Article How to Score a Disney World Trip for Almost Nothing
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Top 50 AI Tools Transforming Businesses in 2025 |
Computing
Two facilites to produge 3-nanometer chips unveiled in Noida, Bengaluru
Software
Three worlds in 2035: Imagining scenarios for how the world could be transformed over the next decade
News
EXCLUSIVE: Government calls in tech firms to roll out digital IDs – UKTN
News

You Might also Like

Computing

Top 50 AI Tools Transforming Businesses in 2025 |

88 Min Read
Computing

Why I Ditched JavaScript and Built a SaaS Stack With HTMX, Go & Postgres That Just Works | HackerNoon

8 Min Read
Computing

Horabot Malware Targets 6 Latin American Nations Using Invoice-Themed Phishing Emails

4 Min Read
Computing

Oracle Talks Up Its Adaptived Daemon For Linux Systems

1 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?