By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Improving Legal Document Labeling by Comparing Similar Sentences | HackerNoon
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > Improving Legal Document Labeling by Comparing Similar Sentences | HackerNoon
Computing

Improving Legal Document Labeling by Comparing Similar Sentences | HackerNoon

News Room
Last updated: 2025/04/02 at 1:52 AM
News Room Published 2 April 2025
Share
SHARE

Table of Links

Abstract and 1. Introduction

  1. Related Work

  2. Task, Datasets, Baseline

  3. RQ 1: Leveraging the Neighbourhood at Inference

    4.1. Methods

    4.2. Experiments

  4. RQ 2: Leveraging the Neighbourhood at Training

    5.1. Methods

    5.2. Experiments

  5. RQ 3: Cross-Domain Generalizability

  6. Conclusion

  7. Limitations

  8. Ethics Statement

  9. Bibliographical References

4. RQ 1: Leveraging the Neighbourhood at Inference

In this section, we leverage the knowledge from semantically similar training instances directly during inference without extra training overhead. We interpolate the label distribution predicted by the baseline model with the distribution derived from the training instances similar to the test instance. This overcomes the problem of memorizing/learning rare patterns implicitly in the model parameters, thus enhancing the model’s ability to handle longtail cases (classes with few instances or rare patterns in frequent classes) especially in limited data settings. We explore three different methods to obtain the distribution from similar training instances.

4.1. Methods

4.1.1. Interpolation with kNN

In this method, we construct a datastore of training instances and then retrieve the nearest neighbours to the test instance for computing the interpolated label distribution during the inference.

Datastore Construction After training, we obtain contextualized representation ci of every sentence in each document of the training set using the trained model. We construct the datastore by a single forward pass over each training document. The datastore {K, V } is the set of all contextualized representation-rhetorical label pairs constructed from all the training examples D as:

Interpolation During inference time, we query the datastore using the contextualized representation of every sentence in the test document, to find the k-nearest neighbours N according to the euclidean distance. Then, we derive the distribution of labels pkNN using labels of the retrieved neighbours based on softmax of their negative distances, while aggregating probability mass for each label across all its occurrences in the retrieved neighbours (labels that do not appear in the retrieved N have zero probability). Intuitively, the closer a neighbor is to the test instance, the larger its weight is.

4.1.2. Interpolation with Single Prototype

Instead of storing all the training instances in the datastore, we seek to store one prototype for each label, which captures the essential semantics of various sentences for each rhetorical role, significantly reducing the datastore’s memory footprint. To create a prototype for each label, we calculate the average of the contextualized embeddings of sentences that share the same rhetorical role. Intuitively these prototypes can be assumed to be the center of clusters for different labels, surrounded by sentences expressing the same label in the embedding space. The interpolation process closely resembles the kNN approach (Eq. 5), with the key difference being that interpolation directly involves the prototypes, rather than a prior retrieval step.

4.1.3. Interpolation with Multiple Prototypes

Instead of using a single prototype for each rhetorical role, we suggest the use of multiple prototypes for each label. This choice is driven by the fact that instances with the same rhetorical role can exhibit distinct variations in expression, resulting in diverse contextual embeddings scattered across the embedding space. Averaging these embeddings into a single prototype might diminish specificity. Utilizing multiple prototypes allows us to effectively capture the intricate viewpoints within each label. To accomplish this, we cluster the instances belonging to each rhetorical role using k-means, yielding multiple prototypes for each label from the k centroids. The interpolation step remains similar (Eq.5), involving all these multiple prototypes without any retrieval step.

Authors:

(1) Santosh T.Y.S.S, School of Computation, Information, and Technology; Technical University of Munich, Germany ([email protected]);

(2) Hassan Sarwat, School of Computation, Information, and Technology; Technical University of Munich, Germany ([email protected]);

(3) Ahmed Abdou, School of Computation, Information, and Technology; Technical University of Munich, Germany ([email protected]);

(4) Matthias Grabmair, School of Computation, Information, and Technology; Technical University of Munich, Germany ([email protected]).


This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article You won’t believe how cheap the colourful Motorola Edge 60 Fusion is | Stuff
Next Article Lottery winner mistakenly hands over $2.5m jackpot and it expires in just weeks
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

10 Best Canva Alternatives & Competitors in 2025 |
Computing
Google Chrome will use AI to stop tech support scams in real-time
News
Duolingo is replacing hearts with energy
News
How Tokenizer Choices Shape Hidden Risks in Popular Language Models | HackerNoon
Computing

You Might also Like

Computing

10 Best Canva Alternatives & Competitors in 2025 |

28 Min Read
Computing

How Tokenizer Choices Shape Hidden Risks in Popular Language Models | HackerNoon

14 Min Read
Computing

Google Cloud Greater China President Kathy Lee reportedly stepping down; Microsoft veteran Bin Shen tapped as successor · TechNode

2 Min Read
Computing

Top 11 Google Docs Alternatives for 2025 |

25 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?