Reading: Researchers Learn to Measure AI’s Language Skills | HackerNoon

Computing

Researchers Learn to Measure AI’s Language Skills | HackerNoon

Last updated: 2024/12/30 at 12:32 PM

News Room Published 30 December 2024

Authors:

(1) Martyna Wiącek, Institute of Computer Science, Polish Academy of Sciences;

(2) Piotr Rybak, Institute of Computer Science, Polish Academy of Sciences;

(3) Łukasz Pszenny, Institute of Computer Science, Polish Academy of Sciences;

(4) Alina Wróblewska, Institute of Computer Science, Polish Academy of Sciences.

Editor’s note: This is Part 7 of 10 of a study on improving the evaluation and comparison of tools used in natural language preprocessing. Read the rest below.

Table of Links

Abstract and 1. Introduction and related works

NLPre benchmarking

2.1. Research concept

2.2. Online benchmarking system

2.3. Configuration

NLPre-PL benchmark

3.1. Datasets

3.2. Tasks

Evaluation

4.1. Evaluation methodology

4.2. Evaluated systems

4.3. Results

Conclusions
- Appendices
- Acknowledgements
- Bibliographical References
- Language Resource References

4. Evaluation

4.1. Evaluation methodology

To maintain the de facto standard to NLPre evaluation, we apply the evaluation measures defined for the CoNLL 2018 shared task and implemented in the official evaluation script.[11] In particular, we focus on F1 and AlignedAccuracy, which is similar to F1 but does not consider possible misalignments in tokens, words, or sentences.

In our evaluation process, we follow default training procedures suggested by the authors of the evaluated systems, i.e. we do not conduct any optimal hyperparameter search in favour of leaving the recommended model configuration as-is. We also do not further fine-tune selected models.

[11] https://universaldependencies.org/conll18/conll18_ud_eval.py

Share This Article

A singalong version of ‘Wicked’ arrives on streaming this week — what you need to know

Amazon Knocks These Waterproof Anker Workout Earbuds Down to Just $50

Researchers Learn to Measure AI’s Language Skills | HackerNoon

Table of Links

4. Evaluation

4.1. Evaluation methodology

Leave a Reply Cancel reply

Stay Connected

Latest News

Fedora Stakeholders Talk Of Forking Intel’s Compute Runtime To Maintain Older Hardware

Xbox Game Pass: Swing Into Action With Indiana Jones Now, More Games Soon

Despite Sonos’ setbacks, the Ray soundbar was the best product I bought in 2024

Asus, Samsung, and MSI announce world’s first 27-inch 4K OLED 240Hz monitors

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

Topics

Sign Up for Our Newsletter

Table of Links

4. Evaluation

4.1. Evaluation methodology

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Stay Connected

Latest News