By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Using Code-LLMs for Structured Commonsense Reasoning | HackerNoon
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > Using Code-LLMs for Structured Commonsense Reasoning | HackerNoon
Computing

Using Code-LLMs for Structured Commonsense Reasoning | HackerNoon

News Room
Last updated: 2025/04/24 at 5:04 PM
News Room Published 24 April 2025
Share
SHARE

Table of Links

Abstract and 1 Introduction

2 COCOGEN: Representing Commonsense structures with code and 2.1 Converting (T,G) into Python code

2.2 Few-shot prompting for generating G

3 Evaluation and 3.1 Experimental setup

3.2 Script generation: PROSCRIPT

3.3 Entity state tracking: PROPARA

3.4 Argument graph generation: EXPLAGRAPHS

4 Analysis

5 Related work

6 Conclusion, Acknowledgments, Limitations, and References

A Few-shot models size estimates

B Dynamic prompt Creation

C Human Evaluation

D Dataset statistics

E Sample outputs

F Prompts

G Designing Python class for a structured task

H Impact of Model size

I Variation in prompts

A Few-shot models size estimates

As OpenAI has not released any details of the size of their few-shot models, we estimate the relative strengths and weaknesses on code and text generation by calculating the average loss per token. To calculate the avg. loss of each of these models on code, we use the implementation provided by Xu et al. (2022).[5] The perplexity on text corpus was evaluated on 30 random wikipedia pages from Wikiplots[6] following a similar procedure The structure and text generation capabilities of the models are apparent from the results in Table 7; DAVINCI outperforms CODEX on text generation but is worse on code-generation and vice-versa. CURIE underperforms both DAVINCI and CODEX significantly. Importantly, these results show that CODEX and DAVINCI are of comparable capacities, making their comparison fair.

Table 7: Average loss per token of the three few-shot models used in this work. TEXT refers to the average loss over 30 Wikipedia pages, and CODE is the loss over Python scripts in the evaluation split of Polycoder.Table 7: Average loss per token of the three few-shot models used in this work. TEXT refers to the average loss over 30 Wikipedia pages, and CODE is the loss over Python scripts in the evaluation split of Polycoder.

B Dynamic prompt Creation

As an alternative to creating prompts, there is now a growing interest in customizing the in-context examples each example Ttest. Popular techniques typically train a retriever, which is used to fetch the examples in the training set that are closest to Ttest (Liu et al., 2021; Rubin et al., 2021; Poesia et al., 2021).

Specifically Poesia et al. (2021) train a retriever with a target-similarity tuning (TST) objective over a corpus of D of (x, y) examples. TST learns an embedding function f such that for a pair of examples (xi , yi) and (xj , yj), if yi ∼ yj ⟹ f(xi) ∼ f(xj). For a new x, f(x) is used to retrieve the closest examples from D.

We follow Poesia et al. (2021), and train a knowledge-similarity tuner (KST). We use mpnet5 https://github.com/VHellendoorn/ Code-LMs#evaluation 6 https://github.com/markriedl/ WikiPlots base[7] with SentenceTransformers (Reimers and Gurevych, 2019) to fine-tune a retrieval function f by minimizing the following loss:

where fθ is parameterized using a transformer.

Results on using KST with PROSCRIPT (Table 8) and EXPLAGRAPHS (Table 9). While KST is highly effective for edge-prediction 6, the results are mixed for EXPLAGRAPHS and PROSCRIPT. For PROSCRIPT, KST yields marginal gains. However, for EXPLAGRAPHS, a number of training examples have overlapping theme (Table 10), and thus creating a prompt dynamically reduces the effective information in the prompt.


[5] https://github.com/VHellendoorn/Code-LMs#evaluation

[6] https://github.com/markriedl/WikiPlots


Authors:

(1) Aman Madaan, Language Technologies Institute, Carnegie Mellon University, USA ([email protected]);

(2) Shuyan Zhou, Language Technologies Institute, Carnegie Mellon University, USA ([email protected]);

(3) Uri Alon, Language Technologies Institute, Carnegie Mellon University, USA ([email protected]);

(4) Yiming Yang, Language Technologies Institute, Carnegie Mellon University, USA ([email protected]);

(5) Graham Neubig, Language Technologies Institute, Carnegie Mellon University, USA ([email protected]).

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Google Chrome won’t phase out third-party cookies after all
Next Article Weekly Newsletter Ad 2025-05-07 00:00:00
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Report: Intel considers moving on from 18A chip manufacturing process – News
News
Elon Musk launches ‘America Party’ as a pro-tech alternative to the two-party system
News
How to watch Summer Games Done Quick 2025
News
Xreal One Pro review: The best display glasses you can get
News

You Might also Like

Computing

Wayland 1.24 Released With Few Improvements

1 Min Read
Computing

Many NGG Improvements Arrive For AMD’s Open-Source Linux OpenGL/Vulkan Drivers

2 Min Read
Computing

How to Plan Your PR Calendar for 2025 (+Templates) |

32 Min Read
Computing

vs. Todoist: Which Productivity Tool Is Better? |

29 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?