By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: How CODEX Model Size Influences COCOGEN’s Output Quality | HackerNoon
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > How CODEX Model Size Influences COCOGEN’s Output Quality | HackerNoon
Computing

How CODEX Model Size Influences COCOGEN’s Output Quality | HackerNoon

News Room
Last updated: 2025/04/24 at 12:56 PM
News Room Published 24 April 2025
Share
SHARE

Table of Links

Abstract and 1 Introduction

2 COCOGEN: Representing Commonsense structures with code and 2.1 Converting (T,G) into Python code

2.2 Few-shot prompting for generating G

3 Evaluation and 3.1 Experimental setup

3.2 Script generation: PROSCRIPT

3.3 Entity state tracking: PROPARA

3.4 Argument graph generation: EXPLAGRAPHS

4 Analysis

5 Related work

6 Conclusion, Acknowledgments, Limitations, and References

A Few-shot models size estimates

B Dynamic prompt Creation

C Human Evaluation

D Dataset statistics

E Sample outputs

F Prompts

G Designing Python class for a structured task

H Impact of Model size

I Variation in prompts

G Designing Python class for a structured task

Figure 7 shows three different designs for Explagraphs. For PROSCRIPT, the various formats include representing proscript as a Networkx[8] class (8), DOT-like class 9, and as a Tree (10).

H Impact of Model size

The CODEX model released by OpenAI is available in two versions[9]: code-davinci-001 and code-davinci-002. While the exact sizes of the models are unknown because of their proprietary nature, OpenAI API states that code-davinci-002 is the Most capable Codex model Tables 16 and ?? compares COCOGEN +code-davinci-001 with COCOGEN +code-davinci-002. Note that both code-davinci-001 and code-davinci-002 can fit 4000 tokens, so the number of in-context examples was identical for the two settings. The results show that for identical prompts, COCOGEN +code-davinci-002 vastly outperforms COCOGEN +code-davinci-001, showing the importance of having a better underlying code generation model.

Figure 5: Example graphs for each of the tasks used for COCOGEN: PROSCRIPT (top-left), EXPLAGRAPHS (topright), and PROPARA (bottom).Figure 5: Example graphs for each of the tasks used for COCOGEN: PROSCRIPT (top-left), EXPLAGRAPHS (topright), and PROPARA (bottom).

Table 13: Performance of CODEX on the three different formats present in Figure 7 for EXPLAGRAPHS.Table 13: Performance of CODEX on the three different formats present in Figure 7 for EXPLAGRAPHS.

Table 14: Performance of CODEX-001 and CODEX002 on the the different formats present in Figure 10 and 9 for PROSCRIPT edge prediction. We find that the literal format that combines structure with literally Figure output performs the best for CODEX-002.Table 14: Performance of CODEX-001 and CODEX002 on the the different formats present in Figure 10 and 9 for PROSCRIPT edge prediction. We find that the literal format that combines structure with literally Figure output performs the best for CODEX-002.

Model size vs. sensitivity to the prompt In Table 14 shows the performance of CODEX-001 (smaller) and CODEX-002 (larger, also see Appendix A) on identical prompts. Our experiments show that as model size increases, the sensitivity of the model on the prompt design might get progressively easier.

I Variation in prompts

We run each experiment with 4 different random seeds, where the random seeds decide the order of examples in the prompt. We find minimal variance between runs using different fixed prompts between 3 runs. Further, as shown in the Table 18, 19, 20, and 21, all improvements of COCOGEN over DAVINCI are statistically (p-value < 0.001).

Figure 6: A PROSCRIPT plan (top) and the corresponding Python code (bottom).Figure 6: A PROSCRIPT plan (top) and the corresponding Python code (bottom).

Table 18: PROSCRIPT script generation: mean and standard deviation across three different random seeds.Table 18: PROSCRIPT script generation: mean and standard deviation across three different random seeds.

Table 21: PROPARA: mean and standard deviation across three different random seeds.Table 21: PROPARA: mean and standard deviation across three different random seeds.

Table 19: PROSCRIPT edge prediction: mean and standard deviation across three different random seeds.Table 19: PROSCRIPT edge prediction: mean and standard deviation across three different random seeds.

Table 15: CODEX results on PROSCRIPT generation for various Python source formats.Table 15: CODEX results on PROSCRIPT generation for various Python source formats.

Figure 7: Templates tried for explagraph.Figure 7: Templates tried for explagraph.

Table 16: CODEX-001 vs 002 on PROSCRIPT script generationTable 16: CODEX-001 vs 002 on PROSCRIPT script generation

Figure 8: Proscript as a Networkx class.Figure 8: Proscript as a Networkx class.

Figure 9: Representing PROSCRIPT graph literally.Figure 9: Representing PROSCRIPT graph literally.

Table 20: EXPLAGRAPHS: mean and standard deviation across three different random seeds.Table 20: EXPLAGRAPHS: mean and standard deviation across three different random seeds.

Figure 10: Proscript with a tree-encoding.Figure 10: Proscript with a tree-encoding.


[9] as of June 2022


Authors:

(1) Aman Madaan, Language Technologies Institute, Carnegie Mellon University, USA ([email protected]);

(2) Shuyan Zhou, Language Technologies Institute, Carnegie Mellon University, USA ([email protected]);

(3) Uri Alon, Language Technologies Institute, Carnegie Mellon University, USA ([email protected]);

(4) Yiming Yang, Language Technologies Institute, Carnegie Mellon University, USA ([email protected]);

(5) Graham Neubig, Language Technologies Institute, Carnegie Mellon University, USA ([email protected]).

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Smartphone Deal: The Samsung Galaxy S24 Ultra Rings Up for $403 Less
Next Article Gmail’s New Encrypted Messages Feature Opens a Door for Scams
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

The streaming hot list: This week’s biggest series on Max, Disney+, Netflix, and more
News
Com4 selects Nokia 5G standalone core to power global IoT | Computer Weekly
News
Good Lock’s newest feature promised me home screen freedom, but delivered total chaos
News
Americans can stay for as cheap as $8 a night in historic city, expert says
News

You Might also Like

Computing

Top 15 Change Management KPIs and Metrics to Track |

30 Min Read
Computing

Niri 25.05 Brings New Features To This Innovative Wayland Compositor

1 Min Read
Computing

Debian Installer Trixie RC 1 Adds Rescue Support On Btrfs, Upgraded Linux 6.12 Kernel

2 Min Read
Computing

Top 11 Veed.io Alternatives |

28 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?