ChatGPT gets code questions wrong 52% of the time

ylai@lemmy.ml · 11 个月前

ChatGPT gets code questions wrong 52% of the time

Immersive_Matthew@sh.itjust.works · 11 个月前

I am using ChatGPT4+ with the code interpreter and I am finding it closer to 90% accurate writing 50-200 lines of c# code in Unity. Beyond 200 it starts to have more issues and the accuracy drops. It has saved me so much time refactoring my project.

SirGolan@lemmy.sdf.org · 11 个月前

Yeah. They buried it in there (and for some of their experiments just said “ChatGPT” which could mean either), but they used 3.5 and oddly enough, 3.5 gets 48% on HumanEval.

fristislurper · edit-2 11 个月前

They “burried” it in the methodology section, where they describe how they generate prompts. This is the place I expect this to be mentioned, or am I missing something? Where else would they put it.

SirGolan@lemmy.sdf.org · 11 个月前

It’s a pretty important fact since there’s a huge difference between 3.5 and 4. Mentioning it once in one place is not great, plus they also just mention ChatGPT without specifying 3.5 or 4 earlier in that paragraph. The problem I have is this has led to press (and hence many other people) thinking ChatGPT is terrible at coding when in fact using the GPT 4 version, it’s actually pretty decent.