I haven’t been in development for nearly 20 years now, but I assumed it worked like that:
You generate unit tests for a very specific function of rather limited magnitude, then you let AI generate the function. How could this work otherwise?
Bonus points if you let the AI divide your overall problem into smaller problems of manageable magnitudes. That wouldn’t involve code generation as such…
At that point you should be able to just write the code yourself.
The A"I" will either make mistakes even under defined bounds, or it will never make any mistakes ever in which case it’s not an autocomplete, it’s a compiler and we’ve just gone full circle.
The complexity here lies in having to craft a comprehensive enough spec. Correctness is one aspect, but another is performance. If the AI craps out code that passes your tests, but does it in really inefficient way then it’s still a problem.
Also worth noting that you don’t actually need AI to do such things. For example, Barliman is a tool that can do program synthesis. Given a set of tests to pass, it attempts to complete the program for you. Synthesis is performed using logic programming. Not only is it capable of generating code, but it can also reuse code it’s already come up with as basis for solving bigger problems.
I tend to write a comment of what I want to do, and have Copilot suggest the next 1-8 lines for me. I then check the code if it’s correct and fix it if necessary.
For small tasks it’s usually good enough, and I’ve already written a comment explaining what the code does. It can also be convenient to use it to explore an unknown library or functionality quickly.
“Unknown library” often means a rather small and sparely documented and used library tho, for me. Which means AI makes everything even worse by hallucinating.
I haven’t been in development for nearly 20 years now, but I assumed it worked like that:
You generate unit tests for a very specific function of rather limited magnitude, then you let AI generate the function. How could this work otherwise?
Bonus points if you let the AI divide your overall problem into smaller problems of manageable magnitudes. That wouldn’t involve code generation as such…
Am I wrong with this approach?
At that point you should be able to just write the code yourself.
The A"I" will either make mistakes even under defined bounds, or it will never make any mistakes ever in which case it’s not an autocomplete, it’s a compiler and we’ve just gone full circle.
The complexity here lies in having to craft a comprehensive enough spec. Correctness is one aspect, but another is performance. If the AI craps out code that passes your tests, but does it in really inefficient way then it’s still a problem.
Also worth noting that you don’t actually need AI to do such things. For example, Barliman is a tool that can do program synthesis. Given a set of tests to pass, it attempts to complete the program for you. Synthesis is performed using logic programming. Not only is it capable of generating code, but it can also reuse code it’s already come up with as basis for solving bigger problems.
https://github.com/webyrd/Barliman
here’s a talk about how it works https://www.youtube.com/watch?v=er_lLvkklsk
I tend to write a comment of what I want to do, and have Copilot suggest the next 1-8 lines for me. I then check the code if it’s correct and fix it if necessary.
For small tasks it’s usually good enough, and I’ve already written a comment explaining what the code does. It can also be convenient to use it to explore an unknown library or functionality quickly.
“Unknown library” often means a rather small and sparely documented and used library tho, for me. Which means AI makes everything even worse by hallucinating.
I meant a library unknown to me specifically. I do encounter hallucinations every now and then but usually they’re quickly fixable.
It’s made me a little bit faster, sometimes. It’s certainly not like a 50-100% increase or anything, maybe like a 5-10% at best?