“write a set of unit-tests against a set of abstract classes used as arguments of such unit-tests.”
An exhaustive set of use cases to confirm vibe AI generated apps would be an app by itself. Experienced developers know what subsets of tests are critical, avoiding much work.
> Experienced developers know what subsets of tests are critical, avoiding much work.
And, they do know this for the programs written by other experienced developers, because they know where to expect "linearity" and were to expect steps in the output function. (Testing 0, 1, 127, 128, 255, is important, 89 and 90 likely not, unless that's part of the domain knowledge) This is not necessarily correct for statistically derived algorithm descriptions.
That depends a bit on whether you view and use unit-tests for
a) Testing that the spec is implemented correctly, OR
b) As the Spec itself, or part of it.
I know people have different views on this, but if unit-tests are not the spec, or part of it, then we must formalize the spec in some other way.
If the Spec is not written in some formal way then I don't think we can automatically verify whether the implementation implements the spec, or not. (that's what the cartoon was about).
> then we must formalize the spec in some other way.
For most projects, the spec is formalized in formal natural language (like any other spec in other professions) and that is mostly fine.
If you want your unit tests to be the spec, as I wrote in https://news.ycombinator.com/item?id=46667964, there would be quite A LOT of them needed. I rather learn to write proofs, then try to exhaustively list all possible combinations of a (near) infinite number of input/output combinations. Unit-tests are simply the wrong tool, because they imply taking excerpts from the library of all possible books. I don't think that is what people mean with e.g. TDD.
What the cartoon is about is that any formal(-enough) way to describe program behaviour will just be yet another programming tool/language. If you have some novel way of program specification, someone will write a compiler and then we might use it, but it will still be programming and LLMs ain't that.
I agree (?) that using AI vibe-coding can be a good way to prooduce a prototype for stakeholders to see if the AI-output is actually something they want.
The problem I see is how to evolve such a prototype to more correct specs, or changed specs in the future, because AI output is non-deterministic -- and "vibes" are ambiguous.
Giving AI more specs or modified specs means it will have to re-interpret the specs and since its output is non-deterministic it can re-interpret viby specs differently and thus diverge in a new direction.
Using unit-tests as (at least part of) the spec would be a way to keep the specs stable and unambiguous. If AI is re-interpreting the viby ambiguous specs, then the specs are unstable which measn the final output has hard-time converging to a stable state.
I've asked this before, not knowing much about AI-sw-development, whether there is an LLM that given a set of unit-tests, will generate an implementation that passes those unit-tests? And is such practice used commonly in the community, and if not why not?
An exhaustive set of use cases to confirm vibe AI generated apps would be an app by itself. Experienced developers know what subsets of tests are critical, avoiding much work.