3 Comments
User's avatar
Andrew Simard's avatar

As an aging solo developer, I never had much interest in testing or testing frameworks, that sort of thing, as I worked on my own stuff. With AI though, I'm all-in on testing everything, even in simpler apps, as it provides a natural mechanism to help put blinders on models so they get to their destination without taking a wrecking ball through a project. So this evolution to Task > Evaluate > Test is I think exactly what we want models to be doing? And if we compartmentalize our code enough, then we'll always seemingly be giving them "simple" tasks to implement when in reality these are likely to often be smaller pieces of a larger project where that extra bit of effort is not wasted. If anything, I see these models doing more as a way to combat complacency in our own coding efforts. `I *could* do that but I'm in a rush and it's not MVP so let's skip it` kind of mindset versus `what would I do in this situation if I had enough talent and resources`. I'm liking the latter approach myself.

Η Προώθηση Της Γνώσης's avatar

Usually a better workflow is to ask the agent to create a plan and wait for review. That way you can read what the agent intents to implement before starting implementation. You would have seen the over-engineering intention early on.

For fixing bugs; It can happen that we ask an agent to fix a bug. Usually if the bug is complext, first we ask the agent to investigate , find the root cause, or debug. Modyfying the request to:

'Find the root cause for [X] , show me the problem, suggest a fix, and wait for review'

may have handled your complaint? The idea is to treat the agent as a co-worker with whom we have to communicate, but I guess we all tend to forget occasionally

Η Προώθηση Της Γνώσης's avatar

on second thought , i would like to add:

using software patterns is an excelent way to reduce complexity and fiux erros, or safeguard for future expansion of capabilities.

so I guess the model didnt 'read' your expectation to KISS because it was a tutorial/example code. Maybe the expectation was not expressed at all, was not implied strongly enough, or maybe a SOTA model wasnt the best choice for 'simpler' work