7 Comments
User's avatar
Eloy Ruiz's avatar

Which Grok model is reviewed here?

Ehijele Abhulimhen's avatar

Grok Build 0.1 was used

Larry's avatar

Any chance you might make the app used for the review process available so that other models can be tested / compared against the known results?

Ugrin Radkov's avatar

Not gut person taoby have girty not been in your house since last night but person woman was just 2 years of me

matt wilkie's avatar

Interesting. I wonder what the total review cost would be in iteration: round 1: have grok identify bugs, fix them, round 2: run opus over the cleaned code. Would grok review + opus review be more economical than a one shot opus only review?

For the purposes of determining total review cost the fixing expense would be excluded.

Skylar's avatar

Including a comparable OpenAI model such as Codex would have been a more interesting comparison rather than just the GPT model.

Ahmet Sezen's avatar

Very surprised by GPT-5.5. My experience is better than Opus with GPT.