The 96% vs 48% gap is uncomfortable. I've been in both buckets - suspicious of AI output but not rigorously reviewing it every time.
Running deliberate, structured experiments helps break that pattern. I put Sonnet 4.6 through two very different tasks this week with explicit assumptions going in. One matched expectations cleanly. The other got uncomfortably personal in a way I hadn't planned for.
Thanks for sharing that Pawel! Right, experimenting and trying out what works is key. And hallucinations are quite common, so not trusting should be ON by default, and then depending on what kind of guardrails and constraints you have set up, you can get more comfortable with a bit less checking if those are on a really good level.
The 96% vs 48% gap is uncomfortable. I've been in both buckets - suspicious of AI output but not rigorously reviewing it every time.
Running deliberate, structured experiments helps break that pattern. I put Sonnet 4.6 through two very different tasks this week with explicit assumptions going in. One matched expectations cleanly. The other got uncomfortably personal in a way I hadn't planned for.
Turns out deliberately probing a model vs casually trusting it are completely different modes that produce completely different results. Here's what the experiments actually showed: https://thoughts.jock.pl/p/sonnet-46-two-experiments-one-got-personal
Thanks for sharing that Pawel! Right, experimenting and trying out what works is key. And hallucinations are quite common, so not trusting should be ON by default, and then depending on what kind of guardrails and constraints you have set up, you can get more comfortable with a bit less checking if those are on a really good level.
Hah, "guardrails" is a word of 2026 for sure :D
And yes - I agree. We still need "human in the loop" for some high-risk tasks.