The Boring Newsletter, 3/8/2026

In Which I Dramatically Vanquish AI in a Tax Research Showdown

Hi Friendos,

A few days ago, the New York Times published an article headlined, “A Word to the Wise: Don’t Trust A.I. to File Your Taxes” (gift link). “Duh,” I thought, and of course then I read the article. They had four different chatbots (Gemini, ChatGPT, Claude, and Grok) test eight different tax situations, and on average the robots got the tax refund amounts wrong by more than $2k. The article says this is not surprising due to the probabilistic nature of large language models, in contrast to the if-then logic of “tax software like TurboTax” which is “built for mathematical precision.” But it concludes, “Tax experts have suggested that the tools are still a helpful assistant to use alongside manual research.”

I’ve been trying a couple different AI tools (ChatGPT and Microsoft Copilot) with varying degrees of success. My results are much better when I write detailed prompts and end them with “Ask me any relevant questions.” After my recent wrist surgery, AI greatly outperformed my phoning-it-in physical therapist and pinpointed an exercise I was overlooking. AI helped draft a perfect appeal letter to my insurance company after I got a prescription coverage denial. But AI completely bombed at finding contractors for a home repair project, giving a bunch of nonexistent URLs.

With my own tax research, it’s been a mixed bag. I think the best use is to have AI to point you to relevant IRS publications and relevant articles from experts. The conclusions drawn by AI chatbots are sometimes right, sometimes wrong, and that’s not good enough for tax questions.

Yesterday, I was researching max IRA contributions and deductions for a certain tax situation and Microsoft Copilot said my answer was $2,000. I thought $7,500 was probably correct. I found my way to IRS Publication 590-A, “Contributions to Individual Retirement Arrangements (IRAs)” which had an example that was exactly on point to my question and confirmed the $7,500 I suspected. Job done.

But…if the answer was $7,500, why did the robot say $2,000? I asked the robot to explain its logic, but the answers didn’t make sense to me. We went in circles a couple times and I admit, I got testy with it and the robot said “let’s take a deep breath.” Not my finest moment! But aren’t we all that passionate about IRA contribution limits?

I asked for specifics and the robot said it arrived at $2,000 via a worksheet in Pub 590-A. “Ok,” I thought, I’ll take a look at that,” so I asked it for the page number of the worksheet. I searched for it but what the robot flagged related to a different topic, not IRA contribution limits. I complained to the robot:

Pfft, the robot referenced a worksheet that doesn’t exist. So was the $2,000 answer drawn from some unreliable source among the oodles of online discussions of IRAs? The robot described a bunch of sources but didn’t give actual website links.

It gave me 7 URLs with quotations from each source, and I diligently went to each URL. Either the URL did not bring up a website at all, it brought up an error message, or the webpage that loaded led to the same $7,500 answer I had already concluded was correct. I complained to the robot again:

Woah! The robot said it was wrong. It also said I was right!!

Victory! I got a stupid robot to admit an error about an esoteric IRA question.

A couple years ago, I would have gone to IRS.gov and typed in “IRA.” It might have taken a few minutes, but I would have found my way to Pub 590-A, seen the on-point example provided by the IRS, and called it a day. Instead of only using AI to lead me to the correct source, yesterday I tried to have it give me the final answer and ended up wasting a bunch of time and temporarily doubting myself.

I said earlier that I’ve had better results with AI when I tell it to ask me any relevant questions. If we ask all the relevant questions while preparing a tax return, well…that’s just regular tax software and it already exists. I don’t think we need fancy new technology to help ordinary people navigate our overly complicated tax system. Technology is awesome at solving technological problems and not awesome at solving political problems.

-Stephanie

In Which I Dramatically Vanquish AI in a Tax Research Showdown

Related