I’m starting to get this question more and more from do-it-yourself types hoping to save money. If computers can beat a world chess champion, they argue, why can’t they perform as good as or better than my local tax advisor?
We’ll explore this issue in more detail below, but the short answer is no, artificial intelligence is not at the point where it can analyze tax issues and reliably produce accurate responses.
Equally important, the IRS is not at the point where it will waive penalties if you relied on a chatbot.
For those of you who haven’t heard, ChatGPT is a chatbot that is getting a tremendous amount of press lately. It stands for Chat Generative Pre-Trained Transformer and for our purposes it is a generative artificial intelligence search engine that provides responses in a conversational tone reminiscent of humans. I’m sure knowledgeable techies would take issue with my simple definition, but I’m writing this article for you, not them, so let’s move on. ChatGPT allows users to ask questions and the responses it generates are conversational and seem as if they were written by a human – a very smart human that has instant recall of everything on the internet.
The company behind ChatGPT is OpenAI, and Microsoft is its biggest investor. Other companies have developed their own generative AI engines, but ChatGPT is getting the lion’s share of press these days. Some commentators believe ChatGPT is special only because it’s public, and that other generative AI search engines are just as good if not better. Microsoft added generative AI to Bing, Meta has Blenderbot, and Google has Bard. Not surprisingly, there is even a website dedicated to maintaining the largest AI tools directory.
I spent some time testing ChatGPT in a very unscientific way. I posed a variety of easy and hard tax questions to it. The responses were quite surprising, but not for their accuracy. The responses were well-written, and the language was so natural that it was hard to believe they were not written by a person.
The responses had a consistent and logical organization to them but were short on analysis. After reading a few responses, it became apparent the engine simply pieced together relevant sentences from various parts of the internet. The obvious things missing in ChatGPT’s responses were the analysis of facts and the application of relevant tax principles to the questions presented.
I did notice that the quality of the responses was based on the specificity of the questions, and there is an art to posing questions that elicit the most accurate responses. Some call this “prompt engineering” – the art of knowing how to talk to an AI tool, and it is probably something we will all get better at over time. And when dealing with intricate and complex subject matters like tax, users need to have some baseline familiarity with the subject matter of their queries in order to know if they are even asking the right questions.
The engine answered one of the hardest questions correctly, but no reasoning was provided and it’s hard to rely on a black box that essentially just says yes or no. More than half of the responses were incorrect or contained irrelevant information. For now, at least, I can only recommend ChatGPT and other AI-powered tools as secondary or even tertiary tax research tools, but they are definitely not ready to replace even a modestly trained tax advisor.
There are tax-specific AI-powered search engines out there, and I can say firsthand that at least one of them works pretty well. But even the tax-specific search engines are best used as research tools to find the most relevant authorities that you can analyze on your own. At least for now, unfortunately, AI cannot replace human tax analysis, but who knows how long humans can continue to outpace computers.