First, there was ChatGPT. Now, there are a number of Large Language Models out there that offer you the opportunity to use Generative Artificial Intelligence to help you get answers. But which model is best? And at what tasks do different models excel. The Wall Street Journal put five different bots — ChatGPT, Claude, Perplexity, Copilot, and Gemini — to the test on a number of different topics to help understand their strengths and weaknesses.
While the quality of the prompt — what you are asking the bot — matters a great deal in determining the quality of the answer you get, asking the bots the same question can often yield different answers. The bots were asked, for example, what is the best age to get pregnant. Gemini didn’t really answer the question, saying, “The best time to get pregnant is whenever you feel confident and prepared to raise a child,” while Perplexity shared some of the advantages — more maturity, better financial stability, and stronger partnerships — when having kids at a later age.
When asked what to do after inheriting an IRA with $1 million in it, Copilot did little more than offer congratulations for inheriting such a large sum of money. Gemini, meanwhile, redeemed itself by sharing information about when to withdraw funds and not rushing into any decisions without seeking the advice of someone like a financial advisor.
In other tests, Copilot finished last among the bots when it came to writing on work topics, but finished at the top of the list when it came to creative writing.
Personally, what I have started doing is prompting multiple bots with queries and then comparing their answers and either choosing one over the other or taking information from both. The upshot is that relying on one bot for everything might be the easiest course of action, but it’s likely not going to get you the best information on a consistent basis.