Monday, June 3, 2024

Five Top Tech Takeaways: Microsoft Unveil AI PCs, Google's $125B Cloud Glitch, Battle of the Bots, Glue Pizza, and GPT4 Outshines in Financial Forecasting

Make an image of a battle royale with the 5 robots with logos instead of heads. This includes the Google logo, Microsoft logo, OpenAI logo, Perplexity AI logo, and Anthropic logo

Battle of the Bots: WSJ's Comprehensive GenAI Evaluation 

A comprehensive test by the Wall Street Journal compared five top AI chatbots—OpenAI's ChatGPT, Microsoft's Copilot, Google's Gemini, Perplexity, and Anthropic's Claude—on everyday skills. The tests included tasks in health, finance, cooking, work writing, creative writing, summarization, current events, coding, and speed. Perplexity emerged as the overall winner, excelling in summarization and current events, while ChatGPT performed best in health advice and speed. Each chatbot demonstrated unique strengths and weaknesses, highlighting the rapid evolution and diverse capabilities of AI technology.

Key Takeaways:
  • Perplexity outperformed other AI chatbots in summarization and current events tasks.
  • Claude excelled at work and creative writing.
  • ChatGPT had the fastest response time.

Author's note: 

The Wall Street Journal's exercise is a valuable reminder that organizations must conduct a thorough and rigorous analysis of Generative AI (GenAI) vendors, just as they would with any other software procurement. The article provides a solid foundation for testing and evaluating these tools. However, it is crucial to note that effective testing can only be carried out once the potential business benefits have been clearly identified.

To ensure a comprehensive evaluation, organizations should first determine which specific functions will utilize the AI-powered chatbots and establish clear guidelines on how they will be used and managed within the company. This groundwork will enable the development of targeted test prompts, allowing for a more accurate assessment of which chatbot is best suited to meet the organization's unique needs and requirements. By aligning the testing process with the identified business objectives, companies can make informed decisions when selecting the most appropriate GenAI vendor for their specific use cases.

Microsoft Unveils AI-Powered Copilot+ PCs


Microsoft introduced the Copilot+ PCs, a new category of AI-enhanced Windows PCs with advanced silicon offering up to 40+ TOPS, extended battery life, and innovative AI features. These PCs feature Recall for memory-like data retrieval, Cocreator for real-time AI image creation, and live audio translation from 40+ languages. Starting at $999, they will be available from June 18 and include models from Microsoft Surface and other major brands.

Key Takeaways
  • Microsoft launched Copilot+ PCs, integrating advanced AI capabilities and powerful silicon.
  • Features include Recall for data retrieval, Cocreator for AI-driven image creation, and live translation.
  • The devices, starting at $999, will be available from June 18 from multiple major brands.
(Source: Microsoft)

From Glue on Pizza to Eating Rocks: Google's AI Under Fire

Google's new AI-generated search overviews have been widely mocked for providing bizarre and incorrect responses to user queries. Examples include recommending eating rocks based on a humor website, suggesting glue for pizza cheese, and sharing incorrect and offensive information about former President Obama. These errors highlight significant limitations and "hallucinations" in AI technology, prompting criticism and calls for better safeguards and accuracy in AI-generated content.

Key Takeaways
  • Google's AI search overviews have produced absurd and incorrect answers, causing social media backlash.
  • Errors included recommending glue for pizza and sharing false information about former President Obama.
  • Google's AI issues underline broader challenges in AI technology, especially regarding accuracy and reliability.

GPT-4 Outshines Humans in Financial Forecasting

A study by the Booth School of Business at the University of Chicago found that OpenAI's GPT-4 outperforms human financial analysts in predicting earnings changes from financial statements. Using "chain-of-thought" prompts, GPT-4 achieved a 60% accuracy rate compared to the low 50% range of human analysts. Additionally, trading strategies based on GPT-4's forecasts yielded more profitable results than traditional stock market approaches, suggesting significant potential for AI in financial decision-making.

Key Takeaways
  • GPT-4 surpasses human analysts in financial earnings predictions.
  • The AI model achieved a 60% accuracy rate, higher than human analysts.
  • GPT-4-based trading strategies generated higher profits than the stock market.
(Source: Business Insider)

Google's Cloud Glitch: UniSuper's $125 Billion Account Erased

Google inadvertently deleted the Google Cloud account of UniSuper, an Australian pension fund managing $125 billion. This incident left over half a million fund members without account access for about a week. UniSuper restored service via a backup account with another cloud provider. Google Cloud CEO Thomas Kurian and UniSuper CEO Peter Chun acknowledged the severity of the situation and assured measures have been taken to prevent future occurrences.

Key Takeaways:
  • Google accidentally erased a $125 billion pension fund's Google Cloud account, affecting over half a million users.
  • The issue was resolved using a backup account with another cloud provider.
  • Google and UniSuper CEOs stated measures are in place to prevent similar incidents.
(Source: Yahoo)

Author: Malik Datardina, CPA, CA, CISA. Malik works at Auvenir as a GRC Strategist who is working to transform the engagement experience for accounting firms and their clients. The opinions expressed here do not necessarily represent UWCISA, UW, Auvenir (or its affiliates), CPA Canada or anyone else. This post was written with the assistance of an AI language model. The model provided suggestions and completions to help me write, but the final content and opinions are my own.

No comments: