Showing posts with label CNBC. Show all posts
Showing posts with label CNBC. Show all posts

Friday, January 31, 2025

Understanding DeepSeek's AI Breakthrough: 5 Videos to Get You Up to Speed!

With the seismic impact of DeepSeek on AI, the stock market, and geopolitics, we wanted to follow-up our previous post with a deeper exploration of the topic. In this post, we found 5 videos that will help you get up to speed on the unfolding drama.  

Vid1: CNBC Covers the Ensuing Market Meltdown

CNBC discusses the impact of China's new AI model, DeepSeek, on the global tech industry. DeepSeek's superior efficiency and performance, even surpassing some American models, is causing a major sell-off in AI-related stocks, particularly impacting companies like Nvidia. The video explores concerns about DeepSeek's potential access to advanced technology and the implications for US technological dominance. The discussion also touches upon the shift towards open-source AI models and the uncertainty surrounding future investments in AI development. Finally, the video highlights the rapid advancement of AI technology and its potential societal impact, comparing the situation to the Sputnik moment of the space race.

Vid2: AI Enthusiast, Matt Wolfe, Gives His Take

Matt Wolfe, who closely follows the AI space, discusses DeepSeek R1, a new Chinese open-source AI model that has caused significant market reactions. DeepSeek's impressive performance, achieved with significantly less computing power than comparable models like GPT-4, is attributed to its efficient training methods and innovative design. Controversy surrounds DeepSeek's claims regarding its resource usage, with some suggesting the company downplayed the actual computational resources employed. Despite this, the video argues the model's impact may be positive, possibly lowering the barrier to entry for AI development and increasing overall demand for GPUs. The video also covers DeepSeek's image generation model, Janice Pro 7B, and provides instructions on how to access and utilize DeepSeek.

Vid3: A Geopolitical Perspective on the DeepSeek Saga

Here is Cold Fusion’s take on the DeepSeek story. He discusses the sudden emergence of DeepSeek R1, a free, open-source Chinese AI model that rivals—and in some ways surpasses—leading American AI models. Its unexpectedly low development cost and superior efficiency have sent shockwaves through the US stock market and prompted a reassessment of AI development strategies. Concerns about intellectual property theft are raised, alongside geopolitical implications of this technological advancement. The narrative explores the innovative techniques behind DeepSeek R1's performance and the competitive landscape it has created, highlighting the resulting cost reductions and potential for rapid AI progress globally.

Vid4: If you are using DeepSeek, Your Data is Going to China!

Skill Leap AI discusses serious privacy concerns regarding the DeepSeek website and app, highlighting issues like vague data retention policies, data storage in China raising compliance issues with international laws, lack of transparency in data usage, and insufficient age verification. The creator outlines these issues after reviewing the platform's privacy policy and terms of service using ChatGPT. To mitigate these risks, the video suggests using locally installed versions of DeepSeek R1 or utilizing DeepSeek's integration within the PerplexityAI search engine, a US-based service. Finally, the video promises a future comparison of DeepSeek R1 and ChatGPT's 01 model.

Vid5: A Video Walkthrough of Dario Amodei's take on DeepSeek's Capabilities

In this video, Matt Berman takes a look at Dario Amodei's take on the DeepSeek saga. Amodei, the current CEO of OpenAI’s chief rival Anthropic, wrote an essay discussing the implications of DeepSeek's AI model, R1, particularly concerning its potential data acquisition from OpenAI and the resulting impact on the AI industry and geopolitical landscape. The essay analyzes the three key dynamics of AI development: scaling laws, the shifting curve, and paradigm shifts, emphasizing the escalating costs and exponential advancements in AI capabilities. Concerns about China's access to advanced GPUs and their potential to achieve artificial general intelligence (AGI) are also highlighted, underscoring the importance of export controls. Finally, the essay argues that DeepSeek's cost-effective model, while impressive, does not represent a fundamental shift in AI economics and that the market's overreaction was unwarranted.

Author: Malik Datardina, CPA, CA, CISA. Malik works at Auvenir as a Sr. AI Product Manager who is working to transform the engagement experience for accounting firms and their clients. The opinions expressed here do not necessarily represent UWCISA, UW, Auvenir (or its affiliates), CPA Canada or anyone else. This post was written with the assistance of an AI language model. The model provided suggestions and completions to help me write, but the final content and opinions are my own.


Friday, January 24, 2025

DeepSeek: How a Chinese Open-Source AI is Disrupting Silicon Valley's Dominance

In my ongoing exploration of emerging AI technologies, I recently encountered a fascinating CNBC deep dive into DeepSeek, a Chinese AI firm that's reshaping the competitive landscape of artificial intelligence. The investigation sparked my interest in understanding how this relatively unknown company has managed to develop AI models that rival industry giants like OpenAI and Google, while maintaining an open-source approach and significantly lower development costs. Through comprehensive research and analysis of DeepSeek's technical innovations, particularly their DeepSeek-V3 and DeepSeek-R1 models, I've uncovered insights into how this disruptor is challenging conventional wisdom about AI development costs and accessibility. This article examines DeepSeek's technological breakthroughs, their implications for the global AI race, and what their emergence means for making advanced AI accessible to all.

 

Intro

DeepSeek, a relatively unknown Chinese AI firm with roots in the quantitative stock trading firm High-Flyer1 has sent ripples through the AI community with its release of DeepSeek-V3 and DeepSeek-R1, two powerful open-source AI systems. These models are impressive not only for their technical capabilities, which rival those of industry giants like OpenAI and Google, but also for their remarkably low development costs and open-source accessibility. This has sparked considerable discussion about the evolving AI landscape and the intensifying competition between the US and China in AI development.

DeepSeek-V3: A Technical Marvel

With a massive 671 billion parameters, DeepSeek-V3 surpasses even Meta's Llama 3.1 in scale2. However, DeepSeek-V3 distinguishes itself through its innovative Mixture-of-Experts (MoE) architecture. This architecture activates only the necessary neural networks for specific tasks, resulting in significant cost savings and improved efficiency3. Despite its vast number of parameters, DeepSeek-V3 operates with just 37 billion during actual tasks3. This efficient design allows it to achieve high performance with significantly less computational power and cost compared to its peers4.

DeepSeek-V3 excels in various text-based tasks, including coding, translation, and writing2. It has achieved top scores on popular AI benchmarks, such as HumanEval, GSM8K, and MMLU, challenging both open and closed-source models3. Notably, DeepSeek-V3 outperformed Meta's Llama 3.1, OpenAI's GPT-4o, and Anthropic's Claude Sonnet 3.5 in accuracy across various tasks, from complex problem-solving to math and coding6.

However, it's important to acknowledge that DeepSeek-V3 is primarily focused on text-based tasks and does not possess multimodal abilities2. Like many large language models, it may also inherit biases from its training data, requiring careful consideration in real-world applications2.

DeepSeek-R1: Mastering Reasoning

DeepSeek further solidified its position with the release of DeepSeek-R1, a reasoning model designed to tackle complex problems with a focus on logical inference, mathematical problem-solving, and real-time decision-making7. This sets it apart from traditional language models, which primarily focus on text generation and comprehension.

DeepSeek-R1's development began with DeepSeek-R1-Zero, a foundational model trained exclusively via reinforcement learning8. While R1-Zero showed promise in reasoning, it faced challenges with readability and output coherence. DeepSeek addressed these issues in R1 by incorporating cold-start data and a multi-stage reinforcement learning process9.

DeepSeek-R1 has demonstrated remarkable performance on various benchmarks, including AIME (American Invitational Mathematics Examination) and MATH1. While DeepSeek claimed that R1 exceeded the performance of OpenAI's o1 on these benchmarks, independent analysis by The Wall Street Journal found that o1 was faster in solving AIME problems1. Nevertheless, R1's performance remains competitive with leading models in the field.





Feature

DeepSeek-R1

OpenAI o1

Architecture

Mixture-of-Experts (MoE)

-

Parameters

671 billion total, 37 billion active

-

AIME 2024 (Pass@1)

79.8%

79.2%

MATH-500 (Pass@1)

97.3%

96.4%

Codeforces (Percentile)

96.3

96.6

Cost (per million tokens)

$2.19

$60

Key Features

Open-source, transparent reasoning

Chain-of-thought processing

DeepSeek-R1 has also shown promising results in financial analysis, outperforming the S&P 500 and maintaining superior Sharpe and Sortino ratios compared to the market7. Furthermore, it exhibits a unique ability to provide transparent reasoning, offering insights into its decision-making process10. However, it's worth noting that the model tends to align with the official Chinese government position on sensitive political topics10.

DeepSeek-VL: Expanding into Vision-Language

Beyond its language models, DeepSeek is also exploring vision-language (VL) capabilities with DeepSeek-VL11. This model is designed for real-world vision and language understanding applications, with the ability to process logical diagrams, web pages, formulas, scientific literature, and natural images11. DeepSeek-VL showcases the company's commitment to advancing AI research across multiple modalities.

Open-Source and Cost-Effective: AI For All

One of the most significant aspects of DeepSeek's models is their open-source nature. This allows developers and researchers to freely access, modify, and deploy the models, fostering collaboration and innovation within the AI community4. This open approach contrasts with the proprietary models of many US-based companies and has the potential to democratize access to advanced AI technologies12.

DeepSeek has achieved these impressive results with significantly lower development costs. DeepSeek-V3 was reportedly trained in around 55 days at a cost of US$5.58 million, using considerably fewer resources compared to its peers1. This cost-effectiveness challenges the existing paradigm in the AI industry, where high performance has typically been associated with high costs4.

Implications for the US-China AI Race

DeepSeek's emergence has raised concerns in the US about China's growing AI capabilities13. The US government has implemented restrictions on China's access to advanced AI chips, aiming to curb its progress in AI development14. However, DeepSeek has demonstrated that Chinese researchers can develop world-class AI models with limited resources and by leveraging open-source technologies15. This raises questions about the effectiveness of these restrictions and their potential to inadvertently spur innovation in China by forcing researchers to focus on efficiency and alternative approaches14.

DeepSeek's success has been attributed to several factors, including its efficient MoE architecture, innovative training methods, and a focus on maximizing resource utilization16. The company's ability to develop high-performing models at a fraction of the cost of its US counterparts has put pressure on companies like Meta and OpenAI to rethink their AI strategies17.

Expert Opinions and Analysis

Experts in the AI field have recognized DeepSeek's significant contributions to AI development. They highlight the model's impressive performance, cost-effectiveness, and open-source nature as key factors that could reshape the AI landscape7. Some experts suggest that DeepSeek's approach could lead to a democratization of AI, making advanced AI capabilities more accessible to a wider range of developers and researchers12. Others emphasize the potential for DeepSeek to accelerate innovation and competition in the AI industry, potentially leading to breakthroughs in various fields18.

Use Cases and Applications

DeepSeek's models have shown potential in a variety of applications. DeepSeek-R1, for example, has been used to run complex reasoning tasks on smartphones, generate code for rotating objects with collision detection, and even build a clone of the AI-powered conversational search engine Perplexity AI19. The models have also shown promise in areas like software development, business operations, and education3.

Synthesis

DeepSeek's emergence as a major player in the AI landscape has significant implications for the future of AI development. Its open-source approach, combined with its high-performing and cost-effective models, challenges the dominance of US-based companies and has the potential to democratize access to advanced AI technologies. This could foster a more diverse and inclusive AI ecosystem, with wider participation from developers and researchers worldwide.

DeepSeek's success also highlights the growing capabilities of Chinese AI research and the potential for open-source technologies to disrupt the AI industry. As the AI race intensifies, DeepSeek's innovative approach and commitment to accessibility could shape the future of AI development and its impact on the global technology landscape.

Works cited

1. DeepSeek - Wikipedia, accessed January 24, 2025, https://en.wikipedia.org/wiki/DeepSeek

2. DeepSeek's New Open Source AI Model - Perplexity, accessed January 24, 2025, https://www.perplexity.ai/page/deepseek-s-new-open-source-ai-YwAwjp_IQKiAJ2l1qFhN9g

3. DeepSeek: Everything you need to know about this new LLM in one place - Daily.dev, accessed January 24, 2025, https://daily.dev/blog/deepseek-everything-you-need-to-know-about-this-new-llm-in-one-place

4. The Open Source Revolution in AI: DeepSeek's Challenge to the Status Quo, accessed January 24, 2025, https://c3.unu.edu/blog/the-open-source-revolution-in-ai-deepseeks-challenge-to-the-status-quo

5. deepseek-ai/DeepSeek-R1 - Hugging Face, accessed January 24, 2025, https://huggingface.co/deepseek-ai/DeepSeek-R1

6. How China's new AI model DeepSeek is threatening U.S. dominance - NBC10 Philadelphia, accessed January 24, 2025, https://www.nbcphiladelphia.com/news/business/money-report/how-chinas-new-ai-model-deepseek-is-threatening-u-s-dominance/4087759/

7. DeepSeek R1 vs OpenAI o1: Installation, Features, Pricing - Cody, accessed January 24, 2025, https://meetcody.ai/blog/deepseek-r1-open-source-installation-features-pricing/

8. deepseek-ai/DeepSeek-R1 - Demo - DeepInfra, accessed January 24, 2025, https://deepinfra.com/deepseek-ai/DeepSeek-R1

9. DeepSeek unveils DeepSeek-R1, a reasoning model that beats OpenAI-o1 | Technology News - The Indian Express, accessed January 24, 2025, https://indianexpress.com/article/technology/artificial-intelligence/deepseek-r1-a-reasoning-model-that-beats-openai-o1-9791318/

10. The Chinese AI model DeepSeek-R1 is as good as OpenAI o1 if you don't ask it about Tiananmen Square - Mezha.Media, accessed January 24, 2025, https://mezha.media/en/2025/01/24/the-chinese-ai-model-deepseek-r1-is-as-good-as-openai-o1-if-you-don-t-ask-it-about-tiananmen-square/

11. DeepSeek-VL: Towards Real-World Vision-Language Understanding - GitHub, accessed January 24, 2025, https://github.com/deepseek-ai/DeepSeek-VL

12. DeepSeek R1 vs OpenAI o1 : The AI Underdog That's Eating OpenAI's Lunch - Medium, accessed January 24, 2025, https://medium.com/@cognidownunder/deepseek-r1-vs-openai-o1-the-ai-underdog-thats-eating-openai-s-lunch-7cb72eac8458

13. A Chinese startup is sparking concern over US's AI dominance - IO+, accessed January 24, 2025, https://ioplus.nl/en/posts/a-chinese-startup-is-sparking-concern-over-uss-ai-dominance

14. US May Be Losing Edge to China in the AI Race. Here's Why | Vantage with Palki Sharma, accessed January 24, 2025, https://www.youtube.com/watch?v=y52ITdDPbyo

15. This Chinese AI Startup is giving tough competition to Google, OpenAI, other Silicon Valley giants - The Economic Times, accessed January 24, 2025, https://m.economictimes.com/news/international/us/this-chinese-ai-startup-is-giving-tough-competition-to-google-openai-other-silicon-valley-giants/articleshow/117527183.cms

16. DeepSeek: Bridging Performance and Efficiency in Modern AI | by Nandini Lokesh Reddy, accessed January 24, 2025, https://medium.com/@nandinilreddy/deepseek-bridging-performance-and-efficiency-in-modern-ai-106181a85693

17. Meta AI in panic mode as free open-source DeepSeek gains traction and outperforms for far less - Tech Startups, accessed January 24, 2025, https://techstartups.com/2025/01/24/meta-ai-in-panic-mode-as-free-open-source-deepseek-outperforms-at-a-fraction-of-the-cost/

18. The DeepSeek Revolution: How a Chinese Start-Up Is Reshaping the AI Landscape, accessed January 24, 2025, https://www.1950.ai/post/he-deepseek-revolution-how-a-chinese-start-up-is-reshaping-the-ai-landscape

19. DeepSeek-R1 is taking the AI community by storm: 6 wild use cases - The Indian Express, accessed January 24, 2025, https://indianexpress.com/article/technology/artificial-intelligence/deepseek-r1-is-taking-the-ai-community-by-storm-some-wild-use-cases-9795163/


Author: Malik Datardina, CPA, CA, CISA. Malik works at Auvenir as a Sr. AI Product Manager who is working to transform the engagement experience for accounting firms and their clients. The opinions expressed here do not necessarily represent UWCISA, UW, Auvenir (or its affiliates), CPA Canada or anyone else. This post was written with the assistance of an AI language model. The model provided suggestions and completions to help me write, but the final content and opinions are my own.