Chinese AI startup DeepSeek has launched DeepSeek-V3, an ultra-large AI model boasting 671 billion parameters. Released on Hugging Face under DeepSeek’s license, this model leverages a mixture-of-experts (MoE) architecture, selectively activating parameters to improve efficiency and accuracy. DeepSeek-V3 has already outperformed leading open-source models like Meta’s Llama 3.1-405B and shows performance on par with proprietary models from OpenAI and Anthropic. This advancement underscores the narrowing gap between open-source and closed AI models, with DeepSeek aiming to drive progress toward artificial general intelligence (AGI).
DeepSeek-V3, ultra-large open-source AI, outperforms Llama and Qwen on launch
Key Takeaways
- Chinese AI startup DeepSeek has launched DeepSeek-V3, an ultra-large AI model boasting 671 billion parameters.
- Released on Hugging Face under DeepSeek’s license, this model leverages a mixture-of-experts (MoE) architecture, selectively activating parameters to improve efficiency and accuracy.
- DeepSeek-V3 has already outperformed leading open-source models like Meta’s Llama 3.1-405B and shows performance on par with proprietary models from OpenAI and Anthropic.
- This advancement underscores the narrowing gap between open-source and closed AI models, with DeepSeek aiming to drive progress toward artificial general intelligence (AGI).
Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!