- Weekly AI News
- Posts
- Llama 3.1 405B, The Memphis Supercluster, ChatQA-2
Llama 3.1 405B, The Memphis Supercluster, ChatQA-2
Llama 3.1 With 405B Parameters Is Out, Elon Musk’s xAI Unveils World’s Most Powerful AI Training Cluster, & More...
🔜 Upcoming AI Events
July 26-28, 20246th International Conference on Artificial Intelligence and Computer Science (AICS 2024) Wuhan, China. Register Now
July 30-31, 2024 Fortune Brainstorm AI Singapore Register Now
🌐 Top AI Highlights
Llama 3.1, With 405B Parameters Is Out!

Meta has announced the release of Llama 3.1, its latest and most advanced AI model, available in three versions, including a 405-billion parameter model. This release underscores Meta’s commitment to open-source AI, making Llama 3.1 accessible to everyone for free. By partnering with Nvidia for GPU support, Meta ensures the model's robust training capabilities. Unlike other AI companies, Meta focuses on collaboration with tech giants like AWS, Google Cloud, and Microsoft Azure for distribution rather than monetizing the model directly.
The primary goal of making Llama open source is to attract top talent and reduce computing costs while fostering a community of developers to enhance Meta’s tools. This release coincides with a conference featuring Meta CEO Mark Zuckerberg and Nvidia CEO Jensen Huang, highlighting the deepening partnership between the two companies. The Llama 3.1 family includes models ranging from 8B to 405B parameters, suitable for applications from complex tasks to chatbots and coding assistants. U.S.-based WhatsApp users and visitors to Meta.AI can interact with Meta’s digital assistant powered by Llama 3.1.
Meta is also prioritizing AI safety through collaborations with organizations like NIST and ML Commons, conducting thorough risk assessments, and developing tools like Llama Guard 3 and Prompt Guard to detect and mitigate risks. This approach includes sharing resources and conducting detailed risk evaluations to ensure safe AI applications. While Llama 3.1 is too large to run on regular computers, it is supported by cloud providers like Databricks, Groq, AWS, and Google Cloud, making it accessible for developers to run custom versions.
Elon Musk’s xAI Unveils World’s Most Powerful AI Training Cluster: The Memphis Supercluster

Elon Musk’s AI startup, xAI, has launched the “Memphis Supercluster,” claimed to be the world’s most powerful AI training cluster, in Memphis, Tennessee. This collaboration between xAI, X (formerly Twitter), and Nvidia features 100,000 liquid-cooled Nvidia H100 GPUs, interconnected via a single RDMA fabric. This setup aims to train the most powerful AI model globally, with completion expected by December 2024. The project, costing between $3 billion to $4 billion, represents the largest capital investment by a new-to-market company in Memphis.
The Memphis Supercluster, consuming up to 150 megawatts of electricity per hour, has faced concerns from local residents regarding its energy and water usage. Nevertheless, the project highlights a significant economic milestone for Memphis and showcases the collaborative efforts of xAI, X, and Nvidia, with hardware support from Supermicro. The supercluster began training at 4:20 a.m. local time, emphasizing the urgency and ambition behind the initiative.
This supercluster positions xAI ahead in the AI development race, significantly outscaling other powerful supercomputers like Frontier and Aurora. Despite challenges, the Memphis Supercluster’s advanced infrastructure promises substantial advancements in AI capabilities, marking a pivotal moment in the industry as it aims to redefine the field with groundbreaking innovations and breakthroughs.
😍 Enjoying so far, share it with your friends!
NVIDIA’s New Model ChatQA-2 Rivals GPT-4 in Long Context and RAG Tasks

NVIDIA has developed Llama3-ChatQA-2-70B, a large language model derived from Meta’s Llama3, rivaling GPT-4-Turbo in handling contexts up to 128,000 tokens and excelling in retrieval-augmented generation (RAG) tasks. It showcases superior performance across various benchmarks, particularly in long-context and short-context tasks, and effective processing of medium-length tasks up to 32,000 tokens. The model's context window was extended from 8,000 to 128,000 tokens using continued pre-training with SlimPajama data and a three-stage instruction tuning process, resulting in it outperforming many state-of-the-art models, including GPT-4-Turbo, in specific benchmarks.
The researchers' innovative approach in extending the context window and employing advanced training techniques allows Llama3-ChatQA-2-70B to handle extensive and fragmented contexts, crucial for complex AI tasks. The model integrates a state-of-the-art long-context retriever to mitigate context fragmentation issues, significantly enhancing its performance in long-context understanding tasks. Extensive evaluations show that ChatQA 2 achieves comparable or superior accuracy to proprietary models, particularly excelling in RAG tasks and demonstrating the benefits of combining long-context solutions with advanced retrievers.
This development marks a significant advancement for open-source language models, closing the performance gap with proprietary models like GPT-4. The study provides detailed technical recipes and evaluation benchmarks for reproducibility, highlighting the potential of open-access models to achieve state-of-the-art performance in sophisticated AI tasks. The ongoing progress in making high-performing LLMs more accessible and versatile for various applications is a key takeaway from this research.
🚀 Tech Glimpse of the Week
AI start-up Cohere raises $500mn as it seeks to take on OpenAI
AI startup Cohere has raised $500 million in a funding round led by Inovia Capital, with contributions from Nvidia, Oracle, and Salesforce. This funding will help Cohere compete with OpenAI by expanding its natural language processing tools for enterprises.
Cybersecurity firm Wiz calls off $23 bln Google takeover
Wiz has canceled its $23 billion acquisition deal with Alphabet. The acquisition aimed to enhance Google's cloud security to compete with Microsoft and Amazon. Wiz, valued at $12 billion, has seen rapid growth since its 2020 inception. Despite the deal's collapse, it underscores the rising trend in cybersecurity mergers and acquisitions
Nvidia preparing version of new flagship AI chip for Chinese market
Nvidia is preparing modified versions of its AI chips, including the H20, L20, and L2, for the Chinese market to comply with U.S. export restrictions. These chips have reduced computing power to meet regulatory requirements and are expected to be mass-produced by the second quarter of 2024
Lanchi Ventures Expands to Hong Kong, Bridging Chinese AI Startups Globally
Lanchi Ventures, formerly BlueRun China, is expanding to Hong Kong to tap into its financial resources and talent pool. The venture firm aims to connect Chinese AI startups with global markets, focusing on early-stage investments in AI, 3D interactive tech, and robotics. Despite geopolitical tensions, Lanchi encourages Chinese firms to adopt a global perspective and create world-class products
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models
A Strong Training-Free Baseline for Video Large Language Models" introduces SlowFast-LLaVA, a training-free video large language model (LLM) that captures detailed spatial semantics and long-range temporal context effectively. It uses a two-stream SlowFast design: a Slow pathway for low frame rate with detailed spatial features, and a Fast pathway for high frame rate focusing on motion cues. Experimental results show that SF-LLaVA outperforms existing training-free methods on various video tasks and sometimes matches state-of-the-art fine-tuned models.
👥 Connect & Feedback!
👉 Join Us:
📧 Advertise In Weekly AI News:
📧 Contact directly at [email protected]
😍 Share with your friends!

Reply