Showing results for tags 'llama 3'.

Found 2 results

Sort By
- Date
- Relevancy

Building Enterprise GenAI Apps with Meta Llama 3 on Databricks

Databricks posted a topic in Artificial Intelligence

We are excited to partner with Meta to release the latest state-of-the-art large language model, Meta Llama 3 , on Databricks. With Llama... View the full article
- April 18, 2024
- - genai
  - meta
  - (and 3 more)
    Tagged with:
    
    genai
    
    meta
    
    llama
    
    llama 3
    
    databricks
Meta sheds more light on how it is evolving Llama 3 training — it relies for now on almost 50,000 Nvidia H100 GPU, but how long before Meta switches to its own AI chip?

TechRadar posted a topic in Artificial Intelligence

Meta has unveiled details about its AI training infrastructure, revealing that it currently relies on almost 50,000 Nvidia H100 GPUs to train its open source Llama 3 LLM. The company says it will have over 350,000 Nvidia H100 GPUs in service by the end of 2024, and the computing power equivalent to nearly 600,000 H100s when combined with hardware from other sources. The figures were revealed as Meta shared details on its 24,576-GPU data center scale clusters. Meta's own AI chips The company explained “These clusters support our current and next generation AI models, including Llama 3, the successor to Llama 2, our publicly released LLM, as well as AI research and development across GenAI and other areas.“ The clusters are built on Grand Teton (named after the National Park in Wyoming), an in-house-designed, open GPU hardware platform. Grand Teton integrates power, control, compute, and fabric interfaces into a single chassis for better overall performance and scalability. The clusters also feature high-performance network fabrics, enabling them to support larger and more complex models than before. Meta says one cluster uses a remote direct memory access network fabric solution based on the Arista 7800, while the other features an NVIDIA Quantum2 InfiniBand fabric. Both solutions interconnect 400 Gbps endpoints. "The efficiency of the high-performance network fabrics within these clusters, some of the key storage decisions, combined with the 24,576 NVIDIA Tensor Core H100 GPUs in each, allow both cluster versions to support models larger and more complex than that could be supported in the RSC and pave the way for advancements in GenAI product development and AI research," Meta said. Storage is another critical aspect of AI training, and Meta has developed a Linux Filesystem in Userspace backed by a version of its 'Tectonic' distributed storage solution optimized for Flash media. This solution reportedly enables thousands of GPUs to save and load checkpoints in a synchronized fashion, in addition to "providing a flexible and high-throughput exabyte scale storage required for data loading". While the company's current AI infrastructure relies heavily on Nvidia GPUs, it's unclear how long this will continue. As Meta continues to evolve its AI capabilities, it will inevitably focus on developing and producing more of its own hardware. Meta has already announced plans to use its own AI chips, called Artemis, in servers this year, and the company previously revealed it was getting ready to manufacture custom RISC-V silicon. More from TechRadar Pro Meta has done something that will get Nvidia and AMD very, very worriedMeta set to use own AI chips in its servers in 2024Intel has a new rival to Nvidia's uber-popular H100 AI GPU View the full article
- March 21, 2024
- - meta
  - ai
  - (and 4 more)
    Tagged with:
    
    meta
    
    ai
    
    ai chips
    
    nvidia
    
    llama
    
    llama 3

Sign In

Search the Community

Search By Tags

Search By Author

Content Type

Forums

Calendars

Find results in...

Find results that contain...

Date Created

Start

End

Last Updated

Start

End

Filter by number of...

Minimum number of comments

Minimum number of replies

Minimum number of reviews

Minimum number of views

Joined

Start

End

Group

Website URL

LinkedIn Profile URL

About Me

Cloud Platforms

Cloud Experience

Development Experience

Current Role

Skills

Certifications

Favourite Tools

Interests

Building Enterprise GenAI Apps with Meta Llama 3 on Databricks

Meta sheds more light on how it is evolving Llama 3 training — it relies for now on almost 50,000 Nvidia H100 GPU, but how long before Meta switches to its own AI chip?

Forum Statistics