128
Bing Chat so hungry for GPUs, Microsoft will rent them from Oracle
(www.theregister.com)
This is a most excellent place for technology news and articles.
This is the best summary I could come up with:
Demand for Microsoft's AI services is apparently so great – or Redmond's resources so tight – that the software giant plans to offload some of the machine-learning models used by Bing Search to Oracle's GPU supercluster as part of a multi-year agreement announced Tuesday.
The partnership essentially boils down to: Microsoft needs more compute resources to keep up with the alleged "explosive growth" of its AI services, and Oracle just happens to have tens of thousands of Nvidia A100s and H100 GPUs available for rent.
Microsoft was among the first to integrate a generative AI chatbot into its search engine with the launch of Bing Chat back in February.
You all know the drill by now: you can feed prompts, requests, or queries into Bing Chat, and it will try to look up information, write bad poetry, generate pictures and other content, and so on.
In this case, Microsoft is using the system alongside its Azure Kubernetes Service to orchestrate Oracle's GPU nodes to keep up with what's said to be demand for Bing's AI features.
Oracle claims its cloud super-clusters, which presumably Bing will use, can each scale to 32,768 Nvidia A100s or 16,384 H100 GPUs using a ultra-low latency Remote Direct Memory Access (RDMA) network.
The original article contains 580 words, the summary contains 207 words. Saved 64%. I'm a bot and I'm open source!