The chaotic and humor-filled unraveling of OpenAI, the beloved generative AI venture, commenced with a familial dispute resembling an unexpectedly intense early Thanksgiving disagreement. Whether or not it reached a resolution, it seems a firm yet amicable intervention from Microsoft, assuming the role of the responsible adult in the room, played a role. Amidst the ongoing twists and turns, Microsoft’s intervention to maintain stability in OpenAI’s technology, if not the company itself, appears to be an unavoidable course of action.
More Than Money
Microsoft’s recent $10 billion investment in OpenAI wasn’t a trivial sum, although it was partially funded through extensive layoffs that marred the impressive cultural transformation led by CEO Satya Nadella at the company. However, this investment has quickly become a self-consuming financial cycle, resembling a funding Ouroboros. A significant portion of the funds Microsoft has poured into OpenAI appears to have been directed towards (Azure) cloud computing to support the operation of OpenAI’s extensive language models.
Despite the ambitious but distant plans to develop an AGI that may or may not come to fruition, Microsoft, keen on rebranding itself as “the AI company” and particularly as “the Copilot company” rather than the traditional “Windows company,” will essentially gain the technological foundation of ChatGPT for approximately half of what it paid for Nuance in 2021 or slightly less than the $7.5 billion invested in GitHub in 2019 (adjusted for inflation). While not the entire investment was allocated to cloud computing, Microsoft’s capital expenditure for Q1 2023 alone amounted to $7.8 billion.
Even with its substantial team of AI researchers and extensive collection of large foundation models, Microsoft places significant importance on OpenAI’s ChatGPT LLMs. This emphasis is driven by the substantial investments, both in hardware and software, that Microsoft has dedicated to supporting these models. Moreover, the dependency Microsoft has developed on OpenAI’s technology extends across nearly all its divisions and product lines.
In Nadella’s opening keynote at the Ignite conference, mentions of OpenAI were plentiful, featuring a preview of the GPT-4 Turbo models. Microsoft’s own products are similarly infused with OpenAI technology, with Copilots being a central component across various offerings.
Optimizing the Cost Efficiency of Foundation Models
Training LLMs and other foundation models demands substantial data, time, and computational resources. Microsoft’s approach involves treating them as platforms by constructing a select few models initially and subsequently repurposing them in progressively tailored and specialized applications.
For the past five years, Microsoft has been developing the framework to generate Copilots, revolutionizing aspects ranging from fundamental infrastructure and data center architecture (with a new data center launched every three days in 2023) to enhancing its software development environment for increased efficiency.
Commencing with GitHub Copilot, nearly every Microsoft product line now incorporates one or more Copilot features. It extends beyond generative AI for consumers and office users, encompassing Microsoft 365 Copilot, Windows Copilot, Teams, Dynamics, the renamed Bing Chat, and GPT-powered tools in Power BI. Copilots are present in diverse domains, ranging from security products like Microsoft Defender 365 to Azure infrastructure, Microsoft Fabric, and Azure Quantum Elements.
Microsoft’s clientele is also constructing personalized Copilots on the same framework. Nadella highlighted several examples, including Airbnb, BT, NVidia, and Chevron. The new Copilot Studio serves as a low-code tool for crafting custom Copilots utilizing business data and Copilot plugins for popular tools such as JIRA, SAP ServiceNow, and Trello. This move positions OpenAI to become essentially ubiquitous.
To facilitate this, Microsoft has established an internal pipeline that introduces new foundation models from OpenAI, conducts experiments in smaller services like the Power Platform and Bing, and leverages insights gained to integrate them into more specialized AI services accessible to developers. The company has standardized on Semantic Kernel and Prompt flow for orchestrating AI services using conventional programming languages like Python and C#. Additionally, Microsoft has introduced a user-friendly interface for developers in the new Azure AI Studio tool. These tools empower developers to create and comprehend LLM-powered applications without delving into the intricacies of LLMs. However, their effectiveness relies on Microsoft’s proficiency with the underlying OpenAI models.
Investing in Hardware Requires Genuine Dedication
Microsoft would likely have made substantial investments in the Nvidia and AMD GPUs that form the backbone of OpenAI’s technology. This includes investments in high-bandwidth InfiniBand networking connections between nodes and the acquisition of Lumensity for hollow-core fiber (HFC) manufacturing last year. The specifics depend on which foundation models were in use.
Microsoft acknowledges its collaborative efforts with OpenAI, extending beyond Nvidia-powered AI supercomputers listed in the Tops500 rankings. These collaborations encompass refinements to the Maia 100. Importantly, Microsoft doesn’t merely provide Azure supercomputers to OpenAI; these serve as a public demonstration of the infrastructure available to other customers, highlighting the extensive services running on this infrastructure, encompassing nearly all products and services within Microsoft’s portfolio.
But previously, its main approach to AI acceleration was to use FPGAs, because they allow for so much flexibility: the same hardware that was initially used to speed up Azure networking became an accelerator for Bing search doing real-time AI inferencing and then a service that developers could use to scale out their own deep neural network on AKS. As new AI models and approaches were developed, Microsoft could reprogram FPGAs to create soft custom processors to accelerate them far faster than building a new hardware accelerator — which would quickly become obsolete.
With FPGAs, Microsoft didn’t have to pick the system architecture, data types or operators it thought AI would need for the next couple of years: it could keep changing its software accelerators whenever it needed — you can even reload the functionality of the FPGA circuit partway through a job.
But last week, Microsoft announced the first generation of its own custom silicon: the Azure Maia AI Accelerator, complete with a custom on-chip liquid cooling system and rack, specifically for “large language model training and inferencing” that will run OpenAI models for Bing, GitHub Copilot, ChatGPT and the Azure OpenAI Service. This is a major investment that will significantly reduce the cost (and water use) of both training and running OpenAI models — cost savings that only materialize if training and running OpenAI models continue to be a major workload.
In essence, Microsoft has constructed a specialized hardware accelerator for OpenAI, slated for deployment in data centers next year, and with additional designs already in the pipeline. This timing is far from ideal, especially considering the current challenges faced by its close partner.
Keeping the Wheels Turning
While Microsoft may have expressed interest over the years, the initial intention was not to acquire OpenAI. Microsoft deliberately opted to collaborate with an external team to ensure that the AI training and inference platform being developed would cater to a broader range of needs, not solely designed for Microsoft’s requirements.
However, as OpenAI’s models consistently outpaced the competition, Microsoft increasingly placed its bets on them. In just a year post-launch, ChatGPT boasts 100 million weekly users, and the demand became so substantial that OpenAI had to temporarily halt ChatGPT Plus signups due to overwhelming capacity issues. This figure does not even account for OpenAI usage by Microsoft’s direct customers.
Whether you’re utilizing ChatGPT directly from OpenAI or an OpenAI model integrated into a Microsoft product, the underlying infrastructure is powered by Azure. The distinction between what Microsoft classifies as a ‘first-party service’ (developed in-house) and ‘a third-party service’ (from external sources) has become increasingly indistinct.
In theory, Microsoft could reconsider its stance and shift to a different foundation model, especially since most foundation models from major players already operate on Azure. However, making such a switch midway is a complex and costly endeavor, risking significant setbacks and potential damage to the company’s reputation and stock value. It would be more prudent to ensure the survival and prosperity of OpenAI’s technology, irrespective of the fate of the OpenAI company.
While OpenAI’s developer relations team has been assuring customers about the continuity of services, reports suggest that OpenAI customers are exploring alternatives like Anthropic and Google. This includes Azure OpenAI customers whom Microsoft would prefer not to lose. LangChain, a startup integrating significantly with Azure OpenAI Service, has been providing guidance to developers on the considerable adjustments needed in prompt engineering when transitioning to a different large language model (LLM), with most current examples focused on OpenAI models.
The OpenAI Dependency
In the event that internal discussions within Microsoft, involving virtually every division and product line, mirror external conversations, acquiring as much OpenAI expertise in-house becomes crucial to facilitate any necessary transitions should OpenAI experience fragmentation or decline.
While Microsoft holds a broad perpetual license for all OpenAI intellectual property until AGI (if realized), the rapid evolution of generative AI demands more than just maintaining current models. Microsoft’s reliance on future large language models (LLMs) like GPT-5 necessitates a robust partnership with OpenAI.
OpenAI, despite its name, has not been primarily oriented toward open source initiatives, with few releases and none involving core LLMs. Drawing parallels with Microsoft’s gradual acceptance of open source, pivotal moments were marked not just by releasing core projects but by incorporating dependencies on open source projects like Docker and Kubernetes in Windows Server and Azure.
The dependency formed with OpenAI holds even greater significance, albeit with challenges in stability and governance. In any scenario, Microsoft is committed to ensuring the survival of what it requires from OpenAI.