OpenAI, the San Francisco-headquartered developer behind ChatGPT, a generative artificial intelligence (AI) chatbot, relies on Microsoft Azure’s cloud computing services. To fuel the compute, storage, database, and networking requirements of ChatGPT and OpenAI’s other key generative AI products, Microsoft has made a multi-year, $10 billion investment into OpenAI, in exchange for a reported 49% stake in the company at a $29 billion whole-company valuation. Microsoft’s multi-billion dollar investment will help finance the immense cloud infrastructure needs of OpenAI to train and run its various models on the Microsoft Azure cloud platform.
OpenAI and ChatGPT use Microsoft Azure’s cloud infrastructure to deliver the performance and scale necessary to run their artificial intelligence (AI) training and inference workloads. High-performance computing (HPC), data storage, and global availability are foundational to ChatGPT’s systems.
Dgtl Infra provides an in-depth overview of the cloud infrastructure, delivered by Microsoft Azure, that supports OpenAI and its artificial intelligence-based large language model (LLM) product ChatGPT. Additionally, we highlight Azure OpenAI – Microsoft’s cloud artificial intelligence (AI) service, as well as Microsoft’s current and future OpenAI integrations into its consumer and enterprise products.
ChatGPT and OpenAI’s use of Microsoft Azure’s Cloud
Microsoft Azure is the exclusive cloud provider to OpenAI and, in turn, supports the company’s artificial intelligence-based large language model (LLM) product ChatGPT. Through its compute, storage, database, and networking resources, Microsoft Azure powers all OpenAI workloads across research, products, and application programming interface (API) services. Most notably, OpenAI’s family of models include GPT-3 for human-like language generation, Codex for code generation in many programming languages, and Dall-E 2 for realistic image generation and editing:
- GPT-3: allows users to generate human-like text, perform language translation, summarize text, and more, using an AI-based natural language processing (NLP) model. ChatGPT is a smaller version of GPT-3, with fewer parameters in its neural network
- Codex: a system that generates computer code for software developers by suggesting lines of code and entire functions in real-time. Codex, as the underlying AI model, has been turned into a product called GitHub Copilot, which can generate code snippets. Notably, GitHub is another Microsoft-owned service primarily used by developers
- Dall-E 2: an image generating system that allows users to produce art from strings of text. Dall-E 2 is used by Microsoft Designer, a new graphic design tool, for creating elements such as pictures, icons, illustrations, and infographics
All of these OpenAI models, and more, can be directly accessed and used through Azure OpenAI, a service provided by Microsoft – which is described in greater detail in the next section.
Overall, Microsoft Azure and its cloud computing architecture deliver the performance and scale required by OpenAI’s artificial intelligence (AI) training and inference workloads through high-performance computing (HPC), data storage and processing, global availability, elasticity, and cost-effectiveness:
High-Performance Computing (HPC)
ChatGPT consumes enormous amounts of computing power, driven by both the creation and operation of its artificial intelligence (AI) systems. Additionally, the language model performs a large number of input/output (I/O) operations, such as reading or writing data to storage devices and exchanging information between devices over a communications network. In order to avoid the significant capital investments that would otherwise be necessary to build its own data centers and networking capabilities, OpenAI – ChatGPT’s developer – has partnered with Microsoft Azure.
OpenAI has used Microsoft Azure to build and deploy multiple AI “supercomputing systems” at massive scale, which OpenAI employs to train all of its models. Said differently, these are high-performance computing (HPC) systems that use exascale supercomputers specifically designed and optimized for running large-scale AI training and inference workloads, such as deep learning, machine learning (ML), and natural language processing (NLP).
Typically, HPC systems use a combination of powerful processors, high-speed memory, and specialized hardware, such as graphics processing units (GPUs), to accelerate the processing of large amounts of data.
As a large language model (LLM), ChatGPT was trained through deep learning, involving the use of neural networks with many layers, to process and understand its input dataset – which for ChatGPT was over 570 gigabytes of text data. To speed-up this training process, GPUs are often used. As an example, to train the latest version of GPT-3, OpenAI used 175 billion parameters, 16,000 CPU cores, and thousands of NVIDIA V100 GPUs.
Microsoft Azure’s HPC Capabilities for ChatGPT
Microsoft Azure is the only global public cloud service provider (CSP) that offers AI supercomputers with “massive scale-up and scale-out capabilities” – meaning the ability to add more processing power and nodes to the system, respectively. To this end, Microsoft Azure’s Voyager-EUS2 supercomputer currently ranks as #14 out of the TOP500 supercomputers worldwide and is the highest-ranked among global cloud service providers.
Voyager-EUS2: located in Microsoft Azure’s East US 2 cloud region in Richmond, Virginia, this supercomputer has processing power of 39.531 petaflops – meaning the system can process 10^15 (peta) floating-point operations per second.
With its AI supercomputers, coupled with GPU and networking solutions, Microsoft Azure can uniquely deliver the optimized performance and scale required by OpenAI’s artificial intelligence (AI) training and inference workloads. Below are specific examples of how Microsoft Azure, combined with its supercomputers, differentiate the platform in AI from the other major cloud service providers (i.e., Amazon Web Services and Google Cloud).
AI Model Training – Compute Throughput per GPU
Microsoft Azure provides almost 2x higher compute throughput per GPU, as compared to its cloud service provider competitors, and near-linear scaling to thousands of GPUs, due to its networking and systems software optimization.
AI Model Inferencing (BERT, 99% Offline) – Inference Queries per $
For inferencing, Microsoft Azure is more cost-effective than its cloud service provider competitors, delivering up to 2x the performance, in terms of inference queries per dollar.
Data Storage and Processing
OpenAI requires scaled data storage to store and manage the large dataset of text used to train and run the ChatGPT model, as well as to store the model’s parameters and other model-related information. Common types of data repositories that ChatGPT may use are a database, data warehouse, data lake, or some combination of these, depending on the structured, semi-structured, and unstructured data that the system uses and collects.
Given that Microsoft Azure is the exclusive cloud provider to OpenAI, the following data storage mechanisms could be used by ChatGPT:
- Azure SQL Database: a relational database service, designed to store and manage structured data, like the text related to the model. Alternatively, an open-source relational database management system, such as PostgreSQL, could be installed and managed in the cloud
- Azure Blob Storage: a storage solution for large amounts of unstructured data
- Azure Data Lake Storage: a centralized, scalable, and cost-effective storage solution that stores unstructured, semi-structured, and structured data from multiple diverse sources
OpenAI could use Apache Hadoop for data processing in training its large language models (LLMs), like ChatGPT. Apache Hadoop is an implementation of Hadoop, which itself is an open-source framework for distributed storage and processing of large datasets. For example, Apache Hadoop would allow OpenAI to store ChatGPT’s large dataset of text across many commodity servers in Microsoft Azure’s cloud data centers, as well as enable the parallel processing of that data using a distributed computing model (i.e., where tasks or workloads are split and executed across multiple servers).
OpenAI provides API access to models like ChatGPT in over 155 countries, regions, and territories, making the product accessible to users via an internet connection. However, ChatGPT is not available in more than 40 countries worldwide including Afghanistan, Belarus, Cambodia, Cameroon, China, Cuba, Egypt, Iran, North Korea, Paraguay, Russia, Saudi Arabia, Ukraine, Venezuela, and Vietnam.
ChatGPT’s global availability is based in-part on the availability of Microsoft Azure’s 60+ cloud regions worldwide, which span more than 35 countries. Additionally, Microsoft Azure’s global infrastructure makes use of edge data centers and content delivery networks (CDNs) to deliver ChatGPT’s content and data to users in countries where it does not have a large-scale data center presence.
Collectively, Microsoft Azure’s large-scale and globally distributed data center and networking infrastructure allows for low latency and high-speed access to the ChatGPT model, regardless of the location of the user. In turn, these capabilities improve the performance of ChatGPT’s artificial intelligence (AI) workloads.
Additionally, deploying ChatGPT on Microsoft Azure’s cloud infrastructure ensures that the model is highly available and can handle a large number of requests simultaneously. More specifically, Microsoft Azure accomplishes this by using load balancing techniques to distribute incoming traffic across multiple instances of the model.
To this end, Microsoft Azure can replicate the ChatGPT model across multiple data centers in different geographic regions, so that if one data center goes down, the model can still be accessed from another one. Ultimately, replication helps to ensure that the model is always available, even in the event of a natural disaster occurring where one of its data centers is located.
OpenAI and ChatGPT benefit from elasticity in the cloud by being able to easily and automatically scale up or down compute, storage, database, and networking resources as needed. For example, when there is high demand for the ChatGPT model, such as during a spike in traffic to ChatGPT’s website, Microsoft Azure can automatically provision more resources (e.g., CPU and memory) to the model, to handle the increased load. Conversely, when demand subsides, resources can be deprovisioned to save costs. In turn, elasticity allows for the efficient use of resources and can help maintain control over expenses.
By OpenAI and ChatGPT using Microsoft Azure’s cloud computing services – instead of building-out their own data centers – the company and its model can “rent” the infrastructure, benefitting from savings in operating expenses, through a pay-per-use pricing model, and capital expenditures, through limited upfront investment.
By using Microsoft Azure, OpenAI is able to pay only when ChatGPT consumes compute, storage, database, and networking resources, and pay only for how much ChatGPT consumes. This pricing model can help maintain control over operating expenses, particularly since demand for the ChatGPT model fluctuates daily and even hourly.
Instead of investing capital expenditures into building and maintaining data centers, servers, and graphics processing units (GPUs), OpenAI and ChatGPT can rely on Microsoft Azure’s existing cloud infrastructure. As such, Microsoft Azure can provision the virtual machines (VMs) and GPUs to train and run OpenAI and ChatGPT’s large language models (LLMs). For example, Microsoft Azure offers the NV and NC series VMs, which are equipped with NVIDIA GPUs and are optimized for compute-intensive and GPU-intensive workloads.
Azure OpenAI – Microsoft’s Cloud AI Service
Azure OpenAI is a service provided by Microsoft that allows businesses and developers to directly access and use pre-trained, large-scale generative AI models from OpenAI, such as GPT-3.5, to build their own artificial intelligence (AI) applications. Importantly, Azure OpenAI allows developers to integrate the capabilities of OpenAI models into their own projects and build new applications based on its capabilities – often by leveraging Microsoft Azure’s existing AI solutions like Azure Cognitive Services and Azure Machine Learning. Additionally, Azure OpenAI provides a means for developers to globally scale and deploy OpenAI models in a secure and reliable way.
In January 2023, Microsoft announced the general availability of the Azure OpenAI service, allowing more businesses to apply for access to OpenAI’s API services including GPT-3.5 (an upgraded version of OpenAI’s GPT-3 model), DALL-E 2, and Codex. Through the Azure OpenAI service, businesses will soon also be able to access ChatGPT, a fine-tuned version of GPT-3.5 that has been trained and runs inference on Azure AI infrastructure.
How to Use Azure OpenAI
Azure OpenAI is generally available, but the service currently requires an Azure subscription and registration via application by completing this form. Once access is granted to the Azure OpenAI service, customers can use OpenAI’s models through REST APIs (a set of rules for accessing web-based software), Python SDK, or by creating a resource which will link them to Azure OpenAI Studio, a web-based interface. Using Azure OpenAI Studio, customers can set up a deployment to make API calls against a provided base model (e.g., text-davinci-002) or a custom model.
The Azure OpenAI Studio lets developers experiment with prompts and test their ideas with OpenAI before incorporating them into their code. Once developers are finished experimenting, they can call a particular service from their code and build them into their applications.
Microsoft’s OpenAI Integration in Consumer and Enterprise
Microsoft has already integrated OpenAI’s models into several of its ‘first-party’ consumer and enterprise products, namely Microsoft 365, Dynamics 365, Power Platform, GitHub Copilot, and Microsoft Designer. Ultimately, the company’s goal with these integrations is to improve user experience and enable more efficient and accurate automated tasks.
- Microsoft 365: OpenAI’s models are integrated into Microsoft’s productivity and collaboration tools, such as Office and Microsoft Teams. In Office, OpenAI’s models can be used to assist with writing and formatting tasks. While in Microsoft Teams, the models can help with scheduling and meeting transcription
- Dynamics 365: OpenAI’s models are integrated into Microsoft’s enterprise resource planning (ERP) and customer relationship management (CRM) software to improve the customer service experience. For example, using OpenAI’s natural language understanding technology, Dynamics 365 can comprehend and respond to customer inquiries more efficiently
- Power Platform: OpenAI’s models are integrated into Microsoft’s business automation tool to allow for natural language processing capabilities, such as natural language generation, understanding, and dialog. For example, Power BI, which is Microsoft’s business intelligence and data visualization tool, leverages GPT-3-powered natural language to automatically generate formulas to perform calculations
- GitHub Copilot: OpenAI’s models are integrated into Microsoft’s product for generating code snippets
- Microsoft Designer: OpenAI’s models are integrated into Microsoft’s graphic design tool, for creating elements such as pictures, icons, illustrations, and infographics
Future OpenAI Integrations in Microsoft’s Consumer and Enterprise Products
Microsoft’s Chief Executive Officer, Satya Nadella, has stated that his company plans to incorporate artificial intelligence (AI) tools into all of its ‘first-party’ products and “every layer of the stack” for other businesses to build on. Examples of Microsoft’s products that could, in the future, integrate with OpenAI’s models include Bing, LinkedIn, OneDrive, and Xbox:
- Bing: OpenAI’s models can be used to improve the search engine’s natural language processing (NLP) to better understand search intent, generate more informative search snippets for Bing’s results page, and produce more relevant autocomplete predictions for Bing search queries
- LinkedIn: OpenAI’s models can be used on the professional network platform to automatically summarize resumes of job candidates for recruiters, generate text for job descriptions and posts by companies, and improve the matching capabilities of LinkedIn – suggesting more relevant job opportunities to job seekers and more suitable candidates to recruiters
- OneDrive: OpenAI’s models can be used to improve the file hosting service’s automatic tagging and categorization of files, natural language search capabilities – versus having to remember the exact file name or location, and automatic summarization of the content of documents
- Xbox: OpenAI’s models can be used to improve the video gaming service’s recommendation engine for games on Xbox Game Pass and Xbox Cloud Gaming, in-game dialog and character interactions, and voice recognition capabilities to allow the use voice commands to control a game
“The next big platform wave, as I said, is going to be AI and be strong. We also believe a lot of the enterprise value gets created by just being able to catch these waves and then have those waves impact every part of our tech stack and also create new solutions and new opportunities.” Satya Nadella – Microsoft Corporation – Chairman & CEO – 01/24/2023