Home

Red Hat to Deliver Enhanced AI Inference Across AWS

Red Hat AI on AWS Trainium and Inferentia AI chips to provide customers with greater choice, flexibility and efficiency for production AI workloads

Red Hat, the world's leading provider of open source solutions, today announced an expanded collaboration with Amazon Web Services (AWS) to power enterprise-grade generative AI (gen AI) on AWS with Red Hat AI and AWS AI silicon. With this collaboration, Red Hat focuses on empowering IT decision-makers with the flexibility to run high-performance, efficient AI inference at scale, regardless of the underlying hardware.

The rise of gen AI and subsequent need for scalable inference is pushing organizations to reevaluate their IT infrastructure. As a result, IDC predicts that “by 2027, 40% of organizations will use custom silicon, including ARM processors or AI/ML-specific chips, to meet rising demands for performance optimization, cost efficiency, and specialized computing."1 This underscores the need for optimized solutions that can improve processing power, minimize costs and enable faster innovation cycles for high-performance AI applications.

Red Hat’s collaboration with AWS empowers organizations with a full-stack gen AI strategy by bringing together Red Hat’s comprehensive platform capabilities with AWS cloud infrastructure and AI chipsets, AWS Inferentia2 and AWS Trainium3. Key aspects of the collaboration include:

  • Red Hat AI Inference Server on AWS AI chips: Red Hat AI Inference Server, powered by vLLM, will be enabled to run with AWS AI chips, including AWS Inferentia2 and AWS Trainium3, to deliver a common inference layer that can support any gen AI model, helping customers achieve higher performance, lower latency and cost-effectiveness for scaling production AI deployments, delivering up to 30-40% better price performance than current comparable GPU-based Amazon EC2 instances.
  • Enabling AI on Red Hat OpenShift: Red Hat worked with AWS to develop an AWS Neuron operator for Red Hat OpenShift, Red Hat OpenShift AI and Red Hat OpenShift Service on AWS, a comprehensive and fully managed application platform on AWS, providing customers with a more seamless, supported pathway to run their AI workloads with AWS accelerators.
  • Ease of access and deployment: By supporting AWS AI chips, Red Hat will offer enhanced and easier access to high-demand, high-capacity accelerators for Red Hat customers on AWS. In addition, Red Hat recently released the amazon.ai Certified Ansible Collection for Red Hat Ansible Automation Platform to enable orchestrating AI services on AWS.
  • Upstream community contribution: Red Hat and AWS are collaborating to optimize an AWS AI chip plugin up-streamed to vLLM. As the top commercial contributor to vLLM, Red Hat is committed to enabling vLLM on AWS to help accelerate AI inference and training capabilities for users. vLLM is also the foundation of llm-d, an open source project focused on delivering inference at scale and now available as a commercially supported feature in Red Hat OpenShift AI 3.

Red Hat has a long history of collaboration with AWS to enable customers from the datacenter to the edge. This latest milestone now aims to address the evolving needs of organizations as they integrate AI into their hybrid cloud strategies to achieve optimized, efficient gen AI outcomes.

Visit Red Hat at AWS re:Invent 2025 at booth #839 to learn more about Red Hat’s collaboration with AWS.

Availability

The AWS Neuron community operator is now available in the Red Hat OpenShift OperatorHub for customers using Red Hat OpenShift or Red Hat OpenShift Service on AWS. Red Hat AI Inference Server support for AWS AI chips is expected to be available in developer preview in January 2026.

Supporting Quotes

Joe Fernandes, vice president and general manager, AI Business Unit, Red Hat

“By enabling our enterprise-grade Red Hat AI Inference Server, built on the innovative vLLM framework, with AWS AI chips, we're empowering organizations to deploy and scale AI workloads with enhanced efficiency and flexibility. Building on Red Hat's open source heritage, this collaboration aims to make generative AI more accessible and cost-effective across hybrid cloud environments.”

Colin Brace, vice president, Annapurna Labs, AWS

“Enterprises demand solutions that deliver exceptional performance, cost efficiency, and operational choice for mission-critical AI workloads. AWS designed its Trainium and Inferentia chips to make high-performance AI inference and training more accessible and cost-effective. Our collaboration with Red Hat provides customers with a supported path to deploying generative AI at scale, combining the flexibility of open source with AWS infrastructure and purpose-built AI accelerators to accelerate time-to-value from pilot to production.

Jean-François Gamache, chief information officer and vice president, Digital Services, CAE

"Modernizing our critical applications with Red Hat OpenShift Service on AWS marks a significant milestone in our digital transformation. This platform supports our developers in focusing on high-value initiatives – driving product innovation and accelerating AI integration across our solutions. Red Hat OpenShift provides the flexibility and scalability that enable us to deliver real impact, from actionable insights through live virtual coaching to significantly reducing cycle times for user-reported issues."

Anurag Agrawal, founder and chief global analyst, Techaisle

“As AI inference costs escalate, enterprises are prioritizing efficiency alongside performance. This collaboration exemplifies Red Hat’s ‘any model, any hardware’ strategy by combining its open hybrid cloud platform with the distinct economic advantages of AWS Trainium and Inferentia. It empowers CIOs to operationalize generative AI at scale, shifting from cost-intensive experimentation to sustainable, governed production.”

1IDC FutureScape: Worldwide Cloud 2025 Predictions, October 28, 2024, Doc #US52640724

Additional Resources

Connect with Red Hat

About Red Hat, Inc.

Red Hat is the open hybrid cloud technology leader, delivering a trusted, consistent and comprehensive foundation for transformative IT innovation and AI applications. Its portfolio of cloud, developer, AI, Linux, automation and application platform technologies enables any application, anywhere—from the datacenter to the edge. As the world's leading provider of enterprise open source software solutions, Red Hat invests in open ecosystems and communities to solve tomorrow's IT challenges. Collaborating with partners and customers, Red Hat helps them build, connect, automate, secure and manage their IT environments, supported by consulting services and award-winning training and certification offerings.

Forward-Looking Statements

Except for the historical information and discussions contained herein, statements contained in this press release may constitute forward-looking statements within the meaning of the Private Securities Litigation Reform Act of 1995. Forward-looking statements are based on the company’s current assumptions regarding future business and financial performance. These statements involve a number of risks, uncertainties and other factors that could cause actual results to differ materially. Any forward-looking statement in this press release speaks only as of the date on which it is made. Except as required by law, the company assumes no obligation to update or revise any forward-looking statements.

Red Hat, the Red Hat logo and OpenShift are trademarks or registered trademarks of Red Hat, Inc. or its subsidiaries in the U.S. and other countries.

Contacts