Google Cloud offers a range of Arm powered servers in Compute Engine through the C4A and A4X machine series. Arm architecture is optimized for power efficiency, and as a result can yield better price for performance.
Arm processors are common in standard servers due to their power efficiency as compared to x86 servers. Mobile phones and laptops are examples of devices that run on an Arm processor. With an Arm CPU's reduced instruction set, fewer instructions equals greater performance speed with lower battery and power consumption.
For example, C4A uses Google's custom Arm processor, Axion, which is based on the Arm Neoverse V2 processor. The Neoverse V2 is the first V-series CPU to have Armv9 performance, power, and security enhancements. It is designed for high performance computing, machine learning, and general-purpose cloud computing. Consider using C4A general-purpose Arm virtual machines (VMs) for any of the following purposes:
- Run compute-intensive workloads that require the ability to scale usage quickly when needed.
- Optimize for price-performance on Arm-compatible workloads.
- Build on modern, open source software stacks.
- Develop and test mobile or embedded systems which use an Arm CPU.
- Evaluate whether your workload is suitable for an Arm CPU.
To use GPUs with an Arm-based CPU, choose the A4X machine series, which runs on the NVIDIA GB200 NVL72 platform. VMs created using this machine series have attached NVIDIA GB200 Grace Blackwell Superchips. This machine series is optimized for massively parallelized Compute Unified Device Architecture (CUDA) compute workloads, such as machine learning (ML) and high performance computing (HPC).
A4X machine series
A4X is the first Compute Engine VM with both Arm-based CPUs and attached GPUs. A4X
offers machine types that have up to 140 vCPUs,
and 884 GB of memory. A4X uses NVIDIA GB200 GPUs, which offer
180 GB memory per GPU. A4X has two sockets with NVIDIA Grace Arm CPUs
connected to four B200 GPUs with fast chip-to-chip (NVLink C2C) communication.
A4X is available in the a4x-highgpu-4g
machine type.
Storage options for A4X instances
A4X can be used with Google Cloud Hyperdisk attached storage and comes with 12,000 GiB of Local SSD. Compute Engine automatically attaches the Local SSD disks to your A4X instances during instance creation.
OS images
A4X instances support public Arm-based OS images. You can also create custom images using a public Arm-based OS image.
C4A machine series
C4A is the first
Arm-based VM built on Google's Axion
Arm64-based CPU. C4A offers machine types with up to 72 vCPUs and 576 GB of
DDR5-5600 memory. C4A is available in standard
, highmem
, and highcpu
machine types.
C4A is built on Titanium which uses network offloads and enables per VM Tier_1 networking performance of up to 100 Gbps with the gVNIC networking interface. C4A also supports the NVMe disk interface with Hyperdisk Balanced and Hyperdisk Extreme disks.
Simultaneous multithreading
For the C4A machine series, each vCPU is backed by a single core with no simultaneous multithreading (SMT). Thus, C4A VMs deliver greater performance per vCPU compared to a VM with SMT enabled. While SMT provides benefits to certain workloads, single-threaded cores are ideal for compute-intensive workloads because the processes can access the entire core instead of sharing it with other processes.
OS images
C4A VMs support public Arm-based OS images. You can also create custom images using a publicly-available Arm-based image.
Tau T2A machine series
The Tau T2A Arm machine series runs on the 64 core Ampere Altra Arm processor at 3.0 GHz all-core frequency. Tau T2A makes it possible to run workloads that run best, or exclusively, on Arm.
The Tau T2A machine series has predefined machine types of up to 48 physical cores with 4 GB of memory per vCPU. Tau T2A machine types run within a single NUMA node.
Tau T2A machine types support only the NVMe interface for storage, and Google virtual NIC (gVNIC) for networking. Virtio-Net and SCSI interfaces are not supported. All publicly-available Arm OS images are configured to use the NVMe and gVNIC interfaces. gVNIC is a network interface that is designed specifically for Compute Engine. It provides better performance and supports higher network bandwidths and throughput.
For this machine series, each vCPU is backed by a single core with no simultaneous multithreading (SMT).
Workload recommendations
The C4A machine series is an excellent choice for a wide range of scale-out and compute-intensive workloads, especially when price performance is a key concern. Consider C4A when you are deploying workloads such as the following:
- ML data processing
- ML inferencing and model serving
- App serving, web serving, and game serving
- Embedded systems development
- Development on CI/CD on Arm
- Video and image encoding, transcoding, and processing
- Digital advertising exchanges and serving
- Cache servers
- Computational drug discovery
- Android development
- Autonomous or conventional automotive software development
What's next
- Review the specifications and features of the A4X machine series.
- Review the specifications for the C4A machine series.
- Learn about available CPU platforms for Google Cloud.
- Create and start a Compute Engine instance using an Arm OS image.