Cloud Architecture Battlegrounds AWS vs. Azure

If you come from a traditional enterprise routing and switching background, navigating the architectural nuances of public clouds can feel like decoding two entirely separate dialects. In our deep dive into AWS, we saw a network built around a rigid, deterministic, hardware-driven philosophy—where software-defined routing policies are baked directly into custom ASIC-driven line cards (The Nitro System).

Microsoft Azure takes an entirely divergent path. Instead of optimizing for rigid hardware silicon, Azure treats cloud networking as an extensible software compiler problem running on dynamic hardware accelerators (FPGAs).

If AWS behaves like a fleet of custom-engineered hardware chassis switches running fixed microcode, Azure behaves like a massive distributed cluster of software-programmable Match-Action tables operating inside an inline “bump-in-the-wire” programmable fabric.

This guide dismantles Azure’s network virtualization layer under the hood, contrasting its engineering choices directly against the structural underpinnings of AWS.

1. The Silicon Layer: ASICs vs. FPGAs

The fundamental divergence between AWS and Azure begins at the silicon level on the host server’s motherboard.

AWS (The ASIC Approach): AWS Nitro is built primarily on application-specific integrated circuits (ASICs) and custom System-on-Chips (SoCs). ASICs offer deterministic, screaming-fast line-rate performance with a rigid, immutable feature set. If AWS wants to roll out a radical new physical layer encapsulation format, they generally have to tape out, test, and physically deploy a brand new generation of physical Nitro chips across their global fleet.
Azure (The FPGA Approach): Azure relies on FPGAs (Field Programmable Gate Arrays) via an architecture called SmartNIC / AccelNet (Accelerated Networking). An FPGA is essentially “silicon Lego”—a massive sea of generic logic blocks and programmable interconnects that can be completely re-wired via software code (Verilog/VHDL) over the air. If Microsoft invents or shifts to a new network virtualization standard, they push a software-driven firmware update to rewrite the actual physical hardware circuits across millions of active servers simultaneously.

The “Bump-in-the-Wire” Layout

In early Azure implementations, Microsoft pioneered a hardware design called a “Bump-in-the-Wire” topology. The server motherboard features a standard commodity NIC (such as a Mellanox/NVIDIA ConnectX chip) connected to the host CPU via the PCIe bus.

However, the physical optical cable coming from the Top-of-Rack (ToR) switch doesn’t plug into that NIC. It plugs into an inline, custom Microsoft FPGA board (originally Project Catapult, now evolved into Azure Boost / MANA).

[ Host CPU (Hyper-V Host) ]
       │
       ▼ (PCIe Bus)
[ Standard Commodity NIC ]
       │
       ▼ (Internal PCIe / SerDes Link)
[ Azure SmartNIC / FPGA (Altera Agilex / Catapult) ]  <── "The Bump-in-the-Wire"
       │
       ▼ (Physical Optical Interface)
[ Physical Clos Fabric (SONiC-powered Top-of-Rack Switch) ]

Every single packet entering or exiting an Azure server passes through the FPGA first. The FPGA acts as an inline hardware pre-processor, executing software-defined infrastructure policies directly at the physical layer before passing the frames down the line.

2. Control Plane & SDN Overlay: VFP vs. The Mapping Service

While AWS tossed out routing protocols in the overlay to build a centralized, transactional key-value database (The Mapping Service), Azure built its SDN overlay around a massive, distributed, layered software switch engine called the Virtual Filtering Platform (VFP).

Virtual Filtering Platform (VFP)

VFP acts as a programmable extension operating inside the Hyper-V virtual switch layer. If you are familiar with OpenFlow or Open vSwitch (OVS), VFP feels immediately recognizable. It uses a strict Match-Action Table (MAT) pipeline model.

Instead of evaluating a single flat routing table, VFP pushes packets sequentially through a series of discrete, logical policy layers:

[ Incoming / Outbound Packet ]
              │
              ▼
    ┌───────────────────┐
    │   VNET Layer      │  (Overlay Encapsulation - VXLAN/NVGRE)
    └─────────┬─────────┘
              ▼
    ┌───────────────────┐
    │   ACL Layer       │  (Network Security Groups - NSGs)
    └─────────┬─────────┘
              ▼
    ┌───────────────────┐
    │   NAT / SLB Layer │  (Software Load Balancing / Public VIP NAT)
    └───────────────────┘

Each layer has its own dedicated, centralized SDN controller subsystem. These controllers dynamically program Match-Action rules (rules based on the packet’s L2/L3/L4 headers) directly into the VFP engine using southbound APIs.

The Overlay Protocol: VXLAN vs. HNV (NVGRE)

Unlike AWS, which relies heavily on a Geneve-like custom encapsulation layer enforced by Nitro, Azure historically pioneered NVGRE (Network Virtualization using Generic Routing Encapsulation) and subsequently shifted to VXLAN for its multi-tenant overlay tunnels.

While AWS completely proxies ARP to hide Layer 2 structures, Azure’s VFP allows for more traditional SDN software handling of tenant frames, encapsulating standard tenant Ethernet structures into UDP-backed VXLAN frames directly within the host’s forwarding pipeline.

3. Host Offloading: AccelNet Fast-Path vs. Slow-Path

To avoid burning precious host CPU cycles executing VFP software layers inside Hyper-V, Microsoft built AccelNet (Accelerated Networking). This is Azure’s equivalent to the ENA/Nitro hardware pipeline, but executed via the FPGA using a brilliant software-to-hardware compilation model.

When a customer VM spins up or initiates a brand new connection flow, the hardware FPGA doesn’t know what to do with the first packet because its on-chip cache is blank.

[ New Flow Packet ] ──► [ FPGA Cache Miss ] ──► [ Slow-Path: Pushed to Host Hyper-V VFP ]
                                                               │
                                                               ▼
[ FPGA Cache Hit ]  ◄── [ Hardware Flow Rule Injected ] ◄── [ Rule Compiled by Software ]
        │
        ▼ (Subsequent Packets)
[ Fast-Path: Line-Rate Execution in Silicon ]

The Slow-Path (First Packet): The first packet triggers an FPGA cache miss. The FPGA punts the packet up to the host CPU, where the complex VFP software switch in the Hyper-V host parses the packet, runs it through the VNET/ACL/NAT lookup layers, and validates the connection policy.
The Compile Phase: Once the Hyper-V host determines what should happen to that packet flow, it instantly compiles a hardware flow rule—known as a Unified Flow (UF) and a Header Transposition (HT) action.
The Fast-Path (Subsequent Packets): The host injects this compiled rule directly down into the FPGA’s hardware lookup tables over the PCIe bus. Every subsequent packet belonging to that exact same TCP/UDP flow bypasses the host CPU entirely. The FPGA parses the headers, executes the NSG security check, translates the NAT block, applies the VXLAN encapsulation, and fires it out the optical port at line rate in under 15 microseconds.

4. Scale-Out L4/L7 Services: Anvil vs. Hyperplane

AWS leverages its custom Hyperplane DPDK fleets to run NAT Gateways, Network Load Balancers, and Transit Gateways as distributed out-of-band compute groups.

Azure handles this completely differently. Because every single host server in an Azure data center already contains a hyper-programmable, line-rate FPGA, Azure can distribute Layer 4 load balancing directly to the compute hosts themselves.

Software Load Balancer (SLB) Architecture

Azure’s Software Load Balancer (SLB) is split into two components: the MUX (Multiplexer) and the Host Agent.

The MUX: When public internet traffic hits an Azure Edge site, it routes via BGP Anycast to a fleet of dedicated MUX servers. The MUX boxes do not execute stateful session keeping for the lifetime of a connection. They merely check the incoming packet, run a stateless consistent hash to determine which destination backend host should receive the flow, and wrap the packet in an outer routing encapsulation.
The Host FPGA (Direct Server Return): When the packet lands on the destination backend server, the inline FPGA intercepts it. The FPGA decapsulates the packet, performs the destination NAT (Virtual IP $ ightarrow$ Private IP) using its hardware AccelNet pipeline, and hands it to the VM.

The CCIE Key Concept (Direct Server Return): When the VM sends a response packet back to the client, the FPGA executes Direct Server Return (DSR). It strips the VM’s private IP and rewrites the source header to match the Public Virtual IP (VIP) directly in silicon. The packet goes straight back out to the internet, completely bypassing the MUX fleet on the return path. This eliminates half of the architectural choke point found in traditional load balancer loops.

5. Security & Isolation Boundaries: NSGs vs. Security Groups

In AWS, Security Groups are stateful and NACLs are stateless, with both constrained by rigid multi-variable hardware ceilings enforced by the Nitro card’s fixed onboard SRAM tables ($ ext{Groups} imes ext{Rules} \le 1,000$).

Azure flattens this topology. Azure Network Security Groups (NSGs) are completely stateful, but they combine the logic of both subnet-level filtering and instance-level filtering into a single unified architecture.

Adaptive Software Compilation

Because Azure runs an FPGA-driven VFP model, there are no arbitrary hardware lookup multipliers like AWS’s 1,000-rule limit. When you define an NSG with hundreds of complex rules, service tags, and application security groups, Azure’s control plane treats those rules as an abstract policy matrix. It compiles those rules down into highly optimized hardware-level Match-Action structures optimized for the FPGA’s programmable gate arrays.

Instead of searching a linear sequence of rules one-by-one in hardware memory pipelines (which degrades performance as arrays grow), the FPGA acts as an instantiated, single-pass decision matrix. It evaluates the complete policy footprint simultaneously in hardware gates during the fast-path pipeline execution, maintaining microsecond-level latency profiles even with immense rule counts.

6. Summary Comparison: AWS vs. Azure Under the Hood

Architectural Component	AWS Under the Hood	Azure Under the Hood
Silicon Architecture	Rigid, custom fixed-logic ASICs/SoCs (Nitro Cards).	Programmable, dynamic FPGAs (AccelNet / Azure Boost / MANA).
SDN Architecture	Mapping Service: Decentralized transactional key-value store database lookups.	Virtual Filtering Platform (VFP): Centralized controllers programming Layered Match-Action Tables.
Offload Mechanism	SR-IOV / ENA: Hardware virtual functions streaming packets directly into dedicated Nitro ASICs.	AccelNet Fast-Path / Slow-Path: First packet hits host CPU Hyper-V; subsequent packets compile to hardware circuits.
L4 Load Balancing	Hyperplane: Centralized, DPDK-accelerated anycasted compute clusters handling stateful routing out-of-band.	MUX + Host FPGA: Distributed stateless hashing at the edge combined with Direct Server Return (DSR) in host silicon.
Policy Constraints	Strict hardware-enforced multiplicative scale ceilings ($ ext{SGs} imes ext{Rules} \le 1,000$).	Flexibly compiled and optimized directly into FPGA programmable logic gates without linear scaling penalties.
Underlay Control	Proprietary custom interior protocol fabrics.	Heavily driven by SONiC (Software for Open Networking in the Cloud) running on merchant silicon.

Written by Paul Carvill

Enterprise → cloud → AI networking. I write the breakdowns I wish I’d had. New field notes roughly twice a month.