all writing cloud networking

Cloud Architecture Battlegrounds AWS vs. Azure

If you come from a traditional enterprise routing and switching background, navigating the architectural nuances of public clouds can feel like decoding two entirely separate dialects. In our deep dive into AWS, we saw a network built around a rigid, deterministic, hardware-driven philosophy—where software-defined routing policies are baked directly into custom ASIC-driven line cards (The Nitro System).

Microsoft Azure takes an entirely divergent path. Instead of optimizing for rigid hardware silicon, Azure treats cloud networking as an extensible software compiler problem running on dynamic hardware accelerators (FPGAs).

If AWS behaves like a fleet of custom-engineered hardware chassis switches running fixed microcode, Azure behaves like a massive distributed cluster of software-programmable Match-Action tables operating inside an inline “bump-in-the-wire” programmable fabric.

This guide dismantles Azure’s network virtualization layer under the hood, contrasting its engineering choices directly against the structural underpinnings of AWS.


1. The Silicon Layer: ASICs vs. FPGAs

The fundamental divergence between AWS and Azure begins at the silicon level on the host server’s motherboard.

The “Bump-in-the-Wire” Layout

In early Azure implementations, Microsoft pioneered a hardware design called a “Bump-in-the-Wire” topology. The server motherboard features a standard commodity NIC (such as a Mellanox/NVIDIA ConnectX chip) connected to the host CPU via the PCIe bus.

However, the physical optical cable coming from the Top-of-Rack (ToR) switch doesn’t plug into that NIC. It plugs into an inline, custom Microsoft FPGA board (originally Project Catapult, now evolved into Azure Boost / MANA).

[ Host CPU (Hyper-V Host) ]

       ▼ (PCIe Bus)
[ Standard Commodity NIC ]

       ▼ (Internal PCIe / SerDes Link)
[ Azure SmartNIC / FPGA (Altera Agilex / Catapult) ]  <── "The Bump-in-the-Wire"

       ▼ (Physical Optical Interface)
[ Physical Clos Fabric (SONiC-powered Top-of-Rack Switch) ]

Every single packet entering or exiting an Azure server passes through the FPGA first. The FPGA acts as an inline hardware pre-processor, executing software-defined infrastructure policies directly at the physical layer before passing the frames down the line.


2. Control Plane & SDN Overlay: VFP vs. The Mapping Service

While AWS tossed out routing protocols in the overlay to build a centralized, transactional key-value database (The Mapping Service), Azure built its SDN overlay around a massive, distributed, layered software switch engine called the Virtual Filtering Platform (VFP).

Virtual Filtering Platform (VFP)

VFP acts as a programmable extension operating inside the Hyper-V virtual switch layer. If you are familiar with OpenFlow or Open vSwitch (OVS), VFP feels immediately recognizable. It uses a strict Match-Action Table (MAT) pipeline model.

Instead of evaluating a single flat routing table, VFP pushes packets sequentially through a series of discrete, logical policy layers:

[ Incoming / Outbound Packet ]


    ┌───────────────────┐
    │   VNET Layer      │  (Overlay Encapsulation - VXLAN/NVGRE)
    └─────────┬─────────┘

    ┌───────────────────┐
    │   ACL Layer       │  (Network Security Groups - NSGs)
    └─────────┬─────────┘

    ┌───────────────────┐
    │   NAT / SLB Layer │  (Software Load Balancing / Public VIP NAT)
    └───────────────────┘

Each layer has its own dedicated, centralized SDN controller subsystem. These controllers dynamically program Match-Action rules (rules based on the packet’s L2/L3/L4 headers) directly into the VFP engine using southbound APIs.

The Overlay Protocol: VXLAN vs. HNV (NVGRE)

Unlike AWS, which relies heavily on a Geneve-like custom encapsulation layer enforced by Nitro, Azure historically pioneered NVGRE (Network Virtualization using Generic Routing Encapsulation) and subsequently shifted to VXLAN for its multi-tenant overlay tunnels.

While AWS completely proxies ARP to hide Layer 2 structures, Azure’s VFP allows for more traditional SDN software handling of tenant frames, encapsulating standard tenant Ethernet structures into UDP-backed VXLAN frames directly within the host’s forwarding pipeline.


3. Host Offloading: AccelNet Fast-Path vs. Slow-Path

To avoid burning precious host CPU cycles executing VFP software layers inside Hyper-V, Microsoft built AccelNet (Accelerated Networking). This is Azure’s equivalent to the ENA/Nitro hardware pipeline, but executed via the FPGA using a brilliant software-to-hardware compilation model.

When a customer VM spins up or initiates a brand new connection flow, the hardware FPGA doesn’t know what to do with the first packet because its on-chip cache is blank.

[ New Flow Packet ] ──► [ FPGA Cache Miss ] ──► [ Slow-Path: Pushed to Host Hyper-V VFP ]


[ FPGA Cache Hit ]  ◄── [ Hardware Flow Rule Injected ] ◄── [ Rule Compiled by Software ]

        ▼ (Subsequent Packets)
[ Fast-Path: Line-Rate Execution in Silicon ]

4. Scale-Out L4/L7 Services: Anvil vs. Hyperplane

AWS leverages its custom Hyperplane DPDK fleets to run NAT Gateways, Network Load Balancers, and Transit Gateways as distributed out-of-band compute groups.

Azure handles this completely differently. Because every single host server in an Azure data center already contains a hyper-programmable, line-rate FPGA, Azure can distribute Layer 4 load balancing directly to the compute hosts themselves.

Software Load Balancer (SLB) Architecture

Azure’s Software Load Balancer (SLB) is split into two components: the MUX (Multiplexer) and the Host Agent.

The CCIE Key Concept (Direct Server Return): When the VM sends a response packet back to the client, the FPGA executes Direct Server Return (DSR). It strips the VM’s private IP and rewrites the source header to match the Public Virtual IP (VIP) directly in silicon. The packet goes straight back out to the internet, completely bypassing the MUX fleet on the return path. This eliminates half of the architectural choke point found in traditional load balancer loops.


5. Security & Isolation Boundaries: NSGs vs. Security Groups

In AWS, Security Groups are stateful and NACLs are stateless, with both constrained by rigid multi-variable hardware ceilings enforced by the Nitro card’s fixed onboard SRAM tables ($ ext{Groups} imes ext{Rules} \le 1,000$).

Azure flattens this topology. Azure Network Security Groups (NSGs) are completely stateful, but they combine the logic of both subnet-level filtering and instance-level filtering into a single unified architecture.

Adaptive Software Compilation

Because Azure runs an FPGA-driven VFP model, there are no arbitrary hardware lookup multipliers like AWS’s 1,000-rule limit. When you define an NSG with hundreds of complex rules, service tags, and application security groups, Azure’s control plane treats those rules as an abstract policy matrix. It compiles those rules down into highly optimized hardware-level Match-Action structures optimized for the FPGA’s programmable gate arrays.

Instead of searching a linear sequence of rules one-by-one in hardware memory pipelines (which degrades performance as arrays grow), the FPGA acts as an instantiated, single-pass decision matrix. It evaluates the complete policy footprint simultaneously in hardware gates during the fast-path pipeline execution, maintaining microsecond-level latency profiles even with immense rule counts.


6. Summary Comparison: AWS vs. Azure Under the Hood

Architectural ComponentAWS Under the HoodAzure Under the Hood
Silicon ArchitectureRigid, custom fixed-logic ASICs/SoCs (Nitro Cards).Programmable, dynamic FPGAs (AccelNet / Azure Boost / MANA).
SDN ArchitectureMapping Service: Decentralized transactional key-value store database lookups.Virtual Filtering Platform (VFP): Centralized controllers programming Layered Match-Action Tables.
Offload MechanismSR-IOV / ENA: Hardware virtual functions streaming packets directly into dedicated Nitro ASICs.AccelNet Fast-Path / Slow-Path: First packet hits host CPU Hyper-V; subsequent packets compile to hardware circuits.
L4 Load BalancingHyperplane: Centralized, DPDK-accelerated anycasted compute clusters handling stateful routing out-of-band.MUX + Host FPGA: Distributed stateless hashing at the edge combined with Direct Server Return (DSR) in host silicon.
Policy ConstraintsStrict hardware-enforced multiplicative scale ceilings ($ ext{SGs} imes ext{Rules} \le 1,000$).Flexibly compiled and optimized directly into FPGA programmable logic gates without linear scaling penalties.
Underlay ControlProprietary custom interior protocol fabrics.Heavily driven by SONiC (Software for Open Networking in the Cloud) running on merchant silicon.

Written by Paul Carvill

Enterprise → cloud → AI networking. I write the breakdowns I wish I’d had. New field notes roughly twice a month.

keep reading

More writing