Welcome back!!! We are at Part 4 of the blog series on NSX multitenancy. In this article, we will discuss the option of using stateful active-active T0 gateway as the Provider gateway and the different T1 gateway modes available in NSX projects and VPCs, including the supported attachment options with the Provider gateway.
Earlier this year, I wrote a comprehensive deep-dive blog series on NSX Stateful Active-Active Gateways, and I strongly recommend having a read before proceeding.
Part 1 : Stateful Active-Active Single Tier Routing
Part 2 : Stateful Active-Active Two Tier Routing
Part 3 : Routing Considerations and Packet Walks
Part 4 : Edge Sub-clusters and Failure Domains
Let’s get started:
Supported Stateful Active-Active Gateway Topologies
We have only limited topologies supported with stateful active-active gateways. At the time of writing, we have the below topologies available.
- Stateful A/A T1 gateways attached to stateful A/A T0 gateway sharing the same edge cluster
- DR-only T1 gateways attached to stateful A/A T0 gateway
- Stateful A/S T1 gateways attached to stateful A/A T0 gateway where T1 and T0 gateways are on separate edge clusters
That also means:
- If the tenants (Projects) need to deploy an active-active T1 gateway, they need to leverage the same edge cluster as the Provider T0 gateway.
- If the tenants (Projects) need to deploy an active-standby T1 gateway attached to the stateful active-active T0 gateway, they need to leverage a separate edge cluster.
So, it is important that the Enterprise Administrator share the right resources (edge clusters) to the tenants (projects) from the default space, so that they can align the topologies to any of the supported combinations above.
Setting up the Provider T0 Gateway (Stateful Active-Active)
I have deployed a new Provider T0 gateway, this time in stateful active-active mode on an edge cluster with two edge nodes. Note that, Provider T0 configuration is done from the default space by the Enterprise Administrator.
A pair of edge nodes in the edge cluster forms an edge sub-cluster to store the stateful information. As we have only two edge nodes in the cluster, we will have only one edge sub-cluster.
The T0 gateway has 4 uplink (external) interfaces and we organize them into two interface groups based on their VLAN membership:
- Interface Group 0 : Uplink interfaces over VLAN 1006
- Interface Group 1 : Uplink interfaces over VLAN 1007
We have BGP configured between the Provider T0 gateway and the upstream ToR switches for dynamic route advertisement.
All what we have done is pretty much similar to what we have learned about stateful active-active gateway configurations in the 4-part blog series mentioned above.
We have two edge clusters deployed:
- One for the Provider T0 gateway to support active-active stateful services (vxdc01-c01-ec01-PROVIDER)
- One for the tenants to support legacy Active-Standby T1 gateways (vxdc01-c01-ec01-TENANT-AppSec)
Both edge clusters are prepared on the default overlay transport zone (currently a requirement for NSX projects)
Assigning Resources to Projects
We will assign the below resources to the project “Project_AppSec”:
- Provider T0 gateway running in stateful active-active mode
- Edge cluster hosting the Provider T0 gateway (vxdc01-c01-ec01-PROVIDER)
- Edge cluster for the project T1 gateways (dedicated – vxdc01-c01-ec01-TENANT-AppSec)
Stateful Active-Active Gateway options in Projects
We have the below stateful active-active gateway options in Projects.
- Stateful A/A project T1 gateway attached to stateful A/A Provider T0 gateway
- Stateful A/S project T1 gateway attached to stateful A/A Provider T0 gateway
- DR-only (stateless) project T1 gateway attached to stateful A/A Provider T0 gateway
Let’s discuss these one by one.
1. Stateful A/A project T1 gateway attached to stateful A/A Provider T0 gateway
To deploy a stateful active-active T1 gateway in a project, we need to leverage the same edge cluster as the Provider T0 gateway. This will be the edge cluster named “vxdc01-c01-ec01-PROVIDER”.
The system auto-configures edge subclusters and interface groups for T1 SR uplinks and T1 SR-DR backplane. For more details on the internal constructs and traffic flows, please check out my four-part blog series on stateful active-active gateways that I mentioned at the beginning of this blog post.
Let’s test the traffic flow from a workload VM attached to the active-active T1 gateway in the project to an external destination 8.8.4.4
We see that edge node 02 is authoritative for the stateful information for this flow (to destination 8.8.4.4), and traffic should be punted to edge node 02 if it is hashed to edge node 01 by the host transport nodes. For the northbound flow, traffic punting happens at the T1 DR – SR backplane interface of the T1 gateway.
Let’s confirm this using traceflow by logging into the project “Project_AppSec” as hari@vxplanet.int who is the Enterprise Admin (because Project admin lacks visibility to traceflow results outside of the project T1 gateway)
The below sketch shows the traffic flow as seen in the traceflow results (South -> North)
2. Stateful A/S project T1 gateway attached to stateful A/A Provider T0 gateway
To deploy a stateful active-standby T1 gateway in the project, attached to the stateful active-active Provider T0 gateway, we need to leverage a separate edge cluster than the one used for the Provider T0 gateway. This will be the edge cluster named “vxdc01-c01-ec01-TENANT-AppSec”.
Let’s test the traffic flow from a workload VM attached to the active-standby T1 gateway in the project to an external destination 8.8.4.4
We see that, after the T1 DR lookup in the host transport node, traffic is tunneled to the T1 Active edge node (edge node 04) from where it is ECMP’ed to edge node 01 or edge node 02 (which is on edge cluster vxdc01-c01-ec01-PROVIDER).
Because the T0 gateway is active-active, we see that edge node 02 is authoritative for the stateful information for this flow (to destination 8.8.4.4), and traffic should be punted to edge node 02 if it is hashed to edge node 01 by edge node 04. For the northbound flow, traffic punting happens at the T0 DR – SR backplane interface of the T0 gateway.
Let’s confirm this using traceflow by logging into the project “Project_AppSec” as hari@vxplanet.int who is the Enterprise Admin (because Project admin lacks visibility to traceflow results outside of the project T1 gateway)
The below sketch shows the traffic flow as seen in the traceflow results (South -> North)
3. DR-only (stateless) project T1 gateway attached to stateful A/A Provider T0 gateway
A DR-only T1 gateway (stateless) doesn’t require an edge cluster as there is no T1 SR component created.
Let’s test the traffic flow from a workload VM attached to the DR-only T1 gateway in the project to an external destination 8.8.4.4
T1 DR and T0 DR lookup happens locally on the host transport node. Because the T0 gateway is active-active, we see that edge node 02 is authoritative for the stateful information for this flow (to destination 8.8.4.4), and traffic should be punted to edge node 02 if it is hashed to edge node 01 by the host transport node. For the northbound flow, traffic punting happens at the T0 DR – SR backplane interface of the T0 gateway.
Let’s confirm this using traceflow by logging into the project “Project_AppSec” as hari@vxplanet.int who is the Enterprise Admin (because Project admin lacks visibility to traceflow results outside of the project T1 gateway)
The below sketch shows the traffic flow as seen in the traceflow results (South -> North)
Stateful Active-Active Gateway options in VPCs
A VPC T1 gateway is automatically instantiated under the respective project context whenever a VPC is created. This VPC T1 gateway is not exposed inside the VPC, but only to the VPC’s parent project in read-only mode. Lifecycle of the VPC T1 gateway is managed by the system, and as such, we cannot modify the HA mode of the gateway unlike project T1 gateways. We have the below stateful active-active gateway options in VPCs:
- Stateful A/S VPC T1 gateway attached to stateful A/A Provider T0 gateway
- DR-only (stateless) T1 gateway attached to stateful A/A Provider T0 gateway
Let’s discuss these one by one.
1. Stateful A/S VPC T1 gateway attached to stateful A/A Provider T0 gateway
Stateful active-standby VPC T1 gateway is created when North-South services are enabled for the VPC. As this is attached to stateful active-active Provider T0 gateway, we need to leverage a separate edge cluster for the VPC than the one used for the Provider T0 gateway. This will be the edge cluster named “vxdc01-c01-ec01-TENANT-AppSec”.
This will be the error message if we choose the Provider edge cluster for the VPC.
2. DR-only (stateless) VPC T1 gateway attached to stateful A/A Provider T0 gateway
A DR-only VPC T1 gateway is created when stateful services are not enabled for the VPC. That is, the options for N-S services and Default Outbound NAT are disabled in the VPC.
If the DHCP option in VPC is not set to None, we require an edge cluster for the segment DHCP services. For this, both the edge clusters – vxdc01-c01-ec01-PROVIDER (Provider) and vxdc01-c01-ec01-TENANT-AppSec (Project dedicated) are supported.
If the DHCP option in VPC is set to None, then edge cluster is optional.
Okay, it’s time to break now and have a coffee. We will meet in Part 5 to discuss around edge cluster considerations and failure domains. Stay tuned!!!
I hope the article was informative.
Thanks for reading.
Continue reading? Here are the other parts of this series:
Part 1 – Introduction & Multitenancy models :
https://vxplanet.com/2023/10/24/nsx-multitenancy-part-1-introduction-multitenancy-models/
Part 2 – NSX Projects :
https://vxplanet.com/2023/10/24/nsx-multitenancy-part-2-nsx-projects/
Part 3 – Virtual Private Clouds (VPCs)
https://vxplanet.com/2023/11/05/nsx-multitenancy-part-3-virtual-private-clouds-vpcs/
Part 5 : Edge Cluster Considerations and Failure Domains
https://vxplanet.com/2024/01/23/nsx-multitenancy-part-5-edge-cluster-considerations-and-failure-domains/
Part 6 : Integration with NSX Advanced Load balancer
https://vxplanet.com/2024/01/29/nsx-multitenancy-part-6-integration-with-nsx-advanced-load-balancer/