VMware on Cloud

VMWare on Cloud

Join us to learn more from a community of collaborative experts and IBM Cloud product users to share advice and best practices with peers and stay up to date regarding product enhancements, regional user group meetings, webinars, how-to blogs, and other helpful materials.

 View Only

Private interconnectivity with VMware Cloud Foundation as a Service

By SAMI KURONEN posted Thu April 10, 2025 08:36 AM

  

This blog belongs to the Tech Series of VMware Cloud Foundation as a Service (VCFaaS). In the earlier blogs, we have discussed about the VCFaaS concepts and Virtual Data Center networking, public networking and IPsec VPNsThis blog will focus on private connectivity. In this context, private connectivity refers firstly to accessing IBM Cloud Services using IBM Cloud Private network and secondly to connecting with other IBM Cloud platforms (VPC, Power Virtual Servers or Classic) and on-premises networks using IBM Cloud Interconnectivity Services. Let’s start first what comes out of the box.

Accessing IBM Cloud Services over private networks

When a VDC is deployed, either with a private only or public and private network connections, you can access IBM Cloud Services networks (166.9.0.0/16 and 161.26.0.0/16) by default. These networks host various key services, such as IBM Cloud DNS, NTP, Microsoft and Red Hat repositories, Ubuntu and Debian APT (Advanced Packaging Tool) mirrors. This network path provides access also to Cloud Services Endpoints (CSEs) for various other IBM Cloud services such as IBM Cloud Object Storage, Databases or Monitoring Endpoints. To allow this traffic flow, the automation has created default NAT and FW rules in the edge and provider gateways.

VDC’s edge gateway has a default FW rule, which allows this egress traffic from the attached routed VDC networks.

The provider gateway has SNAT rules when accessing the specified IBM Cloud Services networks.

The default rules allow default access to DNS and private repositories/mirrors when you provision a Virtual Machine using the Catalog images, but these rules do not allow anything else. This means that you must create additional FW rules and SNAT rules, for example to access resources in the Internet, such as Git.

Using IBM Cloud Interconnectivity Services with VCFaaS

IBM Cloud offers multiple IaaS platforms for various workloads. Virtual Private Cloud (VPC) provides a platform for cloud native IaaS applications using software defined technologies. Power Virtual Server instances enable you to deploy IBM Power servers as logical partitions (LPARs) to Power Virtual Server subnets. Classic Infrastructure provide a very flexible way to deploy Bare Metal Servers or self-managed VMware Cloud Foundation instances. In addition, you might need to access on-premises via WAN, MPLS etc. and this connectivity is provided with IBM Cloud Direct Link. IBM Cloud Transit Gateway is the “glue”, which stitches these platforms together. A Transit Gateway acts as a virtual router between the platforms, where you can connect your platforms or multiple instances of these platforms from the same or different IBM Cloud Accounts as logical connections.

In addition to the native connections, Transit Gateway also supports Generic Routing Encapsulation (GRE) tunnels as one connection type. A special feature with GRE tunnels with Transit Gateway is that the outer side and the inner side live in different VRFs – in other words in different routing domains. You can use either Classic Infrastructure or VPC as the underlying transport network for the GRE traffic. Customer traffic is encapsulated inside the tunnel, and routing information is exchanged with BGP between the connecting device and Transit Gateway Routers. The transport network just carries the outer side on the GRE tunnels between the configured endpoints without exposing the customer traffic inside the tunnel. Two types of GRE tunnels are currently provided (Unbound and Redundant), and the key difference between these is how the high availability is built for the tunnels with the underlying Transit Gateway Routers and what can be used as the transport network.

Let’s jump to the use cases. Why would use the IBM Cloud Interconnectivity Services with your VCFaaS workloads? The most typical use case is that customers want to connect to their own private network using MPLS or other WAN technologies. Another use case is that you want to combine the power of other IBM Cloud IaaS platforms with your VMware workloads. In both use cases, you would need to connect the specific virtual data center to a Transit Gateway. Then you can add other required connections to the Transit Gateway based on your connectivity needs. If you needed on-premises connectivity, you would provision an IBM Cloud Direct Link and then connect this to the Transit Gateway. Or if you needed some workloads in VPC or Power Virtual Servers, you would connect the specific instances to the same Transit Gateway.

How about firewalls and all? Each virtual data center provides Gateway firewalls on Edge and Provider Gateways. If needed, you could also integrate this with the so-called “hub and spoke topology” in VPC. Maybe that needs a blog on its own…so let’s keep this pretty simple for now.

In the previous diagram, note the “OPTIONAL” comment on the public connectivity. This means that you can deploy your VDCs fully private. Then you have only private network connectivity to the VDC, and you can only access the VMs or vApps though the private connectivity. If you are just building the solution, or you want to have a backup management connectivity for the VMs, how could you do that then? Remember that IBM Cloud VPC provides a variety of services that you can utilize using this private interconnectivity option. For example, a Client VPN is one good example. You can use VPN as a Service solution through VPC and this gives a client to site user experience using OpenVPN clients. The VPN service offering includes secrets management using IBM Cloud Secrets Manager, where you can also create the required certificates for the VPN clients. You can then create VPN routing and network address translation rules in the VPN service to ease up the connectivity arrangement, would that be temporary or more permanent in nature.

Another example is load balancing. At the moment, the VCFaaS does not offer a native load balancing service. If your applications would need public (or private) facing load balancing, you could use application load balancing (ALB) from VPC. With ALB, you could even provide public ingress access to your applications through VPC on a private only VDC deployment. This architectural option can enable a better network security, as you could centralize that in VPC hosted next-gen firewalls, and the VDC could be physically isolated from the Internet.

Another good example, related to DNS. You would like to include private DNS service or other IBM Cloud native services into your solution. For example, if you wanted to use the private DNS service provided by IBM Cloud, you can create custom resolvers in the VPC and use these together with your VPC and VMware workloads. Then you can configure this DNS in the virtual data center networks when it allcates IPs to your VMs. You can also access other IBM Cloud Services (such as Object Storage or Databases) using your own IP address space through Virtual Private Endpoints (VPEs), which you would provision in your VPC subnets using an IP address which fits to your IP addressing plan. 

With these examples, I think should have got a high-level idea how you can build a solution for these needs. As you may have noticed - the same basic interconnectivity pattern provides the connectivity for all these options. With the interconnectivity, your VMware workloads can also have access to new services, such as Red Hat OpenShift (ROKS) or Kubernetes (IKS) hosted on VPC. Think what this means to hosting your VMware workloads in IBM Cloud? I think these interconnectivity options are very important. It also can make you wonder, do you really have to build everything for your VMware workloads and applications by yourself or could you use the native cloud services to support your infrastucture? How can this help in the application modernization targets what many customers are doing?

Setting up Transit Gateway connectivity

Now we could take a quick look at the actual connectivity setup with the Transit Gateway. As mentioned, IBM Cloud Transit Gateways support GRE tunnels. With VCFaaS, the GRE tunnels are used to connect your VDC to the Transit Gateway. You can request the connection using the IBM Cloud Portal, and the connection setup is pretty much automated – almost end to end. And you do not have to be a specialist to configure any of this.

First, access your VDC using the IBM Cloud Portal.

To add a connection group, you need the Transit Gateway ID and its location. In my example, I used a Transit Gateway located in Dallas. Yes, you can use Transit Gateways in another regions and you can have multiple Transit Gateways connected to the same VDC.

Once you add the connection group, the automation builds configurations for six (6) connections (Unbound GRE tunnels). It allocated IP addresses and BGP configurations for the GRE tunnels. You can then use the generate CLI command, which creates IBM Cloud CLI commands for the tunnel setup in Transit Gateway. You can then copy-paste these to the IBM Cloud Cloud Shell, or alternatively pass the configurations to a colleague or to another team who has the rights to configure the specific Transit Gateway. And yes, the connected Transit Gateway can belong to another IBM Cloud Account, too.

Each tunnel consists of a configuration as shown in the following example. Gateway IPs are the outside GRE tunnel IPs and the Tunnel IPs are the inner IPs, which are also used for the BGP peering setup.

ibmcloud tg connection-create-gre 0431de56-b8b7-XXXX-XXXX-XXXXXXXXXXXX \

    --name mag-043-vdc-dj-us-south-1-1a \

    --zone us-south-1 \

    --network-type unbound_gre_tunnel \

    --remote-bgp-asn 4260955119 \

    --network-account-id f7ae49cc04f348edb34ac01af90ee00e \

    --base-network-type classic \

    --local-gateway-ip 198.19.36.7 \

    --remote-gateway-ip 10.45.55.64 \

    --local-tunnel-ip 198.18.88.73 \

    --remote-tunnel-ip 198.18.88.74; \

Once the 6 GRE tunnels have been added to the Transit Gateway, you should see all coming to Status “Attached” in the IBM Cloud Portal page.

How do you advertise the VDC networks to the Transit Gateway? The above setup advertises any RFC 1918 routes to the Transit Gateway, if they are known by the Provider Gateway. How do you do that, then? Each VDC network has a setting called route advertisement, which is enabled by default for all new VDC networks when you create them. You can disable this setting per network, if needed. When this is enabled, then your network will be advertised to the Provider gateway and it will be further advertised to the Transit Gateway using BGP routing protocol.

After this, you should be able to connect to the VMs from the connected network. But I must remind – remember to check the Gateway Firewall rules and NAT settings, as explained in the previous blog. Do they allow this traffic or do you have a NAT rule which does not work well with this setup?

Deep dive and debugging Transit Gateway connectivity

Then a section for the networking geeks. What happens behind the scenes? Let me try to summarize this. Eventually, you will get a setup as shown in the following diagram. The GRE tunnels are configured on your Provider Gateway and each tunnel is configured as a BGP peer. The configuration uses as-prepends to prioritise the routing paths and make the failover work. Network failover is automatic in case an edge node would break or if there is an ongoing maintenance. Provider gateway (NSX Tier 0 VRF) runs in active-standby mode, which means that only one of the edge nodes actively routes (or forwards) traffic. Both edge nodes run BGP and all BGP sessions are up (if all nodes are currently live and healthy). Occasionally, you might see a single GRE tunnel be down, for example due to a single Transit Gateway router maintenance. This does not mean that would suffer from an network connectivity outage, as the solution is designed to be highly available across the MZR. And this is one of the reasons, why you see so many GRE tunnels. With Transit Gateway’s Unbound GRE tunnels, each Transit Gateway Router is considered as a single point of failure and routers in the other zones back up each others and the solution provides a highly available setup across the MZR. With multiple tunnels, a single GRE tunnel outage does not break your connectivity, and in most cases you should be OK if only one of the GRE tunnels is live in the active edge node.

If you login to the VMware Director Console, you can see all BGP sessions and their statuses. When the tunnels have been correctly setup, you see a Connection Status “Established”. And the shown statuses are just normal BGP states.

On the Neighbors tab, you see the used route maps for each neighbor. You can also see route map and prefix list configurations through the other tabs. Route maps are read only, but you can customise the prefix lists. For example, if you used other than RFC1918 IPs you must allow these in the specific prefix list.

Provider gateway’s IP routing tables and BGP advertised and learned routes can be downloaded as CSV-files. Yes, I agree…this format differs a bit what I would like to see, too. And I guess you need to spend some time to be able to understand the output, but after that is somewhat logical. The information is there.

Let’s take a deeper look. First, note the edge_path column, which tells which of the edge nodes you are looking for, for example in the below example I have filtered out the likes to only show routes from edge node 1 (/infra/…/edge-nodes/1). There are several route types and the most interesting ones for this example are (b) and (t1c), where (b) stands for BGP and (t1c) stands for a connected network in Tier 1 Gateway. So what we can see here, is that this VDC has learned three prefixes from the TGW (b) and one prefix from Edge Gateway aka Tier 1 Gateway as a connected network. (t0c) are the connected networks of the Tier0 on this edge node, so these include also the GRE tunnel interfaces (see connected networks with a /30 mask).

Then one could ask, how do I know which of the two nodes is active? This is a tricky question, and you do not see that directly, but you can deduce that information. Would the information about the active node matter? Normally no, it does not matter as they are hosted on the same data center and same cluster. But if you wanted to know how to deduct that information…it is a bit tricky, as I said. Bare me a moment.

When troubleshooting, a very useful information is to see what networks do you advertise out to the BGP neighborgs and how. Or vice versa, what you learn from the neighbor. As shown earlier you can get this information in a similar way, by downloading the CSV-file via the portal. In the previous VMware Cloud Directro Portal screenshot, I showed an example VDC network 192.168.223.0/24, where the route advertisement was active. So let’s take a look, if that rule actually works. Let’s collect all BGP peers’ advertised routes, and then we sort this a bit to a single spreadsheet. Then you also see how the route prioritisation actually works. Notice the as_path column and the variable AS path lengths, this is controlled in the route maps. Refer to the previous diagram with the BGP peers, and see the numbers on the GRE tunnels…and connect the dotted lines. That depicts how the advertisements are controlled and how the edge nodes advertise the routes to the Transit Gateway routers. And shortest AS path wins…typically, if we simplify this a bit.

And now, lets go back to the active vs standby node identification. See the magic 65000 65000 65000 in the as_path column in front of the advertised AS paths. This “string” is sent by the standby node with every prefix advertisement, basically to “signal” the other side that “please do not send me any traffic, but I am here - ready to take over”. When and if the edge nodes change roles, the standby becomes active and it will start advertising the prefixes without the 65000 65000 65000 in the AS path. So, using the previous example for (t0c) connected networks, you see that 198.18.88.84/30 is a locally connected network for edge node 1, so that is standby as it advertised the prefixes with the prepended 65000 65000 65000 … I agree, it should be easier, but this is how it is built in Director 10.6. How useful was this information? I guess it depends on your role and background. I would say, for most people this is “nice to know”, and hardly ever needed. For people responsible for network operations or design, this might be somewhat useful.

What’s next

I hope this blog helps you to understand the importance and the possibilities of using IBM Cloud Interconnectivity Services with your VMware Cloud Foundation as a Service instances. This blog has explained how you can expand the connectivity to use other IBM Cloud services with your VMware workloads. This blog also included pretty detailed implementation information and debugging tips for connectivity issues. And normally you do not need any of this, but depending on your role – you can take what you need here.

In the next blog, we’ll talk about expanding virtual data center connectivity between other virtual data centers using Transit Gateway, and also how to share networks using Data Center Groups. And why would you need or use virtual data centers without network edge. Stay tuned…

#VMware
#Tech_Series_Using_VMware_Cloud_Foundation_as_a_Service
#IBMCloud

 

0 comments
11 views

Permalink