Always On VPN Multisite with Azure Traffic Manager

Always On VPN Multisite with Azure Traffic ManagerEliminating single points of failure is crucial to ensuring the highest levels of availability for any remote access solution. For Windows 10 Always On VPN deployments, the Windows Server 2016 Routing and Remote Access Service (RRAS) and Network Policy Server (NPS) servers can be load balanced to provide redundancy and high availability within a single datacenter. Additional RRAS and NPS servers can be deployed in another datacenter or in Azure to provide geographic redundancy if one datacenter is unavailable, or to provide access to VPN servers based on the location of the client.

Multisite Always On VPN

Unlike DirectAccess, Windows 10 Always On VPN does not natively include support for multisite. However, enabling multisite geographic redundancy can be implemented using Azure Traffic Manager.

Azure Traffic Manager

Traffic Manager is part of Microsoft’s Azure public cloud solution. It provides Global Server Load Balancing (GSLB) functionality by resolving DNS queries for the VPN public hostname to an IP address of the most optimal VPN server.

Advantages and Disadvantages

Using Azure Traffic manager has some benefits, but it is not with some drawbacks.

Advantages – Azure Traffic Manager is easy to configure and use. It requires no proprietary hardware to procure, manage, and support.

Disadvantages – Azure Traffic Manager offers only limited health check options. Azure Traffic Manager’s HTTPS health check only accepts HTTP 200 OK responses as valid. Most TLS-based VPNs will respond with an HTTP 401 Unauthorized, which Azure Traffic Manager considers “degraded”. The only option for endpoint monitoring is a simple TCP connection to port 443, which is a less accurate indicator of endpoint availability.

Note: This scenario assumes that RRAS with Secure Socket Tunneling Protocol (SSTP) or another third-party TLS-based VPN server is in use. If IKEv2 is to be supported exclusively, it will still be necessary to publish an HTTP or HTTPS-based service for Azure Traffic Manager to monitor site availability.

Traffic Routing Methods

Azure Traffic Manager provide four different methods for routing traffic.

Priority – Select this option to provide active/passive failover. A primary VPN server is defined to which all traffic is routed. If the primary server is unavailable, traffic will be routed to another backup server.

Weighted – Select this option to provide active/active failover. Traffic is routed to all VPN servers equally, or unequally if desired. The administrator defines the percentage of traffic routed to each server.

Performance – Select this option to route traffic to the VPN server with the lowest latency. This ensures VPN clients connect to the server that responds the quickest.

Geographic – Select this option to route traffic to a VPN server based on the VPN client’s physical location.

Configure Azure Traffic Manager

Open the Azure management portal and follow the steps below to configure Azure Traffic Manager for multisite Windows 10 Always On VPN.

Create a Traffic Manager Resource

  1. Click Create a resource.
  2. Click Networking.
  3. Click Traffic Manager profile.

Create a Traffic Manager Profile

  1. Enter a unique name for the Traffic Manager profile.
  2. Select an appropriate routing method (described above).
  3. Select a subscription.
  4. Create or select a resource group.
  5. Select a resource group location.
  6. Click Create.

Always On VPN Multisite with Azure Traffic Manager

Important Note: The name of the Traffic Manager profile cannot be used by VPN clients to connect to the VPN server, since a TLS certificate cannot be obtained for the trafficmanager.net domain. Instead, create a CNAME DNS record that points to the Traffic Manager FQDN and ensure that name matches the subject or a Subject Alternative Name (SAN) entry on the VPN server’s TLS and/or IKEv2 certificates.

Endpoint Monitoring

Open the newly created Traffic Manager profile and perform the following tasks to enable endpoint monitoring.

  1. Click Configuration.
  2. Select TCP from the Protocol drop-down list.
  3. Enter 443 in the Port field.
  4. Update any additional settings, such as DNS TTL, probing interval, tolerated number of failures, and probe timeout, as required.
  5. Click Save.

Always On VPN Multisite with Azure Traffic Manager

Endpoint Configuration

Follow the steps below to add VPN endpoints to the Traffic Manager profile.

  1. Click Endpoints.
  2. Click Add.
  3. Select External Endpoint from the Type drop-down list.
  4. Enter a descriptive name for the endpoint.
  5. Enter the Fully Qualified Domain Name (FQDN) or the IP address of the first VPN server.
  6. Select a geography from the Location drop-down list.
  7. Click OK.
  8. Repeat the steps above for any additional datacenters where VPN servers are deployed.

Always On VPN Multisite with Azure Traffic Manager

Summary

Implementing multisite by placing VPN servers is multiple physical locations will ensure that VPN connections can be established successfully even when an entire datacenter is offline. In addition, active/active scenarios can be implemented, where VPN client connections can be routed to the most optimal datacenter based on a variety of parameters, including current server load or the client’s current location.

Additional Information

Windows 10 Always On VPN Hands-On Training Classes

Leave a comment

19 Comments

  1. Nori

     /  July 30, 2018

    Since RRAS is not supported on VMs in Azure, would you recommend using Azure P2S VPN as an endpoint for Always On VPN?

    Reply
  2. Eric Yew

     /  July 31, 2018

    I find that using both device tunnel and user tunnel with traffic manager (or any DNS type load balancer) does not work well. If a client connects to VPN server 1 with device tunnel and then it connects with user tunnel, but to the VPN server 2, 1 of the VPN server will have corrupted routing and only way to fix it is to reboot. Happened to 2 of our customers so far. Have you experience this yourself?

    Reply
    • Not encountered that myself, but good to know. I’m not particularly enamored with the device tunnel at this point, so I try to avoid its use as much as possible. 🙂

      Reply
      • Nori

         /  August 2, 2018

        Interesting. Could you elaborate further why you don’t like the device tunnel? I haven’t deployed it, but I was “Microsoft super-excited” when I heard about it.

      • The device tunnel is not authenticated as well as I would like it to be. It requires only a machine certificate, unlike DirectAccess where a certificate and a computer account in Active Directory are required. In addition, there are still some reliability issues with the device tunnel that are frustrating. :/

      • Eric Yew

         /  August 7, 2018

        Same here! unfortunately, one of our customer needs it as they have a password reset tool on their login screen which was accessible via direct access and they wanted like for like functionality. So they have decided to go with only device tunnel for now.

      • bargi

         /  August 8, 2018

        Interested to hear your experience with device tunnel and AOVPN in general
        .
        We started with the Device tunnel at the start of the year with 1709 and then noticed the tunnel dropping after login and svchost.exe_RasMan crashing.(no User tunnel configured, just device)
        At the time MS said it “should” be fixed with 1803 couldn’t confirm it actually would. Not started rolling out 1803 yet with all the issues there’s been with the build.

        As a work around we swapped to User tunnel and User certs which we thought was working fine. But then noticed a worrying number of users who couldn’t connect with Windows saying there was no valid certificate. Everyone with the problem actually had a valid certificate, for some reason Window or rasdialer couldn’t see it but certman can. Deleting the cert and running GPUpdate to pull down a new one fixes the problem immediately. I compared both the old and new cert and no differences other than the obvious.(thumbprint etc)

        So as even a hackier work around we’re rolling out a User VPN but authenticating with Computer Certificate as we’ve yet to see any issues with the Computer certificate.

        Agree using Computer certificate authentication is not the best as it completely bypasses NPS/RADIUS servers and any policies.

      • 1803 is much improved in that regard. I’ve not experienced the issue with the device tunnel dropping when the user tunnel is established. However, I’m still hearing reports (and experiencing myself) of overall tunnel instability. Most commonly the device tunnel/user tunnel aren’t re-establishing automatically after a network status change (e.g. moving from one network to another or coming out of sleep mode). Hoping that Microsoft is resolving some of these in vNext for sure. 🙂

      • bargi

         /  August 8, 2018

        btw here’s another bug that MS support took 5 months to reproduce and acknowledge but then said I’d need to pay for it to be looked at any further
        You can only have 1 User AOVPN with Auto connect enabled at one time per Computer.
        eg: Log on as User 1, create a User AOVPN connection, it works as expected and “Automatically Connect” is ticked, log off.
        Log on as User 2, create a User AOVPN connection, it works as expected and “Automatically Connect” is ticked, log off.
        Log on as User 1 and User AOVPN does not connect and “Automatically Connect” is unticked. Tick it, VPN connects as expected, log off.
        Log on as User 2 and User AOVPN does not connect and “Automatically Connect” is unticked. etc, etc

        Granted this generally isn’t an issue as the majority of the time as laptops generally are used by a single person

      • bargi

         /  August 8, 2018

        As a final rant (I promise!)
        Does anyone else find their RRAS servers full of ghost/orphaned connections?
        Tried adjusting timeouts and other settings to get them to clear but the only thing I found is to either restart RRAS service nightly.

      • Are you using IKEv2 or SSTP? Both? That’s not uncommon for IKEv2, really. If a client gets disconnected (loses network connectivity for any reason) the server will keep the SA alive for a period of time in case the client wants to reconnect. However, if the client establishes a completely new connection, then it doesn’t reuse the old ones and they’ll appear as orphaned. They should die out over time though. How long do they seem to hang around for?

  3. bargi

     /  August 1, 2018

    Another work around for IKE exclusive setups is to enable PPTP and open TCP 1723 to the RRAS server and use this for the Healthcheck.

    So PPTP can’t actually be used, lock it down by unticking “Remote Access connections (Inbound only)” and “Demand-dial routing connections (inbound and outbound) for the PPTP properties under Ports.

    To further secure it, configure your firewall to only allow connections from the MS health check servers defined below.
    https://azuretrafficmanagerdata.blob.core.windows.net/probes/azure/probe-ip-ranges.json
    If you’re firewall doesn’t support updating automatically from the page just sign up to a free webpage monitoring site and have it email you when it changes.

    Benefit is the check is directly related to the RRAS service being up/down.

    Reply
    • bargi

       /  August 1, 2018

      correction to above, you need at least “Remote Access connections (Inbound only)” enabled for RRAS to open the port.

      Reply
    • Thanks for the tips, Raymond. Much appreciated! Definitely a good idea. 🙂

      Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: