Always On VPN IKEv2 Load Balancing and NAT

Always On VPN IKEv2 Load Balancing and NAT Over the last few weeks, I’ve worked with numerous organizations and individuals troubleshooting connectivity and performance issues associated with Windows 10 Always On VPN, and specifically connections using the Internet Key Exchange version 2 (IKEv2) VPN protocol. An issue that appears with some regularity is when Windows 10 clients fail to connect with error 809. In this scenario, the server will accept connections without issue for a period of time and then suddenly stop accepting requests. When this happens, existing connections continue to work without issue in most cases. Frequently this occurs with Windows Server Routing and Remote Access Service (RRAS) servers configured in a clustered array behind an External Load Balancer (ELB).

Network Address Translation

It is not uncommon to use Network Address Translation (NAT) when configuring Always On VPN. In fact, for most deployments the public IP address for the VPN server resides not on the VPN server, but on an edge firewall or load balancer connected directly to the Internet. The firewall/load balancer is then configured to translate the destination address to the private IP address assigned to the VPN server in the perimeter/DMZ or the internal network. This is known a Destination NAT (DNAT). Using this configuration, the client’s original source IP address is left intact. This configuration presents no issues for Always On VPN.

Source Address Translation

When troubleshooting these issues, the common denominator seems to be the use of Full NAT, which includes translating the source address in addition to the destination. This results in VPN client requests arriving at the VPN server as appearing not to come from the client’s original IP address, but the IP address of the network device (firewall or load balancer) that is translating the request. Full NAT may be explicitly configured by an administrator, or in the case of many load balancers, configured implicitly because the load balancer is effectively proxying the connection.

Known Issues

IKEv2 VPN connections use IPsec for encryption, and by default, Windows limits the number of IPsec Security Associations (SAs) coming from a single IP address. When a NAT device is performing destination/full NAT, the VPN server sees all inbound IKEv2 VPN requests as coming from the same IP address. When this happens, clients connecting using IKEv2 may fail to connect, most commonly when the server is under moderate to heavy load.

Resolution

The way to resolve this issue is to ensure that any load balancers or NAT devices are not translating the source address but are performing destination NAT only. The following is configuration guidance for F5, Citrix ADC (formerly NetScaler), and Kemp load balancers.

F5

On the F5 BIG-IP load balancer, navigate to the Properties > Configuration page of the IKEv2 UDP 500 virtual server and choose None from the Source Address Translation drop-down list. Repeat this step for the IKEv2 UDP 4500 virtual server.

Always On VPN IKEv2 Load Balancing and NAT

Citrix ADC

On the Citrix ADC load balancer, navigate to System > Settings > Configure Modes and check the option to Use Subnet IP.

Always On VPN IKEv2 Load Balancing and NAT

Next, navigate to Traffic Management > Load Balancing > Service Groups and select the IKEv2 UDP 500 service group. In the Settings section click edit and select Use Client IP. Repeat these steps for the IKEv2 UDP 4500 service group.

Always On VPN IKEv2 Load Balancing and NAT

Kemp

On the Kemp LoadMaster load balancer, navigate to Virtual Services > View/Modify Services and click Modify on the IKEv2 UDP 500 virtual service. Expand Standard Options and select Transparency. Repeat this step for the IKEv2 UDP 4500 virtual service.

Always On VPN IKEv2 Load Balancing and NAT

Caveat

Making the changes above may introduce routing issues in your environment. When configuring these settings, it may be necessary to configure the VPN server’s default gateway to use the load balancer to ensure proper routing. If this is not possible, consider implementing the workaround below.

Workaround

To fully resolve this issue the above changes should be made to ensure the VPN server can see the client’s original source IP address. If that’s not possible for any reason, the following registry key can be configured to increase the number of established SAs from a single IP address. Be advised this is only a partial workaround and may not fully eliminate failed IKEv2 connections. There are other settings in Windows that can prevent multiple connections from a single IP address which are not adjustable at this time.

To implement this registry change, open an elevated PowerShell command window on the RRAS server and run the following commands. Repeat these commands on all RRAS servers in the organization.

New-ItemProperty -Path ‘HKLM:SYSTEM\CurrentControlSet\Services\IKEEXT\Parameters\’ -Name IkeNumEstablishedForInitialQuery -PropertyType DWORD -Value 50000 -Force

Restart-Service IKEEXT -Force -PassThru

Additional Information

IPsec Traffic May Be Blocked When A Computer is Behind a Load Balancer

Windows 10 Always On VPN IKEv2 Load Balancing with Citrix NetScaler ADC

Windows 10 Always On VPN IKEv2 Load Balancing with F5 BIG-IP

Windows 10 Always On VPN IKEv2 Load Balancing with Kemp LoadMaster

77 Comments

by Richard M. Hicks on April 13, 2020 • Permalink

Posted by Richard M. Hicks on April 13, 2020

https://directaccess.richardhicks.com/2020/04/13/always-on-vpn-ikev2-load-balancing-and-nat/

77 Comments

Matt Klein
/ April 13, 2020

We are experiencing this issue described above with both of our VPN servers not accepting new connections after 2-3 days. We typically have around 200 clients per day on the server.

However, we are using direct NAT and DNS round robin so we can see the originating IP so the issue doesn’t appear the same – is it still worth applying this fix or is this possibly an entirely separate issue? We’ve had a ticket open with MS for about 2 weeks with not much movement

Loading...

Reply
- Richard M. Hicks
  / April 13, 2020
  
  It wouldn’t hurt to enable that registry key just to see if it provides any relief. Are you using the Kemp LoadMaster load balancer by chance?
  
  Loading...
  
  Reply
  - Matt Klein
    / April 13, 2020
    
    We’re currently not using any LoadBalancer – just DNS Round Robin with 2 IPs that have NAT’s.
    
    We’re testing setting this up for F5 so we have more DR than DNS round Robin provides
    
    Loading...
  - Richard M. Hicks
    / April 13, 2020
    
    Got it. If your VPN server sees the client’s original IP address then you won’t get any benefit from adding that registry setting.
    
    Loading...
  - Matt Klein
    / April 13, 2020
    
    Hope I’m not derailing the point of this article – but are there any other IKEV2 limitations that might cause this exact same issue (all new connections initiate and then drop) after 2-3 days of connections? Despite having direct NAT?
    
    Loading...
  - Richard M. Hicks
    / April 13, 2020
    
    Potentially, yes. Reach out to me directly via email and I’ll provide you with more detail.
    
    Loading...
Luke Flack
/ April 13, 2020

Hi Richard, thank you so much for posting this. This issue caused me a great deal of trouble when initially setting up AOVPN behind our F5 load balancer and I now have an explanation as to why, where before I had none.

Loading...

Reply
Dave K
/ April 13, 2020

Thanks for the post, Richard. Another timely article for sure. In our implementation we’ve leveraged a device tunnel with user certificates. I see in some cases where during the ISAKMP negotiation the server is unable to send the server certificate back to the client and complete the negotiation. This results in a event ID 4652 and the server gives up trying after about the 3rd attempt. This is rare, however. We have over 1000 clients connected on two RRAS servers load balanced behind an F5. We’ve engaged both Microsoft support and F5 but not reached a resolution yet.

Loading...

Reply
- Richard M. Hicks
  / April 14, 2020
  
  What version of Windows Server are you running?
  
  Loading...
  
  Reply
  - Dave K
    / April 14, 2020
    
    We’re running Server 2019 Datacenter edition.
    
    Loading...
  - Richard M. Hicks
    / April 14, 2020
    
    Have you enabled IKEv2 fragmentation support on the server?
    
    Loading...
  - Dave K
    / April 14, 2020
    
    Yes we have enabled fragmentation. We typically see 11-12 fragments during the ISAKMP negotiation. I blame the certificate size.
    
    Loading...
  - Richard M. Hicks
    / April 14, 2020
    
    Ok, just checking. 🙂 What version of Windows 10 client? I think IKEv2 fragmentation support wasn’t added until 1803. If you’re using 1709 you’ll have problems.
    
    Loading...
  - Dave K
    / April 14, 2020
    
    We’re using Windows 10 Enterprise build 1909. By and large our clients are connecting without issue, but we have a smattering that cannot connect or do not do so unless there’s a detected network change. We’ve been having people disconnect from their home ISP to a wifi hotspot and seen success, but consistency is the goal. 🙂
    
    Loading...
  - Richard M. Hicks
    / April 14, 2020
    
    Great. I’m assuming of course you configured the F5 with a persistency group for UDP 500 and 4500 with “match across services” set as well?
    
    Loading...
- Dave K
  / April 14, 2020
  
  Indeed we have. Your site has been a great source of information for us when it came to configuring load balancing. F5 is investigating now and I hope to hear something back from them soon.
  
  Loading...
  
  Reply
  - Richard M. Hicks
    / April 14, 2020
    
    Ok, keep me posted!
    
    Loading...
Ben W
/ April 15, 2020

Hey Richard – we’ve been deploying AOVPN on a pilot set of users – moving toward going wider with it. But I’ve got a small thing bugging me I can’t figure out why.

We’re doing Device based tunnels, behind an F5 using Server 2016 RAS Servers (With the persistence profiles and client IP’s passing through) – but I’ve noticed that when some clients connect – they create multiple tunnels at once – only one is actually live, but the other ones don’t seem to go away – they consume a few ports/ip’s for the duration the client is connected. Not everyone does it – only some.

Is this normal behavior – something fixable?

Loading...

Reply
- Richard M. Hicks
  / April 16, 2020
  
  Definitely not normal, but not sure what would be causing that. Have a look at the event logs on the client and see if there’s any indication as to why the client might have disconnected. Perhaps that will shed some light on what’s happening.
  
  Loading...
  
  Reply
  - Ben W
    / April 16, 2020
    
    Fragmented Packets was the answer in the end. We were trying to use Server 2016 as 2019 hadn’t been certified for our environment yet – but in the end after confirming Fragmented packets, upgrading to 2019 and enabling Server Fragmentation the issue’s gone away and it’s looking alot better.
    
    Loading...
  - Richard M. Hicks
    / April 17, 2020
    
    Great to hear! 🙂
    
    Loading...
Johan
/ April 18, 2020

Hi
Great article, I have a question regarding the registry setting and what the impact will be.

What is the default value of this registry setting

And if you increees this value, will you just push the problem infront of you, or do we have some kind of reset after X hours??

Loading...

Reply
- Richard M. Hicks
  / April 18, 2020
  
  The default is 10. If you increase this value you will likely see some benefits in certain scenarios. However, there are other limits in Windows which are hardcoded that may result in connectivity issues. The best way to completely resolve this issue is to configure the load balancer to forward the client’s original source IP address to the VPN server.
  
  Loading...
  
  Reply
  - Johan
    / April 19, 2020
    
    Thank you for the answer.
    
    Loading...
  - trotterd
    / April 23, 2020
    
    Hi Richard, I hope you are well.
    
    I am also interested in testing this registry key in one of our implementations.
    
    The scenario I have is that the F5 is load balancing the in from of 5 RRAS servers. The source IP Address of the laptops is visible to the RRAS servers so no Source NAT configured.
    
    I have the Device and User Tunnel deployed to the laptops using IKEv2.
    
    There are lots of couples who work for the company and are all now working from home. So its common that married couples are both connecting using there own company provisioned Laptops using AOVPN over the same broadband connection.
    
    I have had a scenario where only one person can successfully connect to the VPN at a time. Its whoever booted up first and logged on. Then the second person can’t get a connection. If they alternate the boot up order the problem for the working and none working user switches.
    
    With two AOVPN enabled laptops using two IKEv2 tunnels from a single IP Address cause a scenario where the default limit of 10 security associations is triggered, or should I be looking somewhere else.
    
    Thanks in advance
    
    Dave
    
    Loading...
  - Richard M. Hicks
    / April 23, 2020
    
    Hi Dave. I’m not sure if implementing that registry key will help, but give it a shot and let me know what you find. What I think is more likely is that it is an issue with their on-premises networking equipment (router, firewall, etc.). I’ve heard numerous reports of people having issues with IPsec VPNs behind various ISP equipment. Often times there is an IPsec bypass option which might help. I’d suggest having a look there to see what you can find too.
    
    Loading...
  - trotterd
    / April 27, 2020
    
    Hi Richard, thanks as always for the swift response. I will get a change raised to try this. I don’t think it can harm by increasing the parameter to a higher number. I will update with my findings.
    
    Regards,
    
    Dave
    
    Loading...
  - Richard M. Hicks
    / April 27, 2020
    
    Thanks!
    
    Loading...
dandirk
/ April 22, 2020

We misconfigured our F5 (NAT) and started to get 809 errors.

Corrected the issue passing source IPs.

Odd thing now is some specific clients are still getting hit with 809. More puzzling still is on the affected devices, ethernet will work fine the error just occurs over wifi.

Tried the reg entry mentioned, that didn’t seem to help.

The issue persists through reimaging even. Completely repeatable, same profiles etc.

IPSec errors I see on RAS are either negotiation timeouts or “Max number of established MM SAs to peer exceeded”

Restarts of the server don’t seem to help and haven’t really found a way to clear SAs or find the offending device. Really appears to be an issue with the specific server, tried another one from our stage environment to figure out the F5 NAT issue and that would connect fine.

Any thoughts on how to reset SAs? or find them?

Loading...

Reply
- Richard M. Hicks
  / April 23, 2020
  
  On the VPN server you can view main-mode SAs using the Get-NetIPsecMainModeSA PowerShell command. You can clear them by piping the command to Remove-NetIPsecMainModeSA.
  
  Loading...
  
  Reply
Cheryl
/ April 29, 2020

Hi, Has anyone got this working with a F5 in a OneArm deployment with Source NAT used?

Loading...

Reply
Rich
/ May 1, 2020

Hi Richard, please can I express my deepest gratitude towards you and your posts. This really could not have come at a better time. Enabling transparency on the Load Master and reconfiguring the default gateway on the external NIC now means the clients Public IP addresses are being properly displayed.
Thank you very much !!

Loading...

Reply
- Richard M. Hicks
  / May 1, 2020
  
  You can, and thank you! 🙂
  
  Loading...
  
  Reply
Paddy Berger
/ May 6, 2020

Hi Richard, I can see you have ticked “use client IP” in the UDP service group, however we are using services instead. What would be the setting within here?

Loading...

Reply
- Paddy Berger
  / May 6, 2020
  
  This is within the Citrix ADC
  
  Loading...
  
  Reply
- Richard M. Hicks
  / May 6, 2020
  
  I believe so, yes. Does the option to use Client IP appear when you are using services individually instead of service groups?
  
  Loading...
  
  Reply
trotterd
/ May 13, 2020

Hi Richard, I hope you are well.

I was reading your article and interested in the workaround for increasing the number of SA’s from a single IP Address registry key.

You mentioned that there are other settings within Windows that can stop multiple connections from a single IP Address that are not adjustable at this time. I am interested in what those settings are, are you able to elaborate on this?

Regards,

Dave

Loading...

Reply
- Richard M. Hicks
  / May 14, 2020
  
  In addition to limiting the number of SAs from a single source IP address, Windows also limits the number of “in progress” main mode SAs from a single IP address to 35. So, while the registry entry I provided might provide *some* relief, it may not fully resolve the issue. The ultimate way to fix this problem is to ensure the VPN server sees the client’s original public source IP address.
  
  Loading...
  
  Reply
William
/ May 14, 2020

Hi Richard, One thing I have never seen covered in this is client IP addressing when load balanced across say 2 VPN servers. Would you typically use two separate IP pools for clients, one on each server, or use one large pool issued by a DHCP server? or are both acceptable.
The platform team here are asking what subnets needs to be carved up in AWS for this to work and for the life of me I can’t find information on this from MS.

Also, have you managed to load balance successfully in AWS using the elastic load balancer? I am skeptical is has the features we need to allow port following/persistency or Sticky ports.

Loading...

Reply
- Richard M. Hicks
  / May 14, 2020
  
  I’ve had a post in draft for quite some time that covers this topic. One of these days I’ll get it published, I promise! 😉 To answer your question, yes, unique IP address pools are required per server to ensure proper routing from the internal network. You’ll create routes internally to forward each VPN client IP subnet back to the respective VPN server.
  
  As for load balancing in AWS, you can use the ELB for SSTP, but not for IKEv2. It does not support session persistence across services as required for IKEv2. For that you’ll have to deploy a load balancing appliance such as Kemp or something else.
  
  Loading...
  
  Reply
  - William
    / May 20, 2020
    
    Hi Richard, thanks for the reply and congrats on the magazine feature!
    
    Can you clarify whether SSTP works in aws without NAT for client IP addresses? I understand that AWS instances do not use ARP, so I am wondering about the Proxy ARP that RRAS usually uses To route L2 traffic. I am hoping that it would work with a standard 2 NIC configuration with no NAT involved and clients IP addresses handed out from a the LAN interface of the RRAS box, but then how would the traffic get routes without using proxy arp?
    
    We have been testing for a while now and seem to run into routing issues that’s all.
    
    Loading...
  - Richard M. Hicks
    / May 21, 2020
    
    It’s been a while since I’ve done RRAS in AWS, but I do remember getting it to work in the past without NAT. You’ll need to create a unique subnet for VPN clients though and configure routes to point the traffic back to RRAS though.
    
    Loading...
  - William
    / May 26, 2020
    
    Thank you Richard. In that instance where there is a separate Client subnet to the VPC do you know whether Client traffic will be routable from a VPC back to a datacenter via a direct connect?
    
    I have heard that a VPC may not be able to route traffic outside the VPC if it’s not in the VPC summary CIDR block, so we would be able to route clients back to the Datacenter, so NAT would be the only viable client option.
    
    Never the less we will test but it will take a while to reach that point.
    
    Loading...
  - Richard M. Hicks
    / May 27, 2020
    
    I’m not certain about routing via direct connect to be honest. There are similar restrictions for routing in Azure so it sounds plausible.
    
    Loading...
  - William
    / June 2, 2020
    
    Hi Richard, I have now found the answer to this, and it may be useful for the future.
    
    You cannot route a different Subnet to the VPC CIDR via a virtual gateway, the packets will just get dropped as they don’t originate from the VPC range.
    
    To get around this you must instead use an AWS Transit gateway to connect your on premise network. This is supported and will work according to Aws support!
    
    Best regards
    
    Loading...
  - Richard M. Hicks
    / June 4, 2020
    
    Great to hear. Thanks for the update!
    
    Loading...
Rick
/ May 28, 2020

Hi Richard thank you for the article we have been experiencing many of these issues.
Just a question if we use the command Remove-NetIPsecMainModeSA
will endpoints lose connection to VPN?

Loading...

Reply
- Richard M. Hicks
  / May 29, 2020
  
  Yes. The client should try to reconnect automatically when you do this though.
  
  Loading...
  
  Reply
Jason
/ June 5, 2020

Great resources Richard! I am running into the same issue with the IKEV2 Device Tunnel with machine cert. We have 2 VPN Servers running RRAS 2019 with Fragmentation enabled and clients are Windows 10 1909 … receiving error 809 when my attempts go through the External F5 LTM. If I bypass the External F5 with a local host entry and add in the External IP of the VPN box to my existing FW rule… directly it always works perfectly. For testing/troubleshooting I removed one of the VPN boxes from the F5 so currently I only have the one VPN box behind the F5 and I’m still getting the intermittent 809 error on the device tunnel. I followed all the advice and settings for source address translation to NONE and have the default GW of the External NIC of VPN box to be the FLOAT IP of the F5. Also have persistence of source address and match across service enabled. So just curious … I currently have 3 separate VIPs on the F5, there are the 2 UDP ones … the 500 and 4500 and the SSTP one of 443. I also have 3 separate matching pools with the same corresponding ports. Would it make sense to keep the 3 separate VIPs but perhaps combine the 500 & 4500 into the same pool? I also tried using wildcard on one single VIP that should allow all ports and all protocols in and did the same for the pool but that didn’t seem to help either. The User Tunnel seem to be fine. If I change the source address translation to AUTO MAP and fix the VPN server default GW back to the firewall I still get the intermittent connectivity and I obviously loose the Client IP info as all the connections are showing the IP of the F5 FLOAT. This is a new deployment and we would like to go into production soon…everything is working perfectly when bypassing the F5 so this is the last component to address. If you have any ideas on what I could try on the LTM it would be greatly appreciated. Thanks.

Loading...

Reply
- Richard M. Hicks
  / June 8, 2020
  
  Looks like you’ve done everything right. Not sure why you are still getting 809 errors. I typically configure the F5 as you did, with three separate pools (one per port) and three individual VIPs. I’m assuming you’re not having any issues with SSTP? Just IKEv2?
  
  Loading...
  
  Reply
Prasanth Kuttasseri
/ July 30, 2020

Have a doubt. We have DA now. But the issue is that the client laptops through the DA shows single source IP (the DA gateway IP) which makes it difficult for the Proxy to identify the user and activities. Proxy is becoming crazy while it sees the source from single IP and with multiple credentials. Planning to test the “Always On VPN” .
I have worked in SSL remote access concetrator where the clients will be assigned IP from a pool and the firewall will control the access. Will the Always On VPN also the same concept or different. Can i get individual virtual pool ip for the clients coming out of the VPN server rather than a single source IP (ie : the VPN gateway IP)

Loading...

Reply
- Richard M. Hicks
  / August 1, 2020
  
  Got it. You won’t have that problem with Always On VPN. Each VPN client is assigned a unique IP address which is routed through the VPN server. There is no NAT taking place when using Always On VPN.
  
  Loading...
  
  Reply
  - Prasanth Kuttasseri
    / August 3, 2020
    
    Sir Richard…thank you. You made the day as from security perspective this is essential to identify the client IP and the activities involved. Thank you
    
    Loading...
Mahesh
/ September 3, 2020

Hi Richard,
We are seeing some client disconnection issues with only observation of system event id: 868 and 631. From event log it refers to VPN DNS name resolution point but in actual we dont see this issue.There are other users also connected during the time.
Is there anyway to check this? We already teted with replacing vpn profile, moving to differe t VPN Server also…

Loading...

Reply
- Richard M. Hicks
  / September 5, 2020
  
  These commonly occur because of underlying connectivity issue. I’d have a close look at the network connection when the VPN disconnects to determine if it was interrupted at all.
  
  Loading...
  
  Reply
Robin
/ February 4, 2021

We have Kemp load balancers and enabling transparency and changing the gateway of the RAS-Server to the Kemp adress did it. But doing this makes the server itself not being able to access the internet, is this as intended? I was looking at the Server NAT function in Kemp trying to understand if that could give the server the ability to browse https sites as an example, is this possible?

Loading...

Reply
- Richard M. Hicks
  / February 5, 2021
  
  It should be. I use this configuration all the time without issue. The load balancer just forwards the traffic on to it’s configured default gateway and everything works. I don’t recall specifically if enabling Server NAT was required though.
  
  Loading...
  
  Reply
Marc
/ March 30, 2021

Hi Richard! Thank your for your great articles! We run Kemp Loadmaster on the same Subnet as our RRAS-Servers. Sometimes Clientconnections randomly drop and when trying to reconnect, they receive 809 RAS-Error. I have been able to monitor traffic on the RRAS-Servers and I see that one Server receives traffic on UDP 500 and another one receives traffic on UDP 4500. Port Following is enabled on the LoadMaster Virtual Services and settings are according to the Kemp Template. Do you have an idea as to why Port Following is not working as intended? Greetings

Loading...

Reply
- Richard M. Hicks
  / April 1, 2021
  
  No idea. If you’ve enabled port following according to my post on this website, or the guidance provided in the Kemp documentation, it should work. If it is not, I’d open a support call with Kemp and have them take a look. Let me know if they come back with anything interesting!
  
  Loading...
  
  Reply
  - Marc
    / June 4, 2021
    
    Hi Richard,
    two months later I’ve come to close my support call with Kemp. I’d love to hear what you think about this:
    There was a lot of sending logs back and forth, but the solution it came down to was to increase the persistence time to 4 days (previously 5 minutes for our VSes). Setting Persistence to 4 days (or higher), lets you enable an option called ‘Refresh Persist’. Kemp Support told me this:
    ‘the persistence is not maintained across different VSes with upd connections.
    That is why this feature [Refresh Persist] has been introduced. ‘
    
    In regards to your deployment guide, I asked if the recommended settings are deprecated, to which Kemp replied:
    
    ‘there should probably be a specific mention in regards to UDP connections maintaining persistence records in AlwaysOn VPN connections.
    AlwaysOn VPNs require long persistence timeouts. ‘
    
    Greetings!
    
    Loading...
  - Richard M. Hicks
    / June 4, 2021
    
    Thanks for the update, Marc. Kemp has stated there is a bug in their load balancer they are working to address. No fix yet, but hope to see it soon. Perhaps that will help with your issue. Right now the workaround they’ve provided is to set the session persistence to something much longer, like several days. BTW, if you could share with me your case number I’d love to follow up with Kemp and continue this discussion with them. You can reach out to me directly if you can share that information. Thanks!
    
    Loading...
Alex Marsh
/ July 15, 2021

Hi Richard

Great article. I believe we’re seeing this manifest in another way and wonder if I’m on the correct lines.
We have a client who is using AOVPN to connect everyone in a branch office to the datacenter. As a result we see large numbers of users from the same public IP address connecting to the AOVPN server. After a couple of weeks it seems that the server stops accepting connections from this IP address after a few days unless the server is restarted.
Would these in-built IKE limits be the cause of the problem?

Loading...

Reply
- Richard M. Hicks
  / July 15, 2021
  
  Absolutely. In this case, it isn’t feasible to remove NAT from the equation. You could try implementing the registry setting detailed in this post to see if it will help. No guarantees though. :/ Also, make sure you’ve installed this update from Microsoft that fixes a known issue with RRAS enters DDoS protection mode, limiting incoming IKEv2 requests.
  
  Loading...
  
  Reply
  - Alex Marsh
    / July 16, 2021
    
    I’d missed that update but that certainly explains the connection issue so that’ll need to be applied.
    Irritating that the 35 SA limit can’t be adjusted at all, though in the longer term a site to site VPN will be put in to replace this entirely. With people now returning to offices I suspect this might become more of an issue!
    
    Loading...
  - Richard M. Hicks
    / July 17, 2021
    
    Microsoft must have reason for imposing the limit, but it would certainly be nice if it were at least configurable.
    
    Loading...
  - Alex Marsh
    / July 20, 2021
    
    I’m also making the assumption here that this wouldn’t have been an issue with Direct Access?
    
    Loading...
  - Richard M. Hicks
    / July 21, 2021
    
    To a lesser extent, certainly. DirectAccess would never break because of NAT the way Always On VPN with IKEv2 does, but there could be other problems. For example, NAT’ing DirectAccess client traffic to the DirectAccess server could result in high CPU utilization on one CPU core, while others aren’t busy.
    
    Loading...
Okja
/ November 12, 2021

Hello! In a specific scenario, I have many users coming from the same Internet access/Public IP. Finally I get the same error and the workaround seems not working in this case. If I force my clients to fallback to SSTP instead of IKE in case of issue, this could be a ‘real’ workaround ?
Many thanks !

Loading...

Reply
- Richard M. Hicks
  / November 16, 2021
  
  No doubt that SSTP is more stable and reliable than IKEv2. I would suggest using SSTP as the default unless you have a very specific reason to use IKEv2 (e.g., security requirements, interoperability with non-Microsoft VPN, etc.).
  
  Loading...
  
  Reply
Reuben
/ March 7, 2022

Implemented the kemp load balancer worked fine for one day and then this issue happened, only 2 connections when tested came through to our secondary rras server. Our RRAS servers are behind a kemp load balancer that is on our internal network, the kemp load balancer has a external leg too. Requests come in and hit our FortiGate which has a Nat rule in place, the vs is setup to use the ip address for rras01 which is the external IP on the external nic, this card has been disabled so the ip could be used as the VS ip. The real server ips are configured behind the VS.

I’ve added the fix in the only thing i’ve not done yet is the the traversal tick box on kemp, looking at the kemp template to see if ‘subnet originating requests’ is originally ticked or not

Loading...

Reply
alampcs
/ November 2, 2022

Hi Richard,

Thanks for the great article.
I have a question in regards to the AOVPN server network (NIC) configuration. The typical AOVPN server has 2 NIC’s (one for external facing firewall and other one for client VPN access & internal traffic). It’s not clear in the article how many NIC’s AOVPN server needs ? if we are using two NIC’s then which NIC we have to bind to the Netscaler VIP?
I think if we are using two NIC’s we have to bind the external NIC to Netscaler VIP?

Thanks in advance.

Loading...

Reply
- Richard M. Hicks
  / November 3, 2022
  
  The RRAS VPN server can have one or two NICs. Using one or two NICs is a design choice. If you use two NICs, the load balancer must be configured to deliver the traffic to the server’s external NIC.
  
  https://directaccess.richardhicks.com/2019/08/19/always-on-vpn-and-rras-with-single-nic/
  
  Loading...
  
  Reply