VMware Networking Community
eccl1213
Enthusiast
Enthusiast

NSX Edge IPSEC VPN retry/timeout

We are running NSX Edge appliances (6.2.5) and have an issue with a particular IPSEC PVN tunnel.

It appears from the logs that at some point, dead peer detection declares the peer dead and takes the tunnel down.

However, the edge appliance stops trying to reestablish the tunnel?  Is there a setting for this anywhere.

Here is a log snippit

ERROR: asynchronous network error report on vNic_0 (sport=4500) for message to 1.1.1.1 port 4500, complainant 68.1.1.1: No route to host [errno 113, origin ICMP type 3 code 1 (not authenticated)]

[authpriv.warning] "10.250.98.19_10.250.48.168/29-1.1.1.1_192.168.22.0/24/1x1" #14743: DPD: No response from peer - declaring peer dead

[authpriv.warning] "10.250.98.19_10.250.48.168/29-1.1.1.1_192.168.22.0/24/1x1" #14743: DPD: Restarting Connection

[authpriv.warning] "10.250.98.19_10.250.48.168/29-1.1.1.1_192.168.22.0/24/1x1" #14763: rekeying state (STATE_QUICK_I2)

[authpriv.warning] "10.250.98.19_10.250.48.168/29-1.1.1.1_192.168.22.0/24/1x1" #14763: rekeying state (STATE_QUICK_I2)

[authpriv.warning] "10.250.98.19_10.250.48.168/29-1.1.1.1_192.168.22.0/24/1x1" #14763: ERROR: netlink response for Del SA esp.762d6d9a@1.1.1.1 included errno 3: No such process

[authpriv.warning] "10.250.98.19_10.250.48.168/29-1.1.1.1_192.168.22.0/24/1x1" #14767: initiating Main Mode to replace #14743

We then get the following a few seconds later

ERROR: asynchronous network error report on vNic_0 (sport=4500) for message to 1.1.1.1 port 4500, complainant 68.1.1.1: No route to host [errno 113, origin ICMP type 3 code 1 (not authenticated)]

[authpriv.warning] "10.1.1.19_10.250.48.168/29-1.1.1.1_192.168.22.0/24/1x1" #14743: ISAKMP SA expired (LATEST!)

But after that attempt it no longer tries....hours go by and it does not reattempt.  If we disable the tunnel and re-enable it, it comes right back up.  It seems like it should keep trying the connection at least for a while?

Reply
0 Kudos
7 Replies
Sreec
VMware Employee
VMware Employee

So this issue happens specific to one tunnel ,however other tunnels from same vpn device works flawlessly ?

10.250.98.19_10.250.48.168/29-1.1.1.1_192.168.22.0/24

Can i have a look at show config ipsec output from the edge ?

Cheers,
Sree | VCIX-5X| VCAP-5X| VExpert 7x|Cisco Certified Specialist
Please KUDO helpful posts and mark the thread as solved if answered
Reply
0 Kudos
eccl1213
Enthusiast
Enthusiast

Correct, the other tunnels are stable. 

Here is a sanitized ipsec config.  The "Bad" Site is named as such with a peer id of 1.1.1.1.

vShield Edge IPsec VPN Config:

{

   "ipsec" : {

      "logging" : {

         "enable" : true,

         "logLevel" : "debug"

      },

      "global" : {

         "extension" : null,

         "caCertificates" : [],

         "id" : null,

         "serviceCertificate" : null,

         "pskForDynamicIp" : null,

         "crlCertificates" : []

      },

      "enable" : true,

      "sites" : [

         {

            "localId" : "9.9.9.9",

            "certificate" : null,

            "enabled" : true,

            "encryptionAlgorithm" : "aes256",

            "peerIp" : "2.2.2.2",

            "extension" : null,

            "description" : null,

            "authenticationMode" : "psk",

            "peerId" : "2.2.2.2",

            "enablePfs" : false,

            "localSubnets" : [

               "10.250.48.168/29"

            ],

            "psk" : "****",

            "peerSubnets" : [

               "192.168.6.0/24"

            ],

            "name" : "Good1",

            "localIp" : "9.9.9.9",

            "dhGroup" : "dh5",

            "mtu" : null

         },

         {

            "localIp" : "9.9.9.9",

            "name" : "Good2",

            "peerSubnets" : [

               "192.168.2.0/24"

            ],

            "mtu" : null,

            "dhGroup" : "dh5",

            "description" : null,

            "authenticationMode" : "psk",

            "enablePfs" : false,

            "peerId" : "3.3.3.3",

            "localSubnets" : [

               "10.250.48.168/29"

            ],

            "psk" : "****",

            "peerIp" : "3.3.3.3",

            "extension" : null,

            "certificate" : null,

            "localId" : "9.9.9.9",

            "enabled" : true,

            "encryptionAlgorithm" : "aes"

         },

         {

            "peerIp" : "4.4.4.4",

            "extension" : null,

            "localId" : "9.9.9.9",

            "certificate" : null,

            "enabled" : true,

            "encryptionAlgorithm" : "aes256",

            "name" : "Good3",

            "peerSubnets" : [

               "192.168.4.0/24"

            ],

            "localIp" : "9.9.9.9",

            "dhGroup" : "dh5",

            "mtu" : null,

            "description" : null,

            "authenticationMode" : "psk",

            "peerId" : "4.4.4.4",

            "enablePfs" : false,

            "localSubnets" : [

               "10.250.48.168/29"

            ],

            "psk" : "****"

         },

         {

            "description" : null,

            "authenticationMode" : "psk",

            "enablePfs" : false,

            "peerId" : "1.1.1.1",

            "localSubnets" : [

               "10.250.48.168/29"

            ],

            "psk" : "****",

            "localIp" : "9.9.9.9",

            "name" : "BAD1",

            "peerSubnets" : [

               "192.168.22.0/24"

            ],

            "mtu" : null,

            "dhGroup" : "dh5",

            "certificate" : null,

            "localId" : "9.9.9.9",

            "encryptionAlgorithm" : "aes",

            "enabled" : true,

            "peerIp" : "1.1.1.1",

            "extension" : null

         },

         {

            "mtu" : null,

            "dhGroup" : "dh5",

            "localIp" : "9.9.9.9",

            "peerSubnets" : [

               "172.16.1.0/24",

               "192.168.1.0/24",

               "172.15.1.0/24"

            ],

            "name" : "Good4",

            "psk" : "****",

            "localSubnets" : [

               "10.250.48.168/29"

            ],

            "description" : null,

            "authenticationMode" : "psk",

            "enablePfs" : false,

            "peerId" : "5.5.5.5",

            "extension" : null,

            "peerIp" : "5.5.5.5",

            "encryptionAlgorithm" : "aes256",

            "enabled" : true,

            "certificate" : null,

            "localId" : "9.9.9.9"

         },

         {

            "encryptionAlgorithm" : "aes256",

            "enabled" : true,

            "certificate" : null,

            "localId" : "9.9.9.9",

            "extension" : null,

            "peerIp" : "6.6.6.6",

            "psk" : "****",

            "localSubnets" : [

               "10.250.48.168/29"

            ],

            "authenticationMode" : "psk",

            "description" : null,

            "enablePfs" : false,

            "peerId" : "6.6.6.6",

            "mtu" : null,

            "dhGroup" : "dh5",

            "localIp" : "9.9.9.9",

            "name" : "Good6",

            "peerSubnets" : [

               "192.168.5.0/24"

            ]

         },

         {

            "psk" : "****",

            "localSubnets" : [

               "10.250.48.168/29"

            ],

            "description" : null,

            "authenticationMode" : "psk",

            "enablePfs" : false,

            "peerId" : "7.7.7.7",

            "mtu" : null,

            "dhGroup" : "dh5",

            "localIp" : "9.9.9.9",

            "peerSubnets" : [

               "192.168.21.0/24"

            ],

            "name" : "Good7",

            "encryptionAlgorithm" : "aes",

            "enabled" : true,

            "certificate" : null,

            "localId" : "9.9.9.9",

            "extension" : null,

            "peerIp" : "7.7.7.7"

         }

      ],

      "disableEvent" : false

   }

}

Reply
0 Kudos
Sreec
VMware Employee
VMware Employee

Config looks fine . Are we sure we are following NSX supported Phase 1-2 parameters on both the sides ? What is the device on the other end?

Cheers,
Sree | VCIX-5X| VCAP-5X| VExpert 7x|Cisco Certified Specialist
Please KUDO helpful posts and mark the thread as solved if answered
Reply
0 Kudos
eccl1213
Enthusiast
Enthusiast

Yup, we have triple checked the parameters.  We know the crypto are right cause it does connect when we bounce NSX.  And both session timers have been double checked.

All devices are sonicwalls and all the same models.

What is odd to me is that after the tunnel goes down and NSX marks the tunnel down, it doesn't continually try to reconnect?  Seem's odd for it just give up.

Reply
0 Kudos
LaurentMele
Contributor
Contributor

Hello guys,

Got the same issue here between NSX and Sonicwall TZ600 & TZ300. From time to time (unpredicatble) the vpn is going down due to DPD.

No matter whatever config i try, the problem remains. So i get back to the must have for the NSX and I disabled the DPD on the sonicwall box.

I'll keep you inform about stability

Reply
0 Kudos
LaurentMele
Contributor
Contributor

Well, it's not much better. I have on other installations some stable vpn between NSA2600 and NSX but it's seems that I can't have the same with TZ models.

Reply
0 Kudos
Bharat24
Contributor
Contributor

I am having a similar issue with a Cisco Firepower Threat Defence FW where the tunnel is up for a period and then goes down and I cannot initiate any traffic from the NSX end to bring the tunnel up. I am new to NSX so where can I find more detail login to see what errors are being thrown when I try to re-establish the connection and where can I set the DPD setting on the NSX?

Any assistance is greatly appreciated.

Bharat.../

Reply
0 Kudos