I was recently confronted to a strange issue with a PPTP VPN connection to a central site. Some users could connect and some others could not. They all used Windows 7 with SP1, configured the same way, and all computers were behind NAT/PAT routers but not necessarily on the same site.

On the VPN server, the only information I could get was this log stating the GRE protocol was unreachable:

Feb 23 09:12:29 pppd[9712]: pppd 2.4.4 started by root, uid 0 
Feb 23 09:12:29 zebra[2422]: interface ppp2 index 359 <POINTOPOINT,NOARP,MULTICAST> added. 
Feb 23 09:12:29 pppd[9712]: Connect: ppp2 <--> /dev/pts/8 
Feb 23 09:12:29 pptpd[9711]: GRE: read(fd=7,buffer=608c80,len=8260) from network failed: status = -1 error = Protocol not available 
Feb 23 09:12:29 pptpd[9711]: CTRL: GRE read or PTY write failed (gre,pty)=(7,6) 
Feb 23 09:12:29 pppd[9712]: Modem hangup 
Feb 23 09:12:29 pppd[9712]: Connection terminated: no multilink.

I saw that I was getting ICMP protocol-unreachable packets from the WAN IP of the client. Using wireshark on the client, I saw that the packets were sent from it. Strange…

After Googling a bit, I found this post and that post. They described exactly what was happening: some computers sent the ICMP protocol-unreachable and some others did not. The common point between computers that did send the packets was the Windows firewall was turned off.

Where applicable, I turned on the Windows firewall and it blocked those packets, and then the VPN connection was stable and working well.

To fix the issue for computers where turing the firewall on was not possible, I blocked the packets on the NAT/PAT router itself. With a Linux router, you can do it by dropping the packets in the FORWARD rule of the FILTER table:

iptables -I FORWARD -p icmp --icmp-type protocol-unreachable -d x.x.x.x -j DROP

With a Cisco router, you need to add an ACL entry on the LAN interface:

deny icmp any x.x.x.x y.y.y.y protocol-unreachable

If you had no ACL on the interface before, don’t forget to add a permit ip any any after the deny rule and before you apply the ACL on the interface, or you’ll deny all the traffic (Cisco has an implicit deny at the end of ACLs).

Stupid bugs…