Jump to content


Photo

Unusual routing issue

routing

This topic has been archived. This means that you cannot reply to this topic.
3 replies to this topic

#1 bpoindexter

bpoindexter
  • Members
  • 13 posts

Posted 10 December 2019 - 09:56 AM

I had this situation crop up late last night.  We had made some minor changes to our routing that resulted in a near disaster which is not yet fixed (minor changes being we changed our reachable IPs for our two WAN connections).  IP addresses changed for privacy reasons.

 

The router is now improperly routing outbound traffic, in contravention to what the routing table actually says.

 

We have two WAN connections, WAN1 and WAN2.  WAN1 is configured on p3, WAN2 configured on p4.  In the routing table on p3 we have a directly attached network with target network address of 1.1.1.0/29, default gateway is 1.1.1.1.  This auto creates a default route gateway route 0.0.0.0/0 with gateway address as 1.1.1.1.  Metric on both routes is set to 10.  The public IP address assigned to p3 is 1.1.1.6

 

For WAN2 we have a directly attached network on p4, target network address is 2.2.2.140/30.  Default gateway is 2.2.2.141.  This creates a default route 0.0.0.0/0 with gateway address 2.2.2.140.  Metric on both is set to 20.  The public IP address assigned to p4 is 2.2.2.142.

 

We do not have any source based routing entries at this time.  Also, the aforementioned reachable IP addresses have been removed and are not in place at this time.

 

In our firewall rules, we use a weighted connection object for WAN1 and WAN2 that includes failover.  WAN1 gets a weight of 10, WAN2 gets a weight of 1.  Most connections, obviously, would use WAN1 with a few using WAN2, and with all connections failing over to one if the other goes down.  This configuration worked fine for years right up until the day it didn't.

 

Additionally, we have WAN1's interface (p3) configured in a router layer 2 bridge with p5 and p6.  This is to accommodate two devices that need public IP addresses in the 1.1.1.0/29 address space.  Again, this has worked fine for a very long time and, indeed, is working now.  This is the one major part of our network config that doesn't seem to have been affected.  I only mention it for reasons that will be clear later in this post.

 

As of today, WAN2 is effectively non-functional.  I've got WAN1 up and running and its our primary so it hasn't yet caused us a disaster.  I'm using a firewall rule on a test machine so that traffic coming from my test machine is using explicit IP for WAN2 (2.2.2.142).  The firewall indicates that this traffic is attempting to source nat out of the WAN1 interface (which is the bridging interface).  It shows its next hop is 1.1.1.1 instead of the appropriate next hop, 2.2.2.141.

 

At this point I thought I might need some source based routing entries.  I created one for each WAN, more or less matching what was in the routing table.  This caused my WAN2 to work, but also caused the default route associated with WAN1 to be list as 'off' in the routing table.  Not disconnected, not down, off.  This made no sense to me, so I pulled the source routing so I could at least have my primary WAN1 working.

 

In my troubleshooting, I wondered if my network bridge was causing the issue.  I removed the bridge and restarted the network configuration, leaving p3 with only its public IP address 1.1.1.6 and no bridging. Not only did this not help, but the bridge interface was still present, despite the fact that it had been removed from the configuration.

 

We recently updated our router's firmware to 8.01 and we updated our CC and Range to 7.2  There have been odd things like this happening since that time.  I have never encountered a situation where a configuration item, in this case the interface bridge, remained present even after being deleted and then visually verifying multiple times it had been removed from the configuration.

 

I post all of this in search for a solution to get secondary WAN functioning again, but also as a question.  I am strongly considering wiping the router to its originally factory settings and reloading a PAR file to reconfigure it.  Should I go that route, which looks increasingly likely, what is the correct way to do it when using a CC to control my routers?



#2 bpoindexter

bpoindexter
  • Members
  • 13 posts

Posted 11 December 2019 - 10:38 PM

I downgraded to 7.2.5.  The box is sort of working now, but my secondary WAN on 2.2.2.142 is still completely dysfunctional, and from what I'm seeing in the firewall, the routing is simply wrong.  Or, more accurately, the box is routing traffic in a way contrary to what it claims the routing table is.



#3 bpoindexter

bpoindexter
  • Members
  • 13 posts

Posted 16 December 2019 - 11:49 AM

I have made some progress.

 

After downgrading to 7.2.5, I put in source routes for my two ISPs in addition to regular routing table entries.  This has made the 'link balancing' through the use of 'Weighted Connection' in the firewall work correctly.  Either connection can be used successfully by a client now.

 

The connection on 1.1.1.0/29 still fails if I include reachable IPs to monitor the status of the link.  This appears to be related to the bridging interface setup I'm using.  1.1.1.0/29 is physically attached to p3.  I have two devices that need public IP addresses attached to p5 and p6 on the firewall.  I have a layer 2 routed bridge set up bridging p3, p5, and p6.  That works completely fine, EXCEPT, that ICMP packets sent when something is added to Reachable IPs cannot seem to get out.  As a result, the firewall always assumes my link on p3  is down if I'm using link monitoring.

 

Somewhat confirmed by going to SSH / terminal in the box and using ping -I p3 google.com.  This fails to send and receive ICMP packets.  If I use ping -I phbr-p3-p5-p6 google.com, ping packets are sent and received back without issue (phbr-p3-p5-p6 is the virtual bridge interface created by the routed L2 bridge).

 

This is where the problem is.



#4 Micha Knorpp

Micha Knorpp
  • Members
  • 195 posts

Posted 25 December 2019 - 10:58 AM

Still wondering about that part: "We recently updated our router's firmware to 8.01 and we updated our CC and Range to 7.2" ???

Is the "router" a CG Firewall which is standalone, or managed by the CC? Or something completely different?


regards,
-micha-