Ok, First of all I'm completely new to Adtran devices, however, I'm quite experienced in networking. This is the situation, 2 adtran NV3305 routers, each in a different city and connected to each other over a dedicated link, and basically they just forward traffic from one to the other, so you might say that the network load is pretty much the same for both. Both devices are identical in reference and firmware. One of them has a CPU usage of 12% while the other reaches 100% and I assume is the cause of slowness and packet loss during peak hours. To give you a clearer idea, this is what I'm talking about:
System load: 1sec:100.00% 1min:100.00% 5min:100.00% Min: 0.00% Max: 100.00%
Context switch load: 0.35%
Task Task Invoked Exec Time Runtime Load %
Id Name PRI STAT (count) (usec) (usec) (1sec)
0 Idle 0 W 3758795001 270 0 0.00
1 PC Config 12 S 2069263564 1099 201835 20.18
2 Timer-00 37 W 3871048725 4 3174 0.32
3 Nm01 11 W 0 27778923 0 0.00
4 Clock 20 W 13067759 40 57 0.01
5 FrontPanel 249 W 119807049 304 7326 0.73
6 PCI Bridge 200 W 42354920 19 316 0.03
7 RouteTableTick 50 W 7507589 149 154 0.02
8 OSPF 50 W 30109274 1616 1665 0.17
9 IGMPTick 50 W 4406793 56 59 0.01
10 IGMP-Receiver 50 W 1 6 0 0.00
11 IP Events 121 W 7814702 62 71 0.01
12 Thread Pool 10 S 1397 56 0 0.00
13 tcptimer 110 W 510296 29 130 0.01
14 tcpinp 250 W 534027 57 2276 0.23
15 tcpout 250 W 581161 115 1715 0.17
16 WWW 250 W 4479570 148 0 0.00
17 DnsTable 50 W 857904 11 0 0.00
18 DnsProxy 50 W 883569 53 0 0.00
19 Port Manager 20 W 106300841 188 4337 0.43
20 PacketRouting 250 W 4229497708 27658 791699 79.17
21 RSTP 249 W 0 482 0 0.00
22 RSTP 249 W 0 492 0 0.00
23 e1 1/1 100 W 412242286 13 761 0.08
Rest omitted......
I've compared the configurations and they are almost the same, except for this:
interface eth 0/1
ip address 192.168.26.2 255.255.255.0
no ip route-cache
no shutdown
!
This is one of the interfaces in the affected router, the other router does not have that sentence. I've been reading about route cache express and ip ffe, but not sure which one should I use and if that will solve the problem. Also, I'm administering the router remotely, changing any of these parameters (cache routing, ffe) in the same interface I'm connected take the interface down? Will I get kicked out? I've attached a sanitized copy of the configuration, any help will be appreciated.
Thanks
- Thanks for posting your question on the forum!
I don't think you need route-cache or FFE as it appears that this router is just being used to pass traffic through. Based on what you posted, it looks like you are seeing an influx of traffic hitting your router. What is the average CPU usage you are seeing on that router (when it is not maxed out)? How often does it spike up like that? Could you post the output of "show int" from the affected router?
Let us know if you have any questions.
Thanks,
Noor
Noor, thanks for your response.
The router was hanging at 100% CPU load, so I applied both settings, ffe and route cache. The CPU load dropped to 60/70% at peak hours which I still find very unusual, but at least there's no packet loss now. When not maxed out, the CPU load is around 30%, mostly used by "PC Config" process. There's a GRE tunnel configured, when there is traffic on this tunnel, the process "PacketRouting" can eat as much as 30% of CPU, when the operation that uses this tunnel closes (everyday at 6pm), the CPU use for PacketRouting drops below 1%. Are GRE tunnels supposed to load the CPU that much?
This is the output of the show int command. Both of them show similar info as the router mostly passes traffic through. Also checked that they are both in 100/FULL
Router#sh int eth 0/1
eth 0/1 is UP, line protocol is UP
Hardware address is xxxxxxxxxxx
Ip address is 192.168.26.2, netmask is 255.255.255.0
MTU is 1500 bytes, BW is 100000 Kbit
100Mb/s, negotiated full-duplex, configured full-duplex
ARP type: ARPA; ARP timeout is 20 minutes
5 minute input rate 4728560 bits/sec, 3407 packets/sec
5 minute output rate 4744296 bits/sec, 3539 packets/sec
Queueing method: fifo
Output queue: 0/256/0 (size/max total/drops)
1754152716 packets input, 3895491468 bytes
1753959222 unicasts, 51982 broadcasts, 141512 multicasts input
22 unknown protocol, 0 symbol errors, 0 discards
35593 input errors, 0 runts, 0 giants
4782 no buffer, 30811 overruns, 0 internal receive errors
0 alignment errors, 0 crc errors
2966420300 packets output, 3222913683 bytes
2966075398 unicasts, 12546 broadcasts, 332356 multicasts output
0 output errors, 0 deferred, 0 discards
0 single, 0 multiple, 0 late collisions
0 excessive collisions, 0 underruns
0 internal transmit errors, 0 carrier sense errors
0 resets, 0 throttles
Thanks
- Can you tell me if those errors on the interface are incrementing? Could you post the output to "show events"? Remember to remove any sensitive information.
GRE tunnels are known to be more CPU intensive than other routing functions so that does not surprise me. However, I would say if your device is staying below 80% on an average, then you should be okay as far as CPU processing goes.
Let us know if you have any questions.
Thanks,
Noor
Do you have further questions on this post? If so, please do not hesitate to reply. I will be happy to help in any way I can.
Levi
I went ahead and flagged "Assumed Answered" on this post to make it more visible and help other members of the community find solutions more easily. If you feel like there is a better answer, feel free to come back to this post and select it with the applicable buttons. If you have any additional information on this that others may benefit from, please come back to this post to provide an update. If you still need assistance, we would be more than happy to continue working with you on this - just let us know in a reply.
Thanks,
Levi