MPLS QoS
Table of Contents
This post will look at how QoS works in an MPLS environment. The default behaviour of MPLS QoS is shown. Next, I’ll explain and demontrate the three MPLS QoS DiffServ Models - Uniform, Pipe, and Short Pipe. As usual expect both configuration examples and wireshark captures. Do not expect fancy QoS policies as this post’s goal is to reveal the concepts of the technology rather than focus on QoS in itself. I will not discuss how policing, shaping, or queuing works, for example.
This topology is used throughout the post:
Eve-ng file if you want to follow along: EVE-NG - MPLS QoS - zartmann
Default QoS Behaviour in MPLS Networks
Without any policies anywhere, let’s try to send traffic from CE1 to CE2 with a marking of CS6. We can use telnet or SSH as IOS marks this traffic with CS6 by default.
To better see what is what in the wireshark captures, let’s have a look at the MAC addresses of CE1 and CE2:
! CE1
CE1#sh int e0/0 | in bia
Hardware is AmdP2, address is aabb.cc00.1000 (bia aabb.cc00.1000)
CE1#
! CE2
CE2#sh int e0/0 | in bia
Hardware is AmdP2, address is aabb.cc00.2000 (bia aabb.cc00.2000)
CE2#
The MAC addresses of the SP routers have this format:
PE3#sh int gi1 | in bia
Hardware is CSR vNIC, address is 5000.0003.0000 (bia 5000.0003.0000)
PE3#sh int gi2 | in bia
Hardware is CSR vNIC, address is 5000.0003.0001 (bia 5000.0003.0001)
PE3#
So, it can easily be seen where the packet is sent from/to. The first octet after the OUI contains the router number and the last octet the interface identifier where Gi1 == 0 and Gi2 == 1.
Let’s SSH to CE2 from CE1:
CE1#ssh -l cisco 2.2.2.2
Password:
CE2#exit
[Connection to 2.2.2.2 closed by foreign host]
CE1#
Now, if we look at the markings of the SSH traffic as it leaves CE1, we do see the CS6 in the IP header:
And as the traffic is received on CE2:
So, with no policies anywhere customers keep their markings end-to-end.
What about the markings within the SP network? For the same traffic, if we look at the ingress PE (PE3) as traffic is sent into the MPLS core:
As this is a L3VPN we impose a label stack of two labels. One (topmost label) for transport to the egress PE (PE6) and the bottom (VPN label) for identifying the VRF, outgoing interface, and next hop for the destination. Both labels contain the EXP 6 value.
So, when we impose labels, the default behaviour is to copy the IP Precedence or three most significant (left most) bits of the DiffServ markings from IP into MPLS EXP in ALL imposed labels.
If we change the EXP value of the topmost label at imposition on PE3, what happens at the PHP router where we have implicit null (we pop the top label)? To demonstrate this a simple policy in PE3 is created:
! PE3 change EXP of topmost label:
policy-map qos-to-core
class class-default
set mpls experimental topmost 3
!
int gi1
service-policy output qos-to-core
! Verification:
PE3#sh policy-map interface gi1 out
GigabitEthernet1
Service-policy output: qos-to-core
Class-map: class-default (match-any)
3 packets, 224 bytes
5 minute offered rate 0000 bps, drop rate 0000 bps
Match: any
QoS Set
mpls experimental topmost 3
Marker statistics: Disabled
PE3#
Now, let’s SSH and look at the traffic on PE3 as it leaves Gi1 towards the core:
We see out policy working as the top label now has EXP 3 for the SSH traffic as it is sent to the core.
And on our PHP router, P5, what happens when traffic is sent to the egress PE (PE6) on Gi2? Let’s look:
The top label is removed and the bottom (VPN label) is left as is, meaning it carries the IP mapped value in the EXP field - EXP 6.
If we want to keep the changed EXP value when traffic is sent to the egress PE, we need to do a label swap and not a pop. To do this we can use the explicit null label instead of the implicit null. This is signaled via LDP by the egress PE (PE6):
! PE6
mpls ldp explicit-null
! Verification on P5 (PHP)
P5#sh mpls forwarding-table 6.6.6.6 32
Local Outgoing Prefix Bytes Label Outgoing Next Hop
Label Label or Tunnel Id Switched interface
18 explicit-n 6.6.6.6/32 0 Gi2 10.0.56.6
P5#
Now when we SSH to CE2 from CE1, what happens at the PHP router, P5?
With the explicit null label we can carry on the markings in our top label.
Behaviours Learned
With the above scenarios we now know this about QoS for MPLS Networks:
- Without policies customer’s markings get copied to all imposed labels
- Customer’s markings are left untouched
- At label swap the top label keeps it’s changed value
- At PHP the EXP value is not copied down to the VPN label’s EXP value
- If explicit null is used then PHP router transfers top label EXP value to an exp-null label to keep markings when sending traffic to egress PE
Uniform Mode
This mode is typically for customers that have their own MPLS network. The essence of this mode is that on the egress PE we copy down the MPLS EXP value to the IP header before sending traffic to the CE. In other words, we reflect the MPLS core markings to the unlabeled IP packet. This gives us a uniform end-to-end QoS policy which is desired when you own the entire network.
With this model the egress PE’s policy is based on MPLS EXP values.
Let’s look at how this is done:
If we implement this policy show above, we should see CS3 marking in the IP header as the packet is sent from PE6 to CE2.
! P4
class-map match-all exp6
match mpls experimental topmost 6
!
policy-map qos-from-pe
class exp6
police 8000 conform-action transmit exceed-action set-mpls-exp-topmost-transmit 3
!
int gi1
service-policy input qos-from-pe
! PE6
mpls ldp explicit-null
!
class-map match-all exp3
match mpls experimental topmost 3
!
policy-map qos-from-core
class exp3
set qos-group 3
!
int gi1
service-policy input qos-from-core
!
class-map match-all qos-group3
match qos-group 3
!
policy-map qos-to-cust
class qos-group3
set ip precedence 3
!
int gi2
service-policy output qos-to-cust
First P4 has a policer that marks down EXP 6 traffic that exceeds 8kbps to EXP 3. Our egress PE, PE6, does explicit-null to receive the changed EXP value of 3 from P5, our PHP router. Now, to be able to propagate the EXP value into the IP packet before it egresses towards the CE2 router, we need a placeholder. This is because PE6 will do its lookup on the received VPN label, expose all labels, and forward the unlabeled IP packet out Gi2 towards the CE. If we were to configure a policy that says to copy the EXP value to the IP header out Gi2, we’re not able to match on the EXP value as no labels exist now. So, an internal value or placeholder is created. This is what the QoS Group does. As traffic is received we match on EXP 3 and set QoS Group 3. As traffic leaves PE6, we match on the QoS Group 3 and set IP prec 3. The final result is this:
How we have a uniform policy where markings in the core gets copied down to the IP packet as it is sent towards the CE. With uniform mode we usually just copy the IP markings into the EXP value at label imposition thereby not changing anything at ingress to the SP network. Of course it is up to the administrator what is done with the markings.
Pipe
Pipe mode is the next QoS model we’ll look at. Here the SP doesn’t touch the markings of the customer. Policy is based solely on the SP markings throughout the SP network. Even the policy on the edge towards the customer is based on the EXP bits like with uniform mode. With pipe mode we enforce policy ingress on the PE and set EXP value - typically on all imposed labels. The customer is told how to mark its traffic and this will then be mapped correctly into the classes offered by the SP.
The key point with pipe mode is that we configure the policy at egress of the SP network based on EXP values. This means that we must ensure that the marking set on ingress are transferred to the egress PE. Again, the exp-null label can be used.
The policy configuration:
! PE3
class-map match-all ip-prec6
match ip precedence 6
!
policy-map qos-from-ce
class ip-prec6
set mpls experimental imposition 3
!
int gi2
service-policy input qos-from-ce
! PE6
mpls ldp explicit-null
!
class-map match-all exp3
match mpls experimental topmost 3
!
policy-map qos-from-core
class exp3
set qos-group 3
!
int gi1
service-policy input qos-from-core
!
class-map match-all qos-group3
match qos-group 3
!
policy-map qos-to-cust
class qos-group3
bandwidth percent 15
!
int gi2
service-policy output qos-to-cust
Here we’re re-marking the customer’s CS6 value to EXP 3 at ingress on PE3. This should give us a label stack where all imposed labels will have EXP 3 set. At egress we still have exp-null configured, although this arguably could be the default imp-null depending on whether we change the top most label EXP value in the core of the SP network. If we do, we most likely will want to carry on this change to the egress PE for policy towards the customer. For demonstration purposes I’ve configured CBWFQ to guarantee 15% bandwidth for this class when traffic is sent to the customer on PE6. Notice that because we remove all labels the QoS Group is used as a place holder. Optionally we could also configure a discard class for congestion avoidance.
Let’s have a look at what actually happens in the packets. Starting with looking at what the packet’s EXP markings are when they leave the ingress PE, PE3:
Both labels have EXP 3 as we told the router to impose.
What about on the egress PE? Let’s have a look at the packet as it both enters PE6 and leaves PE6:
Here we have both the explicit null label and the VPN label. Both with EXP 3 as we set on the ingress PE. And when we send the packet towards CE2:
The cutomer’s marking is left intact, but the QoS policy at egress is based on our EXP marking.
Short Pipe
The last flavour of QoS DiffServ models for MPLS Networks we’ll look at is another variant of the pipe model shown above. With short pipe, the SP configures the egress policy based on the customer’s marking, so the L3 markings in the IP header. Because of this, we no longer have a need to carry on the top label’s EXP markings towards the egress PE and we can safely disable explicit null and just use the default of implicit null, meaning that we have PHP enabled.
Notice in the above topology that now we only have a VPN label when sending the packets towards the egress PE. Also, the policy is based on the customer’s markings.
! PE3
class-map match-all ip-prec6
match ip precedence 6
!
policy-map qos-from-ce
class ip-prec6
set mpls experimental imposition 3
!
int gi2
service-policy input qos-from-ce
! R6
class-map match-all ip-prec6
match ip precedence 6
!
policy-map qos-to-cust
class ip-prec6
bandwidth percent 5
!
int gi2
service-policy output qos-to-cust
A somewhat more simple policy without QoS Group. Just to verify that imp-null is in effect, we can have a look at P5:
P5#sh mpls forwarding-table 6.6.6.6 32
Local Outgoing Prefix Bytes Label Outgoing Next Hop
Label Label or Tunnel Id Switched interface
18 Pop Label 6.6.6.6/32 327 Gi2 10.0.56.6
P5#
And if we look at the packets as they enter PE6, we should only see one label - the VPN label:
Just one label as expected. We can look at PE3 just to see if this was in fact the label imposed as the bottom label:
PE3#sh bgp vpnv4 u all labels | in 2.2.2.2/32|Next
Network Next Hop In label/Out label
2.2.2.2/32 6.6.6.6 nolabel/21
PE3#sh ip cef vrf a 2.2.2.2/32
2.2.2.2/32
nexthop 10.0.34.4 GigabitEthernet1 label 18-(local:18) 21
PE3#
Sure enough, label 21 is the VPN label imposed by PE3.
Explicit Null
If no services are configured, but you just label switch IP packets in global, for example, you’ll need to configure exp-null on the egress PE, if you wan’t to be able to configure policies based on MPLS EXP at the egress. Otherwise a native IP packet will be received on the egress PE.
Summary
This concludes the QoS DiffServ models for MPLS Networks. We saw how the default behaviour of routers is and how the three models can be used to provide QoS policies in an SP environment. The Uniform model is usually based on the customers markings ingress and if markings are changed anywhere in the path we copy down these new values to the IP packet before sending it to the CE. The pipe model is based on the SP markings. Policy on the egress PE towards the customer is also based on the SP’s EXP marking. The customer’s markings are keep intact for pipe models - including short pipe. With short pipe the policy towards the customer is based on the customer’s markings. Here we can have implicit-null as we do not need EXP values on the egress PE for policy.
Please note that lots of documentation will show that the top label’s EXP value is copied down to the VPN label at PHP. This is by far the case. Nothing happens automatically execpt for the things we saw as default bahaviours. You can copy the value of the top most label down to the VPN label, but a policy will have to be configured on the PHP router for this to happen. And with explicit null configured on the egress PE, such policy wouldn’t make sense.
Explicit null is required for MPLS switched packets with just one label, as PHP would otherwise remove the only label, thereby only giving the egress PE the possiblity to do QoS on the markings in the IP header.
As a final comment regarding DiffServ QoS, all policies are based on an administrators configured PHB. Nothing is magically happening. So, the three models shown in this post are not implemented features of any platform, meaning it isn’t something you turn on and you get x model as a result. You must manually configure what happens at each hop. The three models are there to coarsely reference which policy you use in your network.