Archive | IP RSS feed for this section

Accelerating SDN and NFV performance

30 Jan

The benefits of analysis acceleration are well known. But should such appliances be virtualized?

As software-defined networks (SDNs) and network functions virtualization (NFV) gain wider acceptance and market share, the general sentiment is that this shift to a pure software model will bring flexibility and agility unknown in traditional networks. Now, network engineers face the challenge of managing this new configuration and ensuring high performance levels at speeds of 10, 40, or even 100 Gbps.

Creating a bridge between the networks of today and the software- based models of the future, virtualization-aware appliances use analysis acceleration to provide real time insight. That enables event-driven automation of policy decisions and real time reaction to those events, thereby allowing the full agility and flexibility of SDN and NFV to unfold.

Issues managing SDN, NFV

Given the fact that a considerable investment has been made in operations support systems (OSS)/business support systems (BSS) and infrastructure, managing SDN and NFV proves a challenge for most telecom carriers. Such management must now be adapted not only to SDN and NFV, but also to Ethernet and IP networks.

The Fault, Configuration, Accounting, Performance and Security (FCAPS) model of management first introduced by ITU-T in 1996 is what most of the OSS/BSS systems installed have as their foundation. This concept was simplified in the Enhanced Telecom Operations Map (eTOM) to Fault, Assurance, and Billing (FAB). Management systems tend to focus on one of these areas and often do so in relation to a specific part of the network or technology, such as optical access fault management.

The foundation of FCAPS and FAB models was traditional voice-centric networks based on PDH and SDH. They were static, engineered, centrally controlled and planned networks where the protocols involved provided rich management information, making centralized management possible.

Still, there have been attempts to inject Ethernet and IP into these management concepts. For example, call detail records (CDRs) have been used for billing voice services, so the natural extension of this concept is to use IP detail records (IPDRs) for billing of IP services. xDRs are typically collected in 15-minute intervals, which are sufficient for billing. In most cases, that doesn’t need to be real time. However, xDRs are also used by other management systems and programs as a source of information to make decisions.

The problem here is that since traditional telecom networks are centrally controlled and engineered, they don’t change in a 15-minute interval. However, Ethernet and IP networks are completely different. Ethernet and IP are dynamic and bursty by nature. Because the network makes autonomous routing decisions, traffic patterns on a given connection can change from one IP packet or Ethernet frame to the next. Considering that Ethernet frames in a 100-Gbps network can be transmitted with as little as 6.7 nsec between each frame, we can begin to understand the significant distinction when working with a packet network.

Not a lot of management information is provided by Ethernet and IP, either. If a carrier wants to manage a service provided over Ethernet and IP, it needs to collect all the Ethernet frames and IP packets related to that service and reassemble the information to get the full picture. While switches and routers could be used to provide this kind of information, it became obvious that continuous monitoring of traffic in this fashion would affect switching and routing performance. Hence, the introduction of dedicated network appliances that could continuously monitor, collect, and analyze network traffic for management and security purposes.

Network appliances as management tools

Network appliances have become essential for Ethernet and IP, continuously monitoring the network, even at speeds of 100 Gbps, without losing any information. And they provide this capability in real time.

Network appliances must capture and collect all network information for the analysis to be reliable. Network appliances receive data either from a Switched Port Analyzer (SPAN) port on a switch or router that replicates all traffic or from passive taps that provide a copy of network traffic. They then need to precisely timestamp each Ethernet frame to enable accurate determination of events and latency measurements for quality of experience assurance. Network appliances also recognize the encapsulated protocols as well as determine flows of traffic that are associated with the same senders and receivers.

Appliances are broadly used for effective high performance management and security of Ethernet and IP networks. However, the taxonomy of network appliances has grown outside of the FCAPS and FAB nomenclature. The first appliances were used for troubleshooting performance and security issues, but appliances have gradually become more proactive, predictive, and preventive in their functionality. As the real time capabilities that all appliances provide make them essential for effective management of Ethernet and IP networks, they need to be included in any frameworks for managing and securing SDN and NFV.

Benefits of analysis acceleration

Commercial off-the-shelf servers with standard network interface cards (NICs) can form the basis for appliances. But they are not designed for continuous capture of large amounts of data and tend to lose packets. For guaranteed data capture and delivery for analysis, hardware acceleration platforms are used, such as analysis accelerators, which are intelligent adapters designed for analysis applications.

Analysis accelerators are designed specifically for analysis and meet the nanosecond-precision requirements for real time monitoring. They’re similar to NICs for communication but differ in that they’re designed specifically for continuous monitoring and analysis of high speed traffic at maximum capacity. Monitoring a 10-Gbps bidirectional connection means the processing of 30 million packets per second. Typically, a NIC is designed for the processing of 5 million packets per second. It’s very rare that a communication session between two parties would require more than this amount of data.

Furthermore, analysis accelerators provide extensive functionality for offloading of data pre-processing tasks from the analysis application. This feature ensures that as few server CPU cycles as possible are used on data pre-processing and enables more analysis processing to be performed.

Carriers can assess the performance of the network in real time and gain an overview of application and network use by continuously monitoring the network. The information can also be stored directly to disk, again in real time, as it’s being analyzed. This approach is typically used in troubleshooting to determine what might have caused a performance issue in the network. It’s also used by security systems to detect any previous abnormal behavior.

It’s possible to detect performance degradations and security breaches in real time if these concepts are taken a stage further. The network data that’s captured to disk can be used to build a profile of normal network behavior. By comparing this profile to real time captured information, it’s possible to detect anomalies and raise a flag.

In a policy-driven SDN and NFV network, this kind of capability can be very useful. If performance degradation is flagged, then a policy can automatically take steps to address the issue. If a security breach is detected, then a policy can initiate more security measurements and correlation of data with other security systems. It can also go so far as to use SDN and NFV to reroute traffic around the affected area and potentially block traffic from the sender in question.

Using real time capture, capture-to-disk, and anomaly detection of network appliances with hardware acceleration, SDN and NFV performance can be maximized through a policy-driven framework.

Requirements, constraints

Network appliances can be used to provide real time insight for management and security in SDN and NFV environments. But a key question remains: Can network appliances be fully virtualized and provide high performance at speeds of 10, 40, or even 100 Gbps?

Because network appliances are already based on standard server hardware with applications designed to run on x86 CPU architectures, they lend themselves very well to virtualization. The issue is performance. Virtual appliances are sufficient for low speed rates and small data volumes but not for high speeds and large data volumes.

Performance at high speed is an issue even for physical-network appliances. That’s why most high performance appliances use analysis acceleration hardware. While analysis acceleration hardware frees CPU cycles for more analysis processing, most network appliances still use all the CPU processing power available to perform their tasks. That means virtualization of appliances can only be performed to a certain extent. If the data rate and amount of data to be processed are low, then a virtual appliance can be used, even on the same server as the clients being monitored.

It must be noted, though, that the CPU processing requirements for the virtual appliance increases once the data rate and volume of data increase. At first, that will mean the virtual appliance will need exclusive access to all the CPU resources available. But even then, it will run into some of the same performance issues as physical-network appliances using standard NIC interfaces with regard to packet loss, precise timestamping capabilities, and efficient load balancing across the multiple CPU cores available.

Network appliances face constraints in the physical world, and virtualization of appliances can’t escape them. These same constraints must be confronted. One way of addressing this issue is to consider the use of physical appliances to monitor and secure virtual networks. Virtualization-aware network appliances can be “service-chained” with virtual clients as part of the service definition. It requires that the appliance identify virtual networks, typically done using VLAN encapsulation today, which is already broadly supported by high performance appliances and analysis acceleration hardware. That enables the appliance to provide its analysis functionality in relation to the specific VLAN and virtual network.

Such an approach can be used to phase in SDN and NFV migration. It’s broadly accepted that there are certain high performance functions in the network that will be difficult to virtualize at this time without performance degradation. A pragmatic solution is an SDN and NFV management and orchestration approach that takes account of physical- and virtual-network elements. That means policy and configuration doesn’t have to concern itself with whether the resource is virtualized or not but can use the same mechanisms to “service-chain” the elements as required.

A mixture of existing and new approaches for management and security will be required due to the introduction of SDN and NFV. They should be deployed under a common framework with common interfaces and topology mechanisms. With this commonality in place, functions can be virtualized when and where it makes sense without affecting the overall framework or processes.

Bridging the gap

SDN and NFV promise network agility and flexibility, but they also bring numerous challenges regarding performance due to the high speeds that networks are beginning to require. It’s crucial to have reliable real time data for management and analytics, which is what network appliances provide. These appliances can be virtualized, but that doesn’t prevent the performance constraints of physical appliances from applying to the virtual versions. Physical and virtual elements must be considered together when managing and orchestrating SDN to ensure that virtualization-aware appliances bridge the gap between current network functions and the up and coming software-based model.


Fundamental of Microwave Radio Communication for IP and TDM

9 Dec


 The field of terrestrial microwave communications is constantly experiencing a steady technological innovation to accommodate the ever-demanding techniques telecom providers and private microwave users employ when deploying microwave radios in their cloud networks. • In the beginning of this wireless evolution, the ubiquitous DS1s/E1s and DS3s/E3s crisscrossed networks transporting mainly voice communications, data, and video. • With the advent of Carrier Ethernet and IP, new techniques had to be developed to ensure the new Layer 2 radios were up to par with the new wave of traffic requirements including wideband online-streamed media. These new techniques come in the form of Quality of Service (QoS), Traffic Prioritization, RF Protection and Design, Spectrum Utilization, and Capacity Enhancement. • With Carrier Ethernet and IP, network design becomes more demanding and complex in terms of RF, Traffic Engineering, and QoS. However, the propagation concepts remain unchanged from TDM link engineering while the link’s throughput of L2 radios doubles, triples, or quadruples employing enhanced DSP techniques. 


Introduction to SCTP and it’s benefits over TCP and UDP

29 Jun

SCTP (Stream Control Transmission Protocol) was introduced for transporting PSTN signaling messages over IP network. But due to its amazing features it became an important part of next generation network technologies i.e. IMS and LTE


Easy Solution to AT&T gets Problem

8 May

So I was reading The Switch this morning, and I saw a post from yesterday by Brian Fung about AT&T not being able to prioritize GETS traffic for emergency responders and the government in national security crises, and I thought I would give them a free solution to this problem.  For the full article, here is a link.

For a brief overview, with the current copper lines the phone companies are able to prioritize calls after a cop/firefighter/FBI agent/President of the United States enter a special code in times of great emergency when the phone lines are tied up by people checking on loved ones, or calling 911 to report an obvious emergency.  Think the attack on the twin towers, or hurricane katrina (examples in the article).   Now that AT&T is proposing expanding their VOIP-over-Fibre phone service UVerse to the rest of the US, they are claiming that the nature of the internet means that they cannot prioritize specific VOIP traffic in the case of an emergency.  There could only be 4 possible reasons for this response:

1) Everyone working at AT&T is a moron. Not out of the question

2) They are lying to the US Government in order to shave a few points of margin off their service at the expense of lives. Most likely.

3) They are in fact doing so, but in secret and only for national emergency issues, leaving out first responders. Sneaky.

4) This is 1978 during the DARPA period of the internet. Its not.

Here is a simple solution that I crafted in the shower for a problem many organizations deal with for VOIP and other IP traffic every day. Took 15 minutes and a bit of drawing.  Its high level, but based on the principles of Source/Destination IP Prioritization

 ATT Solution Simple

Basically what this says is that a Priority Phone call user (Firefighter lets say) during an emergency can dial a special pin, which opens up a call to a special VOIP server or or is prioritized on the VOIP system in question. That number is a forwarding number similar to the 1-800 long distance forwarders we all know and loathe. When they get that dial tone, they type in their number and it makes their call to their destination, which can be a special line or a regular line. All prioritized IP traffic based on source and destination. Very easy to do.  Here are some links using Cisco gear as a baseline.

This link describes ‘Policy based routing‘ on Cisco IOS devices (routers) that would need to be programmed to prioritize traffic coming to and from the Priority VOIP server.

This link is a little less technical and gives an overview of traffic shaping.

Because UVerse is an IP based routing platfrom it follows the same rules about traffic shaping as other WAN networks. AT&T’s argument is that because all VOIP traffic looks the same to their routers, they cannot differentiate emergency traffic from regular traffic so cannot use traffic shaping to relieve congestion. I have proved in less than an hour with a very crude drawing and a couple of links that it is simply a desire to save money on an emergency destination (which can be a ‘Virtual IP‘ that simply prioritizes routes, or the priorities can be dynamically created as this paper proves).

Basically AT&T does not want to create a separate server to handle emergency traffic, so is unwilling to utilize this solution (or come up with a better one) in order to save on some equipment costs and the man hours required to update their routers with the prioritized routes.

If you think this wouldn’t work, or have an even better solution, tell me in the comments!


IPv4 and IPv6 dual-stack PPPoE

13 Mar

The lab covers a scenario of adding basic IPv6 access to an existing PPPoE (PPP for IPv4).

PPPoE is established between CPE (Client Premise Equipment) the PPPoE client and the PPPoE server also known as BNG (Broadband Network Gateway).

ipv4 and IPv6 dual-stack PPPoe

Figure1: ipv4 and IPv6 dual-stack PPPoe

PPPoE server plays the role of the authenticator (local AAA) as well as the authentication and address pool server (figure1). Obviously, a higher centralized prefix assignment and authentication architecture (using AAA RADIUS) is more scalable for broadband access scenarios (figure2).

For more information about RADIUS attributes for IPv6 access networks, start from rfc6911 (

Figure2: PPPoE with RADIUS

Figure2: PPPoE with RADIUS

PPPoE for IPv6 is based on the same PPP model as for PPPoE over IPv4. The main difference in deployment is related to the nature of the routed protocol assignment to CPEs (PPPoE clients).

  • IPv4 in routed mode, each CPE gets its WAN interface IP centrally from the PPPoE server and it’s up to the customer to deploy an rfc1918 prefix to the local LAN through DHCP.
  • PPPoE client gets its WAN interface IPv6 address through SLAAC and a delegated prefix to be used for the LAN segment though DHCPv6.

Animation: PPP encapsulation model

Let’s begin with a quick reminder of a basic configuration of PPPoE for IPv4.

PPPoE for IPv4

pppoe-client WAN address assignment

The main steps of a basic PPPoE configuration are:

  • Create a BBAG (BroadBand Access Group).
  • Tie the BBAG to virtual template interface
  • Assign a loopback interface IP (always UP/UP) to the virtual template.
  • Create and assign the address pool (from which client will get their IPs) to the virtual template interface.
  • Create local user credentials.
  • Set the authentication type (chap)
  • Bind the virtual template interface to a physical interface (incoming interface for dial-in).
  • The virtual template interface will be used as a model to generate instances (virtual access interfaces) for each dial-in session.

Figure3: PPPoE server

Figure3: PPPoE server model


ip local pool PPPOE_POOL
bba-group pppoe BBAG
virtual-template 1
interface Virtual-Template1
ip unnumbered Loopback0
ip mtu 1492
peer default ip address pool PPPOE_POOL
ppp authentication chap callin


interface FastEthernet0/0

pppoe enable group BBAG


interface FastEthernet0/1
pppoe enable group global
pppoe-client dial-pool-number 1
interface FastEthernet1/0
ip address
interface Dialer1
mtu 1492
ip address negotiated

encapsulation ppp

dialer pool 1

dialer-group 1

ppp authentication chap callin

ppp chap hostname pppoe-client

ppp chap password 0 cisco

Figure4: PPPoE client model

Figure4: PPPoE client model


As mentioned in the beginning, DHCPv4 is deployed at the CPE device to assign rfc1819 addresses to LAN clients and then translated, generally using PAT (Port Address Translation) with the assigned IPv4 to the WAN interface.

You should have the possibility to configure static NAT or static port-mapping to give public access to internal services.

Address translation

interface Dialer1
ip address negotiated
ip nat outside
interface FastEthernet0/0
ip address
ip nat inside
ip nat inside source list NAT_ACL interface Dialer1 overload

ip access-list standard NAT_ACL

permit any

pppoe-client LAN IPv4 address assignment


ip dhcp excluded-address
ip dhcp pool LAN_POOL
interface FastEthernet0/0
ip address

PPPoE for IPv6

pppoe-client WAN address assignment

All IPv6 prefixes are planned from the 2001:db8::


ipv6 local pool PPPOE_POOL6 2001:DB8:5AB:10::/60 64
bba-group pppoe BBAG
virtual-template 1
interface Virtual-Template1
ipv6 address FE80::22 link-local
ipv6 enable
ipv6 nd ra lifetime 21600
ipv6 nd ra interval 4 3
peer default ipv6 pool PPPOE_POOL6

ppp authentication chap callin


interface FastEthernet0/0

pppoe enable group BBAG

IPCP (IPv4) negotiates the IPv4 address to be assigned to the client, where IPC6CP negotiates only the interface identifier, the prefix information is performed through SLAAC.


interface FastEthernet0/1
pppoe enable group global
pppoe-client dial-pool-number 1
interface Dialer1
mtu 1492
dialer pool 1
dialer-group 1
ipv6 address FE80::10 link-local

ipv6 address autoconfig default

ipv6 enable

ppp authentication chap callin

ppp chap hostname pppoe-client

ppp chap password 0 cisco

The CPE (PPPoE client) is assigned an IPv6 address through SLAAC along with a static default route: ipv6 address autoconfig default

pppoe-client#sh ipv6 interface dialer 1
Dialer1 is up, line protocol is up
IPv6 is enabled, link-local address is FE80::10
No Virtual link-local address(es):

Stateless address autoconfig enabled
Global unicast address(es):

2001:DB8:5AB:10::10, subnet is 2001:DB8:5AB:10::/64 [EUI/CAL/PRE]
valid lifetime 2587443 preferred lifetime 600243

Note from the below traffic capture (figure5) that both IPv6 and IPv4 use the same PPP session (layer2 model)(same session ID=0×0006) because the Link Control Protocol is independent of the network layer.

Figure5: Wireshark capture of common PPP layer2 model

Figure5: Wireshark capture of common PPP layer2 model


pppoe-client LAN IPv6 assignment

The advantage of using DHCPv6 PD (Prefix Delegation is that the PPPoE will automatically add a static route to the assigned prefix, very handy!


ipv6 dhcp pool CPE_LAN_DP
prefix-delegation 2001:DB8:5AB:2000::/56
00030001CA00075C0008 lifetime infinite infinite
interface Virtual-Template1

ipv6 dhcp server CPE_LAN_DP

Now the PPPoE client can use the delegated prefix to assign an IPv6 address (::1) to its own interface (fa0/0) and the remaining for SLAAC advertisement.

No NAT needed for the delegated prefixes to be used publically, so no translation states on the PPPoE server. The prefix is directly accessible from outside.

For more information about the client ID used for DHCPv6 assignment, please refer to the prior post about DHCPv6.


pppoe-client#sh ipv6 dhcp
This device’s DHCPv6 unique identifier(DUID): 00030001CA00075C0008
interface Dialer1

ipv6 dhcp client pd PREFIX_FROM_ISP
interface FastEthernet0/0
ipv6 address FE80::2000:1 link-local

ipv6 address PREFIX_FROM_ISP ::1/64
ipv6 enable

pppoe-client#sh ipv6 dhcp interface
Dialer1 is in client mode
Prefix State is OPEN
Renew will be sent in 3d11h
Address State is IDLE
List of known servers:
Reachable via address: FE80::22
DUID: 00030001CA011F780008
Preference: 0
Configuration parameters:

IA PD: IA ID 0×00090001, T1 302400, T2 483840

Prefix: 2001:DB8:5AB:2000::/56

preferred lifetime INFINITY, valid lifetime INFINITY

Information refresh time: 0

Prefix name: PREFIX_FROM_ISP

Prefix Rapid-Commit: disabled

Address Rapid-Commit: disabled


Now the customer LAN is assigned globally available IPv6 from the CPE (PPPoE client).

client-LAN#sh ipv6 interface fa0/0
FastEthernet0/0 is up, line protocol is up
IPv6 is enabled, link-local address is FE80::2000:F
No Virtual link-local address(es):

Stateless address autoconfig enabled
Global unicast address(es):

2001:DB8:5AB:2000::2000:F, subnet is 2001:DB8:5AB:2000::/64 [EUI/CAL/PRE]

client-LAN#sh ipv6 route


S ::/0 [2/0]

via FE80::2000:1, FastEthernet0/0

C 2001:DB8:5AB:2000::/64 [0/0]

via FastEthernet0/0, directly connected

L 2001:DB8:5AB:2000::2000:F/128 [0/0]

via FastEthernet0/0, receive

L FF00::/8 [0/0]

via Null0, receive


End-to-end dual-stack connectivity check

client-LAN#ping 2001:DB8:5AB:3::100
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 2001:DB8:5AB:3::100, timeout is 2 seconds:
Success rate is 100 percent (5/5), round-trip min/avg/max = 20/45/88 ms
client-LAN#trace 2001:DB8:5AB:3::100
Type escape sequence to abort.
Tracing the route to 2001:DB8:5AB:3::100

1 2001:DB8:5AB:2000::1 28 msec 20 msec 12 msec

2 2001:DB8:5AB:2::FF 44 msec 20 msec 32 msec

3 2001:DB8:5AB:3::100 48 msec 20 msec 24 msec


Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to, timeout is 2 seconds:
Success rate is 100 percent (5/5), round-trip min/avg/max = 52/63/96 ms
Type escape sequence to abort.
Tracing the route to

1 32 msec 44 msec 20 msec

2 56 msec 68 msec 80 msec

3 72 msec 56 msec 116 msec


I assigned PREFIX_FROM_ISP as locally significant name for the delegated prefix, no need to match the name on the DHCPv6 server side.

Finally, the offline lab with all the commands needed for more detailed inspection:


References (french)



Some thoughts about CDNs, Internet and the immediate future of both

27 Feb


A CDN (Content delivery Network) is a Network overlaid on top of internet.  Why bother to put another network on top of internet? Answer is easy: the Internet as of today does not work well for doing certain things, for instance content services for today’s content types.  Any CDN that ever existed was just intended to improve the behaviour of the underlying network in some very specific cases: ‘some services’ (content services for example), for ‘some users’ (those who pay, or at least those whom someone pays for). CDNs do not want nor can improve Internet as a whole.

Internet is just yet another IP network combined with some basic services, for instance: ‘object names’ translation into ‘network addresses’ (network names): DNS.  Internet’s ‘service model’ is multi-tenant, collaborative, non-managed, and ‘open’ opposite to private networks (single owner), joined to standards that may vary one from another, non-collaborative (though they may peer and do business at some points) and managed. It is now accepted that the ‘service model’ of Internet, is not optimal for some things: secure transactions, real time communications and uninterrupted access to really big objects (coherent sustained flows)…

The service model in a network of the likes of Internet , so little managed, so little centralized, with so many ‘open’ contributions,  today can grant very few things to the end-to-end user, and the more the network grows and the more the network interconnects with itself the less good properties it has end to end. It is a paradox. It relates to complex systems size. The basic mechanisms that are good for a size X network with a connection degree C may not be good for another network  10^6X in size and/or 100C in connection. Solutions to internet growth and stability must never compromise its good properties: openness, de-centralisation, multi-tenancy …. This growth& stability problem is important enough to have several groups working on it: Future Internet Architecture Groups. These Groups exist in UE, USA and Asia.

Internet basic tools for service building are: a packet service that is non-connection-oriented (UDP) and a packet service that is connection-oriented (TCP) and on top of this last one a service that is text-query-oriented and stateless (HTTP) (sessions last for just one transaction).A name translation service from object names to network names helps a lot to write services for Internet and also allows these applications to keep running no matter the network addresses are changing.

For most services/applications Internet is a ‘HTTP network’. The spread of NAT and firewalls makes UDP inaccessible to most internet consumers, and when it comes to TCP, only port 80 is always open and even more only TCP flows marked with HTTP headers are allowed through many filters. These constraints make today’s internet a limited place for building services. If you want to reach the maximum possible number of consumers you have to build your service as an HTTP service.


A decent ‘network’ must be flexible and easy to use. That flexibility includes the ability to find your counterpart when you want to communicate.    In the voice network (POTS) we create point to point connections. We need to know the other endpoint address (phone number) and there is no service inside POTS to discover endpoint addresses not even a translation service.

In Internet it was clear from the very beginning that we needed names that were more meaningful than network addresses.  To make the network more palatable to humans Internet has been complemented with mechanisms that support ‘meaningful names’.  The ‘meaning’ of these names was designed to be one very concrete: “one name-one network termination” … and the semantics that will apply to these names were borrowed from set-theory through the concept of ‘domain’ (a set of names) with strict inclusion. Pairs name-address are modelled making ‘name’ to have such an structure that represents a hierarchy of domains. In case a domain includes some other domain that is clearly expressed by means of a chain of ‘qualifiers’.  A ‘qualifier’ is a string of characters. The way to name a subdomain is to add one more qualifier to the string and so on and so forth. If two domains do not have any inclusion relationship then they are forcefully disjoint.

This naming system was originally intended just to identify machines (network terminals) but it can be ,and has been, easily extended to identify resources inside machines by adding subdomains. This extension is a powerful tool that offers flexibility to place objects in the vast space of the network applying ‘meaningful names’. It gives us the ability to name machines, files, files that contain other files (folders), and so on… . These are all the ‘objects’ that we can place in internet for the sake of building services/applications.  It is important to realise that only the names that identify machines get translated to network entities (IP addresses). Names that refer to files or ‘resources’ cannot map to IP network entities and thus, it is the responsibility of the service/application to ‘complete’ the meaning of the name.

To implement this semantics on top of Internet they built a ‘names translator’ that ended up being called ‘name server’. Internet feature is called: Domain Name Service (DNS).  A name server is an entity that you can query to resolve a ‘name’ into an IP address.  Each name server only ‘maps’ objects placed in a limited portion of the network. The owner of this area has the responsibility of maintaining the names of objects associated to proper network addresses.   DNS just gives us  part of the meaning of a name.  The part that can be mapped onto the network. The full meaning of an object name is rooted deeply in the service/application in which that object exists. To implement a naming system that is compatible to DNS domain semantics we can for instance use the syntax described in RFC2369. There we are given the concept of URI: Uniform resource Identifier. This concept is compatible and encloses previous concepts as URL: Uniform Resource Locator and URN: Uniform Resource Name.

For the naming system to be sound and useful it is necessary that an authority exists to assign names, to manage the ‘namespace’..  Bearing in mind that translation process is hierarchical and can be delegated; many interesting intermediation cases are possible that involve cooperation among service owners and between service and network owners. In HTTP the naming system uses URLs. These URLs are names that help us in finding a ‘resource’ inside a machine inside the Internet. In this framework that HTTP provides, the resources are files.

What is ‘Content’?

It is not possible to give a non-restrictive definition of ‘content’ that covers all possible content types for all possible viewpoints. We should agree that ‘content’ is a piece of information. A file/stream is the technological object that implements ‘content’ in the framework of HTTP+DNS.


We face the problem of optimising the following task: find & recover some content from internet..

Observation 1: current names do not have a helpful meaning. URLs (HTTP+DNS framework) are ‘toponymic’ names. They give us an address for a content name or machine name. There is nothing in the name that refers to the geographic placement of the content. The name is not ‘topographic’ (as it would be for instance in case it contains UTM coordinates). The name is not ‘topologic’ (it gives no clue about how to get to the content, about the route). In brief: Internet names, URLs, do not have a meaningful structure that could help in optimising the task (find & recover).

Observation 2: current translations don’t have context. DNS (the current implementation) does not recover information about query originator, nor any other context for the query. DNS does not worry about WHO asks for a name translation or WHEN or WHERE… as it is designed for a semantic association 1:1, one name one network address, and thus, why worry? We could properly say that the DNS, as is today, does not have ‘context’. Current DNS is kind of a dictionary.

Observation 3: there is a diversity of content distribution problems.  The content distribution problem is not, usually, a transmission 1 to 1; it is usually 1 to many.  Usually there is for one content ‘C’ at any given time ‘T’ the amount of ‘N’ consumers with N>>1 most of the times.  The keys to quality are delay and integrity (time coherence is a result of delay). Audio-visual content can be consumed in batch or in stream. A ‘live’ content can only be consumed as a stream. It is very important that latency (time shift T=t1-t0 between an event that happens at t0 and the time t1 at which that event is perceived by consumer) is as low as possible. A pre-recorded content is consumed ‘on demand’ (VoD for instance).

It is important to notice that there are different ‘content distribution problems’ for live and recorded and also different for files and for streams.

A live transmission gives to all the consumers simultaneously the same exact experience (Broadcast/multicast), but it cannot benefit from networks with storage, as store-and-forward techniques increase delay. It is impossible also to pre-position the content in many places in the network to avoid long distance transmission as the content does not exist before consumption time.

An on-demand service cannot be a shared experience.. If it is a stream, there is a different stream per consumer. Nevertheless an on demand transmission may benefit from store and forward networks.  It is possible to pre-position the same title in many places across the network to avoid long distance transmission. This technique at the same time impacts on the ‘naming problem’: how will the network know which is the best copy for a given consumer?

We soon realise that the content distribution problem is affected by (at least):geographic position of content, geographic position of consumer and network topology


-to distribute a live content the best network is a broadcast network with low latency: classical radio & TV broadcasting, satellite are optimal options. It is not possible to do ‘better’ with a switched, routed network as IP networks are. The point is: IP networks just do NOT do well with one-to-many services. It takes incredible effort from a switched network to let a broadcast/multicast flow compared to a truly shared medium like radio.)

-to distribute on demand content the best network is a network with intermediate storage.  In those networks a single content must be transformed into M ‘instances’ that will be stored in many places through the network. For the content title ‘C’, the function ‘F’ that assigns a concrete instance ‘Cn’ to a concrete query ‘Ric’ is the key to optimising Content delivery. This function ‘F’ is commonly referred as ‘request mapping’ or ‘request routing’.

Internet + HTTP servers + DNS have both storage and naming.  (Neither of HTTP or DNS is a must.)

There is no ‘normalised’ storage service in internet, but a bunch of interconnected caches. Most of the caches work together as CDNs. A CDN, for a price, can grant that 99% consumers of your content will get it properly (low delay + integrity). It makes sense to build CDNs on top of HTTP+DNS. In fact most CDNs today build ‘request routing’ as an extension of DNS.

A network with intermediate storage should use the following info to find & retrieve content:

-content name (Identity of content)

-geographic position of requester

-geographic position of all existing copies of that content

-network topology (including dynamic status of network)

-business variables (cost associated to retrieval, requester Identity, quality,…)

Nowadays there are services (some paid) that give us the geographic position of an IP address : MaxMind,, IPinfoDB,… . Many CDNs leverage these services for request routing.

It seems that there are solutions to geo-positioning, but still have a naming problem. A CDN must offer a ‘standard face’ to content requesters. As we have said content dealers usually host their content in HTTP servers and build URLs based on HTTP+DNS so CDNs are forced to build an interface to the HTTP+DNS world.. On the internal side, today the most relevant CDNs use non-standard mechanisms to interconnect their servers (IP spoofing, DNS extensions, Anycast,…)


-add context to object queries: identify requester position through DNS. Today some networks use several proprietary versions of ‘enhanced DNS’ (Google is one of them). The enhancement usually is implemented transporting the IP addr of the requester in the DNS request and preserving this info across DNS messages so it can be used for DNS resolution.   We would prefer to use geo-position better than IP address. This geo position is available in terminals equipped with GPS, and can also be in static terminals if an admin provides positioning info when the terminal is started.

-add topological + topographical structure to names: enhance DNS+HTTP.   A web server may know its geographic position and build object names based on UTM. An organization may handle domains named after UTM. This kind of solution is plausible due to the fact that servers’ mobility is ‘slow’. Servers do not need to change position frequently and their IP addresses could be ‘named’ in a topographic way.  It is more complicated to include topological information in names. This complexity is addressed through successive name-resolution and routing processes that painstakingly give us back the IP addresses in a dynamic way that consumes the efforts of BGP and classical routing (ISIS, OSPF).

Nevertheless it is possible to give servers names that could be used collaboratively with the current routing systems. The AS number could be part of the name.  It is even possible to increase ‘topologic resolution’ by introducing a sub-AS number.  Currently Autonomous Systems (AS) are not subdivided topologically nor linked to any geography. These facts prevent us from using the AS number as a geo-locator. There are organisations spread over the whole world that have a single AS.  Thus AS number is a political-ID, not a geo-ID nor a topology-ID. An organizational revolution could be to eradicate too spread AS and/or too complex AS. This goal could be achieved by breaking AS in smaller parts confined each one in a delimited geo-area and with a simple topology. Again we would need a sub-AS number. There are mechanisms today that could serve to create a rough implementation of geo-referenced AS, for instance BGP communities.

-request routing performed mainly by network terminals: /etc/hosts sync. The abovementioned improvements in the structure of names would allow web browsers (or any SW client that recovers content) to do their request routing locally. It could be done entirely in the local machine using a local database of structured names (similar to /etc/hosts) taking advantage of the structure in the names to guess parts of the mapping not explicitly declared in the local DB. Taking the naming approach to the extreme (super structured names) the DB would not be necessary, just a set of rules to parse the structure of the name producing an IP address that identifies the optimal server in which the content that carried the structured name can be found. It is important to note that any practical implementation that we could imagine will require a DB. The more structured the names the smaller the DB.


It makes sense to think of a CDN that has a proprietary SW client for content recovery that uses an efficient naming system that allows for the ‘request routing’ to be performed in the client, in the consumer machine not depending of (unpredictably slow) network services.

Such a CDN would host all content in their own servers naming objects in a sound way (probably with geographical and topological meaning) so each consumer with the proper plugin and a minimum local DB can access the best server in the very first transaction: resolution time is zero! This CDN would rewrite web pages of its customers replacing names by structured names that are meaningful to the request routing function.   The most dynamic part of the intelligence that the plugin requires is a small pre-computed DB that is created centrally, periodically using all the relevant information to map servers to names. This DB is updated from the network periodically. The information included in this DB:  updated topology info, business policies, updated lists of servers.  It is important to realise that a new naming structure is key to make this approach practical. If names do not help the DB will end up being humungous.

Of course this is not so futuristic. Today we have a name cache in the web browser + /etc/hosts + cache in the DNS servers. It is a little subtle to notice that the best things of the new schema are: suppress the first query (and all the first queries after TTL expiration). Also there is no influence of TTLs, which are controlled by DNS owners out of cdn1, and there are no TTLs that maybe built in browsers….

This approach may succeed for these reasons:

1-      Not all objects hosted in internet are important enough to be indexed in a CDN and dynamism of key routing information is so low that it is feasible to keep all terminals up to date with infrequent sync events.

2-      Today computing capacity and storage capacity in terminals (even mobile) are enough to handle this task and the penalty paid in time is by far less than the best possible situation (with the best luck) using collaborative DNS.

3-      It is possible, attending to geographic position of the client, to download only that part of the map of servers that the client needs to know.  It suffices to recover the ‘neighbouring’ part of the map. In case of an uncommon chained failure of many neighbour servers, it is still possible to dynamically download a far portion of the map.

(Download this article in pdf format : thoughts CDN internet )


LTE EPC: Addressing the Mobile Broadband Tidal Wave

24 Feb

The mobile Internet has changed the way people communicate, stay informed, and are entertained. With more compelling services and mobile multimedia computing devices, users are increasingly entering the network and creating an enormous surge in mobile traffic.

To address this new normal, operators must deploy a core network that combines performance with intelligence to meet different traffic demands with an elastic architecture. An intelligent core network allows them to create a robust multimedia environment, enhance and manage the subscriber experience, and monetize network traffic.

Long-Term Evolution (LTE) is the next-generation mobile wireless technology designed to deliver ultrahigh-speed mobile broadband. The primary goals of LTE are increasing bandwidth, improving spectral efficiency, reducing latency, lowering the cost per byte, and enabling improved mobility. This combination aims to enhance a subscriber’s interaction with the network and further accelerate the adoption of mobile multimedia services, such as online television, streaming video, video on demand (VoD), social networking, and interactive gaming.

Radio access solutions are a primary consideration of the LTE deployment strategy, because LTE affects the mobile operators’ most valued asset: spectrum. However, equally important is the multimedia core network.

The Evolved Packet Core: The Next-Generation Packet Core for All Networks

LTE calls for a transition to a “flat”, all-IP core network with open interfaces, called the Evolved Packet Core (EPC). The goal of the EPC is higher throughput, lower latency, simplified mobility between Third-Generation Partnership Project (3GPP) and non-3GPP networks, enhanced service control and provisioning, and efficient use of network resources. Although the EPC has been defined in conjunction with LTE, it is an open next-generation packet core for all networks, including 2.5G, 3G, 4G, non-3GPP, and even fixed networks. In addition, although the EPC represents one of the smallest percentages of overall wireless infrastructure spending, it provides the greatest potential effect on overall network profitability through enablement of new services combined with cost savings from operational efficiencies.

As a result, mobile operators are looking for the best multimedia core solutions to deliver an optimum user experience and migrate to an efficient, intelligent EPC.

Important considerations for the multimedia core network include:

• Support for multiple access network types, including 2.5G, 3G, and 4G; deployment flexibility and network optimization including backhaul

• Smooth and flexible evolution from 2.5G and 3G to 4G

• Massive increase in signaling

• Increased user-plane performance

• Session-state and subscriber management

• Integration of intelligence and policy control at the mobility anchor point

• Security

• Voice-grade reliability

• Reporting, monitoring, accounting, and charging

• Roaming

• Support for multimedia services over the packet switched infrastructure

Cisco is exceptionally well positioned to address these challenges and assist in the migration to an LTE EPC, bringing the products and expertise needed for this evolution.

Cisco ASR 5000 Series Platform

The Cisco ® ASR 5000 Series extended by the Cisco ASR 5500 is elastic; it combines high capacity, high availability, and powerful performance with unparalleled subscriber and network intelligence. Designed for the evolution from 3G to 4G, the Cisco ASR 5000 Series platform is the benchmark for today’s and tomorrow’s multimedia-enabled core network. The platform uses a simple, flexible distributed architecture that supports multiple access technologies, subscriber mobility management, and call-control capabilities, as well as inline services (Figure 1). With its leading-edge throughput, signaling, and capacity, the Cisco ASR 5000 Series can readily support all EPC network functions.

Figure 1. The Cisco ASR 5000 Series in a Multiaccess Multiservice Environment

EPC Network Functions

The LTE EPC performs a series of network functions that flatten the architecture by minimizing the number of nodes in the network. As a result capital and operational expenditures decrease, thereby trimming the overall cost per megabyte of traffic while improving network performance. Cisco provides the functions defined for the LTE EPC, including the following:

• The Mobility Management Entity (MME) resides in the control plane and manages states (attach, detach, idle, and Radio Access Network [RAN] mobility), authentication, paging, mobility with 3GPP 2.5G and 3G nodes (Serving GPRS Support Node [SGSN]), roaming, and other bearer management functions.

• The Serving Gateway (SGW) sits in the user plane, where it forwards and routes packets to and from the eNodeB and Packet Data Network Gateway (PGW). It also serves as the local mobility anchor for inter-eNodeB handover and roaming between 3GPP systems, including 2.5G and 3G networks.

• The Packet Data Network Gateway (PGW) acts as the interface between the LTE network and packet data networks, such as the Internet or IP Multimedia Subsystem (IMS) networks. It is the mobility anchor point for intra-3GPP and non-3GPP access systems. It also acts as the Policy and Charging Enforcement Function (PCEF) that manages quality of service (QoS), online and offline flow-based charging data generation, deep packet inspection, and lawful intercept.

• The Evolved Packet Data Gateway (ePDG) is the element responsible for interworking between the EPC and untrusted non-3GPP networks, such as a wireless LAN.

• Release 8 Serving GPRS Support Node (SGSN), also known as the S4 SGSN, provides control, mobility, and user-plane support between the existing 2.5G and 3G core and the EPC. It provides the S4 interface that is equivalent to the Gn interface used between the SGSN and the Gateway GPRS Support Node (GGSN).

The Cisco Difference

Cisco multimedia core platforms are built to address the needs of the mobile multimedia core market.

Cisco brings a history of innovative solutions that already meet many of the requirements of the EPC, such as integrated intelligence, simplified network architecture, high-bandwidth performance capabilities, and enhanced mobility.

Therefore, Cisco solutions can support 2.5G and 3G today and, through in-service software upgrades (ISSUs), will support mobile broadband functions as LTE networks are deployed. These platforms can support multiple functions in a single node, allowing a single platform to concurrently act as an MME, Release 8 SGSN and SGW, SGW and PGW, or even as a 2.5G and 3G and LTE EPC node. Mobile operators who want a smooth network migration can maximize the return on their investments and offer an exceptional experience to their customers.

Specific key features include:

Network Flexibility:

• Common platform for all network functions

• Integration and colocation of multiple core functions

• Software architecture that enables service reconfiguration and online upgrades

• Evolution from 3G to LTE

• Single operations, administration, and management (OA&M), policy, and charging integration

Superior Overall Performance:

• High performance across all parameters – signaling, throughput, density, and latency

• Linear scaling of network functions and services

• Support for 2.5G and 3G LTE service on any card running anywhere in the system

• Resources distributed across the entire system

Integrated Intelligence with Policy Enforcement:

• Integrated deep packet inspection, service control, and steering

• Value-added inline services

• Integrated policy enforcement with tightly coupled policy and charging

• Support for integrated Session Initiation Protocol (SIP) and IMS functions

• Consolidated accounting and billing

Outstanding Reliability

• No sessions lost because of any single hardware or software failure

• Automatic recovery of fully established subscriber sessions

• Interchassis session recovery or geographic redundancy

• Network Equipment Building Standards (NEBS) Level 3 certification


Although the deployment of LTE RANs receives considerable attention, the EPC has emerged as critical for delivering next-generation mobile broadband services. As such, mobile operators must look for solutions that can address today’s requirements while positioning them for future technologies.

Cisco is focused on the elastic multimedia core network and the challenges it presents to the mobile operator. We have led the industry with intelligent, high-performance solutions that have changed the packet core environment to a true multimedia core network. We will continue to harness this proven experience and expertise to become your trusted advisor and deliver best-in-class solutions that evolve the mobile operator’s network and help deliver on the promise of true mobile broadband.

PDF (232.9 KB)
For more information, please visit

Provisioning a New Scala Volume in a VMWare Environment

21 Dec

I am going to cover the straight-forward act of creating a new volume from a storage pool, mapping it to a ScaleIO Data Client (SDC) and then presenting it to the VMware cluster.


The first step is to assure we have enough space to configure a new volume of the size we desire. GUI or CLI will suffice:




I’ve decided to provision a 512 GB volume and as can been seen from the above screenshots, I have plenty of space. So on to it.

The following command creates a volume:

scli --add_volume --protection_domain_name <protection domain> --storage_pool_name <storage pool> --size <size in GB> --volume_name <name of volume to be created>

You need to know the protection domain, the storage pool, the size of the volume you want and then a friendly name to be given to the volume. Keep it descriptive for easier management later.

I typically will SSH into the primary MDM (directly or via the virtual IP, doesn’t matter) and run the command, this saves me from having to add the MDM IP into the commands each time.


Simple enough, right?

So now we have to map the new volume to one or more SDCs, I want to present it to my entire ESXi cluster which consists of 4 hosts so I will have to map it to all four respective SDCs. SCLI provides you with two options for mapping a volume, you can either map it to an individual SDC or all of the SDCs at once. The benefit of the latter option is that it will save time, but when it means all SDCs, it means all of them. Any new SDC added to the protection domain at a later point in time will have an automatic mapping to this volume. So use the “all SDCs” option with care. In this case I am just going to map it to one SDC at a time so it will be restricted to just the ones I want until I want to manually expand it. And drumroll…the syntax:

scli --map_volume_to_sdc --volume_name <volume name> --sdc_ip <IP address of SDC>

If you do not remember the IPs of your SDCs, run the following:

scli --sdc --query_all_sdc

It will list the IPs of all of your SDCs like so:


Let’s map. For my four SDCs it will take four commands:


SDCs will periodically rescan for new mapped volumes, but if you want to force the process there is a way–you can run an executable in the SDC directory to check for newly mapped volumes. From the SDC(s) run the following operation (note that this is not a scli command):

/opt/scaleio/ecs/sdc/bin/drv_cfg --rescan

This will force the detection of any new mapped volumes. New volumes seen by a SDC will show up in /proc/partitions with the prefix “scini”. The rescan followed by a cat of /proc/partitions can be seen in the below image, the new 512 GB volume being named scinib.


Once the SDC can properly see the new volume you need to map it to the proper SCSI initiators (iSCSI for VMware). I am going to map it to all four of my hosts in my cluster. To do this you need your volume name and one of the following:

  • SCSI initiator friendly name
  • SCSI initiator ID

I am going to use my initiator name to map the volumes. If you do not remember any of this, run this:

scli --query_all_scsi_initiators

This will get you all the information you will need. Well maybe, if you have a large environment you might need to compare IQNs to the target hosts if the friendly name isn’t specific enough for you to tell.

Optionally during the mapping process you can indicate the LUN ID, otherwise it will just use the next available one.

scli --map_volume_to_scsi_initiator --volume_name <volume name> --initiator_name <initiator name>


So now all four of my SDCs can access the new volume and all ESXi hosts have a iSCSI path to the volume for each SDC. At this point rescan the ESXi cluster and the volume will appear. Note that the recommended Native Multipathing (NMP) Path Selection Policy (PSP) is Fixed. So only one path will be active at a time, therefore you want to make sure that the on each host the preferred iSCSI path is the one to the respective local SDC (the SDC VM running locally to that ESXi host). So when the device is discovered go into the configuration of the ScaleIO device and select the proper path. The path can be identified by seeing the IP address of the SDC path towards the end of the IQN. In the below example I am configuring the device running on my first ESXi server in the cluster and the IP of the SDC VM running on it is, so I selected that IQN as the preferred path. This way I/O stays internal to the ESXi server and doesn’t have to propagate over the physical network unless the SDC VM becomes unavailable and then it will switch to a different, but less optimal, path.  Well at least on the first hop this is the case–though some data almost certainly is hosted on a different SDS therefore it will have to cross the physical network between the local SDC and a remote SDS to access those storage segments.



All ready to use! Make sure you configured the setup for optimal performance too, a great post on doing so can be found here:



14 Oct

Every device that connects to the internet uses a unique address called an IP address, which works very similar to a home/location address. Pieces of data, called “packets”, are transferred via the internet between machines, which in turn gives us the fully functioning interior workings of the online community. In order for two machines, or devices to communicate via the internet, they must transfer these “packets” of data back and forth. Unfortunately the data “packets” can not be transferred if the devices do not each have their own unique address.


. “IPv4″ stands for “Internet Protocol version 4. IPv4 is older, more supported version of the internet address procedure. But ultimately, there are no longer any free IPv4 addresses, meaning all of them have been occupied or taken up.

·         Older version.

·         IPv4 only supports a maximum 32 bit internet address, which translates to 2^32 IP addresses available for assignment (about 4.29 billion total).

·         4 bytes in length.

·         variable size – time-consuming to handle

·        by network classes A, B, C (large, medium, small nets),  local use limited to link only

·        point-to-point communication, local broadcast (depends on physical link features); limited multicast  experimental any cast (not globally available)

·         Header does not identify packet flow for QoS handling by routers.

·         Both routers and the sending host fragment packets.

·         Header includes a checksum.

·         ARP uses broadcast ARP request to resolve IP to MAC/Hardware address.

·         Internet Group Management Protocol (IGMP) manages membership in local subnet groups.

·         Broadcast addresses are used to send traffic to all nodes on a subnet. Configured either manually or through DHCP. Must support a 576-byte packet size (possibly fragmented).



IPv6 stands for Internet Protocol version 6.

·         IPv6 utilizes 128 bit web addresses. Making a maximum 2^128 available addresses: 340,282,366,920,938,000,000,000,000,000,000,000,000. This is approximately three hundred and forty trillion, trillion, trillion addresses. To use up every single IPv6 addresses we would need to stack ten billion computers on top of each other over the entire world including the sea.

·         Addresses are 16 bytes in length

·         fixed size (40 octets) – more efficient

·         IPv4 compatibility  hierarchical by registry, provider, subscriber, and subnet  hierarchical by geographic region,  local use by link or site

·         over 70% of addresses reserved for future expansion

  • Multicast (sends to many interfaces at once) by link, by site, by organization, by any grouping

·         Header contains Flow Label field, which Identifies packet flow for QoS handling by router.

·         Routers do not support packet fragmentation.

·         Header does not include a checksum.

·         Optional data is supported as extension headers.

·         Multicast Neighbor Solicitation messages resolve IP addresses to MAC addresses.

·         IPv6 uses a link-local scope all-nodes multicast address.

·         Does not require manual configuration or DHCP. Must support a 1280-byte packet size (without fragmentation).



%d bloggers like this: