Archive | Architecture RSS feed for this section

How will AR and VR transform 5G?

25 Sep

Augmented Reality technology will push the boundaries of connectivity and drive innovation in a 5G world.

Let’s take a glimpse into the future. You’re in your car on your way to a meeting. The car is equipped with an advanced driver assistance system (ADAS). There’s a turn up ahead, but you’re in an unfamiliar neighborhood and the street signs aren’t easy to read.

You aren’t sure if you’re supposed to turn off this main road at this intersection or the one just past it. Fortunately, the ADAS is a step ahead of you: the street information is superimposed into your visual field as you approach the turn. It changes the picture you see as you approach the decision point, clearly showing your navigational path.

Now that you made the turn off the main street and are looking for the building number of your destination, the building information appears in your visual field, even though the actual building number is partially obscured by a tree. The system finds an available parking space nearest to the door you need to enter and navigates you directly to it. You arrive at your meeting on time, having avoided the stress of making a wrong turn, or having to find space in a crowded parking lot.

Augmenting our reality

While the activity this system facilitates is mundane, I’d argue the technology – the augmentation and virtualization of our reality – is exciting. These evolutions have been envisioned in the science fiction realm for more than a quarter century already. William Gibson’s descriptions of “cyberspace” in his 1984 novel Neuromancer are among the earliest known popular references to these virtual worlds. Michael Crichton’s 1994 novel Disclosure (and the film of the same name, released later that year) contains a scene where the main character searches through a secure computer file system via virtual reality (VR) glasses. Long a staple of sci-fi films, VR technology plays a central role in Tron (1989), The Matrix (1999) and several other films. Now in 2019, virtual and augmented reality technology is finally approaching widespread use.

Yet as immersive and captivating as VR is, there are likely to be far more real-world applications for a more “mobile” augmented reality (AR) technology. The fully immersive VR experience is – at least for now – best when confined to a specific space, like a laboratory, classroom or gaming room; that is, VR operates in a “closed world”. On the other hand, AR can begin to more seamlessly integrate with our daily life – the “open world” – because it combines important bits of supplemental visual information with our perception of the physical world around us.

Connectivity is the defining, and sometimes limiting, factor of this augmented reality. Augmenting our reality requires mobility because we are constantly on the go. Connectivity is one of the main reasons why this technology that’s been envisioned for more than 30 years hasn’t yet made it into mainstream use. There’s simply too much data to be encoded, transported and delivered to our devices for existing network technology to handle it at scale. This is why we’re going to need 5G – and future wireless networks – to make AR (and any kind of mobile VR) a practical reality.

How will AR and VR transform 5G?

Some people might ask the question “How will 5G transform AR and VR?” To us at InterDigital, the more interesting angle is to look at its inverse: How will AR and VR transform 5G? We’ll return to our ADAS example to illustrate. But first, a quick look back to AR in 4G.

The most well-known AR applications thus far have been games: especially the wildly popular Pokémon Go and the recent Harry Potter: Wizards Unite. These have been possible in a 4G network environment because – as exciting and compelling as the games are – the data delivered through the game is relatively compact. There are a fixed (and relatively static) number of game scenarios and challenges. The game itself can work within the constraints of 4G in large part because there isn’t much data to deliver, and a lot of the intensive computing can be done in the cloud. Users of these games can tolerate a fair amount of latency, and the user is generally only traveling along at walking speed. Catching a Pokémon or fending off a dementor attack is fun in part because you’re on foot when you do it.

However, most of the background technology will have to change when AR applications get to the level of future ADAS systems. When a car is traveling along at freeway speeds, an ADAS system will need to be capable both of serving up relevant information like street signs, building numbers, and intersection images, which are all relatively static, as well as more dynamic data, like notifications about traffic, weather and road conditions, and other travel warnings.

The ADAS of the future will also need to deliver that information from a variety of viewing angles in any given location where the car is located. Depending on how that application is coded, this kind of ADAS will require a huge amount of computing power and bandwidth. Some of the computing will be able to be done onboard the vehicle, but much of it will need to be done at the network edge as well, where computing horsepower and energy are more plentiful. Besides, it makes more sense for street maps and visual information to be stored close to the actual device where it will be used. If a car is driving in Los Angeles, there’s no need (and probably insufficient storage anyway) for the Chicago, New York and Seattle maps to be onboard that car. The application data is thus not going to be replicated in every car in exactly the same way.

From top to bottom, the areas 5G will improve include broadband, mobile networks and IoT.

(Image credit: InterDigital)

To understand the bandwidth needs for this future, consider the bandwidth needs for just this system alone. To do so, we must multiply the amount of ADAS computing power by the many thousands of cars in each given city. Not only do we need a bigger pipe for all of that data, we need edge servers to deliver the right data with the lowest possible latency (much more critical in a fast-moving vehicle), and those networks need software-defined slices that provision all of these resources for this particular type of application. A 5G ecosystem is better suited to meet these high volume, low latency network needs. 4G – designed primarily for a mobile voice, data and video user – just isn’t built that way.

The major difference of 5G

A major part of the reason why 5G architecture is so fundamentally different from its predecessors is because of the vast array of new use cases that are coming into focus. Augmented reality is just one of them.

AR technology, enabled by 5G, will by no means be confined to the realm of the automobile or search. Other application domains will include augmented reality tele-medicine, “augmented shopping” (including the virtual fitting of clothing and accessories), augmented reality repair or installation of equipment, augmented reality tourism and travel guides, and many more.

Someday this technology will become so commonplace that we’ll have trouble remembering life before it. And 5G will be the connectivity technology that made that ubiquity possible.

25 09 19

Artificial Intelligence might soon take over architecture and design

17 Aug
AI: Research and Reports

Artificial Intelligence (AI) has always been a topic of debate—is it good for us? Are we walking towards a better future or an inevitable doom? According to an on-going research program by McKinsey Global Institute, every occupation includes multiple types of activities, and each has a different requirement for automation. Almost all occupations have a partial automation potential. And so, almost half of all the work done by humans can eventually be taken over by a high intelligence computer.

According to studies, almost all professions can be automated. Photo credit Marcin Wichary / Wikicommons

AI: Architecture and Its Future

According to the Economist, 47% of the work done by humans will have been replaced by robots by 2037, even those traditionally associated with university education. Having said that, a recent study at University College London (ULC) and the University of Bangor said that although automation and artificial intelligence for the time being would not replace architects, the discipline will undergo massive transformations in the near future. Computers can replace tedious repetitive activities, “optimising the production of technical material and allowing, among other things, atomise the size of architectural offices. Each time fewer architects are needed to develop more complex projects.”

AI can replace a lot of repetitive activities. Photo credit Beaver, Brian/ Wikicommons

AI: A Boon or a Bane?

To create new designs, architects usually use past construction, design, and building data. Instead of putting their minds together to create something new, it is alleged that a computer will be able to utilise tons of previous data in a millisecond, make recommendations and enhance the architecture design process. With AI, an architect would very easily go about researching and testing several ideas at the same time, sometimes even without the need for a pen and paper. Also, an architect could pull out a city or zone-speicifc data, building codes, and redundant design data, and generate design variations. Even on the construction side, it is said that AI can assist with actually building something with little to no manpower. Will this eventually lead to clients and organisations simply reverting to a computer for masterplans and construction?
Researchers at Oxford suggest that even with AI coming into the scene, the essential value of architect as professionals who can understand and evaluate a problem and synthesise unique and insightful solutions will likely remain unchallenged.



7 Jun

To understand SD-LAN, let’s backtrack a bit and look at the architecture and technologies that led to its emergence.

First, what is SDN?

Software-defined networking (SDN) is a new architecture that decouples the network control and forwarding functions, enabling network control to become directly programmable and the underlying infrastructure to be abstracted for applications and network services.

This allows network engineers and administrators to respond quickly to changing business requirements because they can shape traffic from a centralized console without having to touch individual devices. It also delivers services to where they’re needed in the network, without regard to what specific devices a server or other device is connected to.

Functional separation, network virtualization, and automation through programmability are the key technologies.

But SDN has two obvious shortcomings:

  • It’s really about protocols (rather than operations), staff, as well as end-user-visible features, function, and capabilities.
  • It has relatively little impact at the access layer (intermediary and edge switches and access points, in particular). Yet these are critical elements that define wireless LANs today.

And so, what is SD-WAN?

Like SDN, software-defined WAN (SD-WAN) separates the control and data planes of the WAN and enables a degree of control across multiple WAN elements, physical and virtual, which is otherwise not possible.

However, while SDN is an architecture, SD-WAN is a buyable technology.

Much of the technology that makes up SD-WAN is not new; rather it’s the packaging of it together – aggregation technologies, central management, the ability to dynamically share network bandwidth across connection points.

Its ease of deployment, central manageability, and reduced costs make SD-WAN an attractive option for many businesses, according to Gartner analyst Andrew Lerner, who tracks the SD-WAN market closely. Lerner estimates that an SD-WAN can be up to two and a half times less expensive than a traditional WAN architecture. SD-LAN is taking complex technology to solve complex problems, but allowing IT departments work faster and smarter in the process.

So where and how does SD-LAN fit in?

SD-LAN builds on the principles of SDN in the data center and SD-WAN to bring specific benefits of adaptability, flexibility, cost-effectiveness, and scale to wired and wireless access networks.

All of this happens while providing mission-critical business continuity to the network access layer.

Put simply: SD-LAN is an application- and policy-driven architecture that unchains hardware and software layers while creating self-organizing and centrally-managed networks that are simpler to operate, integrate, and scale.

1) Application optimization prioritizes and changes network behavior based on the apps 

  • Dynamic optimization of the LAN, driven by app priorities
  • Ability to focus network resources where they serve the organization’s most important needs
  • Fine-grained application visibility and control at the network edge

2) Secure, identity-driven access dynamically defines what users, devices, and things can do when they access the SD-LAN.

  • Context-based policy control polices access by user, device, application, location, available bandwidth, or time of day
  • Access can be granted or revoked at a granular level for collections of users, devices and things, or just one of those, on corporate, guest and IoT networks
  • IoT networks increase the chances of security breaches, since many IoT devices, cameras and sensors have limited built-in security. IoT devices need to be uniquely identified on the Wi-Fi network, which is made possible by software-defined private pre-shared keys.

3) Adaptive access self-optimizes, self-heals, and self- organizes wireless access points and access switches.

  • Control without the controllers—dynamic control protocols are used to distribute a shared control plane for increased resiliency, scale, and speed
  • Ability to intelligently adapt device coverage and capacity through use of software definable radios and multiple connection technologies (802.11a/b/g/n/ac/wave 1/wave 2/MIMO/ MU-MIMO, BLE, and extensibility through USB)
  • A unified layer of wireless and wired infrastructure devices, with shared policies and management
  • The removal of hardware dependency, providing seamless introduction of new access points and switches into existing network infrastructure. All hardware platforms should support the same software.

4) Centralized cloud-based network management reduces cost and complexity of network operations with centralized public or private cloud networking.

  • Deployment in public or private cloud with a unified architecture for flexible operations
  • Centralized management for simplified network planning, deployment, and troubleshooting
  • Ability to distribute policy changes quickly and efficiently across geographically distributed locations

5) Open APIs with programmable interfaces allow tight integration of network and application infrastructures.

  • Programmability that enables apps to derive information from the network and enables the network to respond to app requirements.
  • A “big data” cloud architecture to enable insights from users, devices, and things

As you can see, there is a lot that goes into making SD-LAN work. It’s taking complex technology to solve complex problems, but allowing IT departments work faster and smarter in the process.


The CORD Project: Unforeseen Efficiencies – A Truly Unified Access Architecture

8 Sep

The CORD Project, according to ON.Lab, is a vision, an architecture and a reference implementation.  It’s also “a concept car” according to Tom Anschutz, distinguished member of tech staff at AT&T.  What you see today is only the beginning of a fundamental evolution of the legacy telecommunication central office (CO).

The Central Office Re-architected as a Datacenter (CORD) initiative is the most significant innovation in the access network since the introduction of ADSL in the 1990’s.  At the recent inaugural CORD Summit, hosted by Google in Sunnyvale, thought leaders at Google, AT&T, and China Unicom stressed the magnitude of the opportunity CORD provides. CO’s aren’t going away.  They are strategically located in nearly every city’s center and “are critical assets for future services,” according to Alan Blackburn, vice president, architecture and planning at AT&T, who spoke at the event.

Service providers often deal with numerous disparate and proprietary solutions. This includes one architecture/infrastructure for each service multiplied by two vendors. The end result is a dozen unique, redundant and closed management and operational systems. CORD is able to solve this primary operational challenge, making it a powerful solution that could lead to an operational expenditures (OPEX) reduction approaching 75 percent from today’s levels.

Economics of the data center

Today, central offices are comprised of multiple disparate architectures, each purpose built, proprietary and inflexible.  At a high level there are separate fixed and mobile architectures.  Within the fixed area there are separate architectures for each access topology (e.g., xDSL, GPON, Ethernet, XGS-PON etc.) and for wireless there’s legacy 2G/3G and 4G/LTE.

Each of these infrastructures is separate and proprietary, from the CPE devices to the big CO rack-mounted chassis to the OSS/BSS backend management systems.    Each of these requires a specialized, trained workforce and unique methods and procedures (M&Ps).  This all leads to tremendous redundant and wasteful operational expenses and makes it nearly impossible to add new services without deploying yet another infrastructure.

The CORD Project promises the “Economics of the Data Center” with the “Agility of the Cloud.”  To achieve this, a primary component of CORD is the Leaf-Spine switch fabric.  (See Figure 1)

The Leaf-Spine Architecture

Connected to the leaf switches are racks of “white box” servers.  What’s unique and innovative in CORD are the I/O shelves.  Instead of the traditional data center with two redundant WAN ports connecting it to the rest of the world, in CORD there are two “sides” of I/O.  One, shown on the right in Figure 2, is the Metro Transport (I/O Metro), connecting each Central Office to the larger regional or large city CO.  On the left in the figure is the access network (I/O Access).

To address the access networks of large carriers, CORD has three use cases:

  • R-CORD, or residential CORD, defines the architecture for residential broadband.
  • M-CORD, or mobile CORD, defines the architecture of the RAN and EPC of LTE/5G networks.
  • E-CORD, or Enterprise CORD, defines the architecture of Enterprise services such as E-Line and other Ethernet business services.

There’s also an A-CORD, for Analytics that addresses all three use cases and provides a common analytics framework for a variety of network management and marketing purposes.

Achieving Unified Services

The CORD Project is a vision of the future central office and one can make the leap that a single CORD deployment (racks and bays) could support residential broadband, enterprise services and mobile services.   This is the vision.   Currently regulatory barriers and the global organizational structure of service providers may hinder this unification, yet the goal is worth considering.  One of the keys to each CORD use case, as well as the unified use case, is that of “disaggregation.”  Disaggregation takes monolithic chassis-based systems and distributes the functionality throughout the CORD architecture.

Let’s look at R-CORD and the disaggregation of an OLT (Optical Line Terminal), which is a large chassis system installed in CO’s to deploy G-PON.  G-PON (Passive Optical Network) is widely deployed for residential broadband and triple play services.  It delivers 2 .5 Gbps Downstream, 1.5 Gbps Upstream shared among 32 or 64 homes.  This disaggregated OLT is a key component of R-CORD.  The disaggregation of other systems is analogous.

To simplify, an OLT is a chassis that has the power supplies, fans and a backplane.  The latter is the interconnect technology to send bits and bytes from one card or “blade” to another.   The OLT includes two management blades (for 1+1 redundancy), two or more “uplink” blades (Metro I/O) and the rest of the slots filled up with “line cards” (Access I/O).   In GPON the line cards have multiple GPON Access ports each supporting 32 or 64 homes.  Thus, a single OLT with 1:32 splits can support upwards of 10,000 homes depending on port density (number of ports per blade times the number of blades times 32 homes per port).

Disaggregation maps the physical OLT to the CORD platform.  The backplane is replaced by the leaf-spine switch fabric. This fabric “interconnects” the disaggregated blades.  The management functions move to ONOS and XOS in the CORD model.   The new Metro I/O and Access I/O blades become an integral part of the innovated CORD architecture as they become the I/O shelves of the CORD platform.

This Access I/O blade is also referred to as the GPON OLT MAC and can support 1,536 homes with a 1:32 split (48 ports times 32 homes/port).   In addition to the 48 ports of access I/O they support 6 or more 40 Gbps Ethernet ports for connections to the leaf switches.

This is only the beginning and by itself has a strong value proposition for CORD within the service providers.  For example, if you have 1,540 homes “all” you have to do is install a 1 U (Rack Unit) shelf.  No longer do you have to install another large chassis traditional OLT that supports 10,000 homes.

The New Access I/O Shelf

The access network is by definition a local network and localities vary greatly across regions and in many cases on a neighborhood-by-neighborhood basis.  Thus, it’s common for an access network or broadband network operator to have multiple access network architectures.  Most ILECs leveraged their telephone era twisted pair copper cables that connected practically every building in their operating area to offer some form of DSL service.  Located nearby (maybe) in the CO from the OLT are the racks and bays of DSLAMs/Access Concentrators and FTTx chassis (Fiber to the: curb, pedestal, building, remote, etc).  Keep in mind that each of the DSL equipment has its unique management systems, spares, Method & Procedures (M&P) et al.

With the CORD architecture to support DSL-based services, one only has to develop a new I/O shelf.  The rest of the system is the same.  Now, both your GPON infrastructure and DSL/FTTx infrastructures “look” like a single system from a management perspective.   You can offer the same service bundles (with obvious limits) to your entire footprint.  After the packets from the home leave the I/O shelf they are “packets” and can leverage the unified  VNF’s and backend infrastructures.

At the inaugural CORD SUMMIT, (July 29, 2016, in Sunnyvale, CA) the R-CORD working group added G.Fast, EPON, XG & XGS PON and DOCSIS.  (NG PON 2 is supported with Optical inside plant).  Each of these access technologies represents an Access I/O shelf in the CORD architecture.  The rest of the system is the same!

Since CORD is a “concept car,” one can envision even finer granularity.  Driven by Moore’s Law and focused R&D investments, it’s plausible that each of the 48 ports on the I/O shelf could be defined simply by downloading software and connecting the specific Small Form-factor pluggable (SFP) optical transceiver.  This is big.  If an SP wanted to upgrade a port servicing 32 homes from GPON to XGS PON (10 Gbps symmetrical) they could literally download new software and change the SFP and go.  Ideally as well, they could ship a consumer self-installable CPE device and upgrade their services in minutes.  Without a truck roll!

Think of the alternative:  Qualify the XGS-PON OLTs and CPE, Lab Test, Field Test, create new M&P’s and train the workforce and engineer the backend integration which could include yet another isolated management system.   With CORD, you qualify the software/SFP and CPE, the rest of your infrastructure and operations are the same!

This port-by-port granularity also benefits smaller CO’s and smaller SPs.    In large metropolitan CO’s a shelf-by-shelf partitioning (One shelf for GPON, One shelf of xDSL, etc) may be acceptable.  However, for these smaller CO’s and smaller service providers this port-by-port granularity will reduce both CAPEX and OPEX by enabling them to grow capacity to better match growing demand.

CORD can truly change the economics of the central office.  Here, we looked at one aspect of the architecture namely the Access I/O shelf.   With the simplification of both deployment and ongoing operations combined with the rest of the CORD architecture the 75 percent reduction in OPEX is a viable goal for service providers of all sizes.


5G Network Architecture 5G Network Architecture – A High-Level Perspective

27 Jul



  • A Cloud-Native 5G Architecture is Key
  • to Enabling Diversified Service Requirements
  • 5G Will Enrich the Telecommunication Ecosystem
    • The Driving Force Behind Network Architecture Transformation
    • The Service-Driven 5G Architecture
  • End-to-End Network Slicing for Multiple
  • Industries Based on One Physical Infrastructure
  • Reconstructing the RAN with Cloud
  • 1 Multi-Connectivity Is Key to High Speed and Reliability
  • 2 MCE
  • Cloud-Native New Core Architecture
  • 1 Control and User Plane Separation Simplifies the Core Network
  • 2 Flexible Network Components Satisfy Various Service Requirements
  • 3 Unified Database Management
  • Self-Service Agile Operation
  • Conclusion:
  • Cloud-Native Architecture is the Foundation of 5G Innovation

Download: 5G-Nework-Architecture-Whitepaper-en

Parallel Wireless breaks lines with new radio architecture

28 Jan
Parallel Wireless takes wraps off reference femtocell and function-packed gateway product with aim of realigning costs of enterprise wireless.

The US start-up that is trying to reimagine the cost structures of building has released details of two new products designed to drive an entirely new cost structure for major enterprise wireless deployments.

Parallel Wireless has announced a reference design (white label) Cellular Access Point femtocell built on an Intel chipset. Alongside the ODM-able femto it has released its upgraded HetNet Gateway Orchestrator – a solution that integrates several network gateway elements (HeNB,FemtoGS, Security GW, ePDG, TWAG), plus SON capability, as Virtual Network Functions on standard Intel hardware, enabled by Intel Open Network Platform Server and DPDK accelerators.

Showing the functions absorbed as VNFs into the HetNet Gateway

Showing the functions absorbed as VNFs into the HetNet Gateway

The net result, Parallel Wireless claims, is an architecture that can enable much cheaper deployments than current large scale wireless competitors. More cost-stripping comes with the femto reference design which is intended to be extremely low cost to manufacture.

parallel price compare

The company claimed that comparable system costs place it far below the likes of SpiderCloud’s E-RAN, Ericsson’s Radio Dot and Huawei’s LampSite solutions.

The brains of the piece is the HetNet Gateway, which provides X2, Iuh, Iur and S1 interface support, thereby providing unified mobility management across WCDMA, LTE and WiFi access. As an NFV-enabled element it also fits in with MEC architectures and can also deployed at different points in the network, dependent on where the operator deems fit.

parallel wireless architecture

Parallel Wireless vision of the overall architecture

One challenge for Parallel will be to convince operators that the HetNet Gateway is the element they need in their network to provide the SON, orchestration, X2 brokering and so on of the RAN. Not only is it challenging them to move to an Intel-based virtualised architecture for key gateway and security functions, but also given the “open” nature of NFV, in theory there is no particular need for operators to move to Parallel’s implementation as the host of these VNFs.

Additionally, it’s a major structural change to make just to be able to address the enterprise market, attractive as it is. Of course, you wouldn’t expect Parallel’s ambitions to stop at the enterprise use case – this is likely it biting off the first chunk of the market it thinks best suits its Intel-based vRAN capabilities.

And Parallel would no doubt also point out that the HNG is not solely integrated with Parallel access points, and could be used to manage other vendors’ equipment, giving operators a multi-vendor, cross-mode control point in the network.

Another challenge for the startup will be that it is introducing its concept at a time when the likes of Altiostar with its virtualised RAN, and Artemis (now in an MoU with Nokia) with its pCell are introducing new concepts to outdoor radio. Indoors  the likes of SpiderCloud and Airvana(Commscope) market themselves along broadly similar lines. For instance Airvana already tags its OneCell as providing LTE at the economics of WiFi. Another example: SpiderCloud‘s Intel-based services control node is positioned by the vendor as fitting into the virtualised edge vision, and SpiderCloud was a founder member of the ETSI MEC SIG.

In other words, it is going to take some time for all of this to shake out. There can be little doubt, however, that the direction of travel is NFV marching further towards the edge, on standard hardware. Parallel, then, is positioning itself on that road. Can it hitch a ride?

LTE Fundamentals Channels Architecture and Call Flow

7 Jan
LTE Overview
LTE/EPC Network Architecture
LTE/EPC Network Elements
LTE/EPC Mobility & Session Management
LTE/EPC Procedure
LTE/EPS overview
Air Interface Protocols

LTE Radio Channels
Transport Channels and Procedure
LTE Physical Channels and Procedure
LTE Radio Resource Management




How to get started with infrastructure and distributed systems

4 Jan
 Most of us developers have had experience with web or native applications that run on a single computer, but things are a lot different when you need to build a distributed system to synchronize dozens, sometimes hundreds of computers to work together.

I recently received an email from someone asking me how to get started with infrastructure design, and I thought that I would share what I wrote him in a blog post if that can help more people who want to get started in that as well.

To receive a notification email every time a new article is posted on Code Capsule, you can subscribe to the newsletter by filling up the form at the top right corner of the blog.

A basic example: a distributed web crawler

For multiple computers to work together, you need some of synchronization mechanisms. The most basic ones are databases and queues. Part of your computers are producers or masters, and another part are consumers or workers. The producers write data in a database, or enqueue jobs in a queue, and the consumers read the database or queue. The database or queue system runs on a single computer, with some locking, which guarantees that the workers don’t pick the same work or data to process.

Let’s take an example. Imagine you want to implement a web crawler that downloads web pages along with their images. One possible design for such a system will require the following components:

  • Queue: the queue contains the URLs to be crawled. Processes can add URLs to the queue, and workers can pick up URLs to download from the queue.
  • Crawlers: the crawlers pick URLs from the queue, either web pages or images, and download them. If a URL is a webpage, the crawlers also look for links in the page, and push all those links to the queue for other crawlers to pick them up. The crawlers are at the same time the producers and the consumers.
  • File storage: The file storage stores the web pages and images in an efficient manner.
  • Metadata: a database, either MySQL-like, Redis-like, or any other key-value store, will keep track of which URL has been downloaded already, and if so where it is stored locally.

The queue and the crawlers are their own sub-systems, they communicate with external web servers on the internet, with the metadata database, and with the file storage system. The file storage and metadata database are also their own sub-systems.

Figure 1 below shows how we can put all the sub-systems together to have a basic distributed web crawler. Here is how it works:

1. A crawler gets a URL from the queue.
2. The crawler checks in the database if the URL was already downloaded. If so, just drop it.
3. The crawler enqueues the URLs of all links and images in the page.
4. If the URL was not downloaded recently, get the latest version from the web server.
5. The crawler saves the file to the File Storage system: it talks to a reserse proxy that’s taking incoming requests and dispatching them to storage nodes.
6. The File Storage distributes load and replicates data across multiple servers.
7. The File Storage update the metadata database so we know which local file is storing which URL.


Figure 1: Architecture of a basic distributed web crawler

The advantage of a design like the one above is that you can scale up independently each sub-system. For example, if you need to crawl stuff faster, just add more crawlers. Maybe at some point you’ll have too many crawlers and you’ll need to split the queue into multiple queues. Or maybe you realize that you have to store more images than anticipated, so just add a few more storage nodes to your file storage system. If the metadata is becoming too much of a centralized point of contention, turn it into a distributed storage, use something like Cassandra or Riak for that. You get the idea.

And what I have presented above is just one way to build a simple crawler. There is no right or wrong way, only what works and what doesn’t work, considering the business requirements.

Talk to people who are doing it

The one unique way to truly learn how to build a distributed system is to maintain or build one, or work with someone who has built something big before. But obviously, if the company you’re currently working at does not have the scale or need for such a thing, then my advice is pretty useless…

Go to and find groups in your geographic area that talk about using NoSQL data storage systems, Big Data systems, etc. In those groups, identify the people who are working on large-scale systems and ask them questions about the problems they have and how they solve them. This is by far the most valuable thing you can do.

Basic concepts

There are a few basic concepts and tools that you need to know about, some sort of alphabet of distributed systems that you can later on pick from and combine to build systems:

    • Concepts of distributed systems: read a bit about the basic concepts in the field of Distributed Systems, such as consensus algorithms, consistent hashing, consistency, availability and partition tolerance.
    • RDBMs: relational database management systems, such as MySQL or PostgreSQL. RDMBs are one of the most significant invention of humankind from the last few decades. They’re like Excel spreadsheets on steroid. If you’re reading this article I’m assuming you’re a programmer and you’ve already worked with relational databases. If not, go read about MySQL or PostgreSQL right away! A good resource for that is the web site
    • Queues: queues are the simplest way to distribute work among a cluster of computers. There are some specific projects tackling the problem, such as RabbitMQ or ActiveMQ, and sometimes people just use a table in a good old database to implement a queue. Whatever works!
    • Load balancers: if queues are the basic mechanism for a cluster of computer to pull work from a central location, load balancers are the basic tool to push work to a cluster of computer. Take a look at Nginx and HAProxy.
    • Caches: sometimes accessing data from disk or a database is too slow, and you want to cache things in the RAM. Look at projects such as Memcached and Redis.
    • Hadoop/HDFS: Hadoop is a very spread distributed computing and distributed storage system. Knowing the basics of it is important. It is based on the MapReduce system developed at Google, and is documented in the MapReduce paper.
    • Distributed key-value stores: storing data on a single computer is easy. But what happens when a single computer is no longer enough to store all the data? You have to split your storage into two computers or more, and therefore you need mechanisms to distribute the load, replicate data, etc. Some interesting projects doing that you can look at are Cassandraand Riak.

Read papers and watch videos

There is a ton of content online about large architectures and distributed systems. Read as much as you can. Sometimes the content can be very academic and full of math: if you don’t understand something, no big deal, put it aside, read about something else, and come back to it 2-3 weeks later and read again. Repeat until you understand, and as long as you keep coming at it without forcing it, you will understand eventually. Some references:

Introductory resources

Real-world systems and practical resources

Theoretical content

Build something on your own

There are plenty of academic courses available online, but nothing replaces actually building something. It is always more interesting to apply the theory to solving real problems, because even though it’s good to know the theory on how to make perfect systems, except for life-critical applications it’s almost never necessary to build perfect systems.

Also, you’ll learn more if you stay away from generic systems and instead focus on domain-specific systems. The more you know about the domain of the problem to solve, the more you are able to bend requirements to produce systems that are maybe not perfect, but that are simpler, and which deliver correct results within an acceptable confidence interval. For example for storage systems, most business requirements don’t need to have perfect synchronization of data across replica servers, and in most cases, business requirements are loose enough that you can get away with 1-2%, and sometimes even more, of erroneous data. Academic classes online will only teach you about how to build systems that are perfect, but that are impractical to work with.

It’s easy to bring up a dozen of servers on DigitalOcean or Amazon Web Services. At the time I’m writing this article, the smallest instance on DigitalOcean is $0.17 per day. Yes, 17 cents per day for a server. So you can bring up a cluster of 15 servers for a weekend to play with, and that will cost you only $5.

Build whatever random thing you want to learn from, use queuing systems, NoSQL systems, caching systems, etc. Make it process lots of data, and learn from your mistakes. For example, things that come to my mind:

      • Build a system that crawls photos from a bunch of websites like the one I described above, and then have another system to create thumbnails for those images. Think about the implications of adding new thumbnail sizes and having to reprocess all images for that, having to re-crawl or having to keep the data up-to-date, having to serve the thumbnails to customers, etc.
      • Build a system that gathers metrics from various servers on the network. Metrics such as CPU activity, RAM usage, disk utilization, or any other random business-related metrics. Try using TCP and UDP, try using load balancers, etc.
      • Build a system that shards and replicate data across multiple computers. For example, you’re complete dataset is A, B, and C and it’s split across three servers: A1, B1, and C1. Then, to deal with server failure you want to replicate the data, and have exact copies of those servers in A2, B2, C2 and A3, B3, C3. Think about the failure scenarios, how you would replicate data, how you would keep the copies synced, etc.?

Look at systems and web applications around you, and try to come up with simplified versions of them:

      • How would you store the map tiles for Google Maps?
      • How would you store the emails for Gmail?
      • How would you process images for Instagram?
      • How would you store the shopping cart for Amazon?
      • How would you connect drivers and users for Uber?

Once you’ve build such systems, you have to think about what solutions you need to deploy new versions of your systems to production, how to gather metrics about the inner-workings and health of your systems, what type of monitoring and alerting you need, how you can run capacity tests so you can plan enough servers to survive request peaks and DDoS, etc. But those are totally different stories!

I hope that this article helped explain how you can get started with infrastructure design and distributed systems. If you have any other resources you want to share, or if you have questions, just drop a comment below!


Next Generation Telecommunication Payload Based On Photonic Technologies

4 Jul


With this study the benefits coming from the application of photonic technologies on the channelization section of a Telecom P/L have been investigated and identified. A set of units have been selected to be further developed for the definition of a Photonic Payload In Orbit Demonstrator (2PIOD). 

<!–[if !supportLists]–>1.      To define a set of Payload Requirements for future Satellite TLC Missions. These requirements and relevant P/L architecture have been used in the project as Reference Payloads (“TN1: Payload Requirements for future Satellite Telecommunication Missions”)

<!–[if !supportLists]–>2.       To review of relevant photonic technologies, signal processing and communications on board telecommunication satellites and To identify novel approaches of photonic digital communication & processing for use in space scenarios for the future satellite communications missions (“TN2: to review and select Photonic Technologies for the Signal Processing and Communication functions relevant to future Satellite TLC P/L”)

  1.     To define a preliminary design and layouts of innovative, digital and analogue payload architectures making use of photonic technologies, and  to perform a comparison between the preliminary design of the photonic payloads with the corresponding conventional implementations, and outline the benefits that can justify the use of photonic technologies in future satellite communications missions. (“TN3: Preliminary Designs of Photonic Payload architecture concepts, Trade off with Electronic Design and Selection of Photonic Payloads to be further investigated”) 


<!–[if !supportLists]–>4.      TRL identification for the potential photonic technologies and the possible telecommunication payload architectures selected in the previous phase. Definition of the roadmap for the development, qualification and fligt of photonic items and payloads. (“TN4: Photonic Technologies and Payload Architecture Development Roadmap”)

5.     TRL identification for the potential photonic technologies and the possible telecommunication payload architectures selected in the previous phase. Definition of the roadmap for the development, qualification and fligt of photonic items and payloads. (“TN4: Photonic Technologies and Payload Architecture Development Roadmap”)


The study permits to: 

  • identify the benefit coming from the migration from conventional to  photonic technology
  • To identify critical optical components which needs of a delta-development
  • To identify a Photonic Payload for in-orbit demonstrator

Project Plan

Study Logic of the Project: 


Identify the benefits coming from the application of photonic technologies in TLC P/L.

Define mission/payload architecture showing a real interest (technical and economical) of optical technology versus microwave technology.

Establish new design rules for optical/microwave engineering

Develop hardware with an emerging technology in the space domain


If the optical technology appears as a breaking technology compare to microwave technology, a new family product could be developed at EQM level in order to cope to business segment evolution needs.


The main benefit which can be expected from the photonic technologies is to provide new flexible payload architecture opportunities with higher performance with respect to the conventional implementations in terms of:


Main expected benefit, derived from the use of photonic technologies to TLC P/L Architecture, is to provide new flexible payload architecture opportunities with higher performance with respect to the conventional implementations. Further benefits are expected in terms of:

  • Payload Mass;
  • Payload Volume;
  • Payload Power Consumption and Dissipation;
  • Data and RF Harness;
  • EMC/EMI and RF isolation issues. 

All these features impacts directly on the:

  • Payload functionality;
  • Selected platform size;
  • Launcher selection;  

At the end, an overall cost reduction for the manufacturing of a Payload/Satellite is expected. 

Current Status (dated: 09 Jun 2014)

The study is completed


LTE Security: Backhaul to the Future

20 Feb

It’s hard to hit moving targets, but subscribers to 4G and LTE networks need to be assured that their data has better protection than just being part of a high volume, fast-moving flow of traffic. This is a key issue with LTE architectures – the connection between the cell site and the core network is not inherently secure.

Operators have previously not had to consider the need for secure backhaul.

2G and 3G services use TDM and ATM backhaul, which proved relatively safe against external attacks.  What’s more, 3rd Generation Partnership Project (3GPP) based 2G and 3G services provide inbuilt encryption from the subscriber’s handset to the radio network controller.  But in LTE networks, while traffic may be encrypted from the device to the cell site (eNB), the backhaul from the eNB to the IP core is unencrypted, leaving the traffic (and the backhaul network) vulnerable to attack and interception.

This security problem is compounded by the rapid, widespread deployment of microcell base stations that provide extra call and data capacity in public spaces, such as shopping centres and shared office complexes.  The analyst Heavy Reading expects that the global number of cellular sites will grow by around 50% by the end of 2015, to approximately 4 million.   Many of these new sites will be micro and small cells, driven by the demand to deliver extra bandwidth to subscribers at lower cost.

Microcell security matters

These small base stations placed in publicly-accessible areas typically only have a minimum of physical security when compared to a conventional base station.  This creates the risk of malicious parties tampering with small cell sites to exploit the all-IP LTE network environment, to probe for weaknesses from which to gain access to other nodes, and stage an attack on the mobile core network.  These attacks could involve access to end-user data traffic, denial-of-service on the mobile network, and more.

Furthermore, operators are starting to experience pressure to deliver strong security for subscribers’ data, because of competitive pressure from rivals and the need to assure both current and future customers that their mobile traffic is fully protected against interception and theft.

As a result, backhaul from the eNB to the mobile core and mobile management entity (MME) needs securing, to protect both unencrypted traffic and the operator’s core network.  Especially when the backhaul network is provided by a third party, is shared with another operator or provider, or uses an Internet connection – which are all common scenarios for MNOs looking to deploy backhaul with the lowest overall cost of deployment and ownership.  While these types of backhaul network deliver lower costs, they also reduce the overall trustworthiness of the network.  So how should MNOs protect backhaul infrastructure against security risks, to boost subscriber trust and protect data and revenues?

Tunnel vision

To mitigate the risks of attack on backhaul networks, and to protect the S1 interface between the eNB and mobile core, 3GPP recommends using IPsec to enable authentication and encryption of IP traffic, and firewalling at both eNB and on the operator’s mobile core.  The 3GPP-recommended model involves IPsec tunnels being initiated at the cell site, carrying both bearer and signalling traffic across the backhaul network and being decrypted in the core network by a security gateway.  IPsec is already used in femtocell, IWLAN (TTG) and UMA/GAN deployments, and a majority of infrastructure vendors support the use of IPsec tunnels in their eNB solutions.

However, while IPsec is the standard approach to security recommended by 3GPP, there are common concerns about its deployment, based on factors such as the operator’s market position and customer profile;  the cost and complexities of deployment; and how IPsec deployment might impact on overall network performance.

MNOs need to be confident that their IPsec deployments are highly scalable, and offer high availability to cater for the expected explosive growth in LTE traffic and bandwidth demands.  This in turn means using security solutions that offer true carrier-grade throughput capabilities as well as compliance with latest 3GPP security standards, while being flexible enough to adapt to the operator’s needs as they evolve.  At the same time, the IPsec solution should be as cost-effective as possible, to minimise impact on budgets.

Scalable security

To address these concerns, the IPsec security solution should run on commercial off-the-shelf platforms embedded in virtualized hypervisors.  This avoids the costs and complexity of having to aggregate backhaul traffic to a central network point, or complementing existing solutions with additional hardware, while also enabling rapid deployment and easier management.  A virtualized solution also gives excellent scalability to support operators’ future needs.

In terms of network performance, the solution should also support both single and multiple IPsec tunnels from the eNBs to the network core, which enables the use of flexible QoS network optimisation based on specific criteria such as the tunnel ID or service used – while making the security transparent to the subscriber.  This also enables the operator to offer dedicated IPsec tunnels to different customer groups – such as public safety users – to segregate different types of sensitive traffic from each other.

Using a flexible security platform that offers advanced IPsec capability and supports other advanced security applications, MNOs can protect their subscribers’ data and the network core against the risks of interception and attack, and easily manage the security deployment.  This in turn helps them to secure their subscribers’ data, loyalty and ongoing revenues.

Clavister has a range of backhaul security solutions you can see here.


%d bloggers like this: