Thread Network

21 May

Thread is not a new standard, but rather a combination of existing, open source standards such as IEEE and IETF that define a uniform, interoperable wireless network stack enabling communication between devices of different manufacturers. Thread uses IPv6 protocol as well the energy efficient wireless IEEE 802.15.4 PHY/MAC standard.

Use of the IPv6 standard allows components in a Thread network to be easily connected to existing IT infrastructure. The Thread network layer combines physical as well as transport layers. UDP serves as the transport layer, on which various application layers such as COAP or MQTT-SN can be used. UPD also supports proprietary layers such as Nest Weave. Layers that are used for most applications, and that service network infrastructure, are defined uniformly via Thread. Application layers are implemented depending on end user requirements.

Two security mechanisms are used within Thread network layers: MAC layer encryption and Datagram Transport Layer Security (DTLS). MAC Layer encryption encodes call content above the PHY/MAC layers. DTLS is implemented in conjunction with the UDP protocol and encrypts application data, but not packet data from the lower layers (IPv6). Thread also enables mesh network topologies. Routing algorithms ensure that messages within a network reach the target node using the IPv6 addressing. When a single nodes fail, Thread changes the network topology in order to preserve network integrity. Thread also supports in parallel multiple Ethernet or wireless networks established via Border Routers. This ensures reliability through network redundancy. Thread is ideal for home automation due to its mesh network topology and support of inexpensive nodes.

The following image shows a possible setup of such topology. Rectangular boxes represent Border Routers such as phyGATE-AM335 (alternately phyGATE-i.MX7, phyGATE-K64) or the phySTICK. The two Border Routers in the image establish the connection to the IT infrastructure via Ethernet or WiFi. The pentagon icons represent nodes, such as phyWAVEs and phyNODEs, that are addressable and can relay messages within the Thread mesh network. Nodes depicted by circles, which can be phyWAVEs and phyNODEs, are nodes that can be configured for low power and to operate for an extensive time using a single battery.

Source: http://www.phytec.eu/products/internet-of-things/

IoT: New Paradigm for Connected Government

9 May

The Internet of Things (IoT) is an uninterrupted connected network of embedded objects/ devices with identifiers without any human intervention using standard and communication protocol.  It provides encryption, authorization and identification with different device protocols like MQTT, STOMP or AMQP to securely move data from one network to another. IoT in connected Government helps to deliver better citizen services and provides transparency. It improves the employee productivity and cost savings. It helps in delivering contextual and personalized service to citizens and enhances the security and improves the quality of life. With secure and accessible information government business makes more efficient, data driven, changing the lives of citizens for the better. IoT focused Connected Government solution helps in rapidly developing preventive and predictive analytics. It also helps in optimizing the business processes and prebuilt integrations across multiple departmental applications. In summary, this opens up the new opportunities for government to share information, innovate, make more informed decisions and extend the scope of machine and human interaction.

Introduction
The Internet of Things (IoT) is a seamless connected system of embedded sensors/devices in which communication is done using standard and interoperable communication protocols without human intervention.

The vision of any Connected Government in the digital era is “To develop connected and intelligent IoT based systems to contribute to government’s economy, improving citizen satisfaction, safe society, environment sustainability, city management and global need.”

IoT has data feeds from various sources like cameras, weather and environmental sensors, traffic signals, parking zones, shared video surveillance service.  The processing of this data leads to better government – IoT agency coordination and the development of better services to citizens.

Market Research predicts that, by 2020, up to 30 billion devices with unique IP addresses are connected to the Internet [1]. Also, “Internet of Everything” has an economic impact of more than $14 trillion by 2020 [2].  By 2020, the “Internet of Things” is powered by a trillion sensors [3]. In 2019, the “Internet of Things” device market is double the size of the smartphone, PC, tablet, connected car, and the wearable market combined [4]. By 2020, component costs will have to come down to the point that connectivity will become a standard feature even for processors costing less than $1 [5].

This article articulates the drivers for connected government using IoT and its objectives. It also describes various scenarios in which IoT used across departments in connected government.

IoT Challenges Today
The trend in government seems to be IoT on an agency-by-agency basis leading to different policies, strategies, standards and subsequent analysis and use of data. There are number of challenges preventing the adoption of IoT in governments. The main challenges are:

  • Complexity: Lack of funding, skills and usage of digital technologies, culture and strategic leadership commitment are the challenges today.
  • Data Management: In Government, there is a need for managing huge volumes of data related to government departments, citizens, land and GIS. This data needs to be encrypted and secured. To maintain the data privacy and data integrity is a big challenge.
  • Connectivity: IoT devices require good network connectivity to deliver the data payload and continuous streaming of unstructured data. Example being the Patient medical records, rainfall reports, disaster information etc.  Having a network connectivity continuously is a challenge.
  • Security: Moving the information back and forth between departments, citizens and third parties in a secure mode is the basic requirement in Government as IoT introduces new risks and vulnerabilities. This leaves users exposed to various kinds of threats.
  • Interoperability: This requires not only the systems be networked together, but also that data from each system has to be interoperable. Majority of the cases, IoT is fragmented and lacks in interoperability due to different OEMs, OS, Versions, Connecters and Protocols.
  • Risk and Privacy: Devices sometimes gather and provides personal data without the user’s active participation or approval. Sometimes gathers very private information about individuals based on indirect interactions violating the privacy policies.
  • Integration: Need to design an integration platform that can connect any application, service, data or device with the government eco system. Having a solution that comprises of an integrated “all-in-one” platform which provides the device connectivity, event analytics, and enterprise connectivity capabilities is a big challenge.
  • Regulatory and Compliance – Adoption of regulations by an IoT agencies is a challenge.
  • Governance: One of the major concerns across government agencies is the lack of big picture or an integrated view of the IoT implementation. It has been pushed by various departments in a silo-ed fashion.  Also, government leaders lack a complete understanding of IoT technology and its potential benefits.

IoT: Drivers for Connected Government
IoT can increase value by both collecting better information about how effectively government servants, programs, and policies are addressing challenges as well as helping government to deliver citizen-centric services based on real-time and situation-specific conditions. The various stakeholders that are leveraging IoT in connected government are depicted below,

 

Information Flow in an IoT Scenario
The Information flow in Government using IoT has five stages (5C) : Collection, Communication, Consolidation, Conclusion and Choice.

  1. Collection: Sensors/devices collect data on the physical environment-for example, measuring things such as air temperature, location, or device status. Sensors passively measure or capture information with no human intervention.
  2. Communication: Devices share the information with other devices or with a centralized platform. Data is seamlessly transmitted among objects or from objects to a central repository.
  3. Consolidation: The information from multiple sources are captured and combined at one point. Data is aggregated as a devices communicate with each other. Rules determine the quality and importance of data standards.
  4. Conclusion: Analytical tools help detect patterns that signal a need for action, or anomalies that require further investigation.
  5. Choice: Insights derived from analysis either initiate an action or frame a choice for the user. Real time signals make the insights actionable, either presenting choices without emotional bias or directly initiating the action.

Figure 2: IoT Information Flow

Role of IoT in Connected Government
The following section highlights the various government domains and typical use cases in the connected government.

Figure 3: IoT Usage in Connected Government

a. Health
IoT-based applications/systems of the healthcare enhance the traditional technology used today. These devices helps in increasing the accuracy of the medical data that was collected from large set of devices connected to various applications and systems. It also helps in gathering data to improve the precision of medical care which is delivered through sophisticated integrated healthcare systems.

IoT devices give direct, 24/7 X 365 access to the patient in a less intrusive way than other options. IoT based analytics and automation allows the providers to access the patient reports prior to their arrival to hospital. It improves responsiveness in emergency healthcare.

IoT-driven systems are used for continuous monitoring of patients status.  These monitoring systems employ sensors to collect physiological information that is analyzed and stored on the cloud. This information is accessed by Doctors for further analysis and review. This way, it provides continuous automated flow of information. It helps in improving the quality of care through altering system.

Patient’s health data is captured using various sensors and are analyzed and sent to the medical professional for proper medical assistance remotely.

b. Education
IoT customizes and enhances education by allowing optimization of all content and forms of delivery. It reduces costs and labor of education through automation of common tasks outside of the actual education process.

IoT technology improves the quality of education, professional development, and facility management.  The key areas in which IoT helps are,

  • Student Tracking, IoT facilitates the customization of education to give every student access to what they need. Each student can control experience and participate in instructional design. The student utilizes the system, and performance data primarily shapes their design. This delivers highly effective education while reducing costs.
  • Instructor Tracking, IoT provides instructors with easy access to powerful educational tools. Educators can use IoT to perform as a one-on-one instructor providing specific instructional designs for each student.
  • Facility monitoring and maintenance, The application of technology improves the professional development of educators
  • Data from other facilities, IoT also enhances the knowledge base used to devise education standards and practices. IoT introduces large high quality, real-world datasets into the foundation of educational design.

c. Construction
IoT enabled devices/sensors are used for automatic monitoring of public sector buildings and facilities or large infrastructure. They are used for managing the energy levels of air conditioning, electricity usage. Examples being lights or air conditioners ON in empty rooms results into revenue loss.

d. Transport
IoT’s can be used across transport systems such as traffic control, parking etc. They provide improved communication, control and data distribution.

The IoT based sensor information obtained from street cameras, motion sensors and officers on patrol are used to evaluate the traffic patterns of the crowded areas. Commuters will be informed of the best possible routes to take, using information from real-time traffic sensor data, to avoid being stuck in traffic jams.

e. Smart City
IoT simplifies examining various factors such as population growth, zoning, mapping, water supply, transportation patterns, food supply, social services, and land use. It supports cities through its implementation in major services and infrastructure such as transportation and healthcare. It also manages other areas like water control, waste management, and emergency management. Its real-time and detailed information facilitate prompt decisions in emergency management.  IoT can automate motor vehicle services for testing, permits, and licensing.

f. Power
IoT simplifies the process of energy monitoring and management while maintaining a low cost and high level of precision. IoT based solutions are used for efficient and smart utilization of energy. They are used in Smart grid, Smart meter solution implementations.

Energy system reliability is achieved through IoT based analytics system. It helps in preventing system overloading or throttling and also detects threats to system performance and stability, which protects against losses such as downtime, damaged equipment, and injuries.

g. Agriculture
IoT minimizes the human intervention in farming function, farming analysis and monitoring. IoT based systems detect changes to crops, soil environment etc.

IoT in agriculture contribute to,

  • Crop monitoring: Sensors can be used to monitor crops and the health of plants using the data collected. Sensors can also be used for early monitoring of pests and disease.
  • Food safety: The entire supply chain, the Farm, logistics and retails, are all becoming connected. Farm products can be connected with RFID tags, increasing customer confidence.
  • Climate monitoring: Sensors can be used to monitor temperature, humidity, light intensity and soil moisture. These data can be sent to the central system to trigger alerts and automate water, air and crop control.
  • Logistics monitoring: Location based sensors can be used to track vegetables and other Farm products during transport and storage. This enhances scheduling and automates the supply chain.
  • Livestock farming monitoring: The monitoring of Farm animals can be monitored via sensors to detect potential signs of disease. The data can be analysed from the central system and relevant information can be sent to the farmers.

Conclusion
There are many opportunities for the government to use the IoT to make government services more efficient. IoT cannot be analyzed or implemented properly without collaborative efforts between Industry, Government and Agencies. Government and Agencies need to work together to build a consistent set of standards that everyone can follow.

Connected Government solutions using IoT is used in the domain front:

  • Public Safety departments to leverage IoT for the protection of citizens. One method is through using video images and sensors to provide predictive analysis, so that government can provide security to citizen gathering during parades or inaugural events.
  • Healthcare front, advanced analytics of IoT delivers better and granular care of patients. Real time access of patient’s reports, monitoring of patients health status improves the emergency healthcare.
  • IoT helps in content delivery, monitoring of the students, faculty and improving the quality of education and professional development in Education domain.
  • In energy sector, IoT allows variety of energy controls and monitoring functions. It simplifies the process of energy monitoring and management while maintaining low cost and high level of precision. It helps in preventing system overloading, improving performance of the system and stability.
  • IoT strategy is being utilized in the agricultural industry in terms of productivity, pest control, water conservation and continuous production based on improved technology and methods.

In the technology front:

  • IOT connects billions of devices and sensors to create new and innovative applications. In order to support these applications, a reliable, elastic and agile platform is essential. Cloud computing is one of the enabling platforms to support IOT.
  • Connected Government solution can manage the large number of devices and volume of data emitted with IoT. This large volume of new information generated by IoT allows a new collaboration between government, industry and citizens. It helps in rapidly developing IoT focused preventive and predictive analytics.
  • Optimizing the business processes with process automation and prebuilt integrations across multiple departmental applications. This opens up the new opportunities for government to share information, innovate, save lives, make more informed decisions, and actually extend the scope of machine and human interaction.

References

  1. Gartner Says It’s the Beginning of a New Era: The Digital Industrial Economy.” Gartner.
  2. Embracing the Internet of Everything to Capture your share of $14.4 trillion.” Cisco.
  3. With a Trillion Sensors, the Internet of Things Would Be the “Biggest Business in the History of Electronics.” Motherboard.
  4. The ‘Internet of Things’ Will Be The World’s Most Massive Device Market And Save Companies Billions of Dollars.” Business Insider.
  5. Facts and Forecasts: Billions of Things, Trillions of Dollars. Siemens.

Source: http://iotbootcamp.sys-con.com/node/4074527

IoT, encryption, and AI lead top security trends for 2017

28 Apr

The Internet of Things (IoT), encryption, and artificial intelligence (AI) top the list of cybersecurity trends that vendors are trying to help enterprises address, according to a Forrester report released Wednesday.

As more and more breaches hit headlines, CXOs can find a flood of new cybersecurity startups and solutions on the market. More than 600 exhibitors attended RSA 2017—up 56% from 2014, Forrester noted, with a waiting list rumored to be several hundred vendors long. And more than 300 of these companies self-identify as data security solutions, up 50% from just a year ago.

“You realize that finding the optimal security solution for your organization is becoming more and more challenging,” the report stated.

In the report, titled The Top Security Technology Trends To Watch, 2017, Forrester examined the 14 most important cybersecurity trends of 2017, based on the team’s observations from the 2017 RSA Conference. Here are the top five security challenges facing enterprises this year, and advice for how to mitigate them.

  1. IoT-specific security products are emerging, but challenges remain

The adoption of consumer and enterprise IoT devices and applications continues to grow, along with concerns that these tools can increase an enterprise’s attack surface, Forrester said. The Mirai botnet attacks of October 2016 raised awareness about the need to protect IoT devices, and many vendors at RSA used this as an example of the threats facing businesses. While a growing number of companies claim to address these threats, the market is still underdeveloped, and IoT security will require people and policies as much as technological solutions, Forrester stated.

The Internet of Things (IoT), encryption, and artificial intelligence (AI) top the list of cybersecurity trends that vendors are trying to help enterprises address, according to a Forrester report released Wednesday.

As more and more breaches hit headlines, CXOs can find a flood of new cybersecurity startups and solutions on the market. More than 600 exhibitors attended RSA 2017—up 56% from 2014, Forrester noted, with a waiting list rumored to be several hundred vendors long. And more than 300 of these companies self-identify as data security solutions, up 50% from just a year ago.

“You realize that finding the optimal security solution for your organization is becoming more and more challenging,” the report stated.

In the report, titled The Top Security Technology Trends To Watch, 2017, Forrester examined the 14 most important cybersecurity trends of 2017, based on the team’s observations from the 2017 RSA Conference. Here are the top five security challenges facing enterprises this year, and advice for how to mitigate them.

1. IoT-specific security products are emerging, but challenges remain

The adoption of consumer and enterprise IoT devices and applications continues to grow, along with concerns that these tools can increase an enterprise’s attack surface, Forrester said. The Mirai botnet attacks of October 2016 raised awareness about the need to protect IoT devices, and many vendors at RSA used this as an example of the threats facing businesses. While a growing number of companies claim to address these threats, the market is still underdeveloped, and IoT security will require people and policies as much as technological solutions, Forrester stated.

“[Security and risk] pros need to be a part of the IoT initiative and extend security processes to encompass these IoT changes,” the report stated. “For tools, seek solutions that can inventory IoT devices and provide full visibility into the network traffic operating in the environment.”

2. Encryption of data in use becomes practical

Encryption of data at rest and in transit has become easier to implement in recent years, and is key for protecting sensitive data generated by IoT devices. However, many security professionals struggle to overcome encryption challenges such as classification and key management.

Enterprises should consider homomorphic encryption, a system that allows you to keep data encrypted as you query, process, and analyze it. Forrester offers the example of a retailer who could use this method to encrypt a customer’s credit card number, and keep it to use for future transactions without fear, because it would never need to be decrypted.
istock-622184706-1.jpg
Image: iStockphoto/HYWARDS

3. Threat intelligence vendors clarify and target their services

A strong threat intelligence partner can help organizations avoid attacks and adjust security policies to address vulnerabilities. However, it can be difficult to cut through the marketing jargon used by these vendors to determine the value of the solution. At RSA 2017, Forrester noted that vendors are trying to improve their messaging to help customers distinguish between services. For example, companies including Digital Shadows, RiskIQ, and ZeroFOX have embraced the concept of “digital risk monitoring” as a complementary category to the massive “threat intelligence” market.

“This trend of vendors using more targeted, specific messaging to articulate their capabilities and value is in turn helping customers avoid selection frustrations and develop more comprehensive, and less redundant, capabilities,” the report stated. To find the best solution for your enterprise, you can start by developing a cybersecurity strategy based on your vertical, size, maturity, and other factors, so you can better assess what vendors offer and if they can meet your needs.

4. Implicit and behavioral authentication solutions help fight cyberattacks

A recent Forrester survey found that, of firms that experienced at least one breach from an external threat actor, 37% reported that stolen credentials were used as a means of attack. “Using password-based, legacy authentication methods is not only insecure and damaging to the employee experience, but it also places a heavy administrative burden (especially in large organizations) on S&R professionals,” the report stated.

Vendors have responded: Identity and access management solutions are incorporating a number of data sources, such as network forensic information, security analytics data, user store logs, and shared hacked account information, into their IAM policy enforcement solutions. Forrester also found that authentication solutions using things like device location, sensor data, and mouse and touchscreen movement to determine normal baseline behavior for users and devices, which are then used to detect anomalies.

Forrester recommends verifying vendors’ claims about automatic behavioral profile building, and asking the following questions:

  • Does the solution really detect behavioral anomalies?
  • Does the solution provide true interception and policy enforcement features?
  • Does the solution integrate with existing SIM and incident management solutions in the SOC?
  • How does the solution affect employee experience?

5. Algorithm wars heat up

Vendors at RSA 2017 latched onto terms such as machine learning, security analytics, and artificial intelligence (AI) to solve enterprise security problems, Forrester noted. While these areas hold great promise, “current vendor product capabilities in these areas vary greatly,” the report stated. Therefore, it’s imperative for tech leaders to verify that vendor capabilities match their marketing messaging, to make sure that the solution you purchase can actually deliver results, Forrester said.

While machine learning and AI do have roles to play in security, they are not a silver bullet, Forrester noted. Security professionals should focus instead on finding vendors that solve problems you are dealing with, and have referenceable customers in your industry.

Source: http://globalbigdataconference.com/news/140973/iot-encryption-and-ai-lead-top-security-trends-for-2017.html

You Can’t Hack What You Can’t See

1 Apr
A different approach to networking leaves potential intruders in the dark.
Traditional networks consist of layers that increase cyber vulnerabilities. A new approach features a single non-Internet protocol layer that does not stand out to hackers.

A new way of configuring networks eliminates security vulnerabilities that date back to the Internet’s origins. Instead of building multilayered protocols that act like flashing lights to alert hackers to their presence, network managers apply a single layer that is virtually invisible to cybermarauders. The result is a nearly hack-proof network that could bolster security for users fed up with phishing scams and countless other problems.

The digital world of the future has arrived, and citizens expect anytime-anywhere, secure access to services and information. Today’s work force also expects modern, innovative digital tools to perform efficiently and effectively. But companies are neither ready for the coming tsunami of data, nor are they properly armored to defend against cyber attacks.

The amount of data created in the past two years alone has eclipsed the amount of data consumed since the beginning of recorded history. Incredibly, this amount is expected to double every few years. There are more than 7 billion people on the planet and nearly 7 billion devices connected to the Internet. In another few years, given the adoption of the Internet of Things (IoT), there could be 20 billion or more devices connected to the Internet.

And these are conservative estimates. Everyone, everywhere will be connected in some fashion, and many people will have their identities on several different devices. Recently, IoT devices have been hacked and used in distributed denial-of-service (DDoS) attacks against corporations. Coupled with the advent of bring your own device (BYOD) policies, this creates a recipe for widespread disaster.

Internet protocol (IP) networks are, by their nature, vulnerable to hacking. Most if not all these networks were put together by stacking protocols to solve different elements in the network. This starts with 802.1x at the lowest layer, which is the IEEE standard for connecting to local area networks (LANs) or wide area networks (WANs). Then stacked on top of that is usually something called Spanning Tree Protocol, designed to eliminate loops on redundant paths in a network. These loops are deadly to a network.

Other layers are added to generate functionality (see The Rise of the IP Network and Its Vulnerabilities). The result is a network constructed on stacks of protocols, and those stacks are replicated throughout every node in the network. Each node passes traffic to the next node before the user reaches its destination, which could be 50 nodes away.

This M.O. is the legacy of IP networks. They are complex, have a steep learning curve, take a long time to deploy, are difficult to troubleshoot, lack resilience and are expensive. But there is an alternative.

A better way to build a network is based on a single protocol—an IEEE standard labeled 802.1aq, more commonly known as Shortest Path Bridging (SPB), which was designed to replace the Spanning Tree Protocol. SPB’s real value is its hyperflexibility when building, deploying and managing Ethernet networks. Existing networks do not have to be ripped out to accommodate this new protocol. SPB can be added as an overlay, providing all its inherent benefits in a cost-effective manner.

Some very interesting and powerful effects are associated with SPB. Because it uses what is known as a media-access-control-in-media-access-control (MAC-in-MAC) scheme to communicate, it naturally shields any IP addresses in the network from being sniffed or seen by hackers outside of the network. If the IP address cannot be seen, a hacker has no idea that the network is actually there. In this hypersegmentation implementation of 16 million different virtual network services, this makes it almost impossible to hack anything in a meaningful manner. Each network segment only knows which devices belong to it, and there is no way to cross over from one segment to another. For example, if a hacker could access an HVAC segment, he or she could not also access a credit card segment.

As virtual LANs (VLANs) allow for the design of a single network, SPB enables distributed, interconnected, high-performance enterprise networking infrastructure. Based on a proven routing protocol, SPB combines decades of experience with intermediate system to intermediate system (IS-IS) and Ethernet to deliver more power and scalability than any of its predecessors. Using the IEEE’s next-generation VLAN, called an individual service identification (I-SID), SPB supports 16 million unique services, compared with the VLAN limit of 4,000. Once SPB is provisioned at the edge, the network core automatically interconnects like I-SID endpoints to create an attached service that leverages all links and equal cost connections using an enhanced shortest path algorithm.

Making Ethernet networks easier to use, SPB preserves the plug-and-play nature that established Ethernet as the de facto protocol at Layer 2, just as IP dominates at Layer 3. And, because improving Ethernet enhances IP management, SPB enables more dynamic deployments that are easier to maintain than attempts that tap other technologies.

Implementing SPB obviates the need for the hop-by-hop implementation of legacy systems. If a user needs to communicate with a device at the network edge—perhaps in another state or country—that other device now is only one hop away from any other device in the network. Also, because an SPB system is an IS-IS or a MAC-in-MAC scheme, everything can be added instantly at the edge of the network.

This accomplishes two major points. First, adding devices at the edge allows almost anyone to add to the network, rather than turning to highly trained technicians alone. In most cases, a device can be scanned to the network via a bar code before its installation, and a profile authorizing that device to the network also can be set up in advance. Then, once the device has been installed, the network instantly recognizes it and allows it to communicate with other network devices. This implementation is tailor-made for IoT and BYOD environments.

Second, if a device is disconnected or unplugged from the network, its profile evaporates, and it cannot reconnect to the network without an administrator reauthorizing it. This way, the network cannot be compromised by unplugging a device and plugging in another for evil purposes.

SPB has emerged as an unhackable network. Over the past three years, U.S. multinational technology company Avaya has used it for quarterly hackathons, and no one has been able to penetrate the network in those 12 attempts. In this regard, it truly is a stealth network implementation. But it also is a network designed to thrive at the edge, where today’s most relevant data is being created and consumed, capable of scaling as data grows while protecting itself from harm. As billions of devices are added to the Internet, experts may want to rethink the underlying protocol and take a long, hard look at switching to SPB.

Source: http://www.afcea.org/content/?q=you-can%E2%80%99t-hack-what-you-can%E2%80%99t-see

Using R for Scalable Data Analytics

1 Apr

At the recent Strata conference in San Jose, several members of the Microsoft Data Science team presented the tutorial Using R for Scalable Data Analytics: Single Machines to Spark Clusters. The materials are all available online, including the presentation slides and hands-on R scripts. You can follow along with the materials at home, using the Data Science Virtual Machine for Linux, which provides all the necessary components like Spark and Microsoft R Server. (If you don’t already have an Azure account, you can get $200 credit with the Azure free trial.)

The tutorial covers many different techniques for training predictive models at scale, and deploying the trained models as predictive engines within production environments. Among the technologies you’ll use are Microsoft R Server running on Spark, the SparkR package, the sparklyr package and H20 (via the rsparkling package). It also touches on some non-Spark methods, like the bigmemory and ff packages for R (and various other packages that make use of them), and using the foreach package for coarse-grained parallel computations. You’ll also learn how to create prediction engines from these trained models using the mrsdeploy package.

Mrsdeploy

The tutorial also includes scripts for comparing the performance of these various techniques, both for training the predictive model:

Training

and for generating predictions from the trained model:

Scoring

(The above tests used 4 worker nodes and 1 edge node, all with with 16 cores and 112Gb of RAM.)

You can find the tutorial details, including slides and scripts, at the link below.

Strata + Hadoop World 2017, San Jose: Using R for scalable data analytics: From single machines to Hadoop Spark clusters

 

Source: http://blog.revolutionanalytics.com/big-data/

Streaming Big Data: Storm, Spark and Samza

1 Apr

There are a number of distributed computation systems that can process Big Data in real time or near-real time. This article will start with a short description of three Apache frameworks, and attempt to provide a quick, high-level overview of some of their similarities and differences.

Apache Storm

In Storm, you design a graph of real-time computation called a topology, and feed it to the cluster where the master node will distribute the code among worker nodes to execute it. In a topology, data is passed around between spouts that emit data streams as immutable sets of key-value pairs called tuples, and bolts that transform those streams (count, filter etc.). Bolts themselves can optionally emit data to other bolts down the processing pipeline.

storm-architecture4

Apache Spark

Spark Streaming (an extension of the core Spark API) doesn’t process streams one at a time like Storm. Instead, it slices them in small batches of time intervals before processing them. The Spark abstraction for a continuous stream of data is called a DStream (for Discretized Stream). A DStream is a micro-batch of RDDs (Resilient Distributed Datasets). RDDs are distributed collections that can be operated in parallel by arbitrary functions and by transformations over a sliding window of data (windowed computations).

spark-architecture4

Apache Samza

Samza ’s approach to streaming is to process messages as they are received, one at a time. Samza’s stream primitive is not a tuple or a Dstream, but a message. Streams are divided into partitions and each partition is an ordered sequence of read-only messages with each message having a unique ID (offset). The system also supports batching, i.e. consuming several messages from the same stream partition in sequence. Samza`s Execution & Streaming modules are both pluggable, although Samza typically relies on Hadoop’s YARN (Yet Another Resource Negotiator) and Apache Kafka.

samza4

Common Ground

All three real-time computation systems are open-source, low-latencydistributed, scalable and fault-tolerant. They all allow you to run your stream processing code through parallel tasks distributed across a cluster of computing machines with fail-over capabilities. They also provide simple APIs to abstract the complexity of the underlying implementations.

The three frameworks use different vocabularies for similar concepts:

Apache-concepts2

Comparison Matrix

A few of the differences are summarized in the table below:

Apaches

There are three general categories of delivery patterns:

  1. At-most-once: messages may be lost. This is usually the least desirable outcome.
  2. At-least-once: messages may be redelivered (no loss, but duplicates). This is good enough for many use cases.
  3. Exactly-once: each message is delivered once and only once (no loss, no duplicates). This is a desirable feature although difficult to guarantee in all cases.

Another aspect is state management. There are different strategies to store state. Spark Streaming writes data into the distributed file system (e.g. HDFS). Samza uses an embedded key-value store. With Storm, you’ll have to either roll your own state management at your application layer, or use a higher-level abstraction called Trident.

Use Cases

All three frameworks are particularly well-suited to efficiently process continuous, massive amounts of real-time data. So which one to use? There are no hard rules, at most a few general guidelines.

If you want a high-speed event processing system that allows for incremental computations, Storm would be fine for that. If you further need to run distributed computations on demand, while the client is waiting synchronously for the results, you’ll have Distributed RPC (DRPC) out-of-the-box. Last but not least, because Storm uses Apache Thrift, you can write topologies in any programming language. If you need state persistence and/or exactly-once delivery though, you should look at the higher-level Trident API, which also offers micro-batching.

A few companies using Storm: Twitter, Yahoo!, Spotify, The Weather Channel...

Speaking of micro-batching, if you must have stateful computations, exactly-once delivery and don’t mind a higher latency, you could consider Spark Streaming…specially if you also plan for graph operations, machine learning or SQL access. The Apache Spark stack lets you combine several libraries with streaming (Spark SQL, MLlibGraphX) and provides a convenient unifying programming model. In particular, streaming algorithms (e.g. streaming k-means) allow Spark to facilitate decisions in real-time.

spark-stack

A few companies using Spark: Amazon, Yahoo!, NASA JPL, eBay Inc., Baidu…

If you have a large amount of state to work with (e.g. many gigabytes per partition), Samza co-locates storage and processing on the same machines, allowing to work efficiently with state that won’t fit in memory. The framework also offers flexibility with its pluggable API: its default execution, messaging and storage engines can each be replaced with your choice of alternatives. Moreover, if you have a number of data processing stages from different teams with different codebases, Samza ‘s fine-grained jobs would be particularly well-suited, since they can be added/removed with minimal ripple effects.

A few companies using Samza: LinkedIn, Intuit, Metamarkets, Quantiply, Fortscale…

Conclusion

We only scratched the surface of The Three Apaches. We didn’t cover a number of other features and more subtle differences between these frameworks. Also, it’s important to keep in mind the limits of the above comparisons, as these systems are constantly evolving.

The IoT: It’s a question of scope

1 Apr

There is a part of the rich history of software development that will be a guiding light, and will support creation of the software that will run the Internet of Things (IoT). It’s all a question of scope.

Figure 1 is a six-layer architecture, showing what I consider to be key functional and technology groupings that will define software structure in a smart connected product.

Figure 1

The physical product is on the left. “Connectivity” in the third box allows the software in the physical product to connect to back-end application software on the right. Compared to a technical architecture, this is an oversimplification. But it will help me explain why I believe the concept of “scope” is so important for everyone in the software development team.

Scope is a big deal
The “scope” I want to focus on is a well-established term used to explain name binding in computer languages. There are other uses, even within computer science, but for now, please just exclude them from your thinking, as I am going to do.

The concept of scope can be truly simple. Take the name of some item in a software system. Now decide where within the total system this name is a valid way to refer to the item. That’s the scope of this particular name.

(Related: What newcomers to IoT plan for its future)

I don’t have evidence, but I imagine that the concept arose naturally in the earliest days of software, with programs written in machine code. The easiest way to handle variables is to give them each a specific memory location. These are global variables; any part of the software that knows the address can access and use these variables.

But wait! It’s 1950 and we’ve used all 1KB of memory! One way forward is to recognize that some variables are used only by localized parts of the software. So we can squeeze more into our 1KB by sharing memory locations. By the time we get to section two of the software, section one has no more use for some of its variables, so section two can reuse those addresses. These are local variables, and as machine code gave way to assembler languages and high-level languages, addresses gave way to names, and the concept of scope was needed.

But scope turned out to be much more useful than just a way to share precious memory. With well-chosen rules on scope, computer languages used names to define not only variables, but whole data structures, functions, and connections to peripherals as well. You name it, and, well yes, you could give it a name. This created new ways of thinking about software structure. Different parts of a system could be separated from other parts and developed independently.

A new software challenge
There’s a new challenge for IoT software, and this challenge applies to all the software across the six boxes in Figure 1. This includes the embedded software in the smart connected device, the enterprise applications that monitor and control the device, as well as the software-handling access control and product-specific functions.

The challenge is the new environment for this software. These software types and the development teams behind them are very comfortable operating in essentially “closed” environments. For example, the embedded software used to be just a control system; its universe was the real-time world of sensors and actuators together with its memory space and operating system. Complicated, but there was a boundary.

Now, it’s connected to a network, and it has to send and receive messages, some of which may cause it to update itself. Still complicated, and it has no control over the timing, sequence or content of the messages it receives. Timing and sequence shouldn’t be a problem; that’s like handling unpredictable screen clicks or button presses from a control panel. But content? That’s different.

Connectivity creates broadly similar questions about the environment for the software across all the six layers. Imagine implementing a software-feature upgrade capability. Whether it’s try-before-you-buy or a confirmed order, the sales-order processing system is the one that holds the official view of what the customer has ordered. So a safe transaction-oriented application like SOP is now exposed to challenging real-world questions. For example, how many times, and at what frequency, should it retry after a device fails to acknowledge an upgrade command within the specified time?

An extensible notion
The notion of scope can be extended to help development teams handle this challenge. It doesn’t deliver the solutions, but it will help team members think about and define structure for possible solution architectures.

For example, Figure 2 looks at software in a factory, where the local scope of sensor readings and actuator actions in a work-cell automation system are in contrast to the much broader scope of quality and production metrics, which can drive re-planning of production, adjustment of machinery, or discussions with suppliers about material quality.

Figure 2

Figure 3 puts this example from production in the context of the preceding engineering development work, and the in-service life of this product after it leaves the factory.

Figure 3

Figure 4 adds three examples of new IoT capabilities that will need new software: one in service (predictive maintenance), and two in the development phase (calibration of manufacturing models to realities in the factory, and engineering access to in-service performance data).

Figure 4

Each box is the first step to describing and later defining the scope of the data items, messages, and sub-systems involved in the application. Just like the 1950s machine code programmers, one answer is “make everything global”—or, in today’s terms, “put everything in a database in the cloud.” And as in 1950, that approach will probably be a bit heavy on resources, and therefore fail to scale.

Dare I say data dictionary?
A bit old school, but there are some important extensions to ensure a data dictionary articulates not only the basic semantics of a data item, but also its reliability, availability, and likely update frequency. IoT data may not all be in a database; a lot of it starts out there in the real world, so attributes like time and cost of updates may be relevant. For the development team, stories, scrums and sprints come first. But after a few cycles, the data dictionary can be the single reference that ensures everyone can discuss the required scope for every artifact in the system-of-systems.

Software development teams for every type of software involved in an IoT solution (for example, embedded, enterprise, desktop, web and cloud) will have an approach (and possibly different approaches) to naming, documenting, and handling design questions: Who creates, reads, updates or deletes this artifact? What formats do we use to move data inside one subsystem, or between subsystems? Which subsystem is responsible for orchestrating a response to a change in a data value? Given a data dictionary, and a discussion about the importance of scope, these teams should be able to discuss everything that happens at their interfaces.

Different programming languages have different ways of defining scope. I believe it’s worth reviewing a few of these, maybe explore some boundaries by looking at some more esoteric languages. This will remind you of all the wonderful possibilities and unexpected pitfalls of using, communicating, and sharing data and other information technology artifacts. The rules the language designers have created may well inspire you to develop guidelines and maybe specific rules for your IoT system. You’ll be saving your IoT system development team a lot of time.

Source: http://sdtimes.com/analyst-view-iot-question-scope/

The Cost of a DDoS Attack on the Darknet

17 Mar

Distributed Denial of Service attacks, commonly called DDoS, have been around since the 1990s. Over the last few years they became increasingly commonplace and intense. Much of this change can be attributed to three factors:

1. The evolution and commercialization of the dark web

2. The explosion of connected (IoT) devices

3. The spread of cryptocurrency

This blog discusses how each of these three factors affects the availability and economics of spawning a DDoS attack and why they mean that things are going to get worse before they get better.

Evolution and Commercialization of the Dark Web

Though dark web/deep web services are not served up in Google for the casual Internet surfer, they exist and are thriving. The dark web is no longer a place created by Internet Relay Chat or other text-only forums. It is a full-fledged part of the Internet where anyone can purchase any sort of illicit substance and services. There are vendor ratings such as those for “normal” vendors, like YELP. There are support forums and staff, customer satisfaction guarantees and surveys, and service catalogues. It is a vibrant marketplace where competition abounds, vendors offer training, and reputation counts.

Those looking to attack someone with a DDoS can choose a vendor, indicate how many bots they want to purchase for an attack, specify how long they want access to them, and what country or countries they want them to reside in. The more options and the larger the pool, the more the service costs. Overall, the costs are now reasonable. If the attacker wants to own the bots used in the DDoS onslaught, according to SecureWorks, a centrally-controlled network could be purchased in 2014 for $4-12/thousand unique hosts in Asia, $100-$120 in the UK, or $140 to $190 in the USA.

Also according to SecureWorks, in late 2014 anyone could purchase a DDoS training manual for $30 USD. Users could utilize single tutorials for as low as $1 each. After training, users can rent attacks for between $3 to $5 by the hour, $60 to $90 per day, or $350 to $600 per week.

Since 2014, the prices declined by about 5% per year due to bot availability and competing firms’ pricing pressures.

The Explosion of Connected (IoT) Devices

Botnets were traditionally composed of endpoint systems (PCs, laptops, and servers) but the rush for connected homes, security systems, and other non-commercial devices created a new landing platform for attackers wishing to increase their bot volumes. These connected devices generally have low security in the first place and are habitually misconfigured by users, leaving the default access credentials open through firewalls for remote communications by smart device apps. To make it worse, once created and deployed, manufactures rarely produce any patches for the embedded OS and applications, making them ripe for compromise. A recent report distributed by Forescout Technologies identified how easy it was to compromise home IoT devices, especially security cameras. These devices contributed to the creation and proliferation of the Mirai botnet. It was wholly comprised of IoT devices across the globe. Attackers can now rent access to 100,000 IoT-based Mirai nodes for about $7,500.

With over 6.4 billion IoT devices currently connected and an expected 20 billion devices to be online by 2020, this IoT botnet business is booming.

The Spread of Cryptocurrency

To buy a service, there must be a means of payment. In the underground no one trusts credit cards. PayPal was an okay option, but it left a significant audit trail for authorities. The rise of cryptocurrency such as Bitcoin provides an accessible means of payment without a centralized documentation authority that law enforcement could use to track the sellers and buyers. This is perfect for the underground market. So long as cryptocurrency holds its value, the dark web economy has a transactional basis to thrive.

Summary

DDoS is very disruptive and relatively inexpensive. The attack on security journalist Brian Krebs’s blog site in September of 2016 severely impacted his anti-DDoS service providers’ resources . The attack lasted for about 24 hours, reaching a record bandwidth of 620Gbps. This was delivered entirely by a Mirai IoT botnet. In this particular case, it is believed that the original botnet was created and controlled by a single individual so the only cost to deliver it was time. The cost to Krebs was just a day of being offline.

Krebs is not the only one to suffer from DDoS. In attacks against Internet reliant companies like Dyn, which caused the unavailability of Twitter, the Guardian, Netflix, Reddit, CNN, Etsy, Github, Spotify, and many others, the cost is much higher. Losses can reach multi- millions of dollars. This means a site that costs several thousands of dollars to set up and maintain and generates millions of dollars in revenue can be taken offline for a few hundred dollars, making it a highly cost-effective attack. With low cost, high availability, and a resilient control infrastructure, it is sure that DDoS is not going to fade away, and some groups like Deloitte believe that attacks in excess of 1Tbps will emerge in 2017. They also believe the volume of attacks will reach as high as 10 million in the course of the year. Companies relying on their web presence for revenue need to strongly consider their DDoS strategy to understand how they are going to defend themselves to stay afloat.

Why the industry accelerated the 5G standard, and what it means

17 Mar

The industry has agreed, through 3GPP, to complete the non-standalone (NSA) implementation of 5G New Radio (NR) by December 2017, paving the way for large-scale trials and deployments based on the specification starting in 2019 instead of 2020.

Vodafone proposed the idea of accelerating development of the 5G standard last year, and while stakeholders debated various proposals for months, things really started to roll just before Mobile World Congress 2017. That’s when a group of 22 companies came out in favor of accelerating the 5G standards process.

By the time the 3GPP RAN Plenary met in Dubrovnik, Croatia, last week, the number of supporters grew to more than 40, including Verizon, which had been a longtime opponent of the acceleration idea. They decided to accelerate the standard.

At one time over the course of the past several months, as many as 12 different options were on the table, but many operators and vendors were interested in a proposal known as Option 3.

According to Signals Research Group, the reasoning went something like this: If vendors knew the Layer 1 and Layer 2 implementation, then they could turn the FGPA-based solutions into silicon and start designing commercially deployable solutions. Although operators eventually will deploy a new 5G core network, there’s no need to wait for a standalone (SA) version—they could continue to use their existing LTE EPC and meet their deployment goals.

“Even though a lot of work went into getting to this point, now the real work begins. 5G has officially moved from a study item to a work item in 3GPP.”

Meanwhile, a fundamental feature has emerged in wireless networks over the last decade, and we’re hearing a lot more about it lately: The ability to do spectrum aggregation. Qualcomm, which was one of the ring leaders of the accelerated 5G standard plan, also happens to have a lot of engineering expertise in carrier aggregation.

“We’ve been working on these fundamental building blocks for a long time,” said Lorenzo Casaccia, VP of technical standards at Qualcomm Technologies.

Casaccia said it’s possible to aggregate LTE with itself or with Wi-Fi, and the same core principle can be extended to LTE and 5G. The benefit, he said, is that you can essentially introduce 5G more casually and rely on the LTE anchor for certain functions.

In fact, carrier aggregation, or CA, has been emerging over the last decade. Dual-carrier HSPA+ was available, but CA really became popularized with LTE-Advanced. U.S. carriers like T-Mobile US boast about offering CA since 2014 and Sprint frequently talks about the ability to do three-channel CA. One can argue that aggregation is one of the fundamental building blocks enabling the 5G standard to be accelerated.

Of course, even though a lot of work went into getting to this point, now the real work begins. 5G has officially moved from a study item to a work item in 3GPP.

Over the course of this year, engineers will be hard at work as the actual writing of the specifications needs to happen in order to meet the new December 2017 deadline.

AT&T, for one, is already jumping the gun, so to speak, preparing for the launch of standards-based mobile 5G as soon as late 2018. That’s a pretty remarkable turn of events given rival Verizon’s constant chatter about being first with 5G in the U.S.

Verizon is doing pre-commercial fixed broadband trials now and plans to launch commercially in 2018 at last check. Maybe that will change, maybe not.

Historically, there’s been a lot of worry over whether other parts of the world will get to 5G before the U.S. Operators in Asia in particular are often proclaiming their 5G-related accomplishments and aspirations, especially as it relates to the Olympics. But exactly how vast and deep those services turn out to be is still to be seen.

Further, there’s always a concern about fragmentation. Some might remember years ago, before LTE sort of settled the score, when the biggest challenge in wireless tech was keeping track of the various versions: UMTS/WCDMA, HSPA and HSPA+, cdma2000, 1xEV-DO, 1xEV-DO Revision A, 1xEV-DO Revision B and so on. It’s a bit of a relief to no longer be talking about those technologies. And most likely, those working on 5G remember the problems in roaming and interoperability that stemmed from these fragmented network standards.

But the short answer to why the industry is in such a hurry to get to 5G is easy: Because it can.

Like Qualcomm’s tag line says: Why wait? The U.S. is right to get on board the train. With any luck, there will actually be 5G standards that marketing teams can legitimately cite to back up claims about this or that being 5G. We can hope.

Source: http://www.fiercewireless.com/tech/editor-s-corner-why-hurry-to-accelerate-5g

KPN Fears 5G Freeze-Out

17 Mar
  • KPN Telecom NV (NYSE: KPN) is less than happy with the Dutch government’s policy on spectrum, and says that the rollout of 5G in the Netherlands and the country’s position at the forefront of the move to a digital economy is under threat if the government doesn’t change tack. The operator is specifically frustrated by the uncertainty surrounding the availability of spectrum in the 3.5GHz band, which has been earmarked by the EU for the launch of 5G. KPN claims that the existence of a satellite station at Burum has severely restricted the use of this band. It also objects to the proposed withdrawal of 2 x 10MHz of spectrum that is currently available for mobile communications. In a statement, the operator concludes: “KPN believes that Dutch spectrum policy will only be successful if it is in line with international spectrum harmonization agreements and consistent with European Union spectrum policy.”
  • Russian operator MegaFon is trumpeting a new set of “smart home” products, which it has collectively dubbed Life Control. The system, says MegaFon, uses a range of sensors to handle tasks related to the remote control of the home, and also encompasses GPS trackers and fitness bracelets. Before any of the Life Control products will work, however, potential customers need to invest in MegaFon’s Smart Home Center, which retails for 8,900 rubles ($150).
  • German digital service provider Exaring has turned to ADVA Optical Networking (Frankfurt: ADV) ‘s FSP 3000 platform to power what Exaring calls Germany’s “first fully integrated platform for IP entertainment services.” Exaring’s new national backbone network will transmit on-demand TV and gaming services to around 23 million households.
  • British broadcaster UKTV, purveyor of ancient comedy shows on the Dave channel and more, has unveiled a new player on the YouView platform for its on-demand service. It’s the usual rejig: new home screen, “tailored” program recommendations and so on. The update follows YouView’s re-engineering of its platform, known as Next Generation YouView.

Source: http://www.lightreading.com/mobile/spectrum/eurobites-kpn-fears-5g-freeze-out/d/d-id/731160?

 

%d bloggers like this: