Orchestration for NFV Needs Scale and Resiliency

Synchronized swimmers

For communication service providers (CSPs) and cloud operators, being able to access a huge range of support software is one of the key benefits of network functions virtualization (NFV). Software packages, such as Open vSwitch, OpenStack, KVM and Linux, give cloud operators and CSPs a foundation on which to create new services based on virtual network functions (VNFs).

However, the telco network is not the same as the data center, so cloud-centric solutions may not transfer directly. In particular, operators are concerned about the scale of deployments that can be supported along with the resiliency of those deployments. How do we enable scaling of virtualized services with necessary reliability?

Use Web-Centric Methods to Drive Scale

Fortunately, web-scale deployments in data centers have solved the problem of scale with a tiered architecture that includes web interfaces, load-balancing and distributed databases. A typical architecture is shown:

There are four essential elements to load balancing:

  • Client applications (e.g., web browsers) present HTTP requests to a single IP address.
  • A front-end proxy or load balancer receives incoming HTTP requests sent to the single IP address and then distributes those HTTP messages across a number of servers, which have separate private addresses.
  • The application layer (NFV orchestration in this case) runs a variable number of servers depending on the load being managed. The application layer must be written such that each request is independent from other requests, and so that the persistent state is stored outside the application server. When implemented properly, the number of servers can be increased or decreased with load, providing horizontal scalability.
  • A distributed database completes the architecture by sharing state between the application servers.

Application to NFV Orchestration

In the case of NFV orchestration, the load will grow with the clouds being managed, the number of VNFs instantiated and the number of virtualized services created. Given the distributed nature of NFV deployments, the total load can grow quite quickly.

To handle the anticipated scaling requirements, ADVA Optical Networking’s Ensemble Orchestrator is built around a standard web-scale architecture. The following image shows specifics of how Ensemble Orchestrator implements this architecture:

In the case of NFV orchestration, the controlling application could be a person or the operator’s OSS/BSS systems, which submits orchestration services orders via the RESTful northbound interface of Ensemble Orchestrator. For typical deployment of Ensemble Orchestrator, this interface is provided by using the industry-standard open source programs HAProxy and Keepalived, but other available proxies and load balancers can also be used.

The proxy layer performs the load-balancing function by distributing the incoming service orders across the multiple instances of Ensemble Orchestrator using their RESTful interfaces that come into a Tomcat server. Each RESTful transaction is handled independently by the Ensemble Orchestrator core, which uses tools such as BPMN and Drools to process the work. The persistent data is stored in the distributed database, which is then propagated by replication. In the case of Ensemble Orchestrator the database is configured as synchronous multi-master.

Ensemble Orchestrator handles other tasks in addition to services orders, which might actually represent a small amount of the overall workload. Ensemble Orchestrator is also handling other tasks in a continuous fashion to support lifecycle management of VNFs, including:

  • Monitoring orchestrated services
  • Monitoring the NFV infrastructure and remediating faults
  • Monitoring the VNFs and applying horizontal scaling when needed by elastic VNFs

Multiple instances of Ensemble Orchestrator can work together because they conform to web-scale design, whereby all persistent state is placed in the database cluster, rather than being cached in local memory. This enables a related set of API requests to be load-balanced across multiple instances of Ensemble Orchestrator.

All instances of Ensemble Orchestrator are aware of each other. If an instance disappears due to a software crash or server failure, other instances take note of any cleanup that might be required. All internal work tasks are queued for execution in database queues. As a result, an instance of Ensemble Orchestrator, whether existing or new, can look at the database queues to find available tasks. The result is a self-balancing workload model.

Resiliency is a fundamental aspect of this architecture. The load-balancing function distributes the load across multiple servers and the database is also distributed. Any server that fails will no longer be included in the load distribution. The result is that there is no single point of failure.

Take Advantage of Orchestration to Support Cloud in a Box

Service providers are also worried about the resiliency, scalability and supportability of OpenStack, particularly in the case of a distributed model where a central OpenStack controller is connected to a number of remote compute nodes. BT’s Peter Willis presented a famous enumeration of these issues at the SDN & OpenFlow World Congress in Dusseldorf. Many of these issues can be addressed by moving to a cloud-in-a-box model where each compute node includes an instance of OpenStack.

While the cloud-in-a-box approach solves one set of issues, it creates a massive increase in the number of OpenStack instances. Fortunately, we now have a solution for this new issue: a horizontally scalable orchestrator.

Each instance of Ensemble Orchestrator is working independently based on discrete tasks. A given instance can communicate with an instance of OpenStack as needed, with no fixed linkage between particular OpenStack clouds and particular orchestrator instances.

As a result, Ensemble Orchestrator can scale out to support large numbers of OpenStack instances, providing a more efficient and scalable path to deploy virtualized services.

Result: an Orchestration Solution That Meets Carrier Requirements

By leveraging standard web-centric methods of load-balancing and horizontal scale, Ensemble Orchestrator enables service providers to start deploying virtualized solutions based on NFV, and then scale those solutions as demand grows. Furthermore, Ensemble Orchestrator uses that horizontal scalability to provide resiliency using standard low-cost servers. Finally, the scalability of Ensemble Orchestrator overcomes the limitations of OpenStack by enabling a distributed cloud-in-a-box architecture.

Related articles