The CORBA benchmarking suite that was used to generate many of the results reported in these papers is available for downloading as part of the TAO release.
As middleware-based distributed applications become more pervasive, the need to improve scalability of those application becomes more important. Improving scalability can be achieved through the use of a load balancing service. Earlier generations of middleware-based load balancing services were simplistic, however, since they focused on specific use-cases and environments. These limitations made it hard to use the same load balancing service for anything other than a small class of distributed applications. Moreover, the lack of generality forced continuous redevelopment of application-specific load balancing services. Not only did this redevelopment increase distributed applications deployment costs, but it also increased the potential of producing non-optimal load balancing implementations since time-proven load balancing service optimizations could not be reused directly without undue effort. Recent advances in the design and implementation of middleware load balancing services overcome these limitations through several techniques, all of which are present in Cygnus, which is adaptive load balancing service based on the CORBA middleware standard.
This paper presents the following contributions to research on adaptive middleware-based load balancing techniques: (1) it presents the results of empirical benchmarks that systematically evaluate different load balancing strategies provided in Cygnus by measuring how they improve scalability, and (2) illustrates when adaptive load balancing, as opposed to non-adaptive, is suitable for use in a middleware based distributed application.
Many distributed applications require a scalable event-driven communication model that decouples suppliers from consumers and simultaneously supports advanced quality of service (QoS) properties and event filtering mechanisms. The CORBA Notification Service provides a publish/subscribe mechanism that is designed to support scalable event-driven communication by routing events efficiently between many suppliers and consumers, enforcing various QoS properties (such as reliability, priority, ordering, and timeliness), and filtering events at multiple points in a distributed system.
This paper provides several contributions to research on scalable notification services. First, we present the CORBA Notification Service architecture and illustrate how it addresses limitations with the earlier CORBA Event Service. Second, we explain how we addressed key design challenges faced when implementing the Notification Service in TAO, which is our high-performance, real-time ORB. Finally, we discuss the optimizations used to improve the scalability of TAO's Notification Service.
Communication software for hand-held devices must be flexible and efficient to deliver the necessary Quality of Service (QoS) to multimedia applications such as real-time audio and video, video on-demand, electronic mail and fax, and Internet telephony. CORBA Object Request Brokers (ORBs) are an emerging middleware standard targeted for distributed applications. The stringent memory constraints imposed by hand-held device hardware necessitates a minimal footprint for ORB-based applications.
This paper provides three contributions to developing efficient ORB middleware for hand-held devices. First, we describe protocol implementation optimizations we employed to develop a time- and space-efficient interpretive IIOP protocol engine. Second, we describe IDL compiler optimizations for generating efficient stubs and skeletons that use our IIOP protocol engine. Finally, we empirically compare the performance and memory footprint of interpretive marshaling versus compiled marshaling for a wide range of IDL data types.
Our optimizations to the interpretive IIOP protocol engine improve its performance substantially and it is now comparable the performance of compiled marshaling. Moreover, our IDL compiler optimizations yielded stubs and skeletons whose footprint is substantially smaller than those using compiled marshaling.
Multi-threading allows long-running operations to execute simultaneously without impeding the progress of other operations. Likewise, multi-threading can minimize latency and ensure predictability in real-time systems. This paper describes and evaluates common CORBA multi-threading architectures used by ORB implementations, including CORBAplus, HP ORB Plus, miniCOOL, MT-Orbix, TAO, and VisiBroker.
There is increasing demand to extend object-oriented middleware, such as OMG CORBA, to support applications with stringent quality of service (QoS) requirements. However, conventional CORBA Object Request Broker (ORB) implementations incur high latency and low scalability when used for performance-sensitive applications. These inefficiencies discourage developers from using CORBA for mission/life-critical applications such as real-time avionics, telecom call processing, and medical imaging.
This paper provides two contributions to the research on CORBA performance. First, we systematically analyze the latency and scalability of two widely used CORBA ORBs, VisiBroker and Orbix. These results reveal key sources of overhead in conventional ORBs. Second, we describe techniques used to improve latency and scalability in TAO, which is a high-performance, real-time implementation of CORBA. Although conventional ORBs do not yet provide adequate QoS guarantees to applications, our research results indicate it is possible to implement ORBs that can support high-performance, real-time applications.
The Internet Inter-ORB Protocol (IIOP) enables heterogeneous CORBA-compliant Object Request Brokers (ORBs) to interoperate over TCP/IP networks. The IIOP uses the Common Data Representation (CDR) transfer syntax to map CORBA Interface Definition Language (IDL) data types into a bi-canonical wire format. Due to the excessive marshaling/demarshaling overhead, data copying, and high-levels of function call overhead, conventional implementations of IIOP protocols yield poor performance over high-speed networks. To meet the demands of emerging distributed multimedia applications, CORBA-compliant ORBs must support both interoperable and highly efficient IIOP implementations.
This paper provides two contributions to the study and design of high performance CORBA IIOP implementations. First, we precisely pinpoint the key sources of overhead in the SunSoft IIOP implementation (which is the standard reference implementation of IIOP written in C++) by measuring its performance for transferring richly-typed data over a high speed ATM network. Second, we empirically demonstrate the benefits that stem from systematically applying protocol optimizations to SunSoft IIOP. These optimizations include: optimizing for the common case; eliminating obvious waste; replacing general purpose methods with specialized, efficient ones; precomputing values, if possible; storing redundant state to speed up expensive operations; and passing information between layers; and optimizing for the cache.
The results of applying these optimization principles to SunSoft IIOP improved its performance 1.8 times for doubles, 3.3 times for longs, 3.75 times for shorts, 5 times for chars/octets, and 4.2 times for richly-typed structs over ATM networks. Our optimized implementation is now competitive with existing commercial ORBs using the static invocation interface (SII) and 2 to 4.5 times (depending on the data type) faster than commercial ORBs using the dynamic skeleton interface (DSI). Moreover, our optimizations are fully CORBA compliant and we maintain strict interoperability with other IIOP implementations such as Visigenic's VisiBroker and IONA's Orbix.
This paper presents two contributions to the study of CORBA performance over high-speed networks. First, we measure the latency of various types and sizes of twoway client requests using a pair of widely used implementations of CORBA -- Orbix 2.1 and VisiBroker for C++ 2.0. Second, we use Orbix and VisiBroker to measure the scalability of CORBA servers in terms of the number of objects they can support efficiently. These experiments extend our previous work on CORBA performance for bandwidth-sensitive applications (such as satellite surveillance, medical imaging, and teleconferencing).
Our results show that the latency for CORBA implementations is relatively high and server scalability is relatively low. Our latency experiments show that non-optimized internal buffering in CORBA implementations can cause substantial delay variance, which is unacceptable in many real-time or constrained-latency applications. Likewise, our scalability experiments reveal that neither Orbix nor VisiBroker can handle a large number of objects in a single server process.
The Common Object Request Broker Architecture (CORBA) is intended to simplify the task of developing distributed applications. Although it is well-suited for conventional RPC-style applications, several limitations become evident when CORBA is used for a broader range of performance-sensitive applications running in heterogeneous environments over high-speed networks. This paper illustrates the performance limitations of existing CORBA implementations in terms of their support for dynamic operation invocation and inter-ORB interoperability. The results indicate that a considerable amount of optimizations must be performed by ORB implementors before CORBA will be suitable for performance-sensitive applications on high-speed networks.
Conventional implementations of communication middleware (such as CORBA and traditional RPC toolkits) incur considerable overhead when used for performance-sensitive applications over high-speed networks. As gigabit networks become pervasive, inefficient middleware will force programmers to use lower-level mechanisms to achieve the necessary transfer rates, which is a serious problem for mission/life-critical applications (such as satellite surveillance and medical imaging).
This paper presents results comparing the performance of several widely used communication middleware mechanisms on a high-speed ATM network. The middleware ranged from lower-level mechanisms (such as socket-based C interfaces and C++ wrappers for sockets) to higher-level mechanisms (such as hand-optimized RPC and two implementations of CORBA -- Orbix and ORBeline). These measurements reveal that the lower-level C and C++ implementations outperform the CORBA implementations significantly (the best CORBA throughput for remote transfer was roughly 75 to 80 percent of the best C/C++ throughput for sending scalar data types and only around 31 percent for sending structs), and the hand-optimized RPC code performs slightly better than the CORBA implementations. Our goal in precisely pinpointing the sources of overhead for communication middleware is to develop scalable and flexible CORBA implementations that can deliver gigabit data rates to applications.
This paper describes the design and performance of an object-oriented communication framework being developed by Kodak Health Imaging Systems and the Electronic Radiology Laboratory at Washington University School of Medicine. The framework is designed to meet the demands of next-generation electronic medical imaging systems, which must transfer large quantities of data efficiently and flexibly in a distributed environment. A novel aspect of this framework is its seamless integration of flexible high-level CORBA distributed object computing middleware with efficient low-level socket network programming mechanisms. In the paper, we outline the design goals and software architecture of our framework, describe how we resolved design challenges, and illustrate the performance of the framework over high-speed ATM networks.
This paper makes two contributions to the development and evaluation of object-oriented communication software. First, it reports performance results from benchmarking several network programming mechanisms (C sockets, ACE C++ wrappers, and two versions of CORBA (Orbix and ORBeline) on Ethernet and ATM networks. These results illustrate that developers of high-bandwidth, low-delay applications (such as interactive medical imaging or teleconferencing) must evaluate their performance requirements and the efficiency of their communication infrastructure carefully before adopting a distributed object solution. Second, the paper describes the software architecture and design principles of the ACE object-oriented network programming components. These components encapsulate UNIX and Windows NT network programming interfaces (such as sockets, TLI, and named pipes) with C++ wrappers. Developers of object-oriented communication software have traditionally had to choose between high-performance, lower-level interfaces provided by sockets or TLI or less efficient, higher-level interfaces provided by communication frameworks like CORBA, DCE, or Distributed COM (DCOM). ACE represents a midpoint in the solution space by improving the correctness, programming simplicity, portability, and reusability of performance-sensitive communication software.
Back to my CORBA Research page.
Last modified 11:34:34 CDT 28 September 2006