ANTS TECHNOLOGY MAY IMPROVE COMPUTER PROGRAMMING

July 28, 2000

by Peter C. Patton, Ph.D.

San Diego, CA — CHoPP, for Columbia Homogenous Parallel Processor, is a parallel processor architecture developed by Herbert Sullivan when he was at Columbia University in New York in the 1970s. Sullivan used the acronym CHoPP as the name for a venture capital corporation he founded. He wanted funds to build a highly parallel supercomputer based on the CHoPP architecture and patents that resulted from his work at Columbia. He intended that this supercomputer would take advantage of concurrency at all levels of granularity in a computer program, even in existing programs that had been written for sequential or vector machines. A prototype was built at the firm’s facility in San Diego, California and the first model was sold to the San Diego Supercomputer Center. It was never delivered due to difficulty raising capital, but the prototype demonstrated that the basic concepts of the CHoPP architecture were effective and indeed, could be implemented in computer hardware.

The firm was reduced to a small consulting and design staff that became involved in a number of government sponsored studies and design projects revolving around parallel processing. During this period Sullivan came to realize that there was very little difference in the performance of various shared memory architectures. Average latency increases approximately with the log of the number of processors, processing power increases almost linearly until the number of processors equals half, or so, the average concurrency. Increasing the concurrency of the programs was much more important than any particular architecture. Much of the uniqueness of the CHoPP architecture, particularly the patented techniques for exploiting latent concurrency in existing programs, could also be implemented in user-level control software. The increasing use of distributed computing models, such as MPP and, more importantly, networks, led CHoPP to concentrate on using increased concurrency to reduce the effect of network latency on performance, in addition to the original goal of allowing nodes of hundreds or thousands of processors to scale. The original patents were sufficiently broad to cover this unforeseen application to software. However, recent developments at CHoPP and the increasing possibility of gaining software patents have led management of the firm to file specific software patents to cover a software development technology called ANTs: Asynchronous Non-preemptive Tasks. CHoPP has now applied for a patent for ANTs as a non-preemptive threading technology.

The people at CHoPP realized that, while the commercial software market was having difficulty scaling beyond a handful of processors, CHoPP had technology that permitted scaling into the thousands of processors. While the scalability and robustness of asynchronous, stateless communication has been made obvious by the success of the World Wide Web, its application to general purpose computing has been hidden by the overhead of existing methods. CHoPP has been using a low overhead mechanism for asynchronous distributed control for some time. The thirty year history of the development of this technology has been distilled into a simmple and elegant set of primitive operations that are object oriented, easy to use, and compatible with existing software systems. CHoPP now offers this environment for sale as the ANTs Software System. It is available in C++ and Java (soon). ANTs will also be available as part of MinimalJ, the most secure and portable client ever developed.

When Herb Sullivan first told me about the software implementation of his CHoPP technology, I reminded him of a termite colony. The mystery of a termite colony arises from the number of bits of information required to describe the architecture of a termite mound, a six-foot high structure commonly seen on the African savanna. The information required to design the architecture of the mound surpasses the number of neurons in a given termite’s brain; no individual termite’s brain is large enough to store the entire design. So where does the colony store it? Herb’s response was that, of course, the architecture is not stored explicitly, each termite knows only its preprogrammed function and they all work together in a non-preemptive multi-tasking environment, each to do their own thing, which in sum is the thing of the colony. He then replied that ants do the same sorts of things as termites but do it in the daylight and are beautiful and shiny black or red creatures, rather than ugly and pasty white or semi-transparent, so he would rather call the technology ants. Moreover, he pointed out, the acronym works out much better, as well.

ANTs are a programming technology for simplifying large complex software systems into independent components, thereby making them more reliable, more efficient and easier to maintain. It achieves these results because, like the ant or termite colony, it distributes control structures throughout the computation. ANTs computations are a collection of essentially independent component sub-programs whose interactions are described within the component sub-programs themselves, and do not depend on a central control structure. ANTs execute concurrently, or appear to be processing data simultaneously and interact only when they explicitly exchange data. Thus ANTs computations are concurrent computations in which all components appear to execute simultaneously unless they are waiting for data to be provided by another ANT. An inherent property of ANTs computations are that they execute concurrently on parallel multiprocessor or distributed hardware. The benefit of the ANTs programming technique can be seen from two perspectives. On the one hand, ANTs simplify programming by eliminating the need for central control structures; on the other, ANTs enhance efficiency by effectively exploiting the parallel resources of a multi-processor.

The computer architect has faced two challenges: improvement of computational efficiency – by avoiding the waste of time or space – and reducing complexity by logical simplification and reductions of system component count. In terms of computational efficiency, progress has been made in two key areas. Improvement in communication between software and hardware has come through the use of operating systems advances and threading technology. Efficiency has also been improved through the use of multiple processors. In this domain, performance is also improved because multiple processors can cooperate to complete a given computation in minimum time. Computer program resource demands are satisfied by calls to an operating system. Traditionally, hardware resources are assigned to competing processes (which may or may not be parts of one or more concurrent programs) by a technique called multi-tasking or multi-threading. This process takes place in the twilight zone between the application program and its assignment of resources by the operating system upon which it depends. Today’s state-of-the-art technology in this area is synchronous, preemptive multi-threading; that is, when an application asks the operating system to accomplish a task, the system opens a thread and locks it; assigning the requested task to that thread. The thread remains locked until the task is complete or the operating system decides some more important resource needs that thread’s assigned resource. These multiple preemptive threads are then scheduled by a single thread responsible only for scheduling all the others.

In conventional procedural programming practice, the control program specifies the sequence in which component parts of a computation are executed. Normally, this sequence ensures that program segments do not execute until the data they need is available, and also ensures that appropriate program segments execute in time to provide data before it is needed for the computation to progress. As computations become larger and more complex, control structures become more complex and limit the size and effectiveness of the application. Complexity rises exponentially with the number of function points in an application program. At 1,000 or more function points, complexity usually becomes overwhelming and the application is abandoned. While systems this large do not necessarily require more bits than the number of neurons in a single human brain, they do seem to be well beyond the ability of a single application architect to describe, and to predict, the real world performance of the application system.

In software engineering, the usual method for solving complexity issues has been to break complex problems down into a number of simpler, more manageable parts. Unfortunately, the programmer must then also undertake the coordination of the various components of the program; there has been no technique for separating the central control mechanism and encapsulating it into the program’s components. In fact, the control structure becomes either highly centralized in an explicit way or interwoven implicitly into the fabric of the application program. The result is that the parts of the program are no longer interchangeable or amenable to independent upgrades, and the entire program must be analyzed when local changes or maintenance activities are undertaken. The advantages of breaking up the program into quasi-independent segments are undermined by centralized or embedded control structures.

Due to its central control structure preemptive threading technology may get in the way of the effort to reduce complexity. ANTs solve this problem by eliminating the central control structure and making the interaction between program components explicit, asynchronous, and easily accessible for later changes. These components, or ANTs, acquire input data, process that data, and provide the results as output to be received by other ANTs, which are in a suspended state until the data they need becomes available. The programmer is no longer required to be concerned with scheduling or explicitly sequencing tasks because the ANTs control kernel (ACK) automatically schedules the execution of any program component or ANT whenever the data it awaits becomes available for processing. There is no longer any need to sequentially order processes and the computations will proceed concurrently from the point of view of the programmer. In fact they will proceed in parallel on a multiprocessor or distributed server system up to: Minimum {processors available, concurrency in the program} which is one of Sullivan’s theorems. Dependencies of one part of the program on another are automatically respected, because no ANT can execute until the ANTs on which it depends have provided the data it needs to execute.

The efficiency of ANTs depends on the special properties of the ACK, which is at the heart of CHoPP’s proprietary technology. The ACK sits between the program, which may be written in any language (commonly C++ or JAVA) and any operating system (typically UNIX or NT). ANTs communicate with messages sent to mailboxes and the ACK acts as postmaster. Some of the functions of the ACK are usually provided by conventional operating system kernels, but these have two shortcomings for ANTs. First, they are not efficient enough to manage a program broken into so many small parts, and second, currently neither operating systems nor databases provide a single object oriented interface required for today’s programs. The efficiency issue can be understood by observing that the natural decomposition of a problem often produces numerous fine grain components. For example, a business transaction may spawn as many as 50 internal computing/database transactions or object references. A natural programming style would turn each such activity into an ANT. Conventional operating systems would associate each such component with a synchronous operating system process, or thread. Then scheduling and task switching overhead would be incurred every time a thread or process is activated or suspended. The useful work done by these small grain components, say 50 instructions or so, would be very small compared to the thousands required for thread management.

By contrast, the ACK executes 10 to 30 instructions for overhead when an ANT is activated. The overhead is so low that the programmer can ignore size considerations in partitioning a computation into its natural components, using segments as large or as small as the application or its algorithm dictate. The second reason the ACK must be substituted for certain operating system functions is that protocols for thread management are inconsistent with the view that ANTs are independent, encapsulated objects. Threads communicate via unrestricted shared memory that provides no tools for private protection, or for structured shared access. As a result, threads are notoriously difficult to use; they provide mechanisms for concurrent (and parallel) execution only at the expense of increased programming complexity. ANTs, which make the interaction between components explicit and asynchronous, provide an improvement in conventional encapsulation, while permitting concurrent execution as a welcome side benefit.

ANTs are more efficient than threads for mathematical reasons and not simply because of coding tricks. An ANT program is a partially ordered set and therefore has more degrees of freedom in execution order than does a deterministic set of threads. Because the thread control mechanism is used to manage hardware and I/O, fairness and prioritization are essential. Fairness is inherently blocking and prioritization is inherently centralized. Lock/Unlock, etc. are inherently blocking because any locked thread must place itself on a list and the unlocking thread must be able to restart it. One or the other must wait while the list is altered. Once the thread count is sufficiently high, blocking dominates execution time. ANTs are non-blocking. ANTs cannot span multiple applications with a single mechanism. ANTs cannot operate in real time or assign priorities. ANTs can only guarantee that all the work in a batch of work is completed before it halts. This simplification allows ANTs to be non-blocking and have distributed control.

Traditional programming techniques have difficulties when, as is usually the case, the event, which determines a context switch, occurs before the operations necessary to permit it are complete. The resulting flag variables inevitably lead to errors. An ANTs program makes the context switch immediately and the ANT representing the new context waits for the data to be ready for it. This simplifies the code and removes the largest barrier to concurrent execution. The two technologies differ in six fundamental aspects as shown in the table below. ANTs operate in user space whereas threads operate in the kernel space. This lowers overhead and improves protection between processes. The operating system is not aware of ANTs, it deals with an ANT program as if it were a single process.

ANTs impose no priorities and unlike threads, are non-preemptive. Priorities can be established only by reference to a global view of execution, a view that determines some order relationship between threads. Such a global view is in conflict with the notion of independent execution. Priorities are appropriate in the execution of operating system calls, for the management of hardware resources, for example. In these cases ANTs make OS calls and interface with threads in the usual ways. ANTs are encapsulated objects with protected code and data. Threads are not objects, they are kernel managed scheduling entities.

ANTs execute reentrant code and have private data that is not accessible from other ANTs. They communicate by means of structured mailbox objects only. These mailboxes provide the primary ANTs message passing control mechanisms, because when ANTs access mailboxes they are suspended or continue execution according to the availability of data in the mailboxes. Threads, by contrast, communicate via shared memory. This provides no protection or structure to threads communication priorities and is a significant source of error prone code.

The programmer sees the synchronization and coordination of ANTs based entirely on data availability; ANTs execute as soon as the data they need becomes available. This data flow view appears natural to the programmer. Threads are synchronized and coordinated using Wait/Signal or Lock/Unlock primitives. Their role in the application program is unspecified and extrinsic to the application logic or algorithm, and often manifested within obscure or opaque coding sequences. This contributes to the complexity and error proneness of threads usage. When an ANT is suspended and another starts to execute certain housekeeping functions, an activity called task switching must be carried out. The number of task switching instructions required averages between 10 and 30 for ANTs. The analogous operation for threads takes a thousand or more instructions. This distinction is at the heart of ANTs efficiency and also contributes to ANTs programmability. Such low task switching overhead allows the user to be indifferent to task granularity.

ANTs and Threads Compared

ANTs Threads Execution Environment User Space Kernel Space Priority Preemption No Priority Preemption Priority Based Scheduling Encapsulation Private Data and Code None Communication Message Passing Shared Memory Synchronization Criteria Data Availability Lock/Unlock Task Switching 10-30 Instructions 1000+ Instructions

ANTs technology originated as a programming technique for exploiting parallel processing in multiprocessor computer systems. Much of the early work on concurrent programming for parallel processors involved explicit mapping of the application onto the computer architecture by the programmer, including discovering new algorithms suitable for specific architectures. In many embedded applications, the parallel processor was even specifically designed to execute a given algorithm. From the earliest days of the CHoPP architecture, the design team sought automatic mapping technology to remove this burden from the programmer. During the past decade while the CHoPP team has been developing ANTs as the software realization of its hardware architecture, multiprocessor operating systems have developed around the threads technology first used at Carnegie Mellon for Mach. This became the kernel of AiX. More recently the JAVA programming language was developed with light weight threads in an effort to reduce complexity.

Threads and ANTs both incorporate automatic mapping techniques that enable the programmer to create concurrent programs without reference to the specific hardware platform on which the application will run. ANTs cannot perform many of the real-time operating system functions for which threads were designed, and thus an ANTs program makes system calls to thread implemented functions whenever needed. On the other hand, threads are poorly adapted as a basis for a modern application programming system, because they introduce too much overhead, they defeat encapsulation, and they rely on a central control paradigm.

The original CHoPP goal of eliminating the manual mapping of programs onto multiprocessor hardware was achieved through ANTs by a significant reduction in the overhead required for intertask communication. To effectively exploit the parallel resources of a multiprocessor, it is necessary to express the application as a collection of tasks that can be distributed among the available processors. The natural decomposition of an application into interdependent cooperating tasks that communicate frequently requires processors to communicate frequently. Over the past decade interprocessor communication has improved dramatically and so has networked communication between distributed processors. In fact, processors in IBM SP Series systems connected by T3 lines are indifferent as to whether they communicate with another on the same Omega switch or over the network with a processor thousands of miles away. Today, hardware generated time required for interprocess communication is about the same as ordinary processor memory access when there is a cache miss. The software generated time required for threads to communicate when potentially suspending is two to three orders of magnitude higher than the hardware time. Thus threads are unsuited for communication intensive applications because they generate so much overhead they cancel the benefits of parallel processing.

ANTs use the conventions and tools of object oriented programming technology to provide encapsulation, naming, and reuse. The initial implementation of ANTs was done using C++; although a runtime library and several precompiled files are required to program in ANTs, no special programming tools are necessary. A JAVA version of ANTs is in its final stages of testing and will soon be available in demonstration form from an ANTs Web site. ANTs provide the programmer with a small set of reusable object classes, foremost of which are the ANTclass and the Mailboxclass. The actual ANTs runtime environment is invisible to the user and is entirely self-adjusting, since it executes object code on different platforms without user intervention. The programmer will need to give some attention to issues arising from partial ordering of their ANT “colony” or set of tasks in order to gain the full performance benefit of the technology when using a parallel processor.

Objects of the ANTsclass have private data, but all public data is restricted to that contained in Mailboxes. This data is accessed by communications instructions, which are method calls to the Mailboxclass. Objects of ANTsclass execute concurrently to the extent of processors available to host them. Their order of execution is determined by the Mailboxclass methods invoked for interprocess communication and not by any syntax ordering of the ANTs within the program. Unlike threads, ANTs operate in the user space under the ACK in the ANTs runtime package.

The Mailbox object has several private states, two of which are Valid and Invalid. If the Mailbox state is Valid, the Mailbox contains a pointer to public data. The methods of the Mailboxclass are called communication instructions because these are the only way that ANT objects communicate with each other. Sample communication instructions for the C++ version are:

void Cmailbox::Send (void* pointername) // places void pointer in mailbox and marks mailbox valid

void* Cmailbox::Receive (void) // waits until mailbox is valid and returns from send

void Cmailbox::Reuse (void) // invalidates mailbox

These functions allow the ANTs programmer to send mail to a particular ANT by storing a pointer in its mailbox, to receive mail by waiting until a particular mailbox is filled, and to clear a mailbox for later reuse.

Naturally this syntax is a bit different for the JAVA version of ANTs since the JAVA language does not support pointers. The JAVA versions are:

public class Mailbox {

Mailbox (Class type); // TypeDesc is always a reference

void Send (type ObjectReference); // in JAVA, Mailbox is strongly typed

type Receive (void);

void Reuse();

}

The JAVA Mailboxclass allows the programmer the same three functions without the use of pointers. The constructor of a Mailbox checks a list of available Mailbox types and creates the correct one; if it does not exist, a new class is created and added to the list. Unlike in C++, a JAVA Mailbox may only pass a single type of object. The advantages of this approach are that all ANTs programs are strongly typed and no runtime type checking is necessary. The disadvantages are that it is necessary to subclass mailbox (or use TypeDesc) to know the actual class of a mailbox. Certain programs that create dynamic classes without end may have their symbol and vector tables grow much like a C memory leak; runtime type checking will be done on lists of mailboxes because they are underspecified. JAVA casting rules make this necessary and dynamic testing is too slow to be useful.

Experiments with ANTs have shown a factor of four performance improvements without changing the code, other than to add ANTs wrappers to natural, easily recognized application program components. Another promising component based technology for distributed object computing has appeared as a part of the greater JAVA phenomenon recently and comparison with ANTs is interesting.

JavaBeans is a portable, platform-independent component model written in the JAVA language and developed in collaboration with software industry leaders. It enables developers to write reusable components once and run them anywhere, taking advantage of the platform independent nature of JAVA technology and the object distribution capability of the Internet. JavaBeans acts as a bridge between the proprietary component models and provides a seamless and powerful means for developers to build components that will run in ActiveX container applications, for example. The JavaBeans’ API provides a framework for defining reusable, embeddable, modular software components. The JavaBeans’ Specification defines a bean as a reusable software component that can be manipulated visually in a builder tool. Any object that conforms to certain basic rules can be a bean; there is no Bean class that all beans are required to subclass. A bean exports properties, events and methods. A property is a piece of the bean’s internal state that can be set and queried by the programmer; a bean may generate events and may export methods as public methods defined by the bean. In addition to these common object features a bean may have indexed properties, bound properties, and constrained properties. An indexed property has an array value for which the bean provides methods to get and set individual elements of the array. A bound property sends out a notification when its value changes, and a constrained property does the same but also allows that change to be vetoed by listeners.

ANTs always adopt the component mechanism of its host language. So, if a JAVA ANT is compiled as a JAVA bean, it is componentized as a bean. Since all ANTs calls are virtual and JAVA supports only virtual calls, it doesn’t matter, except that JAVA cannot compile across a bean boundary and thus performance suffers. The JAVA communication model is not scalable but the use of ANTs technology does fix that problem. For example, a JAVA program compiled as ANTs ran four times faster than its non-ANTs version.

The reason ANTs simplify code is clarified by referring to the old 4GL arguments between the “template” school and the “rules” school. The template school permitted similar instances of code to be produced cookie-cutter style by filling out a form. The rules school created code from a list of rules specifying what ought to occur in any circumstance. The difficulty both schools faced was that they could only create 90% of a program, but moreover, there was no means to go from the 3GL output of the methodology back to the original 4GL input. Any change to the 4GL wiped out the 3GL supplement that finished the program. The most important contribution of object oriented programming to productivity may be the fact that it provides a reversible template technology. However, the objections of the rules side to templates are still valid. The implementation of formulas, tests for conditions, names of methods, and types of data are spread out in the code.

Cataloging and altering these a posteriori code augmentations becomes virtually impossible in a large program. The complex job of determining dependencies and order is still left to the programmer. ANTs adds to object oriented programming the most important benefits of rules based programming, that is, the ability to ignore ordering, the separation of rules from the rest of the program, and removal of specific symbols from the object language source code. While earlier systems used sophisticated algorithms to accomplish these things, ANTs simplifies things by associating the execution of code with the Boolean states of data references. There was little else that sophisticated systems could do anyway. Virtually all of the objections of the rules school to object technology are eliminated without resorting to preprocessors or arcane syntax, and without giving up a single benefit.

Mailboxes contain references to data and not data itself. The structure of C++ and JAVA, the only intended target languages for ANTs, make the cost of that indirection insignificant since data is rarely passed by value in a real program. Since pointers and references only differ as to whether arithmetic is allowed, and since ANTs discourage address arithmetic, there is little practical difference.

JAVA performance seems to have several significant bottlenecks. Those most often complained about will become moot with the advent of optimizing compilers for servers and frequently used applications. More difficult is the synchronization of threads. When a thread encounters a lock it must test it and if it is already locked, place itself on a list. To do that, it must lock the list. When a thread releases a lock, it, or its proxy, must check to see if a thread is waiting to gain the lock. If the list is locked, only bad choices remain. Either the thread can spin on the lock or it can suspend, leaving the next thread wanting the lock waiting. If the locked-out thread finds the list locked, it must suspend and wait. Many implementations use a third thread to maintain the list. Regardless of the strategy, a reliable system is computationally costly. With a reasonably large number of preemptively multi-tasked threads, synchronization will come to dominate execution time. Example: If a list remains locked for a microsecond, and the preemptive slice is ten milliseconds, one in ten thousand times, the thread locking the list will be preempted. Because preemption usually lasts much longer than the time slice, a busy list will almost always be locked by a preempted thread.

When the odds are high that a thread will pause for a lock, the time for a suspend/resume pair must be included in the cost of the lock. When two or more threads are polling for a place on the wait list, and the odds are low of getting in, many times the cost of a suspend/resume pair can be incurred. The non-blocking structure of ANTs fixes this unfortunate situation, which often occurs in large distributed object information systems.

Because JAVA disallows all of the performance enhancing hacks that C programmers use to make elaborate control programs work (like table driven state machines, which are rather slow without pointer arithmetic), JAVA tends to have performance problems with parsers and formatters, for example. ANTs make most of these programs run faster than their C equivalent by replacing the table entries with ANTs waiting on mailboxes. Because JAVA is structured as a call-return, synchronous language, distributed systems must either use some programmer-devised optimization or fail from network delay. ANTs are inherently asynchronous and naturally exploit the concurrency of a program to moderate latency.

A principal goal of ANTs from the beginning was to reduce the need to map a program onto the architecture of the processor that would execute it. In today’s business computing environment, that translates into avoiding API sets. Programs that must operate on various machines have a difficult time dealing with variations in their threading models, event models, message queuing and other system functions. Even systems that only support a single operating system face problems scaling to clusters, distributed computing, handling network latency and properly packaging system calls. ANTs permit a single code base that can be run on a single computer, clusters, distributed networks, and large parallel processors and distributed systems without alterations. Examples of large scale enterprise servers particularly well-suited for the ANTS programming model are the IBM SP Series and the Unisys Aquanta 32 processor enterprise server, both of which can be clustered into sets of 4-8 processor groups.

Writing for a cluster usually involves mapping functions onto specific processors and contending with a difficult API. There are several reasons that this is the case: the call-return model does not start work on data until it is needed (the function that creates the data is called as a request for the data); synchronization is performed by event notification, which is a system function; event notification across OS instances is accomplished by a different mechanism than within the OS; and migration of computation from one node to another is difficult or impossible. Moreover, Object Oriented Programming (OOP) does not provide enough encapsulation to hide machine boundaries and asynchronous communication is provided by an external entity (the message queue) which is slow and clumsy.

ANTs employ asynchronous communication as its basic primitive. Which instance of the OS in a cluster an ANT actually runs on is a matter of considerable indifference to ANTs. ANTs are linked to the list of runable tasks by the arrival of all requested data and, therefore, are neither notified nor polled. Because no distinction is made about where an ANT runs, the cluster API is not a factor. Because ANTs automatically load-balance in a cluster, no special mapping is required. Because ANTs do not rely on the system for synchronization, distributed and dissimilar systems are no problem. (Special harnesses for fault tolerance have been designed, but not implemented.) Hand mapping synchronously communicating processes does not work very well. ANTs are a significant improvement. Windows NT, and, to a lesser degree, UNIX are famous for their system inefficiency. ANTs dramatically reduce the role of the operating system, improving efficiency and scalability. Thread management overhead is lowered 90% using ANTs and can be distributed. This is enough to make NT scale to very large parallel systems.

The call-return model of programming, embeds the name of the function called in the calling code. ANTs embed the name of the desired value. Additionally, the name is local to a component, and, therefore, components can be interchanged without altering the source of other components. Rules are easy to add and modify, because whenever Event X happens is the fundamental primitive of ANTs and is usually the trigger for rules. Call-return programs must embed the notification code in the calling sequences, however many of them there may be, and that makes change difficult.

The reliable superclassing of objects makes JAVA an ideal language for forms routing systems (because stations may need only the functionality of the superclass, and keeping the class description of every form on every station is a management problem), but the absence of useful synchronization mechanisms in JAVA thwart most efforts in this direction. ANTs provide robust synchronization at this global level, reduce the cost of network latency, and improve component encapsulation.

Large online applications with routing requirements for many forms and, similarly, directed map simulations are notoriously hard to parallelize. ANTs make it easy because ANTs programs are partially ordered directed maps. Many common instruction lists are partially ordered directed maps, such as recipes. For instance, it is necessary to break the eggs before separating them, but it is not necessary to break them before sifting the flour or vice versa. It is necessary to do all three before mixing the flour and the eggs. It is interesting to note that cooking a meal is often used as an everyday example of parallel processing, even though the recipes or instructions for preparing the meal’s components or dishes, are procedural rather than descriptive. With complex sets of instructions, it is often difficult to find an order for the instructions that will not halt them before completion. ANTs naturally express their latent partial ordering and the directed map. An ANTs program will not halt before completion, if any path completes, regardless of whether the programmer recognizes the proper path or whether there is even a consistent set of rules for recognizing it. This fact simplifies the job of writing the program, improving the performance by a factor of a hundred by eliminating a complicated central control. In addition, the program will run at maximum concurrency on a multiprocessor without modification.

Lawson Software has the first release C++ version of ANTs installed on a ten processor Aquanta XR/6 NT server in the Research & Development benchmark laboratory. The technology is being evaluated as a means of improving performance against the strict preemptive multi-threading of the NT operating system. We have an application written in JAVA and are looking forward to the forthcoming JAVA version of ANTs to evaluate performance improvements over JAVA’s lightweight but still preemptive threading.

ANTs are an enabling technology not only for parallel processing but for distributed object oriented systems as well. Since ANTs are language and operating system independent (implemented in both C++ and JAVA for NT, UNIX and any JVM), the components of a distributed object based application can be assembled from objects written in either language, (or any other, with JAVA wrappers), running on any OS that supports them or any JVM.

One might ask why anyone would want such a complex distributed application platform. The answer is that nobody does. Such systems are usually “received structures” and must be operated as effectively as possible until they can be replaced by more highly integrated successors. It is unusual but not unknown for customers of an ISV providing cross-industry applications on multiple platforms, to (inter)operate software suites on AS/400 servers for manufacturing, UNIX servers for CAD/CAM and perhaps supply chain, and NT servers for European subsidiaries and some corporate functions like HR. As these applications migrate to object technology with increasing functional and geographic distribution and ever more complex interoperability, software tools like ANTs will enable dramatic performance increases over preemptive threading technology.

Our experience in the development and application of parallel processing over the past thirty years and in distributed object oriented systems over the past ten, tells us that ANTs represent a significant break-through technology with a very promising future in the Age of JAVA. We have long known that the real world operates concurrently; physicists tell us that time is all that keeps everything from happening at once. The object paradigm offers an opportunity to map real world business objects such as invoices, for example, into computer stored, manipulated and distributed electronic objects in a natural way. ANTs allow us to escape the constraints of the computer operating system’s in-built deterministic, sequential ordering and also to map onto our computer models the partially ordered real world activities that make up actual business processes.

The author wishes to acknowledge many helpful discussions with Clifford Hersh, Vice President for Product Development at CHoPP Computer Corporation and with Dr. Richard Wright recently of Lawson Software, now with The Pitts Library at Emory University, Atlanta, G.A.

Peter Patton was formerly Vice President of International operations for Lawson software group based out of MN. Patton has held numerous prestigious high technology positions, including the post of chief information officer (CIO) of the University of Pennsylvania and lead IT consultant to: the National Technology Transfer Center; NASA’s Jet Propulsion Laboratory; and Dun and Bradstreet’s DataQuest Division. He was founding director of Minnesota Supercomputer Institute, vice president and director of Parallel Processing Architecture at MCC and director of computing at the University of Minnesota. He is the author of 16 books and 86 journal articles. Patton also holds a Harvard A.B. degree in Engineering and Applied Physics, an M.A. in Mathematics from the University of Kansas, and a Ph.D. in Aerospace Engineering, summa cum laude, from the University of Stuttgart in Germany.

For more information, see http://www.antssoftware.com/

============================================================

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

What’s New in Computing vs. COVID-19: Fugaku, Congress, De Novo Design & More

July 2, 2020

Supercomputing, big data and artificial intelligence are crucial tools in the fight against the coronavirus pandemic. Around the world, researchers, corporations and governments are urgently devoting their computing reso Read more…

By Oliver Peckham

OpenPOWER Reboot – New Director, New Silicon Partners, Leveraging Linux Foundation Connections

July 2, 2020

Earlier this week the OpenPOWER Foundation announced the contribution of IBM’s A21 Power processor core design to the open source community. Roughly this time last year, IBM announced open sourcing its Power instructio Read more…

By John Russell

HPC Career Notes: July 2020 Edition

July 1, 2020

In this monthly feature, we'll keep you up-to-date on the latest career developments for individuals in the high-performance computing community. Whether it's a promotion, new company hire, or even an accolade, we've got Read more…

By Mariana Iriarte

Supercomputers Enable Radical, Promising New COVID-19 Drug Development Approach

July 1, 2020

Around the world, innumerable supercomputers are sifting through billions of molecules in a desperate search for a viable therapeutic to treat COVID-19. Those molecules are pulled from enormous databases of known compoun Read more…

By Oliver Peckham

HPC-Powered Simulations Reveal a Looming Climatic Threat to Vital Monsoon Seasons

June 30, 2020

As June draws to a close, eyes are turning to the latter half of the year – and with it, the monsoon and hurricane seasons that can prove vital or devastating for many of the world’s coastal communities. Now, climate Read more…

By Oliver Peckham

AWS Solution Channel

Maxar Builds HPC on AWS to Deliver Forecasts 58% Faster Than Weather Supercomputer

When weather threatens drilling rigs, refineries, and other energy facilities, oil and gas companies want to move fast to protect personnel and equipment. And for firms that trade commodity shares in oil, precious metals, crops, and livestock, the weather can significantly impact their buy-sell decisions. Read more…

Intel® HPC + AI Pavilion

Supercomputing the Pandemic: Scientific Community Tackles COVID-19 from Multiple Perspectives

Since their inception, supercomputers have taken on the biggest, most complex, and most data-intensive computing challenges—from confirming Einstein’s theories about gravitational waves to predicting the impacts of climate change. Read more…

Hyperion Forecast – Headwinds in 2020 Won’t Stifle Cloud HPC Adoption or Arm’s Rise

June 30, 2020

The semiannual taking of HPC’s pulse by Hyperion Research – late fall at SC and early summer at ISC – is a much-watched indicator of things come. This year is no different though the conversion of ISC to a digital Read more…

By John Russell

OpenPOWER Reboot – New Director, New Silicon Partners, Leveraging Linux Foundation Connections

July 2, 2020

Earlier this week the OpenPOWER Foundation announced the contribution of IBM’s A21 Power processor core design to the open source community. Roughly this time Read more…

By John Russell

Hyperion Forecast – Headwinds in 2020 Won’t Stifle Cloud HPC Adoption or Arm’s Rise

June 30, 2020

The semiannual taking of HPC’s pulse by Hyperion Research – late fall at SC and early summer at ISC – is a much-watched indicator of things come. This yea Read more…

By John Russell

Racism and HPC: a Special Podcast

June 29, 2020

Promoting greater diversity in HPC is a much-discussed goal and ostensibly a long-sought goal in HPC. Yet it seems clear HPC is far from achieving this goal. Re Read more…

Top500 Trends: Movement on Top, but Record Low Turnover

June 25, 2020

The 55th installment of the Top500 list saw strong activity in the leadership segment with four new systems in the top ten and a crowning achievement from the f Read more…

By Tiffany Trader

ISC 2020 Keynote: Hope for the Future, Praise for Fugaku and HPC’s Pandemic Response

June 24, 2020

In stark contrast to past years Thomas Sterling’s ISC20 keynote today struck a more somber note with the COVID-19 pandemic as the central character in Sterling’s annual review of worldwide trends in HPC. Better known for his engaging manner and occasional willingness to poke prickly egos, Sterling instead strode through the numbing statistics associated... Read more…

By John Russell

ISC 2020’s Student Cluster Competition Winners Announced

June 24, 2020

Normally, the Student Cluster Competition involves teams of students building real computing clusters on the show floors of major supercomputer conferences and Read more…

By Oliver Peckham

Hoefler’s Whirlwind ISC20 Virtual Tour of ML Trends in 9 Slides

June 23, 2020

The ISC20 experience this year via livestreaming and pre-recordings is interesting and perhaps a bit odd. That said presenters’ efforts to condense their comments makes for economic use of your time. Torsten Hoefler’s whirlwind 12-minute tour of ML is a great example. Hoefler, leader of the planned ISC20 Machine Learning... Read more…

By John Russell

At ISC, the Fight Against COVID-19 Took the Stage – and Yes, Fugaku Was There

June 23, 2020

With over nine million infected and nearly half a million dead, the COVID-19 pandemic has seized the world’s attention for several months. It has also dominat Read more…

By Oliver Peckham

Supercomputer Modeling Tests How COVID-19 Spreads in Grocery Stores

April 8, 2020

In the COVID-19 era, many people are treating simple activities like getting gas or groceries with caution as they try to heed social distancing mandates and protect their own health. Still, significant uncertainty surrounds the relative risk of different activities, and conflicting information is prevalent. A team of Finnish researchers set out to address some of these uncertainties by... Read more…

By Oliver Peckham

[email protected] Turns Its Massive Crowdsourced Computer Network Against COVID-19

March 16, 2020

For gamers, fighting against a global crisis is usually pure fantasy – but now, it’s looking more like a reality. As supercomputers around the world spin up Read more…

By Oliver Peckham

[email protected] Rallies a Legion of Computers Against the Coronavirus

March 24, 2020

Last week, we highlighted [email protected], a massive, crowdsourced computer network that has turned its resources against the coronavirus pandemic sweeping the globe – but [email protected] isn’t the only game in town. The internet is buzzing with crowdsourced computing... Read more…

By Oliver Peckham

Global Supercomputing Is Mobilizing Against COVID-19

March 12, 2020

Tech has been taking some heavy losses from the coronavirus pandemic. Global supply chains have been disrupted, virtually every major tech conference taking place over the next few months has been canceled... Read more…

By Oliver Peckham

Supercomputer Simulations Reveal the Fate of the Neanderthals

May 25, 2020

For hundreds of thousands of years, neanderthals roamed the planet, eventually (almost 50,000 years ago) giving way to homo sapiens, which quickly became the do Read more…

By Oliver Peckham

DoE Expands on Role of COVID-19 Supercomputing Consortium

March 25, 2020

After announcing the launch of the COVID-19 High Performance Computing Consortium on Sunday, the Department of Energy yesterday provided more details on its sco Read more…

By John Russell

Steve Scott Lays Out HPE-Cray Blended Product Roadmap

March 11, 2020

Last week, the day before the El Capitan processor disclosures were made at HPE's new headquarters in San Jose, Steve Scott (CTO for HPC & AI at HPE, and former Cray CTO) was on-hand at the Rice Oil & Gas HPC conference in Houston. He was there to discuss the HPE-Cray transition and blended roadmap, as well as his favorite topic, Cray's eighth-gen networking technology, Slingshot. Read more…

By Tiffany Trader

Honeywell’s Big Bet on Trapped Ion Quantum Computing

April 7, 2020

Honeywell doesn’t spring to mind when thinking of quantum computing pioneers, but a decade ago the high-tech conglomerate better known for its control systems waded deliberately into the then calmer quantum computing (QC) waters. Fast forward to March when Honeywell announced plans to introduce an ion trap-based quantum computer whose ‘performance’ would... Read more…

By John Russell

Leading Solution Providers

Contributors

Neocortex Will Be First-of-Its-Kind 800,000-Core AI Supercomputer

June 9, 2020

Pittsburgh Supercomputing Center (PSC - a joint research organization of Carnegie Mellon University and the University of Pittsburgh) has won a $5 million award Read more…

By Tiffany Trader

‘Billion Molecules Against COVID-19’ Challenge to Launch with Massive Supercomputing Support

April 22, 2020

Around the world, supercomputing centers have spun up and opened their doors for COVID-19 research in what may be the most unified supercomputing effort in hist Read more…

By Oliver Peckham

Australian Researchers Break All-Time Internet Speed Record

May 26, 2020

If you’ve been stuck at home for the last few months, you’ve probably become more attuned to the quality (or lack thereof) of your internet connection. Even Read more…

By Oliver Peckham

15 Slides on Programming Aurora and Exascale Systems

May 7, 2020

Sometime in 2021, Aurora, the first planned U.S. exascale system, is scheduled to be fired up at Argonne National Laboratory. Cray (now HPE) and Intel are the k Read more…

By John Russell

Nvidia’s Ampere A100 GPU: Up to 2.5X the HPC, 20X the AI

May 14, 2020

Nvidia's first Ampere-based graphics card, the A100 GPU, packs a whopping 54 billion transistors on 826mm2 of silicon, making it the world's largest seven-nanom Read more…

By Tiffany Trader

10nm, 7nm, 5nm…. Should the Chip Nanometer Metric Be Replaced?

June 1, 2020

The biggest cool factor in server chips is the nanometer. AMD beating Intel to a CPU built on a 7nm process node* – with 5nm and 3nm on the way – has been i Read more…

By Doug Black

Summit Supercomputer is Already Making its Mark on Science

September 20, 2018

Summit, now the fastest supercomputer in the world, is quickly making its mark in science – five of the six finalists just announced for the prestigious 2018 Read more…

By John Russell

TACC Supercomputers Run Simulations Illuminating COVID-19, DNA Replication

March 19, 2020

As supercomputers around the world spin up to combat the coronavirus, the Texas Advanced Computing Center (TACC) is announcing results that may help to illumina Read more…

By Staff report

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This