Operating systems, aside of abstracting the hardware, provide a set of services to the users: process management and scheduling, file systems, sophisticated network stacks. This article argues that in today's highly distributed environment the need for distributed, highly-scalable computing is so common that it deserves being provided as an operating system service. It describes our experience and ongoing work on such a service.
Distributed computing requires the developers to redefine the very concept of an application. The application is no longer a single executable, rather it's a set of components running on different nodes in the network. This redefinition is widely understood and accepted today. However, when speaking of unlimited scalability we have to get used to the fact that applications can scale out of a single enterprise and become global systems with no single point of ownership. Thus, the application consists of components running on different nodes in the network, where most of them are out of our control and often even unknown to us.
Let me give an example: Think of stock quote distribution system. An exchange, such as NASDAQ, publishes messages containing stock prices. These are passed to news services such as Reuters and Bloomberg which in turn re-distribute them to their customers; banks, hedge funds etc. When a bank gets the stock quotes, it re-distributes them to all the branch offices, say to New York, London and Tokyo. Each branch office then re-distributes the stock quotes to individual stock traders within that branch office.
The real system is even more complex, however, the point is that the global NASDAQ stock quote distribution system is never thought of as a single application. The reason is that at each node there's different software stack, there are different protocols being used, the system is often highly specific, the common re-distribution element cannot be easily untangled from business logic of the particular entity, etc. We don't even have a vocabulary to speak about applications that are global, truly distributed and transcending the "single point of control" model.
Nowadays, the POSIX standard more or less defines what operating system should or should not do. So, to make the scalability service integral part of the operating system is should be accessible via POSIX API rather than via custom interface. Exposing the functionality using non-standard interface would be a major obstacle to acceptance and wide-spread implementation of the scalability service.