NAME

lonc - LON TCP-MySQL-Server Daemon for handling database requests.


SYNOPSIS

Usage: lonc

Should only be run as user=www. This is a command-line script which is invoked by loncron. There is no expectation that a typical user will manually start lonc from the command-line. (In other words, DO NOT START lonc YOURSELF.)


OVERVIEW

Physical Overview

Physically, the Network consists of relatively inexpensive upper-PC-class server machines which are linked through the commodity internet in a load-balancing, dynamically content-replicating and failover-secure way.

All machines in the Network are connected with each other through two-way persistent TCP/IP connections. Clients (B, F, G and H in Fig. Overview of Network) connect to the servers via standard HTTP. There are two classes of servers, Library Servers (A and E in Fig. Overview of Network) and Access Servers (C, D, I and J in Fig. Overview of Network).

Library Servers are used to store all personal records of a set of users, and are responsible for their initial authentication when a session is opened on any server in the Network. For Authors, Library Servers also hosts their construction area and the authoritative copy of the current and previous versions of every resource that was published by that author. Library servers can be used as backups to host sessions when all access servers in the Network are overloaded. Otherwise, for learners, access servers are used to host the sessions. Library servers need to have strong I/O capabilities.

Access Servers provide LON-CAPA service to users, using the library servers as their data source. The network is designed so that the number of concurrent sessions can be increased over a wide range by simply adding additional access servers before having to add additional library servers. Preliminary tests showed that a library server could handle up to 10 access servers fully parallel. Access servers can generally be cheaper hardware then library servers require.

The Network is divided into domains , which are logical boundaries between participating institutions. These domains can be used to limit the flow of personal user information across the network, set access privileges and enforce royalty schemes. LON-CAPA domains bear no relationship to any other domain, including domains used by the DNS system; LON-CAPA domains may be freely configured in any manner that suits your use pattern.

Example Transactions

Fig. Overview of Network also depicts examples for several kinds of transactions conducted across the Network.

An instructor at client B modifies and publishes a resource on her Home Server A. Server A has a record of all server machines currently subscribed to this resource, and replicates it to servers D and I. However, server D is currently offline, so the update notification gets buffered on A until D comes online again. Servers C and J are currently not subscribed to this resource.

Learners F and G have open sessions on server I, and the new resource is immediately available to them.

Learner H tries to connect to server I for a new session, however, the machine is not reachable, so he connects to another Access Server J instead. This server currently does not have all necessary resources locally present to host learner H, but subscribes to them and replicates them as they are accessed by H.

Learner H solves a problem on server J. Library Server E is H's Home Server, so this information gets forwarded to E, where the records of H are updated.

lond, lonc, and lonnet

Fig. Overview of Network Communication elaborates on the details of this network infrastructure. It depicts three servers (A, B and C) and a client who has a session on server C.

As C accesses different resources in the system, different handlers, which are incorporated as modules into the child processes of the web server software, process these requests.

Our current implementation uses mod_perl inside of the Apache web server software. As an example, server C currently has four active web server software child processes. The chain of handlers dealing with a certain resource is determined by both the server content resource area (see below) and the MIME type, which in turn is determined by the URL extension. For most URL structures, both an authentication handler and a content handler are registered.

Handlers use a common library lonnet to interact with both locally present temporary session data and data across the server network. For example, lonnet provides routines for finding the home server of a user, finding the server with the lowest loadavg, sending simple command-reply sequences, and sending critical messages such as a homework completion, etc. For a non-critical message, the routines reply with a simple ``connection lost'' if the message could not be delivered. For critical messages, lonnet tries to re-establish connections, re-send the command, etc. If no valid reply could be received, it answers ``connection deferred'' and stores the message in buffer space to be sent at a later point in time. Also, failed critical messages are logged.

The interface between lonnet and the Network is established by a multiplexed UNIX domain socket, denoted DS in Fig. Overview of Network Communication. The rationale behind this rather involved architecture is that httpd processes (Apache children) dynamically come and go on the timescale of minutes, based on workload and number of processed requests. Over the lifetime of an httpd child, however, it has to establish several hundred connections to several different servers in the Network.

On the other hand, establishing a TCP/IP connection is resource consuming for both ends of the line, and to optimize this connectivity between different servers, connections in the Network are designed to be persistent on the timescale of months, until either end is rebooted. This mechanism will be elaborated on below.

Establishing a connection to a UNIX domain socket is far less resource consuming than the establishing of a TCP/IP connection. lonc is a proxy daemon that forks off a child for every server in the Network. Which servers are members of the Network is determined by a lookup table, such as the one in Fig. Examples of Hosts. In order, the entries denote an internal name for the server, the domain of the server, the type of the server, the host name and the IP address.

The lonc parent process maintains the population and listens for signals to restart or shutdown, as well as USR1. Every child establishes a multiplexed UNIX domain socket for its server and opens a TCP/IP connection to the lond daemon (discussed below) on the remote machine, which it keeps alive. If the connection is interrupted, the child dies, whereupon the parent makes several attempts to fork another child for that server.

When starting a new child (a new connection), first an init-sequence is carried out, which includes receiving the information from the remote lond which is needed to establish the 128-bit encryption key - the key is different for every connection. Next, any buffered (delayed) messages for the server are sent.

In normal operation, the child listens to the UNIX socket, forwards requests to the TCP connection, gets the reply from lond, and sends it back to the UNIX socket. Also, lonc takes care to the encryption and decryption of messages.

lond is the remote end of the TCP/IP connection and acts as a remote command processor. It receives commands, executes them, and sends replies. In normal operation, a lonc child is constantly connected to a dedicated lond child on the remote server, and the same is true vice versa (two persistent connections per server combination).

lond listens to a TCP/IP port (denoted P in Fig. Overview of Network Communication) and forks off enough child processes to have one for each other server in the network plus two spare children. The parent process maintains the population and listens for signals to restart or shutdown. Client servers are authenticated by IP.

When a new client server comes online, lond sends a signal USR1 to lonc, whereupon lonc tries again to reestablish all lost connections, even if it had given up on them before - a new client connecting could mean that that machine came online again after an interruption.

The gray boxes in Fig. Overview of Network Communication denote the entities involved in an example transaction of the Network. The Client is logged into server C, while server B is her Home Server. Server C can be an access server or a library server, while server B is a library server. She submits a solution to a homework problem, which is processed by the appropriate handler for the MIME type ``problem''. Through lonnet, the handler writes information about this transaction to the local session data. To make a permanent log entry, lonnet establishes a connection to the UNIX domain socket for server B. lonc receives this command, encrypts it, and sends it through the persistent TCP/IP connection to the TCP/IP port of the remote lond. lond decrypts the command, executes it by writing to the permanent user data files of the client, and sends back a reply regarding the success of the operation. If the operation was unsuccessful, or the connection would have broken down, lonc would write the command into a FIFO buffer stack to be sent again later. lonc now sends a reply regarding the overall success of the operation to lonnet via the UNIX domain port, which is eventually received back by the handler.

Dynamic Resource Replication

Since resources are assembled into higher order resources simply by reference, in principle it would be sufficient to retrieve them from the respective Home Servers of the authors. However, there are several problems with this simple approach: since the resource assembly mechanism is designed to facilitate content assembly from a large number of widely distributed sources, individual sessions would depend on a large number of machines and network connections to be available, thus be rather fragile. Also, frequently accessed resources could potentially drive individual machines in the network into overload situations.

Finally, since most resources depend on content handlers on the Access Servers to be served to a client within the session context, the raw source would first have to be transferred across the Network from the respective Library Server to the Access Server, processed there, and then transferred on to the client.

To enable resource assembly in a reliable and scalable way, a dynamic resource replication scheme was developed. Fig. ``Dynamic Replication'' shows the details of this mechanism.

Anytime a resource out of the resource space is requested, a handler routine is called which in turn calls the replication routine. As a first step, this routines determines whether or not the resource is currently in replication transfer (Step D1a). During replication transfer, the incoming data is stored in a temporary file, and Step D1a checks for the presence of that file. If transfer of a resource is actively going on, the controlling handler receives an error message, waits for a few seconds, and then calls the replication routine again. If the resource is still in transfer, the client will receive the message ``Service currently not available''.

In the next step (Step D1b), the replication routine checks if the URL is locally present. If it is, the replication routine returns OK to the controlling handler, which in turn passes the request on to the next handler in the chain.

If the resource is not locally present, the Home Server of the resource author (as extracted from the URL) is determined (Step D2). This is done by contacting all library servers in the author?s domain (as determined from the lookup table, see Fig. 1.1.2B). In Step D2b a query is sent to the remote server whether or not it is the Home Server of the author (in our current implementation, an additional cache is used to store already identified Home Servers (not shown in the figure)). In Step D2c, the remote server answers the query with True or False. If the Home Server was found, the routine continues, otherwise it contacts the next server (Step D2a). If no server could be found, a ``File not Found'' error message is issued. In our current implementation, in this step the Home Server is also written into a cache for faster access if resources by the same author are needed again (not shown in the figure).

In Step D3a, the routine sends a subscribe command for the URL to the Home Server of the author. The Home Server first determines if the resource is present, and if the access privileges allow it to be copied to the requesting server (D3b). If this is true, the requesting server is added to the list of subscribed servers for that resource (Step D3c). The Home Server will reply with either OK or an error message, which is determined in Step D4. If the remote resource was not present, the error message ``File not Found'' will be passed on to the client, if the access was not allowed, the error message ``Access Denied'' is passed on. If the operation succeeded, the requesting server sends an HTTP request for the resource out of the /raw server content resource area of the Home Server.

The Home Server will then check if the requesting server is part of the network, and if it is subscribed to the resource (Step D5b). If it is, it will send the resource via HTTP to the requesting server without any content handlers processing it (Step D5c). The requesting server will store the incoming data in a temporary data file (Step D5a) - this is the file that Step D1a checks for. If the transfer could not complete, and appropriate error message is sent to the client (Step D6). Otherwise, the transferred temporary file is renamed as the actual resource, and the replication routine returns OK to the controlling handler (Step D7).

Fig. ``Dynamic Replication: Change'' depicts the process of modifying a resource. When an author publishes a new version of a resource, the Home Server will contact every server currently subscribed to the resource (Step U1), as determined from the list of subscribed servers for the resource generated in Step D3c. The subscribing servers will receive and acknowledge the update message (Step U1c). The update mechanism finishes when the last subscribed server has been contacted (messages to unreachable servers are buffered).

Each subscribing server will check if the resource in question had been accessed recently, that is, within a configurable amount of time (Step U2).

If the resource had not been accessed recently, the local copy of the resource is deleted (Step U3a) and an unsubscribe command is sent to the Home Server (Step U3b). The Home Server will check if the server had indeed originally subscribed to the resource (Step U3c) and then delete the server from the list of subscribed servers for the resource (Step U3d).

If the resource had been accessed recently, the modified resource will be copied over using the same mechanism as in Step D5a through D7, which represents steps Steps U4a through U6 in the replication figure.

Load Balancing

lond provides a function to query the server's current loadavg. As a configuration parameter, one can determine the value of loadavg, which is to be considered 100%, for example, 2.00.

Access servers can have a list of spare access servers, /home/httpd/lonTabs/spares.tab, to offload sessions depending on own workload. This check happens is done by the login handler. It re-directs the login information and session to the least busy spare server if itself is overloaded. An additional round-robin IP scheme possible. See Fig. ``Load Balancing Sample'' for an example of a load-balancing scheme.


DESCRIPTION

Provides persistent TCP connections to the other servers in the network through multiplexed domain sockets

lonc forks off children processes that correspond to the other servers in the network. Management of these processes can be done at the parent process level or the child process level.

After forking off the children, lonc the parent executes a main loop which simply waits for processes to exit. As a process exits, a new process managing a link to the same peer as the exiting process is created.

logs/lonc.log is the location of log messages.

The process management is now explained in terms of linux shell commands, subroutines internal to this code, and signal assignments: