lonc - LON TCP-MySQL-Server Daemon for handling database requests.
Usage: lonc
Should only be run as user=www. This is a command-line script which is invoked by loncron. There is no expectation that a typical user will manually start lonc from the command-line. (In other words, DO NOT START lonc YOURSELF.)
Physically, the Network consists of relatively inexpensive upper-PC-class server machines which are linked through the commodity internet in a load-balancing, dynamically content-replicating and failover-secure way.
All machines in the Network are connected with each other through two-way persistent TCP/IP connections. Clients (B, F, G and H in Fig. Overview of Network) connect to the servers via standard HTTP. There are two classes of servers, Library Servers (A and E in Fig. Overview of Network) and Access Servers (C, D, I and J in Fig. Overview of Network).
Library Servers are used to store all personal records of a set of users, and are responsible for their initial authentication when a session is opened on any server in the Network. For Authors, Library Servers also hosts their construction area and the authoritative copy of the current and previous versions of every resource that was published by that author. Library servers can be used as backups to host sessions when all access servers in the Network are overloaded. Otherwise, for learners, access servers are used to host the sessions. Library servers need to have strong I/O capabilities.
Access Servers provide LON-CAPA service to users, using the library servers as their data source. The network is designed so that the number of concurrent sessions can be increased over a wide range by simply adding additional access servers before having to add additional library servers. Preliminary tests showed that a library server could handle up to 10 access servers fully parallel. Access servers can generally be cheaper hardware then library servers require.
The Network is divided into domains , which are logical boundaries between participating institutions. These domains can be used to limit the flow of personal user information across the network, set access privileges and enforce royalty schemes. LON-CAPA domains bear no relationship to any other domain, including domains used by the DNS system; LON-CAPA domains may be freely configured in any manner that suits your use pattern.
Fig. Overview of Network also depicts examples for several kinds of transactions conducted across the Network.
An instructor at client B modifies and publishes a resource on her Home Server A. Server A has a record of all server machines currently subscribed to this resource, and replicates it to servers D and I. However, server D is currently offline, so the update notification gets buffered on A until D comes online again. Servers C and J are currently not subscribed to this resource.
Learners F and G have open sessions on server I, and the new resource is immediately available to them.
Learner H tries to connect to server I for a new session, however, the machine is not reachable, so he connects to another Access Server J instead. This server currently does not have all necessary resources locally present to host learner H, but subscribes to them and replicates them as they are accessed by H.
Learner H solves a problem on server J. Library Server E is H's Home Server, so this information gets forwarded to E, where the records of H are updated.
Fig. Overview of Network Communication elaborates on the details of this network infrastructure. It depicts three servers (A, B and C) and a client who has a session on server C.
As C accesses different resources in the system, different handlers, which are incorporated as modules into the child processes of the web server software, process these requests.
Our current implementation uses mod_perl
inside of the Apache web
server software. As an example, server C currently has four active
web server software child processes. The chain of handlers dealing
with a certain resource is determined by both the server content
resource area (see below) and the MIME type, which in turn is
determined by the URL extension. For most URL structures, both an
authentication handler and a content handler are registered.
Handlers use a common library lonnet
to interact with
both locally present temporary session data and data across the server
network. For example, lonnet provides routines for finding the home
server of a user, finding the server with the lowest loadavg, sending
simple command-reply sequences, and sending critical messages such as
a homework completion, etc. For a non-critical message, the routines
reply with a simple ``connection lost'' if the message could not be
delivered. For critical messages, lonnet tries to re-establish
connections, re-send the command, etc. If no valid reply could be
received, it answers ``connection deferred'' and stores the message in
buffer space to be sent at a later point in time. Also, failed
critical messages are logged.
The interface between lonnet
and the Network is established by a
multiplexed UNIX domain socket, denoted DS in Fig. Overview of
Network Communication. The rationale behind this rather involved
architecture is that httpd processes (Apache children) dynamically
come and go on the timescale of minutes, based on workload and number
of processed requests. Over the lifetime of an httpd child, however,
it has to establish several hundred connections to several different
servers in the Network.
On the other hand, establishing a TCP/IP connection is resource consuming for both ends of the line, and to optimize this connectivity between different servers, connections in the Network are designed to be persistent on the timescale of months, until either end is rebooted. This mechanism will be elaborated on below.
Establishing a connection to a UNIX domain socket is far less resource
consuming than the establishing of a TCP/IP connection. lonc
is a proxy daemon that forks off a child for every server in
the Network. Which servers are members of the Network is determined by
a lookup table, such as the one in Fig. Examples of Hosts. In order,
the entries denote an internal name for the server, the domain of the
server, the type of the server, the host name and the IP address.
The lonc
parent process maintains the population and listens for
signals to restart or shutdown, as well as USR1. Every child
establishes a multiplexed UNIX domain socket for its server and opens
a TCP/IP connection to the lond daemon (discussed below) on the remote
machine, which it keeps alive. If the connection is interrupted, the
child dies, whereupon the parent makes several attempts to fork
another child for that server.
When starting a new child (a new connection), first an init-sequence
is carried out, which includes receiving the information from the
remote lond
which is needed to establish the 128-bit encryption key
- the key is different for every connection. Next, any buffered
(delayed) messages for the server are sent.
In normal operation, the child listens to the UNIX socket, forwards
requests to the TCP connection, gets the reply from lond
, and sends
it back to the UNIX socket. Also, lonc
takes care to the encryption
and decryption of messages.
lond
is the remote end of the TCP/IP connection and acts as
a remote command processor. It receives commands, executes them, and
sends replies. In normal operation, a lonc
child is constantly
connected to a dedicated lond
child on the remote server, and the
same is true vice versa (two persistent connections per server
combination).
lond listens to a TCP/IP port (denoted P in Fig. Overview of Network Communication) and forks off enough child processes to have one for each other server in the network plus two spare children. The parent process maintains the population and listens for signals to restart or shutdown. Client servers are authenticated by IP.
When a new client server comes online, lond
sends a signal USR1
to lonc, whereupon lonc
tries again to reestablish all lost
connections, even if it had given up on them before - a new client
connecting could mean that that machine came online again after an
interruption.
The gray boxes in Fig. Overview of Network Communication denote the
entities involved in an example transaction of the Network. The Client
is logged into server C, while server B is her Home
Server. Server C can be an access server or a library server, while
server B is a library server. She submits a solution to a homework
problem, which is processed by the appropriate handler for the MIME
type ``problem''. Through lonnet
, the handler writes information
about this transaction to the local session data. To make a permanent
log entry, lonnet
establishes a connection to the UNIX domain
socket for server B. lonc
receives this command, encrypts it,
and sends it through the persistent TCP/IP connection to the TCP/IP
port of the remote lond
. lond
decrypts the command, executes it
by writing to the permanent user data files of the client, and sends
back a reply regarding the success of the operation. If the operation
was unsuccessful, or the connection would have broken down, lonc
would write the command into a FIFO buffer stack to be sent again
later. lonc
now sends a reply regarding the overall success of the
operation to lonnet
via the UNIX domain port, which is eventually
received back by the handler.
Since resources are assembled into higher order resources simply by reference, in principle it would be sufficient to retrieve them from the respective Home Servers of the authors. However, there are several problems with this simple approach: since the resource assembly mechanism is designed to facilitate content assembly from a large number of widely distributed sources, individual sessions would depend on a large number of machines and network connections to be available, thus be rather fragile. Also, frequently accessed resources could potentially drive individual machines in the network into overload situations.
Finally, since most resources depend on content handlers on the Access Servers to be served to a client within the session context, the raw source would first have to be transferred across the Network from the respective Library Server to the Access Server, processed there, and then transferred on to the client.
To enable resource assembly in a reliable and scalable way, a dynamic resource replication scheme was developed. Fig. ``Dynamic Replication'' shows the details of this mechanism.
Anytime a resource out of the resource space is requested, a handler routine is called which in turn calls the replication routine. As a first step, this routines determines whether or not the resource is currently in replication transfer (Step D1a). During replication transfer, the incoming data is stored in a temporary file, and Step D1a checks for the presence of that file. If transfer of a resource is actively going on, the controlling handler receives an error message, waits for a few seconds, and then calls the replication routine again. If the resource is still in transfer, the client will receive the message ``Service currently not available''.
In the next step (Step D1b), the replication routine checks if the URL is locally present. If it is, the replication routine returns OK to the controlling handler, which in turn passes the request on to the next handler in the chain.
If the resource is not locally present, the Home Server of the resource author (as extracted from the URL) is determined (Step D2). This is done by contacting all library servers in the author?s domain (as determined from the lookup table, see Fig. 1.1.2B). In Step D2b a query is sent to the remote server whether or not it is the Home Server of the author (in our current implementation, an additional cache is used to store already identified Home Servers (not shown in the figure)). In Step D2c, the remote server answers the query with True or False. If the Home Server was found, the routine continues, otherwise it contacts the next server (Step D2a). If no server could be found, a ``File not Found'' error message is issued. In our current implementation, in this step the Home Server is also written into a cache for faster access if resources by the same author are needed again (not shown in the figure).
In Step D3a, the routine sends a subscribe command for the URL to
the Home Server of the author. The Home Server first determines if the
resource is present, and if the access privileges allow it to be
copied to the requesting server (D3b). If this is true, the
requesting server is added to the list of subscribed servers for that
resource (Step D3c). The Home Server will reply with either OK or
an error message, which is determined in Step D4. If the remote
resource was not present, the error message ``File not Found'' will be
passed on to the client, if the access was not allowed, the error
message ``Access Denied'' is passed on. If the operation succeeded, the
requesting server sends an HTTP request for the resource out of the
/raw
server content resource area of the Home Server.
The Home Server will then check if the requesting server is part of the network, and if it is subscribed to the resource (Step D5b). If it is, it will send the resource via HTTP to the requesting server without any content handlers processing it (Step D5c). The requesting server will store the incoming data in a temporary data file (Step D5a) - this is the file that Step D1a checks for. If the transfer could not complete, and appropriate error message is sent to the client (Step D6). Otherwise, the transferred temporary file is renamed as the actual resource, and the replication routine returns OK to the controlling handler (Step D7).
Fig. ``Dynamic Replication: Change'' depicts the process of modifying a resource. When an author publishes a new version of a resource, the Home Server will contact every server currently subscribed to the resource (Step U1), as determined from the list of subscribed servers for the resource generated in Step D3c. The subscribing servers will receive and acknowledge the update message (Step U1c). The update mechanism finishes when the last subscribed server has been contacted (messages to unreachable servers are buffered).
Each subscribing server will check if the resource in question had been accessed recently, that is, within a configurable amount of time (Step U2).
If the resource had not been accessed recently, the local copy of the resource is deleted (Step U3a) and an unsubscribe command is sent to the Home Server (Step U3b). The Home Server will check if the server had indeed originally subscribed to the resource (Step U3c) and then delete the server from the list of subscribed servers for the resource (Step U3d).
If the resource had been accessed recently, the modified resource will be copied over using the same mechanism as in Step D5a through D7, which represents steps Steps U4a through U6 in the replication figure.
lond
provides a function to query the server's current loadavg. As
a configuration parameter, one can determine the value of loadavg,
which is to be considered 100%, for example, 2.00.
Access servers can have a list of spare access servers,
/home/httpd/lonTabs/spares.tab
, to offload sessions depending on
own workload. This check happens is done by the login handler. It
re-directs the login information and session to the least busy spare
server if itself is overloaded. An additional round-robin IP scheme
possible. See Fig. ``Load Balancing Sample'' for an example of a
load-balancing scheme.
Provides persistent TCP connections to the other servers in the network through multiplexed domain sockets
lonc forks off children processes that correspond to the other servers in the network. Management of these processes can be done at the parent process level or the child process level.
After forking off the children, lonc the parent executes a main loop which simply waits for processes to exit. As a process exits, a new process managing a link to the same peer as the exiting process is created.
logs/lonc.log is the location of log messages.
The process management is now explained in terms of linux shell commands, subroutines internal to this code, and signal assignments:
This is the process id number of the parent lonc process.
Parent signal assignment: $SIG{INT} = $SIG{TERM} = \&HUNTSMAN;
Child signal assignment: $SIG{INT} = 'DEFAULT'; (and SIGTERM is DEFAULT also) (The child dies and a SIGALRM is sent to parent, awaking parent from slumber to restart a new child.)
Command-line invocations: kill -s SIGTERM PID kill -s SIGINT PID
Subroutine HUNTSMAN: This is only invoked for the lonc parent PID. This kills all the children, and then the parent. The lonc.pid file is cleared.
Current bug: This signal can only be processed the first time on the parent process. Subsequent SIGHUP signals have no effect.
Parent signal assignment: $SIG{HUP} = \&HUPSMAN;
Child signal assignment: none (nothing happens)
Command-line invocations: kill -s SIGHUP PID
Subroutine HUPSMAN: This is only invoked for the lonc parent PID, This kills all the children, and then the parent. The lonc.pid file is cleared.
Parent signal assignment: $SIG{USR1} = \&USRMAN;
Child signal assignment: $SIG{USR1}= \&logstatus;
Command-line invocations: kill -s SIGUSR1 PID
Subroutine USRMAN: When invoked for the lonc parent PID, SIGUSR1 is sent to all the children, and the status of each connection is logged.