In some large projects it is often convenient to be able
to distribute expensive processing (e.g., large compiles),
to other hosts.
Another situation in which distributed processing is required,
is when a tool's availability is limited to a small number of hosts.
mimk's ability to invoke multiple processes in parallel,
combined with qremote, qdsrv, qhost, and
cush facilities support distributed or remote processing.
All one has to do is to specify the list of hosts that
may be used to execute specific tools.
 |
|
_Servers_[service] |
To specify that a tool is be distributed or run on specific hosts,
one sets a qvrs associative array variable called _Servers_
with the index being the name of the tool, to the list of
hosts which may be used to run the tool.
For example:
set _Servers_[cc] host1 host2 host3
indicates that cc commands are to be distributed to host1, host2,
and host3.
When qef is processing its input, it will modify
the definition of any _T_service for which a
_Servers_[service] is defined to indicate to mimk
that the service is to be distributed.
mimk in turn will transform the commands so indicated
to insert "qremote DRSservice" before the tool's
name.
So if the recipe was:
_T_cc -c ... echo.c ...
mimk will actually execute:
qremote -DRScc cc -c ... echo.c ...
Note: |
The "cc" in the recipe will be replaced by the value of
_T_cc if it was defined. |
When qremote is invoked with a Sservice flag,
it retrieves the value of _Servers_[service],
checks which hosts in the list are available (using qhost)
and then uses qdsrv to select the least recently used host (see below)
in the list.
The argument command (i.e., "cc -c ...") is then run on the
selected host.
Note: |
The D flag indicates that the command is to be run
in the current directory, albeit from the remote host.
The R flag indicates that any recursive calls to qremote
should be ignored. Recursive calls might arise if a tool, (e.g.,
purify) runs another tool (e.g., cc) both of which have
_Servers_[] settings. |
|
Limitations |
The target hosts must be able to access the directory on the
current host in which qef is being run.
They must have an installed Q-Tree and qhost
must be running on it.
The cost of running qremote, rsh, cush and the
overhead of the messages to the qdsrv and qhost
servers is not trivial.
The savings in running a process on another host will have to
offset these overheads, thus its use should be limited to
expensive processes.
|
Where to set _Servers_ |
The _Servers_[service] can be set in any of the
qvrs files, include the traits file for the host.
It can be modified as required in the more local qvrs file
(e.g., qeffile), but should probably be limited to platform
specific files as there are no checks (as yet) to ensure that the
target platform is compatible with the current host.
|
Least Recently Used Host |
qremote needs to pick a host from the list named by
the _Servers_[] list.
It first sends messages to the target host's qhost server
which also returns the host's Q-Tree directory and the
user's home directory.
This determines which hosts are available.
qremote then sends the list to qdsrv which maintains
an internal list of hosts and when they were last selected.
qdsrv then returns the name of the least recently selected
host.
It might be preferable to select a host based on its current load,
however, that is difficult to determine on some systems.
Furthermore the return on investment is minimal as it is likely
that one is cycling through the list of server hosts and the
least-recently-selected scheme does that effectively.
|
MAXMKPROCS |
This qvrs variable specifies the maximum number of parallel
processes to be run by mimk.
If using the distributed processing facility, this variable
could be increased to take advantage of the available servers.
For example, one could set MAXMKPROCS to three times the number of
servers available as in:
set MAXMKPROCS @(expr 3 * @_Servers_[cc]~l)
|
cook30.qh - 1.4 - 03/11/06 |
|