OVERVIEW
----------------------------
The hdmongui.py program can be used to launch and/or monitor
the health of the monitoring farm processes. It launches and
kills processes by running the start_monitoring script (which
can also be run by hand without this program). The monitoring
function is done by continuosly communicating with each farm
process via cMsg.
cMsg Connection UDL
----------------------------
Communication with the farm processes via cMsg using the server
given by the value of the JANACTL_UDL environment variable is
normally set to:
cMsg://gluondb1/cMsg/janactl
but if it is not set, the following is used:
cMsg://localhost/cMsg/janactl
This can be changed by changing your environment variable and
relaunching hdmongui.py. Note, however, that monitoring processes
that were already started using a different UDL will continue
to communicate only through that UDL and must be killed and
restarted in order to use the new UDL.
LEVELS
----------------------------
The monitoring system is designed to support multiple "levels"
of monitoring. The main application of this is to have some
nodes process at high rate to fill histograms with large
statistics. Other nodes can do a more full analysis of each
event at a slower rate. There is no limit to the number of
levels one can have. The levels are set by files matching the
following naming pattern:
${DAQ_HOME}/config/monitoring/hdmon*.conf
where DAQ_HOME is an environment varaible (normally set to
something like "/home/hdops/CDAQ/daq_dev_v0.31/daq")
and the "*" part defines the level name. Level names can
be as simple as "1" and "2" or more complex strings like
"tesing". These would correspond to configuration file
names "hdmon1.conf", "hdmon2.conf", and "hdmontesting.conf".
To create a new level, simply create a new configuration
file in the monitoring directory. Keep in mind though that
users will usually want to start monitoring for all levels
so don't leave extra configuration files lying around that
may accidentally get used by unsuspecting shift workers.
NODE ASSIGNMENTS
----------------------------
The nodes assigned to run proesses for each level are
specified in the file:
${DAQ_HOME}/config/monitoring/nodes.conf
Each line has two items. The first specifies what type
of process to run and the second is the node name. For
example:
mon2 gluon110
would specify that a level "2" monitoring process should
be run on gluon110.
Two important things to note:
- A node can appear more than once in this file
and so may have more than one process launched
there
- Other node assignments may be made in this file
including L3 trigger nodes and the RootSpy archiver
node
EVENT SOURCE
----------------------------
The source of events used by the monitoring processes may
be specified in multiple ways. If the "COOL" box is checked
then the parameters for the ET system are extracted for the
current CODA configuration (COOL configuration). Usually,
this is what shift workers will want to do.
If the "COOL" box is not checked, then the value in the
"User specified" entry is used. This has the same format
as all DANA (i.e. sim-recon) programs accept. Thus, you
can actually put a file name there, but that would be of
limited use since all monitoring processes would read from
it independently. In most all cases, one should specify an
ET system from which to read events. In the Hall-D syntax,
an ET system is specified by:
ET:filename:station:host:port
where:
filename is the name of the ET memory mapped file
which is usually something like
/tmp/et_hdops_ERsoftROC
station is the name of the station on the ET
system to attach to. It will be created
if it does not exist. Usually, this should
be something like "MON" or "MON1".
host hostname of the ET system to attach to
(e.g. gluonraid2)
port TCP port the ET system is listening on.
This may be one of 11111 or 23921 or
something else altogether