GroupEventStoreToolkit for CLEOc EventStore
What it is and why should I use it?
EventStore
provides powerful access to CLEOc data. In order to achieve this
functionality it uses an underlying 'database' as well as set of auxillary
files known as key and location files.
The
GroupEventStoreToolkit
allows you to:
- create a new EventStore (personal or group size)
- update content of EventStore
- add new data to an EventStore (pds, binary files and/or specific "skims", e.g. qcd, etc.)
- create necessary key/location files
- keep the history of your actions to/with an EventStore
- dump the content of an EventStore
- dump the content of pds/key/location files
It is a set of standalone Python scripts, which can be run regardless of your CLEO environment.
Requirements
This section provides information about necessary tools required to run
GroupEventStoreToolkit outside of Cornell.
At Cornell none of those tools/settings is required.
To setup GroupEventStoreToolkit on your PC download the toolkit from
CLEO CVS tree. You'll also need two python modules: MySQLdb and sqlite.
The following software need to be installed:
-
Python (version 2.x). It is free software
and most recent Linux/Solaris distributions include it by default.
It can be downloaded from www.python.org.
-
MySQLdb python module. It is free software (released under GNU license)
and can be downloaded from
here.
To install it just run
python setup.py build
followed by
python setup.py install
commands.
-
PySQLite python module. It is free software and
can be downloaded from
here.
To install it just run
python setup.py build
followed by
python setup.py install
commands.
-
If you plan to use MySQL as a backend for EventStore you need to install and
run MySQL database. Details of installation as well as the source code can be
found on MySQL web site. EventStore
suppose to work with version 3.23 and above, although version 4 of MySQL is
prefered (due to full support of InnoDB transactions).
How To Use?
There are three main tools you should use:
- ESBuilder
- ESDump
- ESFileContent
|
All of these tools have many options to play with.
Use -help for more information.
For examples, see usage sections below.
|
A few words about EventStore notations. Your data are grouped within a
grade. The grade is a data collection which need to be
versionede. For instance, data processed by pass2 can be called
p2-processed grade, or raw data can be called daq grade. Within
a grade you may subdivide your data into smaller chunks,
a.k.a as skims. In EventStore notations it called view.
The view is a collection of events within a grade. A good
example would be qcd or 2photon views. EventStore doesn't distinguish
real data or Monte Carlo. They're both equal and treated in the same way
in EventStore. For MC, grade can be pi0 MC events and
view can be .........
To access your data and maintain versioning information EventStore
uses concept of time stamp (in a format YYYYMMDD). Once you
entered (injected) your data into EventStore it remembers it's
history/version by specifying valid time stamp. When you use
EventStore you may specify a closest date you want to run with.
To describe in details the way you stored your data you may use
specific version. For instance, P2-20041110-Feb13_03
string may be informative enough to tell that this is P2 data processed
by 20041110 using Feb13_03 release.
By default GroupEventStoreToolkit is used SQLite underlying DB since
it doesn't require any knowledge or administration from the user site.
More advanced features can be achived using MySQL backend.
ESBuilder - is a main tool to manage your data in EventStore
The ESBuilder
allows you to create and update an EventStore database.
Just run this tool and you'll immeadiately get the full list of options
with their description:
Usage: ESBuilder [ -newDB ] [ -add dir or file or pattern of files ]
[ -grade grade ] [ -time timeStamp ]
[ -svName name ] [ -view skim ] [ -no-key ]
[ -output dir ] [ -esdb whichDBToUse ] [ -verbose ]
[ -idleMode ] [ -delete grade timeStamp ]
[ -move fileIn fileOut ] [ -sqlite fileName ]
[ -mysql host [ -user userName -password password ] ]
All options can be specified in any order. Let's explore a few useful ones.
Create a new EventStore DB from a list of
files located in the directory MC (you can either use an absolute or
relative path to this location):
This will generate the default EventStore DB with name sqlite.db
in your current directory using all pds files found in the
MC directory. The default grade is 'physics' and the time stamp is zero. The
new 'specific version' has been assigned in the format
<machine_name>_<underlyingOS>_<localtime>.
If you want to control these parameters you need to provide them.
Let's invoke the same action with more parameters:
- ESBuilder.py -add /home/vk/MC/*_hot_*.pds -newDB
-svName MyDAnalysis -grade physics -time 20040415
Here we created new a EventStore DB with grade 'physics', time stamp
'20040415' and assigned 'specific version' as 'MyDAnalysis'. Also
here we used a data file pattern (*_hot_*.pds) to add
only file which match it in /home/vk/MC directory.
To add more data to EventStore you don't need anymore -newDB
flag and can procced as follows:
- ESBuilder.py -add
$HOME/dir1/file1.pds -svName MyAnalysis -grade physics -time 20040415
- ESBuilder.py -add
/cdat/tem/dir/file2.pds -svName MyAnalysis -grade physics -time 20040415
If you need to specify a particular location of underlying DB just use
the following syntax:
- ESBuilder.py ......... -sqlite $HOME/sqlite_pi0.db
Now we add some type data to EventStore. But what if you
want to add a specific type of data which is different from the default
view='all'. For instance, you made a list of D-tagging events and saved
it in ASCII Suez IDXA format. You are able to add this information to
EventStore by using the -view
option:
- ESBuilder.py -add
/disk1/file.idxa -svName MyAnalysis -view dtag
Now our EventStore knows about D-tagging events whose view is 'dtag'.
But what if you lost track of what you have done and want to reproduce
excatly the same DB again?
Every time you invoke the ESBuilder script it
keeps track of your actions in esdb.history ASCII file.
Just look around in directory where you keep your EventStore DB
and read content of this file. It will have complete history of your
commands. Also the esdb.log file is updated on every DB injections.
It contains a PID, timestamp and SQL statement information which
can be usefull for debugging purposes. The
-verbose
option is also usefull to understand the details of EventStore
operations.
ESDump is dump utility for your EventStore DB
The ESDump
prints content of underlying databases. The output may be more relevant for
experts, but if you want to use here is how.
This script allows you to either print the content of all databases:
or a specific one (e.g., VersionDB):
- ESDump -sqlite ES.db -dbTable VersionDB
The -sqlite option is used to specify which file to use.
The deault is sqlite.db file. If you need to access MySQL DB just
use
- -mysql lnx248.lns.cornell.edu
option with appropriate MySQL EventStore DB server.
ESFileContent
The
fileContent
tool allows you to see the content of data(pds,binary)/key/location files.
When it invoked
without any arguments it summarizes content of the file.
If you need more information you may add the -v option.
Be aware that the verbose output can be very long.
This script correctly recognizes the file format of input file.
You may want to use this tool while debugging your problem or submitting a
bug report to software group.