GroupEventStoreToolkit for CLEOc EventStore


What it is and why should I use it?

EventStore provides powerful access to CLEOc data. In order to achieve this functionality it uses an underlying 'database' as well as set of auxillary files known as key and location files.

The GroupEventStoreToolkit allows you to:

It is a set of standalone Python scripts, which can be run regardless of your CLEO environment.

Requirements

This section provides information about necessary tools required to run GroupEventStoreToolkit outside of Cornell. At Cornell none of those tools/settings is required.
To setup GroupEventStoreToolkit on your PC download the toolkit from CLEO CVS tree. You'll also need two python modules: MySQLdb and sqlite. The following software need to be installed:

How To Use?

There are three main tools you should use:
  • ESBuilder
  • ESDump
  • ESFileContent
All of these tools have many options to play with.
Use -help for more information.
For examples, see usage sections below.

A few words about EventStore notations. Your data are grouped within a grade. The grade is a data collection which need to be versionede. For instance, data processed by pass2 can be called p2-processed grade, or raw data can be called daq grade. Within a grade you may subdivide your data into smaller chunks, a.k.a as skims. In EventStore notations it called view. The view is a collection of events within a grade. A good example would be qcd or 2photon views. EventStore doesn't distinguish real data or Monte Carlo. They're both equal and treated in the same way in EventStore. For MC, grade can be pi0 MC events and view can be ......... To access your data and maintain versioning information EventStore uses concept of time stamp (in a format YYYYMMDD). Once you entered (injected) your data into EventStore it remembers it's history/version by specifying valid time stamp. When you use EventStore you may specify a closest date you want to run with. To describe in details the way you stored your data you may use specific version. For instance, P2-20041110-Feb13_03 string may be informative enough to tell that this is P2 data processed by 20041110 using Feb13_03 release. By default GroupEventStoreToolkit is used SQLite underlying DB since it doesn't require any knowledge or administration from the user site. More advanced features can be achived using MySQL backend.

ESBuilder - is a main tool to manage your data in EventStore

The ESBuilder allows you to create and update an EventStore database. Just run this tool and you'll immeadiately get the full list of options with their description:

Usage: ESBuilder [ -newDB ] [ -add dir or file or pattern of files ]
                 [ -grade grade ] [ -time timeStamp ]
                 [ -svName name ]  [ -view skim ] [ -no-key ]
                 [ -output dir ] [ -esdb whichDBToUse ] [ -verbose ]
                 [ -idleMode ] [ -delete grade timeStamp ]
                 [ -move fileIn fileOut ] [ -sqlite fileName ]
                 [ -mysql host [ -user userName -password password ] ]
All options can be specified in any order. Let's explore a few useful ones.

Create a new EventStore DB from a list of files located in the directory MC (you can either use an absolute or relative path to this location): This will generate the default EventStore DB with name sqlite.db in your current directory using all pds files found in the MC directory. The default grade is 'physics' and the time stamp is zero. The new 'specific version' has been assigned in the format <machine_name>_<underlyingOS>_<localtime>. If you want to control these parameters you need to provide them.

Let's invoke the same action with more parameters:

Here we created new a EventStore DB with grade 'physics', time stamp '20040415' and assigned 'specific version' as 'MyDAnalysis'. Also here we used a data file pattern (*_hot_*.pds) to add only file which match it in /home/vk/MC directory. To add more data to EventStore you don't need anymore -newDB flag and can procced as follows: If you need to specify a particular location of underlying DB just use the following syntax:

Now we add some type data to EventStore. But what if you want to add a specific type of data which is different from the default view='all'. For instance, you made a list of D-tagging events and saved it in ASCII Suez IDXA format. You are able to add this information to EventStore by using the -view option:

Now our EventStore knows about D-tagging events whose view is 'dtag'.

But what if you lost track of what you have done and want to reproduce excatly the same DB again? Every time you invoke the ESBuilder script it keeps track of your actions in esdb.history ASCII file. Just look around in directory where you keep your EventStore DB and read content of this file. It will have complete history of your commands. Also the esdb.log file is updated on every DB injections. It contains a PID, timestamp and SQL statement information which can be usefull for debugging purposes. The -verbose option is also usefull to understand the details of EventStore operations.

ESDump is dump utility for your EventStore DB

The ESDump prints content of underlying databases. The output may be more relevant for experts, but if you want to use here is how. This script allows you to either print the content of all databases:

or a specific one (e.g., VersionDB): The -sqlite option is used to specify which file to use. The deault is sqlite.db file. If you need to access MySQL DB just use option with appropriate MySQL EventStore DB server.

ESFileContent

The fileContent tool allows you to see the content of data(pds,binary)/key/location files. When it invoked without any arguments it summarizes content of the file. If you need more information you may add the -v option. Be aware that the verbose output can be very long. This script correctly recognizes the file format of input file. You may want to use this tool while debugging your problem or submitting a bug report to software group.


Last revised: Mon Nov 29 23:25:54 EST 2004
Maintainer: Valentin Kuznetsov