The goal of this excercise is to train a Boosted Decision Tree (BDT) using the analysis TTree's discussed in the previous session. Simulation samples for the hadronic background (bggen) and signal MC (gen_3pi) have been generated, and using the trunk/home/pmatt/plugins/n3pi plugin generated analysis TTrees, which are stored on the VM at the path $GLUEX_DATA/ttree. Introductory Scripts: No exercises, just code example to introduce TMVA $GLUEX_DATA/session6b/TMVAClassification.C Input Data: $GLUEX_DATA/session6b/tree_n3pi_bggen_train.root (2M generated bggen events) $GLUEX_DATA/session6b/tree_n3pi_signal_train.root (50K generated signal events) If you are using the VM image from a jump drive given to you at the workshop you will need to copy the file tree_n3pi_bggen_train.root (1.3GB) from the jump drive to the path $GLUEX_DATA/session6b/ as it would not fit in a single image on the jump drives used. This copy can be done by dragging the file onto the VM desktop and then copying it to the proper path. Steps: 1) First setup the working directory by creating soft links to the relevant data: ln -s $GLUEX_DATA/session6b/ data 2) There are some variables we would like to compute which are not included in the default analysis TTree, such as whether or not the thrown topology is the 3pi n signal or not. For this we use a TSelector to process the analysis TTree and output a friend TTree which contains the new variables we need for the BDT training by executing the following macro root -l preTrain_n3pi.C 3) Next we train the BDT using the TMVA package, which reads in both the analysis TTree and the friend TTree and outputs a ROOT file (n3pi_trainTMVA.root) which has lots of diagnostics to check the training performance. Train the BDT with the following macro root -l train_n3piTMVA.C > logTraining & 4) Use the TMVA GUI to look at the training performace in n3pi_trainTMVA.root, which can be launched using the following macro root -l '$ROOTSYS/tmva/test/TMVAGui.C("n3pi_trainTMVA.root")' 5) Modify train_n3piTMVA.C to include the Measured__MissingMass variable and train the BDT again and compare to the BDT without this variable. Note: need to change baseName string in train_n3piTMVA.C as well to not overwrite previous weight and TMVA output ROOT files. Additional exercise (on your own): * Use TSelector to construct new variables to add to friend tree and include in BDT training to improve selection. For example, include calorimeter information from BCAL/FCAL hits not matched to reconstructed tracks.