%<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
\section{Physics Analysis}\label{amp-gpu}
\label{sec:physics_analysis}
In the following section we present a conceptual overview of the envisioned plan for taking the
outputs of reconstruction and simulation and conducting a physics analysis with them. Since the
core focus of GlueX is spectroscopy and the search for exotic mesons, we will primarily discuss
how such searches will take place. The same general analysis principles are applicable to other
types of physics analyses.
Briefly, GlueX will study reactions of the type $\gamma p\to X^+n$ or
$\gamma p \to X^0p$, where $X$ is and intermediate resonance of interest that decays to a
collection of stable hadrons. For the initial phases of GlueX running, the stable hadrons of
interest are $\pi^\pm$, $\pi^0$, $\eta$, and $\eta^\prime$. The goal is to isolate some collection
of stable hadrons, {\it e.g.,} $\pi^+\pi^-\pi^+$, and then study the initial state and intermediate resonances.
For example, the final state $\pi^+\pi^-\pi^+$ may be populated by decays $a_2^+\to f_2\pi^+,~f_2\to\pi^+\pi^-$ or
$\pi_2^+\to \rho^0\pi^+,~\rho^0\to\pi^+\pi^-$. For any collection of final state particles we want to identify
all such initial states $X$ and measure the quantum numbers (total angular momentum $J$,
parity $P$, and charge conjugation $C$) of $X$. The quantum numbers of the initial state are determined by
performing and unbinned maximum likelihood fit to the angular distributions of the data, this is discussed
in more detail below. In summary the analysis process has two key steps: (1) the selection of a signal-rich
sample of events in which a particular collection of final state hadrons is produced, and (2) the subsequent
``amplitude analysis" of the angular distributions of these events in order to extract intermediate
resonances and their quantum numbers.
\subsection{Event Selection}
After the raw data events are reconstructed they will be classified into several categories. While the classification
may become detailed at later stages of the experiment, we expect initial classification to require
that there be a tagged incident photon that has energy that is approximately equal to the observable
energy in the detector. Because of the fact that the level 1 hardware trigger for the experiment is very
loose and many photons are produced outside the tagging range we expect that about 10\% of all
triggered events will satisfy this initial selection criterion, and these events will be the starting point for
physics analysis by members of the collaboration. This sample of events is expected to be resident on disk and all
analyzers will conduct their own skims of this data. We plan to retain the minimum amount of high level
information in order allow complete analysis of the event while reducing the disk footprint.
The next stage in the analysis process will be to define selection criteria that isolate a particular
set of final state particles. In order to avoid unintentional bias at this stage in the analysis we plan to use
samples of inclusive Monte Carlo generated with the version of Pythia tuned for photoproduction,
\emph{bggen}. As results emerge from GlueX, it is expected that this generator will be refined to more
accurately reflect the true background reactions. We will have a sample of Pythia-generated
photoproduction events available for analysis that is at least a factor of two larger than the data
sample collected with the experiment. This sample will allow users to understand the dominant
backgrounds for a particular analysis and develop event selection criteria to minimize those backgrounds.
At this stage of the analysis, it is expected that each analyzer may need to make up to ten passes through the
entire sample in order to refine the selection criteria for a particular analysis. While the exact mechanism for
conducting and managing these ``skims" of the data has yet to be developed we would like to efficiently accommodate
a variety of user preferences. We are examining systems such at EventStore~\cite{Jones:2005eh}, used by CLEO-c,
which essentially provides a mechanism to efficiently index events and provide random access to lists of events.
While this system would reduce disk footprint of a skim, it would require the user to perform analysis tasks, for example,
generating a histogram of a variable, inside the JANA framework. On the other end of the spectrum would be
a skim that produces a traditional $n$-tuple that be easily manipulated with a package such as ROOT. Such an
approach results in duplication of data, but it tends to make subsequent analysis tasks easier for some users and
would be ideal for sparse skims of the data. By having several options available, users may choose the
technique that provides highest efficiency for a particular analysis.
\subsection{Amplitude Analysis}
Once an analyzer has decided on a set of event selection criteria to select a final state of interest, the amplitude
analysis procedure can be started. The goal of amplitude analysis is to use all of the physical observables, {\it e.g.,}
decay angles and invariant masses, to extract information about the intermediate resonances. In order to do this,
one constructs a model probability density function that describes the density of events in the muti-dimensional
phase space of observables. This model contains free parameters, which are typically production amplitudes,
masses, or widths of various resonances that are determined through an unbinned maximum likelihood fit to the
data. Typically three collections of events are input to the fit: (a) the actual data that pass all event selection criteria;
(b) a sample of generated signal events, uniform in phase space, that is several times larger than the data; and (c)
the sample (b) after it has been subjected to the event selection criteria used to select the data events. The Monte
Carlo samples (b) and (c) are used to incorporate the acceptance of the detector and event selection
algorithm into the physics model that describes the decay.
It is expected that the amplitude analysis process will be repeated many times during the course of an analysis as
the analyzer systematically tries different parameterizations of the physics model. Some analyses may require hundreds
or thousands of fits to performed to the data to evaluate systematic uncertainties in the analysis.
As expected, performing an unbinned likelihood fit with many parameters to a large set of data presents a computational
challenge. However, the problem lends itself well to parallel computing since evaluating the log likelihood at each fit
iteration reduces to computing several large sums over the input event samples. Each term in the sum is essentially the
probability of producing a given event subject to the values of the fit parameters for that iteration. Once the event samples
have been distributed to multiple host machines these sums can be computed in parallel. The only information
exchange between the ``fit manager" and the compute nodes is then the values of partial sums and updated sets of
parameters. For large data samples, the fit time scales like 1/$N$ where $N$ is the number of nodes used to
compute sums. Recently, graphical processing units (GPUs) have been utilized as an economical means of
performing the parallel computing needed for amplitude analysis. Commodity GPUs available for several hundred
dollars have provided one to two orders of magnitude increase in the speed at which amplitude analysis fits
can be performed. Large data sets can be spread over multiple GPUs hosted by multiple CPUs. In such a configuration
the popular Message Passing Interface (MPI) toolkit for parallel processing can be used to conduct fits on
multiple GPUs simultaneously. The collaboration has developed an amplitude analysis framework for performing
such fits. Given the easy access to GPU hardware, it is expected that most collaborating universities will be
able to accomplish amplitude analysis tasks with a modest collection of GPU resources at their home institution.
For each event, one only needs to input the four vectors of the final state particles into the amplitude analysis software;
therefore, the disk size of the event samples is comparatively small (tens to hundreds of GB), which will facilitate
easy analysis away from the centralized Jefferson Lab computing resources.
\subsection{ A Test Case: $\gamma p \to \pi^+\pi^-\pi^+ n$}
In order to test the GlueX analysis framework, we conducted an amplitude analysis of mock data to study
the $\pi^+\pi^-\pi^+$ system produced in $\gamma p$ collisions. The event sample
corresponded to what we might expect to accumulate in several hours of data taking at beam intensities
comparable to those planned for the first production physics runs at GlueX.
The Pythia-based generator, \emph{bggen}, was used to generate inclusive $\gamma{p}$ photoproduction at $E_\gamma = 9$~GeV.
The signal events ($\gamma p \to \pi^+\pi^-\pi^+n$) were generated at a level of about 2.5\% of
the total hadronic cross section.
After optimizing all analysis criteria a signal selection efficiency of 25\% and a signal-to-background
ratio of 2:1 were achieved. About 20\% of the total background originated from kaons misidentified as pions.
The other backgrounds included protons being misidentified as pions or extra $\pi^0$'s in the event
that went undetected. This study, conducted in 2011, motivated a more detailed simulation of particle identification
systems and tracking resolution along with enhancements in tracking efficiency. This work is still
under development, and we expect that these enhanced algorithms along with improvements in
analysis technique, such as kinematic fitting, will provide at least an order of magnitude further background
suppression. Reducing the background to the percent level is essential for enhancing sensitivity in the amplitude analysis.
The sensitivity to small amplitudes that is provided by the GlueX detector acceptance and resolution was tested
by performing an amplitude analysis on a sample of purely generated $\gamma p\to \pi^+\pi^-\pi^+ n$ events
that has been subjected to full detector simulation and reconstruction as discussed above. Several conventional resonances,
the $a_1$, $\pi_2$, and $a_2$, were generated along with a small ($<2\%$) component of exotic $\pi_1$.
The result of the fit is shown in Figure~\ref{fig:amp_analysis}. In such a fit, the data are divided into bins of $3\pi$
invariant mass. Each bin is fit independently, and the fit parameters are production amplitudes and phases of the
different resonances that ultimately populate the $3\pi$ final state. In this mock data sample, all of the $3\pi$ resonances
are modeled by a simple Breit-Wigner, and one can see that the both the Breit-Wigner lineshape and phase can be
extracted from the small exotic wave decaying into $3\pi$ as well as the other, dominant resonances.
This study indicates that with a pure sample of
reconstructed decays, the GlueX detector provides excellent sensitivity to rare exotic decays.
The analysis sensitivity will ultimately be limited by the ability to suppress and
parametrize backgrounds in the amplitude analysis that arise from improperly reconstructed events, as noted above.
The analysis of $\gamma p \to \pi^+\pi^-\pi^+ n$ represents the first full end-to-end analysis of simulated GlueX data.
While the statistics are only a fraction of what we expect to collect with the final experiment, the exercise has motivated
continued refinement of the reconstruction algorithms. Similar studies with other physics channels are also underway
by members of the collaboration, and the preliminary results demonstrate a notable improvement in efficiency and
background rejection over that presented above. We expect this process to continue over the next three years,
leading up to the first GlueX data. As the final production computing resources become available we plan to
increase the scale of our mock analyses in order to ensure that we have an analysis framework that is capable of
meeting the demands of the experiment.
\begin{figure}
\begin{center}
\includegraphics[width=0.9\linewidth]{figs/amp_analysis_fig}
\caption{\label{fig:amp_analysis}A sample amplitude analysis result for the $\gamma p \to \pi^+\pi^-\pi^+ n$ channel with
GlueX. (top) The invariant mass spectrum as a function of $M(\pi^+\pi^-\pi^+)$ is shown by the solid histogram. The results
of the amplitude decomposition into resonant components in each bin is shown with points and error bars. (bottom) The
exotic amplitude, generated at a relative strength of 1.6\%, is cleanly extracted (red points). The black points show the
phase between the $\pi_1$ and $a_1$ amplitudes.}
\end{center}
\end{figure}