Final Year Physics Project Interim ReportAnthony HashemiDepartment of Physics, University of Warwick, Coventry CV4 7AL, United KingdomI. IntroductionThe Standard Model of particle physics describesthe fundamental particles of physics(fermions and bosons) and their correspondinginteractions 1; the electromagnetic,strong and weak force.
Fermionshave half integer spin, and each has a correspondingantiparticle different to itself.Bosons have integer spin and can be theirown antiparticle, are responsible for thefundamental forces between particles, andinclude the photon, gluons, and weak decay(W and Z) bosons. All the predictedparticles of the standard model have nowbeen discovered experimentally since theHiggs boson was confirmed in the past fewyears 2, 3.
However, it has many parameterscalculated by experiment and only includesthree of the four fundamental forces;it excludes gravity, as the graviton is yet tobe found 4. Also, it doesn’t explain manycosmological phenomena such as dark matter5 or what happened to all the antimatterafter the big bang (CP violation)6. Therefore, it is believed to approximatea larger complete theory. The StandardModel organises fermions (quarks and leptons)using the flavour property (six quarkflavours, and six lepton flavours). Out ofthe three standard model interactions: onlythe weak force can change flavour, and thenonly in charged current transitions (mediatedby a W+ or W? boson) and thereforeflavour transitions between fermionsof the same charge (and different flavour),known as flavour-changing neutral-current(FCNC) processes, cannot be contributedto by tree-level diagrams 7 and only occurthrough higher-order loop processes. Treelevel diagrams are those Feynman diagrams8 without loops, higher order processeshave diagrams containing loops in whichvirtual particles are involved and representhigher orders of perturbation theory(virtual particles are transient fluctuationsthat violate energy and momentum conservationvery briefly, and come from perturbationtheory where particle interactionsdescribed by virtual particles).The Cabibbo-Kobayashi-Maskaw matrix9 gives information of flavour-changing1Figure 1: This is the CKM Matrix. Each elementrepresents the transition probability betweenthe two quarks in the subscript.
For exampleVud represents the probability of an up(or anti-up) quark transitioning into a down(or anti-down) quark or vice versa. Specifi-cally in the Wolfenstein configuration shown,? = 0.2257+0.
0009?0.0010 is the expansion parameter,? = 0.135+0.031?0.016, ? = 0.
349+0.015?0.017 andA = 0.814+0.
021?0.022. Therefore the WolfensteinCKM matrix is almost diagonal.weak decays, specifically it gives the likelihoodof a transition between the up anddown type quarks. We will use the WolfensteinParametrization of CKM 10 whichis approximately diagonal, as the diagonalelements are the charge changing transitionswithin each of the three fermionflavour generations (u?d, c?s, t?b) andthe elements involved in higher order looptransitions are therefore the off diagonalsand so are relatively small compared tothe tree level diagonal elements. Therefore,in the standard model, higher order loopprocesses are suppressed because of theGlashow?Iliopoulos?Maiani (GIM) mechanism11 and the CKM Matrix being approximatelydiagonal.
CP violation is thecombination of charge conjugation violationand parity violation. Charge conjugationis the symmetry between positive andnegative charges and parity is spatial coordinatesymmetry. CP violation has beentested in the Kaon system 12 but not thewhole SM, and there is a discrepancy of theCP violation given by CKM and the CKMrequired to explain the low amount of antimatterin the universe is obviously.
Therefore,looking for CP violation in B decaysbeyond the SM CKM is vital to accountfor the B asymmetry in the universe. Someexamples of NP are low-energy supersymmetricextensions of the SM (SUSY) 13,where there are extra sources of CP violationand the CKM ?CKM is applied tonew particles.Rare b meson decays are rare decays (decaysof particles in which fermions changeflavour) of b mesons; mesons containing abottom (b) quark (or anti quark). Rare decaysare suppressed due to involving higherorder loops, or small (combinations of)CKM matrix elements or both. These arefeatures which arent necessarily true in beyondstandard model physics.
As the SMdoes a very good job at describing fundamentalparticles and interactions, it isknown that the deviations from SM thatNP brings must be small. So, the changesare only to be seen in processes which aresuppressed in the SM, which is the case inrare decays which means they are of interestwhen investigating NP. The first rareB decay was found by the CLEO experiment14 of the b?s process and the sub2Figure 2: LHCb Schematic.http://lhcb-public.web.
cern.ch/lhcbpublic/en/Detector/Detector-en.htmlsequent BaBar 15, 16 and Belle 17 experimentsamong others have given strongconstraints on NP models.At the LHC 18 there are four large experimentsongoing: ATLAS 19, ALICE20, CMS 21 and LHCb 22 (as wellas other smaller ones). LHCb is a forwardspectrometer covering pseudorapidityrange 2 < ? < 5 whose objective is to lookfor evidence that the Standard Model (SM)of particle physics is not complete, specificallyby looking at rare B hadron (particlescontaining b and anti-b quarks) decaysas explained above. Before using theLHC, the rare B decays studied have agreedwith the SM to the level of precision possible.
The LHCb allows greater precisiondue to the high energy proton-proton collisions,which gives a large cross section ofO(100µb) 23 and so the possibility to showdiscrepancies with the SM.II. TheoryB0is made of anti b quark and a d quark.
?0is made up of d anti d in this decay(can also be u anti u). Then there are thedimuon pair mu+, mu-. The decay is antib to anti d and mu+ and mu-.B mesons produced by proton-protoncollisions in the LHCb stay close to the lineof the beam pipe and so the detector is aseries of different components stacked behindeach other in that line. Firstly, theVertex Locator (VeLo) detector is wherethe proton beams collide. The B meson isindirectly detected by matching the theoreticaldecay distance of the B meson (justover 1 mm) to a particle decay distance inthe VeLo.
The decay distance of a particlein the VeLo is measured by detectingthe position of the particle decay and theposition of the proton collision. Then theRing Imaging Cherenkov (RICH) detectorsidentify different charged particles from aB decay, such as pions, kaons and protonsby measuring the emissions of Cherenkovradiation, allowing the determination of velocity.In between the two RICH detectors,there is a magnet which allows the momentumof charged decay particles to be calculatedby looking at their curvature in thefield. Silicon and outer trackers then recordtrajectories of charged particles; chargedparticles collide with silicon atoms in thesilicon tracker giving an electric current and3they may ionise gas molecules in the outertracker giving an electric current. Combiningthe velocity from RICH and trajectoriesfrom the tracking system and magneticfield gives the mass, charge, and thereforethe particle itself.
Pion 0 is not chargedand so cannot be detected by the RICH,magnetic field, or tracking detectors and soit (and other non-charged particles) needsan extra detector. This is solved with thecalorimeter system, which stops particles asthey hit metal plates, releasing a showerof particles which then hit metal platesand emit ultraviolet light. The calorimeterscan then measure the energy lost bymeasuring the amount of UV as UV luminosityproportional to energy of particles.The other particles in the B decay are thedimuon pair: mu+, mu- and although theyare charged particles, they need a separatedetector system to the other charged particlesand cannot use the calorimeter systemas they pass through the it with almost noenergy loss as they are heavy. The muonsystem can only detect particles comingfrom muons, as all others are stopped bythe calorimeter system. The muon systemis made up of five rectangular stations containingcarbon dioxide, argon and tetrafluoromethane,which the muons react withand the fallout is detected by electrodes.A Monte Carlo algorithm is used to simulateproton-proton collisions so as to beable to compare the real data to somethingwe think it should look like.
Such processesthat are simulated the proton-proton to B0decay, hadronics (propagation of particles)and the electronics of the detector. Chargesin the electronics correspond to hits, whichallow the construction of tracks and particlein the same way as the real particlecharges in the detectors can.All the events in our dataset are potentialB0 decays but most of it is background.They are selected by combiningmuons with a neutral pion whilst requiringrestrictions on the impact parametersof these daughter particles and potential Bparticle, the B momentum to be directedto the primary vertex and more for themuons. There are different types of background(misidentified groups of particles)present in the data. One type is combinatorialbackground, where the particles in anevent come from different decays, anotheris a peaking background where the particlesare from a single decay but at least one ofthe particles is misidentified and lastly, partiallyreconstructed backgrounds where atleast one of the final state particles in thedecay are not reconstructed. 24To choose which variables were best tocut on, we use Receive Operating Characteristic(ROC) curves 25. ROC curves aregraphs where 1 ? B is plotted against Swhere B is Efficiency of Background = AcceptedBackground Events / Total numberof Background Events, S is Efficiency of4Signal = Accepted Signal Events / TotalNumber of Signal Events, where the correspondingsignal and background acceptedevents are those which are kept by thecut on that number (direction decided bywhich one removes more background thansignal).
A square ROC would be the optimum(full separation) and the worst wouldbe a straight line between (0,1) and (1,0)(no separation). All curves have area > 0.5by this convention and so the greater area,the better the variable to cut.Once good variables are chosen to cut onto remove background (but preserve mostsignal), to choose the best value of a variableto cut on, an efficiency measure canbe used, and here this is chosen to bethe Punzi Figure of Merit: P = S/(1 ??B ? NB) where NB is Total Number ofNumber of Background Events.
Maximisingthis gives the best ratio of backgroundloss to signal loss 26.Classifiers predict the outcome of a binaryquestion. Multivariate classifiers combinemultiple affecting variables into a fi-nal output variable which can be used toachieve better results than a single variableto separate/classify, where trends maynot be visible. There are multiple types ofmultivariate classification methods such asBoosted Decision Trees (BDT) and Artifi-cial Neural Networks.
A decision tree (DT)is a system made up of a consecutive setof nodes (at the ends of branches) with abinary response made at each. A final classificationis given after reaching the end ofa given node path (at a leaf). A DT istrained on a dataset which the outcome tothe tree that is required is already known.
When looking for a particle decay signal,signal from simulated data and backgroundfrom real data is given (separated by lookingoutside the predicted range of the signalof particle but if not possible can just usethe simulated background data). A decisiontree alone is a rather poor classifier asit cant learn that well and overfits to thetraining set. A random forest, is an ensemblelearning method made up of a collectionof decision trees whose classificationis decided as the most common classificationof the separate trees corrects the overtrainingissue of a DT as the weighted averageof all the trees becomes insensitiveto fluctuations of specific training samples.A BDT is a random forest which uses aboosting method to make sure misclassifiedevents are improved on. AdaBoost (AdaptiveBoosting), used here, gives each totaltraining set event a weight (probability ofuse in training subset); higher weight indicatesgreater probability. It trains thetrees on this subset of the training data,and then increases the weights of misclassifiedevents whilst reducing the weightsof correctly classified events and then repeatsthe process until the weight of misclassifiedevents is > 50%. The final out5put weight is the sum of all classifiers outputweighted by their errors; < 50% accuracygiven negative weight, >= 50% accuracygiven weight zero. An Artificial NeuralNetwork (ANN) on the other hand is asimulation of a model based on the humanbrain, specifically the network of neurons.
Like a BDT, it learns from a training setbut does so differently. They usually consistof layers of interconnected nodes (artificialneurons). There is the input layer,a hidden layer(s) and output layer. Thesecan be either feed-forward (signal travelsfrom input to output) and recurrent (signalscan travel in both directions, allowingolder results to be used again, giving thesystem a sort of ‘memory’). Here the focusis on feed-forward types, specifically multilayerperceptrons (MLP).
A simple perceptionhas an input layer and output node.The inputs are fed into the output node buteach connection is weighted (initially randomly)but change based on the learning.The output node then takes the weightedsum of input values and uses an activationfunction to give an output. Simple perceptrons,however, can only model linear problemswell. So for large background/signalclassification, MLP is required, where thereare ‘hidden’ layer(s) in between the inputand output with hidden neurons which allact like the output node in the simple case,however feed into another layer of hiddennodes or finally again an output node. TheFigure 3: ROC for B0 MMERR. Area = 0.
93Figure 4: Punzi Figure of Merit forB0 MMERR. Has maximum at 29.9.
This isthe best value to cut on.multiple activation functions in the multipleneurons allows for a non-linear classifi-cation by combining the linear activationfunctions into a non-linear combination.The errors in the classification are then correctedby propagating the error backward(backpropagation), readjusting the weightsaccordingly at each layer. 27III. Work Done in Term 1Firstly, I familiarised myself with ROOTand C++ and revisited Python. ThenI worked on the B decay kinematicsquestions provided; calculating B0 decaydistance, momenta of an example6Figure 5: gamma1 PY MC Background andReal Signal shows a strange issue in thecalorimeterB0 decay, impact parameters and Kaondecay distances (all in ROOT).
Then Idetermined the number of signal eventsin a toy sample by fitting to the massvariable and integrating the area underthe fit curve. Then I started analysingthe data from CERN; specifically lookingfor any variables which would give a goodcut to get rid of background in the realdata to be able to see a signal peak in theB0 M variable. This was done by plottingtwo different graphs of each variablesuperimposed on each other: one fromthe MC data in the signal region given byB0 BKGCAT < 51 and the other from thereal data in the background only range i.e.B0 M < 4900 MeV and B0 M > 5800 MeV(outside the MC signal range of B0 M).After looking at these, I decided on cutsto get rid of background in the real datafor the variables with the two graphs withleast overlap to lose background withoutlosing much signal. Singular and parallelcuts gave no obvious signal once appliedto the real data B0 M plot, although theymade it look much more exponential, soremoved a lot of large background coveringthe whole range.
Subsequently I plottedROC efficiency curves for these goodvariables to find the best of them, listingthem in order of area under curve. Thetop 10 variables were (in order from bestto worst): B0 MMERR, pi0 P, pi0 PT,pi0 MMERR, B0 TAUERR, gamma1 P,B0 DTF Chi2, B0 IPCHI2 OWNPV,B0 TAUCHI2,B0 ETA.Weird effects were observed in thepi0 PY/ gamma1(2) PY graphs. Afterconsideration this was thought to be someissue in the calorimeter and was confirmedby plotting, for both the MC and real data(in both signal and background regions),atan(gamma1 PY/gamma1 PZ) againstatan(gamma1 PX/gamma1 PZ) whichshows the detection of particles in thecalorimeter. For the real data there was astrip along the x axis of the calorimeter butin the MC data only the background hadthis and also the MC data in both regionshad a defining rectangular region near thecentre of calorimeter not present in thereal. Reasoning behind this wasnt properlyunderstood but it will need addressing interm 2.Then I started using TMVA, with onlyBDT active, with the given starting variablein the TMVA code using the MC signaland real background data as training7Figure 6: Plots for both the MC and real data (in both signal and background regions)of atan(gamma1 PY/gamma1 PZ) against atan(gamma1 PX/gamma1 PZ). Discrepancies betweenthe images of the MC and real data indicate an issue in the calorimeter in this detection,confirming assumption foundsample.
The output, from testing on thereal data (non training subset) gave a goodROC (area > 0.9).Branching factors were started to belooked at for different possible backgroundevents, to see how significant they are.IV. Plan for Term 2We will apply the TMVA BDT output acquiredin term one to the original data byadding the output weight to the originaltree, plotting a Punzi figure of merit curvefor it and finding the optimum value to cuton.
Then we will try different combinationsof the best variables I found earlier to inputinto TMVA and compare their correspondingROC curves to find the best combination.This should take about a week anda bit of signal should be identifiable in thedata. Then we will also test out and possiblyuse other forms of machine learning,such as the neural networks discussed inthe Theory section, again using the sameTMVA code. This could take between 1-3weeks depending on how many variations ofthe TMVA methods are used and to whatlevel we tweak their parameters. We will removebig amounts of background by identifyingcauses of background as discussedin the Theory section, including the J/psireconstruction of the dimuon pair, different8types of combinatorial backgrounds such as.
whose branching fractions were assessedin term 1. Will simulate particle decaysfor these background sources in a similarway to the B decay kinematics problemsat the beginning of term 1. This shouldtake about 1-2 weeks and should allow usto put constraints on the data.
This willhopefully be accompanied by acquiring anew, streamlined data set, which has all theJ/Psi contributions removed. Also will removethe data from strip around calorimetergiving the issue in the Pys discussedfrom term 1. This should get rid of substantialbackground without that much signal.
Finally the branching factors which westarted looking at in term 1 will be used tofocus on certain types of background.References1 Kibble T W B 2014 ArXiv e-prints(Preprint 1412.4094)2 Aad G et al.
(ATLAS Collaboration)2012 Phys.Lett. B716 1–29 (Preprint1207.
7214)3 2014 Nature Physics 10 557–560(Preprint 1401.6527)4 DYSON F 2013 InternationalJournal of Modern Physics A28 1330041 (Preprint http://www.worldscientific.
com/doi/pdf/10.1142/S0217751X1330041X)URL http://www.worldscientific.com/doi/abs/10.1142/S0217751X1330041X5 Abdallah J et al.
2015 Physicsof the Dark Universe 9-10 8 –23 ISSN 2212-6864 URL http://www.sciencedirect.com/science/article/pii/S22126864150001636 Y N 2000 165 (Preprint hep-ph/9911321)7 Glashow S L, Iliopoulos J and MaianiL 1970 Phys.
Rev. D2 1285–12928 Kumericki K 2016 ArXiv e-prints(Preprint 1602.04182)9 Gershon T 2012 Pramana 79 1091–1108 (Preprint 1112.
1984)10 Wolfenstein L 1983 Phys. Rev.Lett. 51(21) 1945–1947 URLhttps://link.
aps.org/doi/10.1103/PhysRevLett.51.194511 Maiani L 2013 ArXiv e-prints(Preprint 1303.6154)12 D’Ambrosio G and Isidori G 1998International Journal of ModernPhysics A 13 1–93 (Preprinthep-ph/9611284)13 Sohnius M F 1985 Phys.
Rept. 128 39–204914 Alam M S et al. (CLEO Collaboration)1995 Phys. Rev. Lett. 74(15)2885 URL https://link.aps.
org/doi/10.1103/PhysRevLett.74.288515 Lees J P et al. (BABAR Collaboration)2012 Phys.
Rev.Lett. 109(19) 191801 URLhttps://link.aps.org/doi/10.1103/PhysRevLett.
109.19180116 Lees J P et al. (BABAR Collaboration)2012 Phys. Rev. D 86(11) 112008URL https://link.aps.org/doi/10.
1103/PhysRevD.86.11200817 Limosani A et al. 2009 Physical ReviewLetters 103 241801 (Preprint 0907.1384)18 Jr A A A et al.
(LHCb) 2008 Journalof Instrumentation 3 S08005URL http://stacks.iop.org/1748-0221/3/i=08/a=S0800519 Aad G et al. (ATLAS) 2008 JINST 3S0800320 Aamodt K et al.
(ALICE) 2008 JINST3 S0800221 Chatrchyan S et al. (CMS) 2008JINST 3 S0800422 Alves Jr A A et al. (LHCb) 2008JINST 3 S0800523 LHCb Collaboration 2010 Physics LettersB 694 209–216 (Preprint 1009.2731)24 Aaij R et al. (LHCb) 2012 JHEP 12125 (Preprint 1210.
2645)25 Fawcett T 2006 Pattern Recogn.Lett. 27 861–874 ISSN 0167-8655URL http://dx.doi.org/10.1016/j.
patrec.2005.10.01026 Punzi G 2003 79 (Preprint physics/0308063)27 Hoecker A et al. 2007 ArXiv Physicse-prints (Preprint physics/0703039)10