Event selection

From ATLAS-TRIUMF

Jump to: navigation, search

[edit] Event Selection for Real Data

A log of my steps to get ELSSI / TAGs working for me, so I don't forget. Should not contain any ATLAS secret information!!!

Creating TAG files with ELSSI is easy; just follow the steps at https://cern.ch/tagservices/ (or directly to https://elssi.triumf.ca), and use the most recent version, currently 02-01-01.

Log in with GRID certificate, not CERN ID certificate if you have both. It's better to make your browser ask you which certificate to use if you have several installed, not let it choose the default. Of course, that means it asks a lot.

ELSSI is rather slow for some steps. Faster if you pick nearby interface (TRIUMF much faster for me than CERN). If it spins its wheels for minutes and minutes, it doesn't necessarily mean it isn't going to work in the end.

Pick TRIUMF as location, then pick a dataset. Pick some (or all) runs proposed. Set some physics criteria (or trigger, or DQ - bearing in mind that there weren't many or perhaps any all-green runs).

Count events passing your selection. Make sure it seems like a reasonable number for your needs. Go back using the "Back" arrow at the top right of the panel (not the browser Back). Repeat as many times as needed (this is pretty fast).

Enter e-mail address. Retrieve.

A bit later you get an e-mail saying it's done. The resulting TAG file isn't on the Grid, and I had no luck at all using the "--shipInput" option with pathena, so I retrieved it locally from the web interface, then put it on the Grid with dq2-put -L TRIUMF-LCG2_SCRATCHDISK -f myNewTagFile -s directory_where_I_put_it user10.IsabelTrigger.myTAGdataset1

Then I could run pathena myJobOptions.py --inDS user10.IsabelTrigger.myTAGdataset1 --outDS user10.IsabelTrigger.testOutputFromTAGrunning

and indeed it all worked! I was just running the AnalysisSkeleton_topOptions.py from UserAnalysis, so I ended up with a basic AANT file as output. The only modification to run on TAG files is to add the line:

svcMgr.EventSelector.CollectionType = "ExplicitROOT"

For local running (not pathena, but just testing running with TAGs) you may need a line like

svcMgr.PoolSvc.ReadCatalog += ["xmlcatalog_file:/data/isabel1/PoolFileCatalog.xml"]

which seems to do no harm if you leave it there when you run on the Grid with pathena.

For the record, the input TAG file generated by ELSSI is in: user10.IsabelTrigger.myELSSIinputTAGfiles and the resulting output AANT file is in user10.IsabelTrigger.testELSSI_TAGinput.2

It is 497 events from the 2.36 TeV MinBias stream which are supposed to have NTrk>0 and NJet>0 according to my selection criteria. They don't all have jets in them in the final ntuple, but a lot of them do, so it seems to have worked (I guess the definition of NJet isn't the same?).

[edit] Details of my shipInput problem

In principle you don't need to do the step where you upload the ELSSI TAG file to the Grid. You are supposed to be able to run pathena with an input file that is on your local file system by using the switch "--shipInput" and then putting the input file in your job options as if you were running locally. It does work. But what happens when I try to do that with a TAG file as input is I get an error on submission complaining that "CollListFileGUID -queryopt Token -src PFN:/my/input/TAG/file RootCollection" gives error "Column with name 'Token' does NOT exist". I get that if I run that command locally, on my ELSSI TAG file or on any production standard TAG file downloaded from the Grid. I can run the command successfully without the "-queryopt Token" but that isn't what the job does. Anyhow, it seems only to do it on local input files being uploaded, so if you upload the input TAG file to the Grid, you can circumvent this check and all is well.

Found out how to do it (I think). First, make a file called myInputsFile.txt containing the name of the TAG file from ELSSI

  echo "myELSSIoutputTAGfile.root" >> myInputsFile.txt

Then submit the job as:

  pathena --shipInput --inputFileList=myInputsFile.txt AnalysisSkeleton_topOptions.py --outDS user10.IsabelTrigger.elssiShipTest

If it asks:

 Enter the number of events per file : 

pick a big number. 10000 worked for me. 100 crashed it.

 Traceback (most recent call last):
   File "/atlas/ATLASLocalRootBase/x86_64/PandaClient/0.2.12/bin/pathena", line 1468, in <module>
     if totalSize/tmpNSplit > maxTotalSize:
 ZeroDivisionError: integer division or modulo by zero

At that point, the problem is that so far I can only get it to do the build job; it's not obvious that it is finding the input files properly.

See also here.

Personal tools