LArMonTools Tips
From ATLAS-TRIUMF
[edit] LArMonTools Release Information
Overview of Releases
A overview of the release can be found at Here
All development is currently in the Trunk.
[edit] LArMonTools Performance
Memory and CPU time for LArMonTools as a function tag Here .
Memory and CPU Time for LArMonTools as a function of tool Here .
[edit] LArMonTools Coding and Validation Information
This section is intended to provide information on tools for validation and debugging. The Software Development work book can be found at (https://twiki.cern.ch/twiki/bin/view/Atlas/SoftwareDevelopmentWorkBook). It is strongly recommended that all developers and managers read the ATLAS coding standards (https://twiki.cern.ch/twiki/bin/view/Atlas/CodingStandards).
Information on how to use SVN can be found at (https://twiki.cern.ch/twiki/bin/view/Atlas/SoftwareDevelopmentWorkBookSVN)
Note: make sure you LD_LIBRARY_PATH contains your test_area.
1) The standard validation recipe for Rel 15 and 16 https://twiki.cern.ch/twiki/bin/view/Atlas/RecoRealData. One should test their code in the appropriate nightly (generally the most recent nightly that compiled for AtlasTier0 release 15.5.X.Y). The required validations typically include one cosmic and one collision simulation transform.
Reco_trf.py AMI=q120 Reco_trf.py AMI=q121
Please double check the standard validation recipe web page (https://twiki.cern.ch/twiki/bin/view/Atlas/RecoRealData) for the most recent procedure.
Also one should compile their code with the options used in the nightly compilation:
gmake -j6 PEDANTIC=1 VERBOSE=1
2) Monitoring of CPU time and Memory done can be done with perfmon. An example is:
RAWtoESD_trf.py inputBSFile=/afs/cern.ch/atlas/project/rig/data/data10_7TeV.00153565.physics_L1CaloEM.merge.RAW._lb0420._0001.1 maxEvents=250 autoConfiguration=everything --athenaopts=--stdcmalloc preExec=rec.doDetailedPerfMon=True,,rec.doNameAuditor=True,,from@PerfMonComps.PerfMonFlags@import@jobproperties@as@pmjp,,pmjp.PerfMonFlags.doPostProcessing=True outputDQMonitorFile=mymon.root outputESDFile=myesd.pool.root outputMuonCalibNtup=mymuoncalib.root outputNTUP_TRKVALIDFile=mytrkvalidntup.root outputTAGComm=mytagcom.pool.root
2) Here is a useful script to run on BS to ESD with only the minimum required for LArMonTools (Note: ESD to AOD is not intended to work and the input BS file name may need to be changed). The perfmon output can be found in the results_bstoesd. Additional job options can be included in the Reco_trf.py by adding them to the preInclude list in the transform arguments.
#!/bin/bash
for i in bstoesd esdtoaod merged; do
if [ -d "results_${i}/" ];then
echo "Remove results_$i before starting."
exit 1
fi
done
touch starttime
Reco_trf.py inputBSFile=/scratchdisk2/inugent/daq.ATLAS.0091900.physics.IDCosmic.LB0001.SFO-2._0001.data skipEvents=0 maxEvents=-1 trigStream=IDCosmic beamType=cosmics conditionsTag=COMCOND-ES1C-001-01 geometryVersion=ATLAS-GEO-03-00-00 outputESDFile=myESD.pool.root HIST=myMergedMonitoring.root preExec=rec.abortOnUncheckedStatusCode=False,,rec.doTrigger=False,,rec.doInDet=False,,rec.doMuon=False,,rec.doEgamma=False,,rec.doJetMissingETTag=False,,rec.doMuonCombined=False,,rec.doTau=False,,rec.doMonitoring=True,,rec.doPerfMon=True,,rec.doDetailedPerfMon=True,,DQMonFlags.doGlobalMon=False,,DQMonFlags.doTileMon=False,,DQMonFlags.doCaloMon=False,,DQMonFlags.doMuonCombinedMon=False preInclude=RecExCommon/RecoUsefulFlags.py,RecExCommission/MinimalCommissioningSetup.py,RecJobTransforms/UseOracle.py,RecJobTransforms/debugConfig.py --ignoreunknown --athenaopts='--stdcmalloc'
touch endtime
for i in $PARTS; do
if [ ! -f ntuple_${i}.pmon.gz -o ntuple_${i}.pmon.gz -ot starttime ]; then
echo "Job didn't finish like it should (problems in ${i})"
exit 1
fi
done
MONFILE_bstoesd=Monitor.root
if [ ! -f $MONFILE_bstoesd -o $MONFILE_bstoesd -ot starttime ]; then
echo "Did not find bstoesd monitor output file ($MONFILE_bstoesd): "
exit 1
fi
THISDIR=$PWD
for i in bstoesd; do
RESULTDIR=$THISDIR/results_${i}
mkdir $RESULTDIR
PMONFILE=ntuple_${i}.pmon.gz
cp $THISDIR/$PMONFILE $RESULTDIR
cd $RESULTDIR
perfmon.py $PMONFILE
PMONSUMMARY=ntuple_$i.perfmon.summary.txt
if [ $? != 0 -o ! -f $PMONSUMMARY ]; then
echo "Problems in perfmon.py $PMONFILE"
cd $THISDIR
exit 1
fi
/afs/cern.ch/user/s/sschaetz/public/monmanagers.sh $PMONSUMMARY
if [ ! -f monmanagers.txt -o ! -f monmanagers_CPU.txt ];then
echo "Problems in producing monmanagers summary files for $i step."
cd $THISDIR
exit 1
fi
cd $THISDIR
done
cp $MONFILE_bstoesd $THISDIR/results_bstoesd
touch reallyendtime
3) A job option file for running LArRODMonTool.
from LArROD.LArRODFlags import larRODFlags
larRODFlags.doDSP.set_Value_and_Lock(True) # T
larRODFlags.readRawChannels.set_Value_and_Lock(True) # T
from LArConditionsCommon.LArCondFlags import larCondFlags
larCondFlags.useShape.set_Value_and_Lock(True)
# do single Version ?
larCondFlags.SingleVersion.set_Value_and_Lock(True)
larCondFlags.OFCShapeFolder.set_Value_and_Lock("")
from LArROD.LArRODFlags import larRODFlags
larRODFlags.readDigits.set_Value_and_Lock(True)
4) A modified version of the online job options which does not include any other sub detector.
# -- About data format
# -- I believe it should not be our job to set these flags.
# -- They should be retrieved automatically online. Same for BField.
from AthenaCommon.GlobalFlags import globalflags
globalflags.DataSource.set_Value_and_Lock('data')
globalflags.InputFormat.set_Value_and_Lock("bytestream")
globalflags.ConditionsTag.set_Value_and_Lock('COMCOND-MONC-001-00')
globalflags.DetDescrVersion.set_Value_and_Lock('ATLAS-GEO-03-00-00')
# -- Data type
from AthenaCommon.BeamFlags import jobproperties
jobproperties.Beam.beamType = 'cosmics'
# -- Detector flags
from AthenaCommon.DetFlags import DetFlags
DetFlags.all_setOff() #Switched off to avoid INDET geometry problems
DetFlags.Calo_setOn()
DetFlags.digitize.all_setOff()
# -- Output flags
from RecExConfig.RecFlags import rec
rec.doESD = True
rec.doAOD = False
rec.doCBNT = False
rec.doJiveXML = False
rec.doWriteESD = False
rec.doWriteAOD = False
rec.doWriteTAG = False
rec.readESD = False
rec.readRDO = True
from AthenaCommon.AthenaCommonFlags import athenaCommonFlags
athenaCommonFlags.BSRDOInput = [
'/castor/cern.ch/grid/atlas/DAQ/2009/00127963/physics_L1Calo/data09_calophys.00127963.physics_L1Calo.daq.RAW._lb0000._SFO-1._0001.data'
]
athenaCommonFlags.EvtMax = 500
athenaCommonFlags.SkipEvents = 0
# -- Reco flags
rec.Commissioning = False
rec.doTruth = False
rec.doInDet = False
rec.doMuon = False
rec.doMuonCombined = False # if True, crash in LArMuId from ImpactInCaloAlgInDet
rec.doEgamma = False
rec.doTau = False
rec.doJetMissingETTag = False
rec.doLArg = True
rec.doTile = True
# -- Monitoring flags
rec.doMonitoring = True
from AthenaMonitoring.DQMonFlags import DQMonFlags
DQMonFlags.doGlobalMon = False
DQMonFlags.doEgammaMon = False
DQMonFlags.doMissingEtMon = False
DQMonFlags.doTauMon = False
DQMonFlags.doTileMon = False
DQMonFlags.doJetTagMon = False
# -- Debug flags
rec.doPerfMon = True
rec.doDetailedPerfMon = True
rec.doDumpProperties = True
rec.abortOnUncheckedStatusCode.set_Value_and_Lock(False)
rec.OutputLevel = INFO
################################
# LAR/CALO MONITORING SWITCHES #
################################
if 'CONFIG' not in dir():
CONFIG = 'LArMon'
print "DEBUG: CONFIG =", CONFIG
if CONFIG=='CaloMon':
DQMonFlags.doLArMon = False
DQMonFlags.doCaloMon = True
elif CONFIG=='DSPMon':
DQMonFlags.doLArMon = True
DQMonFlags.doCaloMon = False
rec.doTile = False
from LArROD.LArRODFlags import larRODFlags
larRODFlags.doDSP.set_Value_and_Lock(True) # T
larRODFlags.readRawChannels.set_Value_and_Lock(True) # T
from LArConditionsCommon.LArCondFlags import larCondFlags
larCondFlags.useShape.set_Value_and_Lock(True)
# Turn Clusters reco OFF
from CaloRec.CaloRecFlags import jobproperties
jobproperties.CaloRecFlags.doCaloTopoCluster.set_Value_and_Lock(False)
jobproperties.CaloRecFlags.doCaloCluster.set_Value_and_Lock(False)
jobproperties.CaloRecFlags.doEmCluster.set_Value_and_Lock(False)
jobproperties.CaloRecFlags.doCaloEMTopoCluster.set_Value_and_Lock(False)
else: # default = LArMon
DQMonFlags.doLArMon = True
DQMonFlags.doCaloMon = False
rec.doTile = False
# Turn Clusters reco OFF
from CaloRec.CaloRecFlags import jobproperties
jobproperties.CaloRecFlags.doCaloTopoCluster.set_Value_and_Lock(False)
jobproperties.CaloRecFlags.doCaloCluster.set_Value_and_Lock(False)
jobproperties.CaloRecFlags.doEmCluster.set_Value_and_Lock(False)
jobproperties.CaloRecFlags.doCaloEMTopoCluster.set_Value_and_Lock(False)
# -- Dirty fixes - to be understood
rec.doDetStatus = False # Added by Henric as it crashes on access to GLOBAL_OFL
rec.doHist = False # Added by J.L. as it crashes on some TrigHist service
rec.doTrigger = False # Added by J.L. If you turn trigger ON, it's crashing
# -- Some basic reco settings from Walter
include("RecExOnline/SimpleLarCondFlags.py")
# -- To read raw channels from DSP instead of building them from Digits
if CONFIG=='DSPMon': # should be TRUE all the time
from LArROD.LArRODFlags import larRODFlags
larRODFlags.readDigits.set_Value_and_Lock(True)
else: # can be switched to True or False
from LArROD.LArRODFlags import larRODFlags
# Changed by Fabien for a comics run 32 sample
larRODFlags.readDigits.set_Value_and_Lock(True)
# -- Not sure why we still need this one
LArDigitKey = "FREE"
# -- Main jobOpt
include("RecExCommon/RecExCommon_topOptions.py")
# -- private parameters
if CONFIG=='DSPMon':
from LArMonTools.LArMonToolsConf import LArRODMonTool
LArRODMonTool.SkipKnownProblematicChannels = False
LArRODMonTool.SkipNullPed = False
# -- printout
print "CHECK POINT PRINTING"
globalflags.print_JobProperties()
# -- Over-writes come at the end
MessageSvc = Service("MessageSvc")
MessageSvc.OutputLevel = INFO
5) Valgrind is a useful tool for checking memory usage and identifying the source of crashes and memory leaks in your software. Here is an example command to run valgrind, where TopJobOption.py is the top job option that you want to run. More information on valgrind can be found at (https://twiki.cern.ch/twiki/bin/view/Atlas/UsingValgrind).
valgrind --leak-check=yes --trace-children=yes --num-callers=8 --show-reachable=yes `which athena.py` --stdcmalloc TopJobOption.py >! valgrind_fix1.log 2>&1
Valgrind can also be useful for tracking down memory leaks or other crashes. First, one needs to include modify the requirements file in the cmt directory for the package you want to monitor.
private macro cppdebugflags '$(cppdebugflags_s)' macro_remove componentshr_linkopts "-Wl,-s"
Then you need to remake the package
gmake clean gmake binclean cmt config source setup.sh gmake
Now you can run you transform job with valgrind from your working directory.
valgrind --tool=memcheck --trace-children=yes --num-callers=8 --leak-check=yes --show-reachable=yes <your job option commands>
An example is shown below:
valgrind --tool=memcheck --trace-children=yes --num-callers=8 --leak-check=yes --show-reachable=yes /afs/cern.ch/atlas/software/builds/nightlies/15.5.X.Y/AtlasTier0/rel_2/InstallArea/share/bin/Reco_trf.py inputBSFile=/afs/cern.ch/user/g/gencomm/w0/problematicEvents/daq.extract.evt1673836run92048._0001.data,/afs/cern.ch/user/g/gencomm/w0/problematicEvent/daq.extract.evt1756246run91890._0001.data,/afs/cern.ch/user/g/gencomm/w0/problematicEvents/daq.extract.evt1756635run91890._0001.data,/afs/cern.ch/user/g/gencomm/w0/problematicEvents/daq.extract.evt3167858run91890._0001.data skipEvents=0 maxEvents=-1 trigStream=IDCosmic beamType=cosmics conditionsTag=COMCOND-ES1C-001-01 geometryVersion=ATLAS-GEO-03-00-00 outputESDFile=myESD.pool.root HIST=myMergedMonitoring.root preExec=rec.abortOnUncheckedStatusCode=False,,rec.doTrigger=False,,rec.doInDet=False,,rec.doMuon=False,,rec.doEgamma=False,,rec.doJetMissingETTag=False,,rec.doMuonCombined=False,,rec.doTau=False,,rec.doMonitoring=True,,rec.doPerfMon=True,,rec.doDetailedPerfMon=True,,DQMonFlags.doGlobalMon=False,,DQMonFlags. doTileMon=False,,DQMonFlags.doCaloMon=False,,DQMonFlags.doMuonCombinedMon=False preInclude=RecExCommon/RecoUsefulFlags.py,RecExCommission/MinimalCommissioningSetup.py,RecJobTransform/UseOracle.py,RecJobTransforms/debugConfig.py --ignoreunknown --athenaopts=--stdcmalloc
Note: to run Valgrind on the Reco_trf.py one requires a large amount of memory. You may need to make sure your computer has enough resources.

