Outline for the art Users Guide
Rob Kutschke
June 2, 2016
Abstract:
This document is the working outline for the the Users Guide section of the
art documentation suite. The Users Guide is Part III of the current large
document. The chapter numbers in this document don’t match the chapter
numbers in the current large document. At this writing the outline in the large
document is just a placeholder - this document is the working version of the
outline.
The reader of this document is presumed to be familiar with the material in Parts I
and II of the art documentation suite, the Introduction and the Workbook,
respectively.
Contents
Chapter 1
Overview
In this chapter we need to touch on a lot of ideas so that each of the following
chapters can be reasonably standalone. Some of the most elementary material will be
the same as in the Intro - maybe we want to structure the document that way, with
input ?
- What is art?
- Driving the Event loop
- How does art see your experiment’s code: Modules and Services
- The Event Model
- Input and Output
- Scheduled reconstruction
- Reconstruction on Demand
- TFileService
- Event Mixing - loss of provenance.
- Random Number support
- Interface with Geant4
- Schedules
- Concurrency
- Interface with SAM
- Philosophy: Modules may only communicate via the Event. Part of the
mission of art is to document provenance.
Chapter 2
Development and RunTime Environments
- List of external products on which art depends. Short description of each
and references to docs for them.
- Describe UPS - hopefully can ref the new manual. Maybe ref the material
in the Intro.
- How to deal with build systems? Use system in the toyExperiment for
definiteness?
Chapter 3
User Response points in the Event Loop
- This describes the full list of user callable points for both modules and
services.
Chapter 4
Run Time Configuration
- Refer to the appendix for the FHCIL grammar guide.
- Describe art’s view of a FHICL file; what words, in which scopes, have
special meaning to art.x
- Where does FHICL_FILE_PATH go - here or in the appendix?
- For modules and services: relationship between class names, file name of
the .so and the names that appear in the .fcl file.
- Colission detection in the names of .so files for modules and services.
- How to write a specialization of get_if_present¡T¿ for your own class T.
Chapter 5
Modules
- Overview of the 5 types of modules
- What makes a class a module
- Analyzer modules
- Producer modules
- Filter modules
- Input modules
- Output modules
- The MACRO DEFINE_ART_MODULE
- Factory Function
- Registry and interaction with the registry
- Violation of the ODR
Chapter 6
Services
- Writing your own Service
- Legacy Services
- Global Services
- Schedule-Local Services
- Service Interfaces
- Internally art uses services to do some of its work. So of these have a
default config and you do not need to supply a config in the FHiCL file.
You may optionally provide a config - example is the scheduler.
- art also supplies some services that are only turned on if a
parameter set is provided in the FHiCL file. Examples: Tracer, Timer,
SimpleMemoryCheck
- When should you get your service handle? c’tor, each member function ...
- Check Anne’s notes from her discussion with Chris for more material.
Chapter 7
Cartoon of art-Flow
- Open FHCIL file
- Create intermediate table
- Apply Command line arguments
- Create Parameter Sets in the Registry
- Instantiate art internal services
- Instantiate user services
- Scan LD_LIBRARY_PATH - find all .so files.
- Open and process and _dict.so and _map.so files
- Parse the paths to figure out the order in which to instantiate modules.
- Instantiate modules
- Drive the event loop.
- The concept of a schedule
- trigger_path: skip to next module, skip to next trigger_path, next event
- Shutdown as safely as possible
Chapter 8
Event Data Model
- Event/Run/SubRun
- EventID
- You must ensure the uniqueness of event IDs.
- This has consequences on how to structure our MC jobs.
- Transient
- Persistent
- Orthogonality of Transient and Peristent
- art::ProductID
- Provenance
- Consistencey of input files - needs consistency of provenance as well as shape
consistency
- Merging two input files into one
Chapter 9
Data Products
- Focus on transient rep.
- Refer to the chapter on ROOT IO for properties of persistent data
products.
- Facade Pattern
Chapter 10
InterProduct References
- art::Ptr
- art::PtrVector
- art::Assns
- Discussion of when to prefer Assns or Ptr
Chapter 11
RootInput and RootOutput
- SelectEvents based on trigger path - is this in the RootOutput or in one
of the base classes?
- Drop on input and drop on output
- fast cloning
- art::Wrapper
- Schema: classes.h and classes_def.xml
- Automagic schema evolution
- User written streamers
- Include the guide to creating data products from the art wiki page.
- Structure of an art root file
- Size on disk of a data product.
- Tips on how to read one of these files using native root ( you loose
interprodcut references but all else should be OK ).
- genreflex info should not be duplicated - structure your libraries
accordingly.
- art supplies the genreflex info for some basic data types. These are used
for testing.
Chapter 12
Exeption Handling
- Orthogonality of throw and response.
- Response is run time configurable (at the category level?? at other levels?)
- Strong recommendation not to use exceptions for normal flow control. Use
it to signal true errors. Except if an underlying package uses exceptions
for flow control and you are forced to when using that package. Normally
let art catch exceptions.
- fpe
- Do use exceptipon specifications and why.
- cet::exception and art::exception
- Defining exception (and log?) categories for your own experiment.
- assert, cassert vs throw. Debugging stuff should compile to nothing with
production code.
Chapter 13
Message Logger
- Crib CMS docs except for run time config; this may be a good stand-alone
project for Anne?
- Run time config
- is there an equivalent of art::exception? I don’t think so.
Chapter 14
Art Supplied Services
- TFileService - refer to its own full chapter.
- Tracer
- Timing Service
- Simple Memory Check
- RandomNumber Service - refer to its own full chapter.
- Floating point control ?
Chapter 15
Standards and Practices
- Describe standards and practices used within art; this includes use of
C++, use of the tool chain and interactions of pieces of art with other
pieces of art.
- We probably want a “short version” and a “long version” and ask that
everyone be familiar with the short version.
- Suggest standards and practices that might be adopted by the
experiments.
- “Coding standards” - types start upcase etc
- Exception Saftey
- Thread Saftey
- Looking ahead towards concurrency
- If it’s bigger than a pointer, return by appropriate (safe) pointer type
- What is the appropriate (safe) pointer type? Sometimes it’s a const &.
- About inlining.
- it is always just a hint, never a firm directive
- automatic if definition is together with declaration in the class
declaration
- if out of class declaration preceded by inline keyword.
- always inline trivial accessors
- never inline anything static
- Guidelines for how to think about other cases.
Chapter 16
Profiling your Code
Chapter 17
Debugging and Troubleshooting
Appendix A
Rules for names
- Reserved identifiers in art .fcl files
- Must not contain an underscore: process name, module label, instance
name
Appendix B
Code Guards
Describe what they are and why to use them.
Appendix C
CLHEP
- You are likely to encounter Vector, Random Engine, SystemOfUnits
- CMS reports that we should prefer ROOT TMatrix to CLHEP Matrix
classes.
- For Vector, headers are the documentation.
- There is a writeup for Random
- .icc convention - no longer recommended but you need to know it.
- Some things are nasty have a nasty: [] use 0 based indexing but () use
1-based indexing!
- Hep3Vector and HepLorentzVector do not distinguish between a position,
a displacent, a momentum and a velocity.
- Proper use of SystemOfUnits - repeated multiplying is wrong!
- double a = 123. * CLHEP::mm
says that the numeric constant 123 is in mm and asks the code to
convert it to the internal unit of length. It does NOT assert that a is
in mm.
- double b = a / CLHEP::mm
If a is in CLHEP internal length units, then b will be in mm.
- Never do: using CLHEP or using CLHEP::m; This is just asking for
trouble.
- Geant4 internally uses Hep3Vector, SystemOfUnits and HepRandom.
Appendix D
C++ Ideas
We anticipate that there will be many places in the code that want to refer to certain
C++ ideas. For example we will want discussions about inheritance, templates,
exceptions, factory methods and other things. These are all examples of things that
most people only need to know a little about and the standard texts have much more
material than is needed. For each of these we need a superficial general discussion
followed by an art-specific detailed discussion. We can put those pieces here and refer
to them as needed.
Some of these may also be mentioned in Standards and Practices. I envisage the
technical discussion here so that the Stanards and Practices discussion can presume
that people know the technical bits already.
- Inheritance
- Templates
- Exceptions
- We need a section on when const in different positions means the same
thing and when const in different positions means different things. See
table.
|
|
Declaration | Meaning |
|
|
T t; | t is an object of type T |
const T t; | t is an object of type const T |
T const t; | same meaning as previous line |
const T& t; | t is a const reference to an object of type T |
T const& t; | same meaning as the previous line |
T * t; | t is a pointer to an object of type T |
T const * t; | t is a pointer to an object of type T |
const T * t; | same meaning as previous line |
T * const t; | t is a const pointer to a object of type T |
T const * const t; | t is a const pointer to a object of type const T |
const T * const t; | same meaning as the previous line |
|
|
|
Table D.1: Meaning of const in the declaration of an object.
The only real weirdness is that “const T” and “T const” mean exactly the same
thing. When pointers are involved there are two notions of const-ness, the
const-ness of the pointer and the const-ness of the pointee; all permutations are
possible.
Appendix E
CETLIB
- Lots of stuff. Point to docs. The existing docs are a table of one-line
descriptions plus what you find in the header code. Point out the string
manips, like pad, trim, split,
- Longer docs
- maybe_ref<T> a simple handle with reference syntax
- map_vector<T> should have been called sparse_vector<T>. Why not
just use std::map<integral_type,T>?
- no T const& operator[](integral_type) const;
- read performance in ROOT IO.
- pow<T>
- exempt_ptr<T> Always a shallow copy.
- value_ptr<T> Always a deep copy
- cet::exception
Appendix F
FHICL - grammar
Appendix G
Enum-Matched-To-String
Appendix H
Finite Precision Arithmetic and Floating Point Arithmetic
- Dynamic range of int, long, double, float
- Concept of machine epsilon for floating point types
- Danger of using == for comparison of floating point types.
- Danger of underflow if ( e > p ) m =sqrt(e*e-p*p). Instead: arg
=(e-p)*(e+p); m = (arg >0 ) ? sqrt(arg): 0.; safesqrt. Much less likely to
underflow. Or even just: m = sqrt((e-p)*(e+p)).
- cet::sum_of_squares and cet::diff_of_squares.
- NaN, signaling and non-signaling.
- Care in using a floating point type as a loop index.
- Rounding from float to int: ceil, floor.
- don’t do pow(x,n) when you really mean x*x. We have sqr(x) and
pow¡n¿(x). Does it make sense to ask cetlib to provide by and inline
powi¡n¿(x) and an non-inline?
Appendix I
Anonymous Namespaces
What is an anonymous namespace, how does it
work and when should I use one? When should I put a function in an anonymous
namespace and when should it put it in a utility library.
Appendix J
Dynamic Libraries
Discuss what a dynamic library file is. How does it
differ from static library? How does one load them? Maybe point to my toy
example?
Appendix K
Checklist
This is a brainstorming list of topics for which I have not yet found a home.
- Suppose that you write events to an art root file out of order. When you
read it back, they will be in order! This is because the file contains a table
that contains event ids with pointers to the data for that event;the table
is sorted by event id.
- If you are reading multiple input files you may encounter the same
Run/SubRun multiple times. By default art holds its Run and subRun
objects in memory so that it is safe to encounter the same Run/SubRun
multiple times. In a sparse skim this can be a memory pig. There is a
switch to turn off this caching. What is it’s name?
- The if-combo-container has chapters on “Obtaining Credentials to Access
Fermilab Computing Resources” and on “git”. These don’t really belong
in the User’s Guide but we should find a home for them.
- Decide on “module type” vs “module_type” vs “module class name” and
use it everywhere.
- In the existing docs: scrub “module name” - it should be “module label”
or “module type”
- Where can
we put http://mu2e.fnal.gov/atwork/computing/FrameworkIntro.shtml
which has a list of C++ stuff you need to know - maybe not on day one
but within the first few weeks.
- What is a linkage loop. How do I diagnose it? How do I fix it? Chris
Green’s tool.
- Is there a syntax to tell art to run on just one fully qualified eventId? On
a list of them? On a range? On a list of ranges? These were in cmsrun.
- Document the new TimeTracker and MemoryTracker Services and how to
access the new databases. Provide code samples to do simple things and
a link to SQL documentation.