Intensity Frontier
Common Offline Documentation:
art Workbook and Users Guide
Alpha Release 0.90
June 2, 2016
This version of the documentation is written for version August2015 of the art-workbook code.
Scientific Computing Division
Future Programs and Experiments Department
Scientific Software Infrastructure Group
Principal Author: Rob Kutschke
Other contributors: Marc Paterno, Mike Wang
Editor: Anne Heavey
art Developers: L. Garren, C. Green, K. Knoepfel,
J. Kowalkowski, M. Paterno and P. Russo
Detailed Table of Contents iv
chapter*.2
List of Figures xx
chapter*.3
List of Tables xxii
chapter*.4
List of Code and Output Listings xxii
chapter*.4 I Introduction 1
part.1
1 How to Read this Documentation 2
chapter.1 1.1 If you are new to HEP Software...
section.1.1 1.2 If you are an HEP Software expert...
section.1.2
1.3 If you are somewhere in between...
section.1.3
2 Conventions Used in this Documentation 4
chapter.2 2.1 Terms in Glossary
section.2.1 2.2 Typing Commands
section.2.2 2.3 Listing Styles
section.2.3
2.4 Procedures to Follow
section.2.4 2.5 Important Items to Call Out
section.2.5
2.6 Site-specific Information
section.2.6
3 Introduction to the art Event Processing Framework 7
chapter.3 3.1 What is art and Who Uses it?
section.3.1 3.2 Why art?
section.3.2 3.3 C++ and C++11
section.3.3
3.4 Getting Help
section.3.4 3.5 Overview of the Documentation Suite
section.3.5
3.5.1 The Introduction
subsection.3.5.1 3.5.2 The Workbook
subsection.3.5.2
3.5.3 Users Guide
subsection.3.5.3 3.5.4 Reference Manual
subsection.3.5.4
3.5.5 Technical Reference
subsection.3.5.5 3.5.6 Glossary
subsection.3.5.6
3.6 Some Background Material
section.3.6 3.6.1 Events and Event IDs
subsection.3.6.1
3.6.2 art Modules and the Event Loop
subsection.3.6.2 3.6.3 Module Types
subsection.3.6.3
3.6.4 art Data Products
subsection.3.6.4 3.6.5 art Services
subsection.3.6.5
3.6.6 Dynamic Libraries and art
subsection.3.6.6 3.6.7 Build Systems and art
subsection.3.6.7
3.6.8 External Products
subsection.3.6.8 3.6.9 The Event-Data Model and Persistency
subsection.3.6.9
3.6.10 Event-Data Files
subsection.3.6.10 3.6.11 Files on Tape
subsection.3.6.11
3.7 The Toy Experiment
section.3.7 3.7.1 Toy Detector Description
subsection.3.7.1
3.7.2 Workflow for Running the Toy Experiment Code
subsection.3.7.2
3.8 Rules, Best Practices, Conventions and Style
section.3.8
4 Unix Prerequisites 34
chapter.4 4.1 Introduction
section.4.1 4.2 Commands
section.4.2 4.3 Shells
section.4.3 4.4 Scripts: Part 1
section.4.4
4.5 Unix Environments
section.4.5 4.5.1 Building up the Environment
subsection.4.5.1
4.5.2 Examining and Using Environment Variables
subsection.4.5.2 4.6 Paths and $PATH
section.4.6
4.7 Scripts: Part 2
section.4.7 4.8 bash Functions and Aliases
section.4.8 4.9 Login Scripts
section.4.9
4.10 Suggested Unix and bash References
section.4.10
5 Site-Specific Setup Procedure 45
chapter.5
6 Get your C++ up to Speed 48
chapter.6 6.1 Introduction
section.6.1 6.2 File Types Used and Generated in C++ Programming
section.6.2
6.3 Establishing the Environment
section.6.3 6.3.1 Initial Setup
subsection.6.3.1
6.3.2 Subsequent Logins
subsection.6.3.2 6.4 C++ Exercise 1: Basic C++ Syntax and Building an Executable
section.6.4
6.4.1 Concepts to Understand
subsection.6.4.1 6.4.2 How to Compile, Link and Run
subsection.6.4.2
6.4.3 Discussion
subsection.6.4.3 6.4.3.1 Primitive types, Initialization and Printing Output
subsubsection.6.4.3.1
6.4.3.2 Arrays
subsubsection.6.4.3.2 6.4.3.3 Equality testing
subsubsection.6.4.3.3
6.4.3.4 Conditionals
subsubsection.6.4.3.4 6.4.3.5 Some C++ Standard Library Types
subsubsection.6.4.3.5
6.4.3.6 Pointers
subsubsection.6.4.3.6 6.4.3.7 References
subsubsection.6.4.3.7 6.4.3.8 Loops
subsubsection.6.4.3.8
6.5 C++ Exercise 2: About Compiling and Linking
section.6.5 6.5.1 What You Will Learn
subsection.6.5.1
6.5.2 The Source Code for this Exercise
subsection.6.5.2 6.5.3 Compile, Link and Run the Exercise
subsection.6.5.3
6.5.4 Alternate Script build2
subsection.6.5.4 6.5.5 Suggested Homework
subsection.6.5.5
6.6 C++ Exercise 3: Libraries
section.6.6 6.6.1 What You Will Learn
subsection.6.6.1
6.6.2 Building and Running the Exercise
subsection.6.6.2 6.7 Classes
section.6.7
6.7.1 Introduction
subsection.6.7.1 6.7.2 C++ Exercise 4 v1: The Most Basic Version
subsection.6.7.2
6.7.3 C++ Exercise 4 v2: The Default Constructor
subsection.6.7.3
6.7.4 C++ Exercise 4 v3: Constructors with Arguments
subsection.6.7.4
6.7.5 C++ Exercise 4 v4: Colon Initializer Syntax
subsection.6.7.5
6.7.6 C++ Exercise 4 v5: Member functions
subsection.6.7.6
6.7.7 C++ Exercise 4 v6: Private Data and Accessor Methods
subsection.6.7.7
6.7.7.1 Setters and Getters
subsubsection.6.7.7.1 6.7.7.2 What’s the deal with the underscore?
subsubsection.6.7.7.2
6.7.7.3 An example to motivate private data
subsubsection.6.7.7.3
6.7.8 C++ Exercise 4 v7: The
inline Specifier
subsection.6.7.8
6.7.9 C++ Exercise 4 v8: Defining Member Functions within the Class Declaration
subsection.6.7.9
6.7.10 C++ Exercise 4 v9: The Stream Insertion Operator and Free Functions
subsection.6.7.10
6.7.11 Review
subsection.6.7.11 6.8 Overloading functions
section.6.8 6.9 C++ References
section.6.9
7 Using External Products in UPS 107
chapter.7 7.1 The UPS Database List: PRODUCTS
section.7.1
7.2 UPS Handling of Variants of a Product
section.7.2
7.3 The setup Command: Syntax and Function
section.7.3
7.4 Current Versions of Products
section.7.4
7.5 Environment Variables Defined by UPS
section.7.5
7.6 Finding Header Files
section.7.6 7.6.1 Introduction
subsection.7.6.1
7.6.2 Finding art Header Files
subsection.7.6.2
7.6.3 Finding Headers from Other UPS Products
subsection.7.6.3
7.6.4 Exceptions: The Workbook, ROOT and Geant4
subsection.7.6.4 II
Workbook 119
part.2
8 Preparation for Running the Workbook Exercises 120
chapter.8 8.1 Introduction
section.8.1 8.2 Getting Computer Accounts on Workbook-enabled Machines
section.8.2
8.3 Choosing a Machine and Logging In
section.8.3 8.4 Launching new Windows: Verify X Connectivity
section.8.4
8.5 Choose an Editor
section.8.5
9 Exercise 1: Running Pre-built art Modules 124
chapter.9 9.1 Introduction
section.9.1 9.2 Prerequisites
section.9.2
9.3 What You Will Learn
section.9.3 9.4 The art Run-time Environment
section.9.4
9.5 The Input and Configuration Files for the Workbook Exercises
section.9.5
9.6 Setting up to Run Exercise 1
section.9.6 9.6.1 Log In and Set Up
subsection.9.6.1
9.6.1.1 Initial Setup Procedure using Standard Directory
subsubsection.9.6.1.1
9.6.1.2 Initial Setup Procedure allowing Self-managed Working Directory
subsubsection.9.6.1.2
9.6.1.3 Setup for Subsequent Exercise 1 Login Sessions
subsubsection.9.6.1.3
9.7 Execute art and Examine Output
section.9.7 9.8 Understanding the Configuration
section.9.8
9.8.1 Some Bookkeeping Syntax
subsection.9.8.1 9.8.2 Some Physics Processing Syntax
subsection.9.8.2
9.8.3 art Command line Options
subsection.9.8.3 9.8.4 Maximum Number of Events to Process
subsection.9.8.4
9.8.5 Changing the Input Files
subsection.9.8.5 9.8.6 Skipping Events
subsection.9.8.6
9.8.7 Identifying the User Code to Execute
subsection.9.8.7 9.8.8 Paths and the art Workflow
subsection.9.8.8
9.8.8.1 Paths and the art Workflow: Details
subsubsection.9.8.8.1 9.8.8.2 Order of Module Execution
subsubsection.9.8.8.2
9.8.9 Writing an Output File
subsection.9.8.9 9.9 Understanding the Process for Exercise 1
section.9.9
9.9.1 Follow the Site-Specific Setup Procedure (Details)
subsection.9.9.1
9.9.2 Make a Working Directory (Details)
subsection.9.9.2
9.9.3 Setup the toyExperiment UPS Product (Details)
subsection.9.9.3
9.9.4 Copy Files to your Current Working Directory (Details)
subsection.9.9.4
9.9.5 Source makeLinks.sh (Details)
subsection.9.9.5 9.9.6 Run art (Details)
subsection.9.9.6
9.10 How does art find Modules?
section.9.10 9.11 How does art find FHiCL Files?
section.9.11
9.11.1 The -c command line argument
subsection.9.11.1 9.11.2 #include Files
subsection.9.11.2
9.12 Review
section.9.12 9.13 Test your Understanding
section.9.13 9.13.1 Tests
subsection.9.13.1
9.13.2 Answers
subsection.9.13.2
10 Exercise 2: Building and Running Your First Module 162
chapter.10 10.1 Introduction
section.10.1 10.2 Prerequisites
section.10.2 10.3 What You Will Learn
section.10.3
10.4 Initial Setup to Run Exercises
section.10.4 10.4.1 “Source Window” Setup
subsection.10.4.1
10.4.2 Examine Source Window Setup
subsection.10.4.2 10.4.2.1 About git and What it Did
subsubsection.10.4.2.1
10.4.2.2 Contents of the Source Directory
subsubsection.10.4.2.2
10.4.3 “Build Window” Setup
subsection.10.4.3 10.4.3.1 Standard Procedure
subsubsection.10.4.3.1
10.4.3.2 Using Self-managed Working Directory
subsubsection.10.4.3.2
10.4.4 Examine Build Window Setup
subsection.10.4.4 10.5 The art Development Environment
section.10.5
10.6 Running the Exercise
section.10.6 10.6.1 Run art on first.fcl
subsection.10.6.1
10.6.2 The FHiCL File first.fcl
subsection.10.6.2 10.6.3 The Source Code File First_module.cc
subsection.10.6.3
10.6.3.1 The #include Statements
subsubsection.10.6.3.1
10.6.3.2 The Declaration of the Class First, an Analyzer Module
subsubsection.10.6.3.2
10.6.3.3 An Introduction to Analyzer Modules
subsubsection.10.6.3.3
10.6.3.4 The Constructor for the Class First
subsubsection.10.6.3.4
10.6.3.5 Aside: Omitting Argument Names in Function Declarations
subsubsection.10.6.3.5
10.6.3.6 The Member Function analyze and the Representation of an Event
subsubsection.10.6.3.6
10.6.3.7 Representing an Event Identifier with art::EventID
subsubsection.10.6.3.7
10.6.3.8 DEFINE_ART_MACRO: The Module Maker Macros
subsubsection.10.6.3.8
10.6.3.9 Some Alternate Styles
subsubsection.10.6.3.9 10.7 What does the Build System Do?
section.10.7
10.7.1 The Basic Operation
subsection.10.7.1 10.7.2 Incremental Builds and Complete Rebuilds
subsection.10.7.2
10.7.3 Finding Header Files at Compile Time
subsection.10.7.3
10.7.4 Finding Dynamic Library Files at Link Time
subsection.10.7.4 10.7.5 Build System Details
subsection.10.7.5
10.8 Suggested Activities
section.10.8 10.8.1 Create Your Second Module
subsection.10.8.1
10.8.2 Use artmod to Create Your Third Module
subsection.10.8.2
10.8.3 Running Many Modules at Once
subsection.10.8.3 10.8.4 Access Parts of the EventID
subsection.10.8.4
10.9 Final Remarks
section.10.9 10.9.1 Why is there no First_module.h File?
subsection.10.9.1
10.9.2 The Three-File Module Style
subsection.10.9.2 10.10 Flow of Execution from Source to FHiCL File
section.10.10
10.11 Review
section.10.11 10.12 Test Your Understanding
section.10.12 10.12.1 Tests
subsection.10.12.1
10.12.2 Answers
subsection.10.12.2 10.12.2.1 FirstBug01
subsubsection.10.12.2.1
10.12.2.2 FirstBug02
subsubsection.10.12.2.2
11 General Setup for Login Sessions 218
chapter.11 11.1 Source Window
section.11.1 11.2 Build Window
section.11.2
12 Keeping Up to Date with Workbook Code and Documentation 220
chapter.12 12.1 Introduction
section.12.1 12.2 Special Instructions for Summer 2014
section.12.2
12.3 How to Update
section.12.3 12.3.1 Get Updated Documentation
subsection.12.3.1
12.3.2 Get Updated Code and Build It
subsection.12.3.2
12.3.3 See which Files you have Modified or Added
subsection.12.3.3
13 Exercise 3: Some other Member Functions of Modules 226
chapter.13 13.1 Introduction
section.13.1 13.2 Prerequisites
section.13.2 13.3 What You Will Learn
section.13.3
13.4 Setting up to Run this Exercise
section.13.4 13.5 The Source File Optional_module.cc
section.13.5
13.5.1 About the begin* Member Functions
subsection.13.5.1 13.5.2 About the art::*ID Classes
subsection.13.5.2
13.5.3 Use of the override Identifier
subsection.13.5.3 13.5.4 Use of const References
subsection.13.5.4
13.5.5 The analyze Member Function
subsection.13.5.5 13.6 Running this Exercise
section.13.6
13.7 The Member Function beginJob versus the Constructor
section.13.7
13.8 Suggested Activities
section.13.8 13.8.1 Add the Matching end Member functions
subsection.13.8.1
13.8.2 Run on Multiple Input Files
subsection.13.8.2 13.8.3 The Option --trace
subsection.13.8.3
13.9 Review
section.13.9 13.10 Test Your Understanding
section.13.10 13.10.1 Tests
subsection.13.10.1
13.10.2 Answers
subsection.13.10.2
14 Exercise 4: A First Look at Parameter Sets 238
chapter.14 14.1 Introduction
section.14.1 14.2 Prerequisites
section.14.2 14.3 What You Will Learn
section.14.3
14.4 Setting up to Run this Exercise
section.14.4 14.5 The Configuration File pset01.fcl
section.14.5
14.6 The Source code file PSet01_module.cc
section.14.6 14.7 Running the Exercise
section.14.7
14.8 Member Function Templates and their Arguments
section.14.8
14.8.1 Types Known to ParameterSet::get<T>
subsection.14.8.1 14.8.2 User-Defined Types
subsection.14.8.2
14.9 Exceptions (as in “Errors”)
section.14.9 14.9.1 Error Conditions
subsection.14.9.1
14.9.2 Error Handling
subsection.14.9.2 14.9.3 Suggested Exercises
subsection.14.9.3
14.10 Parameters and Data Members
section.14.10 14.11 Optional Parameters with Default Values
section.14.11
14.11.1 Policies About Optional Parameters
subsection.14.11.1
14.12 Numerical Types: Precision and Canonical Forms
section.14.12
14.12.1 Why Have Canonical Forms?
subsection.14.12.1 14.12.2 Suggested Exercises
subsection.14.12.2
14.12.2.1 Formats
subsubsection.14.12.2.1 14.12.2.2 Fractional versus Integral Types
subsubsection.14.12.2.2
14.13 Dealing with Invalid Parameter Values
section.14.13 14.14 Review
section.14.14
14.15 Test Your Understanding
section.14.15 14.15.1 Tests
subsection.14.15.1
14.15.2 Answers
subsection.14.15.2
15 Exercise 5: Making Multiple Instances of a Module 264
chapter.15 15.1 Introduction
section.15.1 15.2 Prerequisites
section.15.2 15.3 What You Will Learn
section.15.3
15.4 Setting up to Run this Exercise
section.15.4 15.5 The Source File Magic_module.cc
section.15.5
15.6 The FHiCL File magic.fcl
section.15.6 15.7 Running the Exercise
section.15.7
15.8 Discussion
section.15.8 15.8.1 Order of Analyzer Modules is not Important
subsection.15.8.1
15.8.2 Two Meanings of Module Label
subsection.15.8.2 15.9 Review
section.15.9
15.10 Test Your Understanding
section.15.10 15.10.1 Tests
subsection.15.10.1
15.10.2 Answers
subsection.15.10.2
16 Exercise 6: Accessing Data Products 271
chapter.16 16.1 Introduction
section.16.1 16.2 Prerequisites
section.16.2
16.3 What You Will Learn
section.16.3 16.4 Background Information for this Exercise
section.16.4
16.4.1 The Data Type GenParticleCollection
subsection.16.4.1
16.4.2 Data Product Names
subsection.16.4.2 16.4.3 Specifying a Data Product
subsection.16.4.3
16.4.4 The Data Product used in this Exercise
subsection.16.4.4 16.5 Setting up to Run this Exercise
section.16.5
16.6 Running the Exercise
section.16.6 16.7 Understanding the First Version, ReadGens1
section.16.7
16.7.1 The Source File ReadGens1_module.cc
subsection.16.7.1
16.7.2 Adding a Link Library to CMakeLists.txt
subsection.16.7.2
16.7.3 The FHiCL File readGens1.fcl
subsection.16.7.3 16.8 The Second Version, ReadGens2
section.16.8
16.9 The Third Version, ReadGens3
section.16.9 16.10 Suggested Activities
section.16.10
16.11 Review
section.16.11 16.12 Test Your Understanding
section.16.12 16.12.1 Tests
subsection.16.12.1
16.12.2 Answers
subsection.16.12.2
17 Exercise 7: Making a Histogram 291
chapter.17 17.1 Introduction
section.17.1 17.2 Prerequisites
section.17.2 17.3 What You Will Learn
section.17.3
17.4 Setting up to Run this Exercise
section.17.4 17.5 The Source File FirstHist1_module.cc
section.17.5
17.5.1 Introducing art::ServiceHandle
subsection.17.5.1 17.5.2 Creating a Histogram
subsection.17.5.2
17.5.3 Filling a Histogram
subsection.17.5.3 17.5.4 A Few Last Comments
subsection.17.5.4
17.6 The Configuration File firstHist1.fcl
section.17.6 17.7 The file CMakeLists.txt
section.17.7
17.8 Running the Exercise
section.17.8 17.9 Inspecting the Histogram File
section.17.9
17.9.1 A Short Cut: the browse command
subsection.17.9.1 17.9.2 Using CINT Scripts
subsection.17.9.2
17.10 Finding ROOT Documentation
section.17.10 17.10.1 Overwriting Histogram Files
subsection.17.10.1
17.10.2 Changing the Name of the Histogram File
subsection.17.10.2
17.10.3 Changing the Module Label
subsection.17.10.3 17.10.4 Printing From the TBrowser
subsection.17.10.4
17.11 Review
section.17.11 17.12 Test Your Understanding
section.17.12 17.12.1 Tests
subsection.17.12.1
17.12.2 Answers
subsection.17.12.2
18 Exercise 8: Looping Over Collections 318
chapter.18 18.1 Introduction
section.18.1 18.2 Prerequisites
section.18.2 18.3 What You Will Learn
section.18.3
18.4 Setting Up to Run Exercise
section.18.4 18.5 The Class GenParticle
section.18.5
18.5.1 The Included Header Files
subsection.18.5.1 18.5.2 Particle Parent-Child Relationships
subsection.18.5.2
18.5.3 The Public Interface for the Class GenParticle
subsection.18.5.3
18.5.4 Conditionally Excluded Sections of Header File
subsection.18.5.4
18.6 The Module LoopGens1
section.18.6 18.7 CMakeLists.txt
section.18.7
18.8 Running the Exercise
section.18.8 18.9 Variations on the Exercise
section.18.9
18.9.1 LoopGens2_module.cc
subsection.18.9.1 18.9.2 LoopGens3_module.cc
subsection.18.9.2
18.9.3 LoopGens3a_module.cc
subsection.18.9.3 18.10 Review
section.18.10
18.11 Test Your Understanding
section.18.11 18.11.1 Test 1
subsection.18.11.1
18.11.2 Test 2
subsection.18.11.2 18.11.3 Test 3
subsection.18.11.3 18.11.4 Answers
subsection.18.11.4
18.11.4.1 Test 1
subsubsection.18.11.4.1 18.11.4.2 Test 2
subsubsection.18.11.4.2
18.11.4.3 Test 3
subsubsection.18.11.4.3
19 3D Event Displays 342
chapter.19 19.1 Introduction
section.19.1 19.2 Prerequisites
section.19.2 19.3 What You Will Learn
section.19.3
19.4 Setting up to Run this Exercise
section.19.4 19.5 Running the Exercise
section.19.5
19.5.1 Startup and General Layout
subsection.19.5.1 19.5.2 The Control Panel
subsection.19.5.2
19.5.2.1 The List-Tree Widget and Context-Sensitive Menus
subsubsection.19.5.2.1
19.5.2.2 The Event-Navigation Pane
subsubsection.19.5.2.2 19.5.3 Main EVE Display Area
subsection.19.5.3
19.6 Understanding How the 3D Event Display Module Works
section.19.6
19.6.1 Overview of the Source Code File EventDisplay3D_module.cc
subsection.19.6.1
19.6.2 Class Declaration and Constructor
subsection.19.6.2
19.6.3 Creating the GUI and Drawing the Static Detector Components in the beginJob() Member Function
subsection.19.6.3
19.6.3.1 The Default GUI
subsubsection.19.6.3.1 19.6.3.2 Adding the Global Elements
subsubsection.19.6.3.2
19.6.3.3 Customizing the GUI
subsubsection.19.6.3.3 19.6.3.4 Adding the Navigation Pane
subsubsection.19.6.3.4
19.6.4 Drawing the Generated Hits and Tracks in the analyze() Member Function
subsection.19.6.4
20 Troubleshooting 379
chapter.20 20.1 Updating Workbook Code
section.20.1
20.2 XWindows (xterm and Other XWindows Products)
section.20.2
20.2.1 Mac OSX 10.9
subsection.20.2.1 20.3 Trouble Building
section.20.3
20.4 art Won’t Run
section.20.4 III User’s Guide 381
part.3
21 git 382
chapter.21 21.1 Aside: More Details about git
section.21.1
21.1.1 Central Repository, Local Repository and Working Directory
subsection.21.1.1
21.1.1.1 Files that you have Added
subsubsection.21.1.1.1 21.1.1.2 Files that you have Modified
subsubsection.21.1.1.2
21.1.1.3 Files with Resolvable Conflicts
subsubsection.21.1.1.3
21.1.1.4 Files with Unresolvable Conflicts
subsubsection.21.1.1.4 21.1.2 git Branches
subsection.21.1.2
21.1.3 Seeing which Files you have Modified or Added
subsection.21.1.3
22 art Run-time and Development Environments 391
chapter.22 22.1 The art Run-time Environment
section.22.1 22.2 The art Development Environment
section.22.2
23 art Framework Parameters 399
chapter.23 23.1 Parameter Types
section.23.1 23.2 Structure of art Configuration Files
section.23.2
23.3 Services
section.23.3 23.3.1 System Services
subsection.23.3.1
23.3.2 FloatingPointControl
subsection.23.3.2 23.3.3 Message Parameters
subsection.23.3.3
23.3.4 Optional Services
subsection.23.3.4 23.3.5 Sources
subsection.23.3.5
23.3.6 Modules
subsection.23.3.6
24 Job Configuration in art: FHiCL 405
chapter.24 24.1 Basics of FHiCL Syntax
section.24.1 24.1.1 Specifying Names and Values
subsection.24.1.1
24.1.2 FHiCL-reserved Characters and Identifiers
subsection.24.1.2 24.2 FHiCL Identifiers Reserved to art
section.24.2
24.3 Structure of a FHiCL Run-time Configuration File for art
section.24.3 24.4 Order of Elements in a FHiCL Run-time Configuration File for art
section.24.4
24.5 The physics Portion of the FHiCL Configuration
section.24.5 24.6 Choosing and Using Module Labels and Path Names
section.24.6
24.7 Scheduling Strategy in art
section.24.7 24.8 Scheduled Reconstruction using Trigger Paths
section.24.8
24.9 Reconstruction On-Demand
section.24.9 24.10 Bits and Pieces
section.24.10 IV
Appendices 424
part.4
A Obtaining Credentials to Access Fermilab Computing Resources 425
appendix.A A.1 Kerberos Authentication
section.A.1 A.2 Fermilab Services Account
section.A.2
B Installing Locally 427
appendix.B B.1 Install the Binary Distributions: A Cheat Sheet
section.B.1
B.2 Preparing the Site Specific Setup Script
section.B.2 B.3 Links to the Full Instructions
section.B.3
C art Completion Codes 432
appendix.C
D Viewing and Printing Figure Files 435
appendix.D D.1 Viewing Figure Files Interactively
section.D.1 D.2 Printing Figure Files
section.D.2
E CLHEP 437
appendix.E E.1 Introduction
section.E.1 E.2 Multiple Meanings of Vector in CLHEP
section.E.2
E.3 CLHEP Documentation
section.E.3 E.4 CLHEP Header Files
section.E.4
E.4.1 Naming Conventions and Syntax
subsection.E.4.1 E.4.2 .icc Files
subsection.E.4.2
E.5 The CLHEP Namespace
section.E.5 E.5.1 using Declarations and Directives
subsection.E.5.1
E.6 The Vector Package
section.E.6 E.6.1 CLHEP::Hep3Vector
subsection.E.6.1
E.6.1.1 Some Fragile Member Functions
subsubsection.E.6.1.1
E.6.2 CLHEP::HepLorentzVector
subsection.E.6.2 E.6.2.1 HepBoost
subsubsection.E.6.2.1
E.7 The Matrix Package
section.E.7 E.8 The Random Package
section.E.8
F Include Guards 453
appendix.F V Index 455
part.5
Index 456
section*.87
The art document suite, which is currently in an alpha release form, consists of an introductory section and the first few exercises of the Workbook1 , plus a glossary and an index. There are also some preliminary (incomplete and unreviewed) portions of the Users Guide included in the compilation.
The Workbook exercises require you to download some code to edit, execute and evaluate. Both
the documentation and the code it references are expected to undergo continual development
throughout 2013 and 2014. The latest is always available at the art Documentation website.
Chapter 12 tells you how to keep up-to-date with improvements and additions to the Workbook
code and documentation.
Read Parts I and II (the introductory material and the Workbook) from start to finish. The Workbook is aimed at an audience who is familiar with (although not necessarily expert in) Unix, C++ and Fermilab’s UPS product management system, and who understands the basic art framework concepts. The introductory chapters prepare the “just starting out” reader in all these areas.
Read chapters 1, 2 and 3: this is where key terms and concepts used throughout the art
document suite get defined. Skip the rest of the introductory material and jump straight into
running Exercise 1 in Chapter 9 of the Workbook. Take the approach of: Don’t need it? Don’t
read it.
Read chapters 1, 2 and 3 and skim the remaining introductory material in Part I to glean what
you need. Along with the experts, you can take the approach of: Don’t need it? Don’t read
it.
Most of the material in this introduction and in the Workbook is written so that it can be understood by those new to HEP computing; if it is not, please let us know (see Section 3.4)!
The first instance of each term that is defined in the glossary is written in italics followed
by a γ (Greek letter gamma), e.g., framework(γ).
Unix commands that you must type are shown in the format unix command. Portions of the command for which you must substitute values are shown in slanted font within the command. e.g., you would type your actual username when you see username).
While art supports OS X as well as flavors of Linux, the instructions for using art are nearly identical for all supported systems. When operating-system specific instructions are needed they are noted in the exercises.
When an example Unix command line would overflow the page width, this documentation will use a trailing backslash to indicate that the command is continued on the next line. We indent the second line to make clear that it is not a separate command from the first line. For example:
You can type the entire command on a single line if it fits, without typing the backslash, or on two lines with the backslash as the final character of the first line. Do not leave a space before the backslash unless it is required in the command syntax, e.g., before an option, as in
Code listings in C++ are shown as:
1// This is a C++ file listing.
2float* pa = &a;
3
Code listings in FHiCL are shown as:
Other script or file content is denoted:
Computer output from a command is shown as:
Step-by-step procedures that the reader is asked to follow are denoted in the following way:
Occasionally, text will be called out to make sure that you don’t miss it. Important or tricky terms
and concepts will be marked with an “pointing finger” symbol in the margin, as shown at
right.
Items that are even trickier will be marked with a “bomb” symbol in the margin, as shown at
right. You really want to avoid the problems they describe.
In some places it will be necessary for a paragraph or two to be written for experts. Such
paragraphs will be marked with a “dangerous bends” symbol in the margin, as shown at right.
Less experienced users can skip these sections on first reading and come back to them at a
later time.
Text that refers in particular to Fermilab-specific information is marked with a Fermilab picture,
as shown at right.
Text that refers in particular to information about using A
combination of experiment and location, and is used to refer to a set of computing resources
configured for use by a particular experiment at a particular location. Two examples
of sites are the Fermilab supplied resources used by your experiment and the group
computing resources an institution that collaborates on your experiment. If you have the
necessary software installed on your own laptop, it is also a site. Similarly for your own
desktop.
Experiment-specific information will be kept to an absolute minimum; wherever it
appears, it will be marked with an experiment-specific icon, e.g., the Mu2e icon at
right.
The initial clients of
The Fermilab SCD has also developed a related product named
https://cdcvs.fnal.gov/redmine/projects/artdaq/wiki.
A technial paper on
The design of
Experiments using
In all previous experiments at Fermilab, and in most previous experiments elsewhere, infrastructure software (i.e., the framework, broadly construed – mostly forms of bookkeeping) has been written in-house by each experiment, and each implementation has been tightly coupled to that experiment’s code. This tight coupling has made it difficult to share the framework among experiments, resulting in both great duplication of effort and mixed quality.
design of
In 2011, the International Standards Committee voted to approve a new standard for C++, called C++ 11.
Much of the existing user code was written prior to the adoption of the C++ 11 standard and has not yet been updated. As you work on your experiment, you are likely to encounter both code written the new way and code written the old way. Therefore, the Workbook will often illustrate both practices.
A very useful compilation of what is new in C++ 11 can be found at
This reference material is written for advanced C++ users.
Please send your questions and comments to
When complete, this documentation suite will contain several principal components, or
This introductory volume is intended to set the stage for using
The Workbook is a series of standalone, self-paced exercises that will introduce the building
blocks of the
One of the Workbook’s primary functions is training readers how and where to find
more extensive documentation on both
The Workbook assumes some basic computing skills and some basic familiarity with the C++ computing language; Chapter 6 provides a tutorial/refresher for readers who need to improve their C++ skills.
The Workbook is written using recommended best practices that have become current since the adoption of C++ 11 (see Section 3.8).
Because
work through the exercises to translate the lessons learned there into the environment of their
own experiments.
The Users Guide is targeted at physicists who have reached an intermediate level of
competence with
The Reference Manual will be targeted at physicists who already understand the major ideas
underlying
The Technical Reference will be targeted at the experts who develop and maintain
The glossary will evolve as the documentation set grows. At the time of writing, it includes
definitions of
This section defines some language and some background material about the
In almost all HEP experiments, the core idea underlying all bookkeeping is the
In a typical HEP experiment, the trigger or DAQ system assigns an event identifier (event ID) to
each event; this ID uniquely identifies each event, satisfying a critical requirement imposed by
simulated events.
The simplest event ID is a monotonically increasing integer. A more common practice is to
define a multi-part ID and
There are two common methods of using this event ID scheme and
When an experiment takes data, events read from the DAQ are typically written to disk files,
with copies made on tape. The events in a single subRun may be spread over several
files; conversely, a single file may contain many runs, each of which contains many
subRuns.
Users provide executable code to
The simplest command to run
The argument to
even more simply as just the
When
These rules will be explained as you work through the Workbook and they are summarized in
The code base of a typical experiment will contain many C++ classes. Only a small fraction of these will be modules; most of the rest will be ordinary C++ classes that are used within modules3 .
A user can tell
Imagine the processing of each event as the assembly of a widget on an assembly
line and imagine each module as a worker that needs to perform a set task on each
widget. Each worker has a task that must be done on each widget that passes by; in
addition some workers may need to do some start-up or close-down jobs. Following this
metaphor,
For those of you who are familiar with it may override other virtual member
functions from the base class.
After
The event loop
This entire set of steps comprises the event loop. One of
Every
Note that no module may change information that is already present in an event.
What does an analyzer do if it may neither alter information in an event nor add to it? Typically it
creates printout and it creates ROOT files containing histograms,
Most novice users will only write analyzer modules and filter modules; readers with a little more
experience may also write producer modules. The Workbook will provide examples of all three.
Few people other than
This section introduces more ideas and terms dealing with event information that you will need as you progress through the Workbook.
The term
Because these data products are intrinsically experiment-dependent, each experiment defines its
own data products. In the Workbook, you will learn about a set of data products designed for use
with the toy experiment. There are a small number of data products that are defined by
A data product is just a C++ different than the rules that must be followed for a class to be a
module; when the sections that describe these rules in detail have been prepared, we will add
references here. A data product can be a single integer, a large complex class hierarchy, or
anything in between.
Very often, a data product is a
Previous sections of this Introduction have introduced the concept of C++ classes that have
to obey a certain set of rules defined by
In a typical
To provide managed access to the second sort of information,
After
Once a service has been constructed, any code in any module can ask
to that service and use the features provided by that service. Because services are constructed
before modules, they are available for use by modules over the full life cycle of each
module.
It is also legal for one service to request information from another service as long as the dependency chain does not have any loops. That is, if Service A uses Service B, then Service B may not use Service A, either directly or indirectly.
For those of you familiar with the C++ Singleton Design Pattern, an Contrast this with the behavior of Singletons, for
which the order of initialization is undefined by the C++ standard and which is an accident of the
implementation details of the loader.
When code is executed within the
To make an experiment’s code available to
Experiments that use
The
The compiler options corresponding to the three levels are listed in Table 3.1.
| |
| |
| |
| |
As you progress through the Workbook, you will see that the exercises use some software
packages that are part of neither
These packages and tools are referred to as
An initial list of the external products you will need to become familiar with includes:
Any particular line of code in a Workbook exercise may use elements from, say, four or five of these packages. Knowing how to parse a line and identify which feature comes from which package is a critical skill. The Workbook will provide a tour of the above packages so that you will recognize elements when they are used and you will learn where to find the necessary documentation.
For the
products whichever way you prefer. UPS is, itself, just another external product. From
the point of view of your experiment,
Finally, it is important to recognize an overloaded word, If it is not clear from the context which is meant, please let us know (see
Section 3.4).
Section 3.6.4 introduced the idea of
While each experiment will define its own data product classes, there is a common set of
questions that
The answers to these questions form what is called the
A question that is closely related to the EDM is: what technologies are supported to write data
products from memory to a disk file and to read them from the disk file back into memory in a
separate
A few other related terms that you will encounter include:
When you read data from an experiment and write the data to a disk file, that disk file is usually
called a
When you simulate an experiment and write a disk file that holds the information produced by
the simulation, what should you call the file? The Particle Data Group has recommended that this
not be called a “data file” or a “simulated data file;” they prefer that the word “data” be strictly
reserved for information that comes from an actual experiment. They recommend that
we refer to these files as “files of simulated events” or “files of Monte Carlo events”
6. Note
the use of “events,” not “data.”
This leaves us with a need for a collective noun to describe both data files and files of simulated
events. The name in current use is
Many experiments do not have access to enough disk space to hold all of their event-data files, ROOT files and log files. The solution is to copy a subset of the disk files to tape and to read them back from tape as necessary.
At any given time, a snapshot of an experiment’s files will show some on tape only, some on tape
with copies on disk, and some on disk only. For any given file, there may be multiple copies on
disk and those copies may be distributed across many
Conceptually, two pieces of software are used to keep track of which files are where, a
https://cdcvs.fnal.gov/redmine/projects/sam-main/wiki.
The Workbook exercises are based around a made-up (
The software for the toy experiment is designed around a toy detector, which is shown in
Figure 3.3. The
The toy detector is a central detector made up of 15 concentric shells, with their axes centered on
the
Each shell is a detector that measures
All of the code in the toyExperiment product works in the set of units described in Table 3.2.
Because the code in the Workbook is built on toyExperiment, it uses the same units.
The first six units listed in Table 3.2 are the base units defined by the CLHEP SystemOfUnits
package. These are also the units used by Geant4.
| |
| |
| |
| |
| |
| |
| |
| |
The workflow of the toy experiment code includes five steps: three simulation steps, a reconstruction step and an analysis step:
For each event, the event generator creates some signal particles and some background particles. The first signal particle is generated with the following properties:
The event generator then decays this particle to
The background particles are generated by the following algorithm:
The above algorithm generates events with a total charge of zero but there is no concept of momentum or energy balance. About 47% of these events will not have any background tracks.
In the detector simulation step, particles neither scatter nor lose energy when they pass through
the detector cylinders; nor do they decay. Therefore, the charged particles follow a perfectly
helical trajectory. The simulation follows each charged particle until it either exits the detector or
until it completes the outward-going arc of the helix. When the simulated trajectory crosses one
of the detector shells, the simulation records the true point of intersection. All intersections are
recorded; at this stage in the simulation, there is no notion of inefficiency or resolution. The
simulation does not follow the trajectory of the
Figure 3.4 shows an event display of a simulated event that has no background tracks. In
this event the
Figure 3.5 shows an event display of another simulated event, one that has four background
tracks, all drawn in green. In the
The third step in the simulation chain (hit-making) is to inspect the intersections produced by the
detector simulation and turn them into data-like hits. In this step, a simple model of inefficiency
is applied and some intersections will not produce hits. Each hit represents a 2D measurement
The three simulation steps use tools provided by
The fourth step is the reconstruction step. The toyExperiment does not yet have properly working
reconstruction code; instead it mocks up credible looking results. The output of this code is a
data product that represents a fitted helix; it contains the fitted track parameters of the helix, their
covariance matrix and collection of smart pointers that point to the hits that are on the
reconstructed track. When we write proper tracking finding and track fitting code for the
toyExperiment, the classes that describe the fitted helix will not change. Because the main
point of the Workbook exercises is to illustrate the bookkeeping features in
The fifth step in the workflow does a simulated analysis using the fitted helices from the
reconstruction step. It forms all distinct pairs of tracks and requires that they be oppositely charged.
It then computes the invariant mass of the pair, under the assumption that both fitted helices are
kaons.7
This module is an analyzer module and does not make any output data product. But it does make
some histograms, one of which is a histogram of the reconstructed invariant mass of all pairs of
oppositely charged tracks; this histogram is shown in Figure 3.6. When you run the Workbook
exercises, you will make this plot and can compare it to Figure 3.6. In the figure you can see a
clear peak that is created when the two reconstructed tracks are the two true daughters of the
generated
In many places, the Workbook will recommend that you write fragments of code in a particular way. The reason for any particular recommendation may be one of the following:
It is important to be able to distinguish between rules, best practices, conventions and styles; you must follow the rules; it wise to use best practices and established conventions; but style suggestions are just that, suggestions. This documentation will distinguish among these options when discussing the recommendations that it makes.
If you follow the recommendations for best practices and common conventions, it will be easier
to verify that your code is correct and your code will be easier to understand, develop and
maintain.
You will work through the Workbook exercises on a computer that is running some version of the Unix operating system. This chapter describes where to find information about Unix and gives a list of Unix commands that you should understand before starting the Workbook exercises. This chapter also describes a few ideas that you will need immediately but which are usually not covered in the early chapters of standard Unix references.
If you are already familiar with Unix and the
In the Workbook exercises, most of the commands you will enter at the Unix prompt will be standard Unix commands, but some will be defined by the software tools that are used to support the Workbook. The non-standard commands will be explained as they are encountered. To understand the standard Unix commands, any standard Linux or Unix reference will do. Section 4.10 provides links to Unix references.
Most Unix commands are documented via the
In Unix, everything is case sensitive; so the command
or
Before starting the Workbook, make sure that you understand the basic usage of the following Unix commands:
You also need to be familiar with the following Unix concepts:
When you type a command at the prompt, a command-line interpreter called a
For those of you with accounts on a Fermilab machine, your login shell was initially set to the
shell1 .
If you are working on a non-Fermilab machine and learn how to change your login shell to bash.
Some commands are executed internally by the shell but other commands are dispatched to an
appropriate program or script, and launch a child shell (of the same variety) called a
In order to automate repeated operations, you may write multiple Unix commands into a file and
tell
Throughout the Workbook exercises you will run many scripts. You should understand the big
picture of what they do, but you don’t need to understand the details of how they
work.
If you would like to learn more about
Very generally, a Unix
This constitutes a basic
Particular programs (e.g.,
In turn, the Workbook code, which must work for all experiments and at Fermilab as well as at
collaborating institutions, requires yet more environment configuration – a
Given the different experiments using unique combination of
When you finish the Workbook and start to run real code, you will set up your experiment-specific
environment on top of the more generic
the script appropriate for the environment you want. Because of potential naming
“collisions,” it is not guaranteed that these two environments can be overlain and always work
properly.
This concept of the environment hierarchy is illustrated in Figure 4.1.
One way to see the value of an environment variable is to use the
At any point in an interactive command or in a shell script, you can tell the shell that you want
the value of the environment variable by prefixing its name with the
Here,
By convention, environment variables are virtually always written in all capital letters2 .
There may be times when the Workbook instructions tell you to set an environment variable to some value. To do so, type the following at the command prompt:
If you read
In addition,
All of these path concepts are important to users of
When you source the scripts that setup your environment for
To make the output easier to read by replacing all of the colons with newline characters, enter:
In the above line, the vertical bar is referred to as a
There are two ways to run a bash script (actually three, but two of them are the same).
Suppose that you are given a bash script named
The first version,
The second and third versions are equivalent. They do not start a subshell; they execute the
commands from
Some shell scripts are designed so that they must be sourced and others are designed so that they must be executed. Many shell scripts will work either way.
If the purpose of a shell script is to modify your working environment then it must be sourced,
not executed. As you work through the Workbook exercises, pay careful attention to which
scripts it tells you to source and which to execute. In particular, the scripts to setup your
environment (the first scripts you will run) are bash scripts that must be sourced because
their purpose is to configure your environment so that it is ready to run the Workbook
exercises.
Some people adopt the convention that all bash scripts end in
If you would like to learn more about bash, some references are listed in Section 4.10.
The bash shell also has the notion of a
that you will sometimes need to issue, described in Chapter 7, is an example. A bash
function is similar to a bash script in that it is just a collection of bash commands that are
accessible via a name; the difference is that bash holds the definition of a function
as part of the environment while it must open a file every time that a bash script is
invoked.
You can see the names of all defined bash functions using:
The bash shell also supports the idea of
You can read more about bash shell functions and aliases in any standard bash reference.
When you type a command at the command prompt, bash will resolve the command using the following order:
To learn how bash will resolve a particular command, enter:
When you first login to a computer running the Unix operating system, the system will
look for specially named files in your home directory that are scripts to set up your
working environment; if it finds these files it will source them before you first get a shell
prompt. As mentioned in Section 4.5, these scripts modify your
When your account on a Fermilab computer was first created, you were given
standard versions of the files You can read about login scripts in any standard bash reference. You may add to these files
but you should not remove anything that is present.
If you are working on a non-Fermilab computer, inspect the login scripts to understand
what they do.
It can be useful to inspect the login scripts of your colleagues to find useful customizations.
If you read generic Unix documentation, you will see that there are other login scripts with
names like,
The following cheat sheet provides some of the basics:
A more comprehensive summary is available from:
Information about writing bash scripts and using bash interactive features can be found in:
The first of these is a compact introduction and the second is more comprehensive.
The above guides were all found at the Linux Documentation Project: Workbook:
Books about Unix are numerous, of course. Examples include Mark Sobell’s
Section 4.5 discussed the notion of a working environment on a computer. This chapter answers the question: How do I make sure that my environment is configured so that I can run the Workbook exercises or my experiment’s code?
This chapter will explain how to do this in several different situations:
On every computer that hosts the Workbook, a procedure must be established that every user
is expected to follow once per login session. In most cases (NO
As a user of the Workbook, you will need to know what the procedure is and you must
remember to follow it each time that you log in.
For all of the Intensity Frontier experiments at Fermilab, the site-specific setup procedure defines
all of the environment variables that are necessary to create the working environment for either
the Workbook exercises or for the experiment’s own code.
Table 5.1 lists the site-specific setup procedure for each experiment. You will follow the procedure when you get to Section 9.6.
| |
| |
| |
| |
| |
| |
|
|
|
|
| |
| |
| |
| |
| |
NO sure that
nothing in your login scripts, either directly or indirectly, executes the following line:
Once you have a clean login, follow the procedure given in Listing 5.1.
This change is for topic one.
There are two goals for this chapter. The first is to provide an overview of the features of C++ that
will be important for users of
You will need to consult standard documentation to learn about any of the features that you are
not already familiar with. The examples and exercises in this chapter will in many cases only
skim the surface of C++ features that you will need to know how to manipulate as
you work through the Workbook exercises and then use C++ code with
The second goal is to explain the process of turning source code files into an
This chapter is designed around a handful of exercises, each of which you will first build and run, then “pick apart” to understand how the results were obtained.
A typical program consists of many source code files, each of which contains a human-readable
description of one or more components of the program. In the Workbook, you will see source
code files written in the C++ computer language; these files have names that end in
in
In the compilation step each source file is translated into
It is often convenient to collect related groups of object files and put them into
The job of the
After the linker has finished, you can run your executable program typing the filename of the
program at the bash command prompt. If you do not have the current directory on the
A typical program links both to libraries that were built from the program’s source code and to
libraries from other sources. Some of these other libraries might have been developed by the
same programmer as general purpose tools to be used by his or her future programs;
other libraries are provided by third parties, such as
Now that you know about libraries, we can give a second reason why an object file, by itself, is not an executable program: until it is linked, it does not have access to the functions provided by any of the external libraries. Even the simplest program will need to be linked against some of the libraries supplied by the compiler vendor and by the operating system.
The names of all of the libraries and object files that you give to the linker is called the
To start these exercises for the first time, do the following:
After these steps, you are ready to begin the exercise in Section 6.4.
If you log out and log back in again, reestablish your environment by following these steps:
This section provides a program that illustrates the concepts in C++ that are assumed knowledge for the Workbook material. Brief explanations are provided, but in many cases you will need to consult other sources to gain the level of understanding that you will need. Several C++ references are listed in Section 6.9.
This sample program will introduce you to the following C++ concepts and features:
The above list explicitly does not include classes, objects and inheritance, which will be discussed in Sections 6.7 and a future section on inheritance.
In this section you will learn how to compile, link and run the small C++ program that illustrates the features of C++ that are considered prerequisites to the Workbook exercises.
Run the following procedure. The idea here is for you to get used to the steps and
see what results you get. Then in Section 6.4.3 you will examine the source file and
output.
To compile, link and run the sample C++ program, called
Just to see how the exercise was built, look at the script
This turned the source file
Look at the file some are listed in Section 6.9. Note that some questions may be answered in
Section 6.4.3.
In the source file, it is important to first point out the function called the
int main () { ... executable code ... }
Compare your output with the standard example:
There will almost certainly be a handful of differences, which we will discuss in Section 6.4.3.1.
The following sections correspond to sections of the code in
All variables, parameters, arguments, and so on in C++ need to have a
Now, about the handful of differences in the output of one run versus another. There are two main sources of the differences: (1) an uninitialized variable and (2) variation in object addresses from run to run.
In
This line is also the source of the warning message produced by the establish this good coding habit, the
remaining exercises in this series and in the Workbook include the compiler option
See Section 6.4.3.6 for other output that may vary between program runs.
The next section of the example code introduces
While you might find use of arrays in existing code, we recommend avoiding them in new code
arrays, and using either See Section 6.4.3.5 for an
introduction to these types.
Two variables which refer to different objects that contain the same value (either by design or by
coincidence) are is a common mistake.
Another distinction to be made is that of two variables being
# include < algoritm > # include < string > std :: size_t maxSize ( std :: string const & a , std :: string const & b ) { return std :: max ( a . size (), b . size ()); }
If we consider the call:
std :: string s ( ~ cow ~ ); auto sz = maxSize ( s , s );
then, in the body of the function
The primary conditional statements in C++ are
operator,
// Note : this is pseudocode , not C ++ type variable - to - initialize ( expression - to - evaluate ) ? value - if - true : value - if - false ;
An example is shown in the code.
The C++ Standard Library is quite large, and contains many classes, functions, class templates,
and function templates. Our sample code introduces only three: the class
A
The
A pointer is a variable whose value is the memory address of another object. The type of pointer must correspond to the type of the object to which it points.
In addition to the sources of difference in the program output between runs discussed in Section 6.4.3.1, another stems from the line:
float * pa = & a ;
This line declares a variable
The variable
Note that this line could have been written with the asterisk next to
float * pa = & a ;
This latter style is common in the C community. In the C++ community, the former style is
preferred, because it emphasizes the type of the variable
Since the address may change from run to run, so may the printout that starts
The next line,
std :: cout << ~ * pa = ~ << * pa << std :: endl ;
shows how to access the value to which a pointer points. This is called
In Section 6.7 you will learn about
(* panimal ). size () panimal -> size ()
In the example code, the lines
std :: cout << ~ The size of animal is : ~ << (* panimal ). size () << std :: endl ; std :: cout << ~ The size of animal is : ~ << panimal -> size () << std :: endl ;
do exactly the same thing. Note that the parentheses in the first line are necessary because the
precedence of
Note that in many situations, the compiler is free to convert an array-of-
A
float a ; float & ra = a ; float * p = & a ; float * q = & ra ;
The values of
Loops, also called
In the previous exercise, the user code was found in a single file and the build script performed
compiling and linking in a single step. For all but the smallest programs, this is not practical. It
would mean, for example, that you would need to recompile and relink everything when you
made even the smallest change anywhere in the code; generally this would take much too long.
To address this, some computer languages, including C++, allow you to break up a large program
into many smaller files and rebuild only a small subset of files when you make changes in one.
There are two exercises in this section. In the first one the source code consists of three files. This example has enough richness to discuss the details of what happens during compiling and linking, without being overwhelming. The second exercise introduces the idea of libraries.
The source code for this exercise is found in
The file
function, either directly or indirectly. For more information, consult any standard C++
reference. The file
Look at
To be recognized as a main program, there is one more requirement: must be declared
in the global namespace.
The body of the main program (between the braces), declares and defines a variable
You, as the programmer using that function, need to know what the function does but the C++
compiler doesn’t. It only needs to know the name, argument list and return type of the function
— information that is provided in the header file,
double times2 ( double );
This line is called the
The other three lines in
Finally, the file
double times2 ( double i ) { return 2 * i ; }
It names its argument
We now have a rich enough example to discuss a case in which the same word is frequently used for two different things — instead of two words used for the same thing.
Sometimes people use the phrase “the source code of the function named
The phrase
Based on the above description, when this exercise is run, we expect it to print out:
To perform this exercise, first log in and
This matches the expected printout.
Look at the file
You should have noticed that the same command,
The full story is that when you run the command
If the
The third command (with no
As it is compiling the main program,
The compiler also makes a table that lists all functions defined by the file and all functions that
are called by code within the file. The name of each entry in the table is called a
find the definition of
When the compiler writes an object file, it writes out both the compiled code and the table of linker symbols.
The symbol table in the file
The job of the linker (also invoked by the command
Sometimes resolving one unresolved reference will generate new ones. The linker iterates until
(a) all references are resolved and no new unresolved references appear (success) or (b) the same
unresolved references continue to appear (error). In the former case, the linker writes the output
to the file specified by the
After the link completes, the files
The script
Look at the script
This script automatically does the same operations as
It takes a bit of experience to decipher the error messages issued by a C++ compiler. The three exercises in this section are intended to introduce you to them so that you (a) get used to looking at them and (b) understand these particular errors if/when you encounter them later.
Each of the following three exercises is independent of the others. Therefore, when you finish with each exercise, you will need to undo the changes you made in the source file(s) before beginning the next exercise.
The first homework exercise will issue the diagnostic:
When you see a message like this one, you can guess that either you have not included a required header file or you have misspelled the name of the function.
The second homework exercise will issue the diagnostic (second and last lines split into two here):
This error message says that the compiler has found two functions that have the same signature but different return types. The compiler does not know which of the two functions you want it to use.
The bottom line here is that you must ensure that the definition of a function is consistent with its declaration; and you must ensure that the use of a function is consistent with its declaration.
The third homework exercise illustrates the C++ idea of
double tmp = a ; ... std :: cout << ~ times2 ( a ) ~ << times2 ( tmp ) << std :: endl ;
Consult the standard C++ documentation to understand when implicit type conversions may occur; see Section 6.9.
Multiple object files can be grouped into a single file known as a
In this section you will repeat the example of Section 6.5 with a variation. You will create a
library from
To perform this exercise, do the following:
This matches the expected printout. Now let’s look at the script
Note that from this point on, in order to reduce the verbosity of some library descriptions, we
will use the Linux form of library names ( If you
are working on OS X, you will need to translate all these to the OS X form (
The two new features are in step 2, which creates the dynamic library, and step 4, in which
In the filename
The other parts of the name, the prefix
you should always follow it. The use of this convention is illustrated by the scripts
To perform the exercise using
The only difference between
while that from
In the script
In the above, the dot in
This syntax generalizes to multiple libraries in multiple directories as follows. Suppose that the
libraries
The
To perform the exercise using
The difference between
The comments in the sample program used in Section 6.4 emphasized that every variable has a
type:
The language features that allow users to define types are the
Classes and structures (types introduced by either
In general, a class is specified by both a
class) to contain only data or only functions. A class definition has the form shown in
listing 6.1.
class MyClassName { // required : declarations of all members of the class // optional : definitions of some members of the class };
The string
A class
class MyClassName ;
Class declarations are rarely used because a class definition also acts as a class declaration.
The remainder of Section 6.7 will give many examples of
In a class definition, the semi-colon after the closing brace is important.
The upcoming sections will illustrate some features of classes, with an emphasis on features that
will be important in the first few Workbook exercises. This is not indended to be a comprehensive
description of classes. To illustrate, we will show nine versions of a type named
This documentation will use technically correct language so that you will find it easier to read the standard reference materials. We will point out colloquial usage as necessary.
Note that the C++ Standard uses the phrase
Here you will see a very basic version of the class
To build and run this example:
The values printed out in the first line of the output may be different when you run the program
(remember initialization?). When you look at the code you will see that
Look at the header file
1 # ifndef Point_h 2 # define Point_h 4 class Point { 5 public : 6 double x ; 7 double y ; 8 }; 10 # endif /* Point_h */ 11
The three lines starting with
Line 4 introduces the name
The body of the class definition begins on line 4, with the opening brace; the body of the class
definition ends on line 8, with the closing brace. The definition of the class is followed by a
semicolon. Line 5 states that the following members of the class are
In this exercise there is no file
Look at the function
1 # include ~ Point . h ~ 2 # include < iostream > 4 int main () { 5 Point p0 ; 6 std :: cout << ~ p0 : ( ~ << p0 . x << ~ , ~ << p0 . y << ~ ) ~ 7 << std :: endl ; 9 p0 . x = 1.0; 10 p0 . y = 2.0; 11 std :: cout << ~ p0 : ( ~ << p0 . x << ~ , ~ << p0 . y << ~ ) ~ 12 << std :: endl ; 14 Point p1 ; 15 p1 . x = 3.0; 16 p1 . y = 4.0; 17 std :: cout << ~ p1 : ( ~ << p1 . x << ~ , ~ << p1 . y << ~ ) ~ 18 << std :: endl ; 20 Point p2 = p0 ; 21 std :: cout << ~ p2 : ( ~ << p2 . x << ~ , ~ << p2 . y << ~ ) ~ 22 << std :: endl ; 24 std :: cout << ~ Address of p0 : ~ << & p0 << std :: endl ; 25 std :: cout << ~ Address of p1 : ~ << & p1 << std :: endl ; 26 std :: cout << ~ Address of p2 : ~ << & p2 << std :: endl ; 28 return 0; 29 } 30
When the first line of code in the
Point p0 ;
is executed, the program will ensure that memory has been
allocated5
to hold the data members of
Some other standard pieces of C++ nomenclature can now be defined:
An important take-away from the above is that a
We have now seen multiple meanings for the word
Which is meant must be determined from context. In this Workbook, we will use “class instance” rather than “object” to distinguish between the second and third meanings in any place where such differentiation is necessary.
The last section of the main program (and of
Figure 6.1 shows a diagram of the computer memory at the end of running
Now, for a bit more terminology: each of the objects referred to by the variables
This exercise expands the class
To build and run this example:
When you run the code, all of the printout should match the above printout exactly.
Look at
Point ();
The parentheses tell you that this new member is some sort of function. A C++ class may have several different kinds of functions.
A function that has the same name as the class itself has a special role and is called a
no arguments it is called a
In informal written material, the word constructor is sometimes written as
Look at the file
Point :: Point () { x = 0.; y = 0.; }
Look at the program
Point p0 ;
When the program executes this line, the first step is the same as before: it ensures that memory
has been allocated for the data members of
The next block of the program assigns new values to the data members of
In the previous example,
This exercise introduces four new ideas:
To build and run this exercise,
Look at the file
Point ( double ax , double ay );
This line declares a second constructor; we know it is a constructor because it is a function whose
name is the same as the name of the class. It is distinguishable from the default constructor
because its argument list is different than that of the default constructor. As before,
the file
Look at the file
Look at the file
Point p0 (1.,2.);
This line declares the variable
The next line of code
Point p1 ( p0 );
uses the
constructor, the compiler puts the generated code directly into the object file; it does not affect
the source file.
We recommend that for any class whose data members are either built-in types, of which
If your class has data members that are pointers, or data members that manage some external
resource, such as a file that you are writing to, these pointers should be
The next line in the file prints the values of the data members of
Notice that in the previous version of
Point p0 ; p0 . x = 3.1; p0 . y = 2.7;
This is called
Point p0 (1.,2.);
We strongly recommend using single-phase construction whenever possible. Obviously it
takes less real estate, but more importantly:
This version of the class
To build and run this exercise,
The file
Now look at the file
If you think about these rules carefully, you will see that in
On the other hand, when the compiler compiled the source code for the default constructor in
Therefore, the machine code for the
In practice
In some cases it does not matter which of these two ways you use to write a constructor; but on those occasions that it does matter, the right answer is always the colon-initializer syntax. So we strongly recommend that you always use the colon-initializer syntax. In the Workbook, all classes are written with colon-initializer syntax.
Now look at the second constructor in
Look at
This section will introduce
To build and run this exercise,
Look at the file
double mag () const ; double phi () const ; void scale ( double factor ); ![]()
![]()
All three lines declare
The first of these member functions is named
Similarly, the member function named
The third member function,
If a member function does not modify any data members, you should always declare it Any negative consequences of not doing so might only
become apparent later, at which point a lot of tedious editing will be required to make everything
right.
Look at
Later on in
function declared in the
The next part of
The next part of
The file
The next line calls the member function
One final comment is in order. Many other modern computer languages have ideas very similar
to C++ classes and C++ member functions; in some of those languages, the name
Here we suggest four activities as homework to help illustrate the meaning of
double Point :: mag () const { x *= 2.; return std :: sqrt ( x * x + y * y ); }
Then build the code again; you should see the following diagnostic message:
Point const p0 (1,2);
Then build the code again; you should see the following diagnostic message:
These first two homework exercises illustrate how the compiler enforces the contract defined by
the qualifier
accordingly.8
In the first homework exercise, the value of a member datum is modified, thereby breaking the contract. The compiler detects it and issues a diagnostic message.
In the second homework exercise, the variable
double mag ();
Then build the code again; you should see the following diagnostic message:
The third and fourth homework exercises illustrate that the compiler considers two member
functions that are identical except for the presence of the
This version of the class
A 2D point class, with member data in Cartesian coordinates, is not a good example of
To build and run this exercise,
Look at
Relative to version
Yes, there are two functions named
The two member functions named
If you want to see what mangled names are created for the class
You can understand the output of
In a class declaration, if any of the identifiers
Look at
Relative to version
Inspect the code in the implementation of each of the new member functions. The member function
The member functions in the overload set for the name
of the
There is no requirement that there be accessors and setters for every data member of a class; indeed, many classes provide no such member functions for many of their data members. If a data member is important for managing internal state but is of no direct interest to a user of the class, then you should certainly not provide an accessor or a setter.
Now that the data members of
Look at
Relative to version
Figure 6.2 shows a diagram of the computer memory at the end of running this version of
The key point in Figure 6.2 is that each object has its own member data but there is only one
copy of the code. Even if there are thousands of objects of type
Initially this sounds a little weird: the previous paragraph talks about passing an argument to the
function
For example, the accessor
double x () const { return this -> x_ ; }
This version of the syntax makes it much clearer how there can be one copy of the code even
though there are many objects in memory; but it also makes the code harder to read once you
have understood how the magic works. There are not many places in which you need to explicitly
use the keyword
C++ will not permit you to use the same name for both a data member and its accessor.
Since the accessor is part of the public interface, it should get the simple, obvious,
easy-to-type name. Therefore the name of the data member needs to be decorated to make it
distinct.
The convention used in the Workbook exercises and in the toyExperiment UPS product is that the names of member data end in an underscore character. There are some other conventions that you may encounter:
_name ; m_name ; mName ; theName ;
You may also see the choice of a leading underscore followed by a capital letter, or
a double underscore. Never do this. Such names are reserved for use by C++
implementations; use of such names may produce accidental collisions with names used in
an implementation, and cause errors that might be very difficult to diagnose. While
this is a very small risk, it seems wise to adopt habits that guarantee that it can never
happen.
It is common to extend the pattern for decorating the names of member data to all member data, even those without accessors. One reason for doing so is just symmetry. A second reason has to do with writing member functions; the body of a member function will, in general, use both member data and variables that are local to the member function. If the member data are decorated differently than the local variables, it can make the member functions easier to understand.
This section describes a class for which it makes sense to have private data: a 2D point class that
has data members
If this class is implemented with private data manipulated by member functions, then the constructors and member functions can enforce the guarantees.
The language used in the software engineering texts is that a guaranteed relationship among the
data members is called an
data.
If a class has no invariant then one is free to choose public data. The Workbook and the toyExperiment never make this choice. One reason is that classes that begin life without an invariant sometimes acquire one as the design matures — we recommend that you plan for this unless you are 100% sure that the class will never have an invariant. A second reason is that many people who are just starting to learn C++ find it confusing to encounter some classes with private data and others with public data.
This section introduces the
To build and run this exercise, cd to the directory
Look at
Comparing
The
The specifier does not force inlining on the compiler. Why the option? In some cases inlining is a
net positive thing, in other cases it’s a net negative; based on heuristics, the compiler will
determine which, and choose. For some functions, offering the option at all (i.e., including the
specifier
when to use it and when not
to.
Specifying a function as
In the “decline-to-inline” case, the compiler will write a copy of the
function once for each source file in which a definition of the function
appears12 .
During linking, the copy of the compiled function in the same object file will be used to satisfy
calls to the function. Result: a larger memory footprint, but no reduction in execution
time. Clearly, for a bigger or more complex function, use of the
C++ does not permit you to force inlining; an
The bottom line is that you should always declare simple accessors and simple setters
Look at the definition of the member function
The version of
To build and run this exercise,
This is the same output made by
Relative to
When you define a member function inside the class declaration, the function is implicitly
declared
When you define a member function within the class declaration, you must not prefix the function name with the class name and the scope resolution operator; that is,
double Point :: x () const { return x_ ; }
would produce a compiler diagnostic.
In summary, there are two ways to write inlined definitions of member functions. In most cases,
the two are entirely equivalent and the choice is simply a matter of style. The one exception
occurs when you are writing a class that will become part of an
When writing an does the parsing has some limitations
and we need to work around them. The workarounds are easiest if any member functions
definitions in the header file are placed outside of the class declarations. For details see
This section illustrates how to write a
Point p0 (1,2); std :: cout << p0 << std :: endl ;
instead of
Point p0 (1, 2); std :: cout << ~ p0 : ( ~ << p0 . x () << ~ , ~ << p0 . y () << ~ ) ~
To build and run this exercise,
This is the same output made by
Look at
Look at
Look at
In
std :: ostream & operator <<( std :: ostream & ost , Point const & p );
If the class whose type is used as second argument is declared in a namespace (which it is not,
in this case), then the stream insertion operator must be declared in the same
namespace.
When the compiler sees the use of a
We write
Point p0 (1,2), p1 (3,4); std :: cout << p0 << ~ ~ << p1 << std :: endl ;
The C++ compiler parses this left to right. First it recognizes the expression
Look at the implementation of the stream insertion operator in
std :: ostream & operator <<( std :: ostream & ost , Point const & p ) { ost << ~ ( ~ << p . x () << ~ , ~ << p . y () << ~ ) ~ ; return ost ; } ![]()
![]()
The first argument,
In this example, the stream insertion operator does
If you wish to write a stream insertion operator for another class, just follow the pattern used here.
If you want to understand more about why the operator is written the way that it is, consult the standard C++ references; see Section 6.9.
The stream insertion operator is a
The choice of whether to put the declaration of the stream insertion operator (or any other free
function) into (1) the header file containing a class declaration or (2) its own header file is a
tradeoff between the following two criteria:
The definition of this operator is typically put into the implementation file, rather than being inlined. Such functions are generally poor candidates for inlining.
Ultimately this is a judgment call and the code in this example follows the recommendations
made by the
With one exception, if including a function declaration in
The class
This section lists some recommended C++ references, both text books and online materials.
The following references describe the C++ core language,
The following references describe the C++ Standard Library,
The following contains an introductory tutorial. Many copies of this book are available at the Fermilab library. It is a very good introduction to the big ideas of C++ and Object Oriented Programming but it is not a fast entry point to the C++ skills needed for HEP. It also has not been updated for the current C++ standard.
The following contains a discussion of recommended best practices. It has not been updated for the current C++ standard.
Section 3.6.8 introduced the idea of external products. For the Intensity Frontier experiments (and for Fermilab-based experiments in general), access to external products is provided by a Fermilab-developed product-management package called Unix Product Support (UPS). An important UPS feature – demanded by most experiments as their code evolves – is its support for multiple versions of a product and multiple builds (e.g., for different platforms) per version.
Another notable feature is its capacity to handle multiple databases of products. So, for example,
on Fermilab computers, login scripts (see Section 4.9) set up the UPS system, providing
access to a database of products commonly used at Fermilab.
The
In this chapter you will learn how to see which products UPS makes available, how UPS handles variants of a given product, how you use UPS to initialize a product provided in one of its databases and about the environment variables that UPS defines.
The act of setting up UPS defines a number of environment variables (discussed in Section 7.5),
one of which is
The environment variable
When UPS looks for a product, it checks each directory in
If you are on a Fermilab machine, you can look at the value of PRODUCTS just after logging in,
before sourcing your site-specific setup script. Run
It should have a value of
This generic Fermilab UPS database contains a handful of software products commonly used at
Fermilab; most of these products are used by all of the Intensity Frontier Experiments.
This database does not contain any of the experiment-specific software nor does it
contain products such as
After you source your site-specific setup script, look at
You can see which products
Each directory name in these listings corresponds to the name of a UPS product. If you are on a
different experiment, the precise contents of your experiment’s product directory may be
slightly different. Among other things, both databases contain a subdirectory named
An important feature of UPS is its capacity to make multiple variants of a product available to
users. This of course includes different versions, but beyond that, a given version of a product
may be built more than one way, e.g., for use by different operating systems (what UPS
distinguishes as
The full identifier of a UPS product includes its product name, its version, its flavor and its full set of qualifiers. In Section 7.3, you will see how to fully identify a product when you set it up.
Any given UPS database contains several to many, many products. To select a product and make
it available for use, you use the
In most cases the correct flavor can be automatically detected by
Putting in real-looking values, it would look something like:
What does the
Setting up dependent products works recursively. In this way, a single
When you follow a given site-specific setup procedure, the
Running the right ‘setup’ should work
automatically as long as UPS is properly initialized. If it’s not,
If this happens, the simplest solution is to log out and log in again. Make sure that you carefully follow the instructions for doing the site specific setup procedure.
Few people will need to know more than the above about the UPS system. Those who do can consult the full UPS documentation at:
For some UPS products, but not all, the site administrator may define a particular fully-qualified
version of the product as the default version. In the language of UPS this notion of default is
called the
When you run this, the UPS system will automatically insert the version and qualifiers of the version that has been declared current.
Having a current version is a handy feature for products that add convenience features to your interactive environment; as improvements are added, you automatically get them.
However the notion of a current version is very dangerous if you want to ensure that software
built at one site will build in exactly the same way on all other sites. For this reason, the
Workbook fully specifies the version number and qualifiers of all products that it requires;
and in turn, the products used by the Workbook make fully qualified requests for the products on
which they depend.
When your login script or site-specific setup script initializes UPS, it defines many environment
variables in addition to
In discussing the other important variables, the toyExperiment product will be used as
an example product. For a different product, you would replace “toyExperiment” or
“TOYEXPERIMENT” in the following text by the product’s name. Once you have followed your
appropriate setup procedure (Table 5.1) you can issue the following command this is
informational for the purposes of this section; you don’t need to do it until you start running the
first Workbook exercise):
The version and qualifiers shown here are the ones to use for the Workbook exercises. When the
Almost all UPS products that you will use in the Workbook define these three environment variables. Several, including toyExperiment, define many more. Once you’re running the exercises, you will be able to see all of the environment variables defined by the toyExperiment product by issuing the following command:
Many software products have version numbers that contain dot characters. UPS requires that
version numbers not contain any dot characters; by convention, version dots are replaced with
underscores. Therefore
The software for the Workbook depends on a large number of external products; the same is true,
on an even larger scale, for the software in your experiment. The preceeding sections in this
chapter discussed how to establish a working environment in which all of these software products
are available for use.
When you are working with the code in the Workbook, and when you are working on your experiment, you will frequently encounter C++ classes and functions that come from these external products. An important skill is to be able to identify them when you see them and to be able to follow the clues back to their source and documentation. This section will describe how to do that.
An important aid to finding documentation is the use of
This subsection will use the example of the class
The class that holds the
If you look at code that uses
1 # include ~ art / Framework / Principal / Event . h ~ 2
The
When you setup the
You can follow this same pattern for any class or function that is part of
If you are new to C++, you will likely find this header file difficult to understand; you do not need to understand it when you first encounter it but, for future reference, you do need to know where to find it.
Earlier in this section, you read that if a C++ file uses
We can summarize this discussion as follows: if a C++ source file uses
Finally, from time to time, you will need to dig through several layers of header files to find the information you need.
There are two code browsing tools that you can use to help navigate the layering of header files
and to help find class declarations that are not in a file named for the class:
(In the above, both URLs are live links.)
Section 3.7 introduced the idea that the Workbook is built around a UPS product named
toyExperiment, which describes a made-up experiment. All classes and functions defined in this
UPS product are defined in the namespace
One of the classes from the toyExperiment UPS product is
1 # include ~ toyExperiment / MCDataProducts / GenParticle . h ~ 2
As for headers included from
With a few exceptions, discussed in Section 7.6.4, if a class or function from a UPS product is used in the Workbook code, it will obey the following pattern:
Using this information, the name of the header file will always be
This pattern holds for all of the UPS products listed in Table 7.1.
| |
| |
| |
| |
| |
| |
| |
| |
A table listing git- and LXR-based code browsers for many of these UPS products can be found
near the top of the web page:
https://cdcvs.fnal.gov/redmine/projects/art/wiki
There are three exceptions to the pattern described in Section 7.6.3:
The Workbook is so tightly coupled to the toyExperiment UPS product that all classes
in the Workbook are also in its namespace,
The ROOT package is a CERN-supplied software package that is used by
conventions:
The rule for writing an include directive for a header file from ROOT is to write its name without any leading path elements:
1 # include ~ TFile . h ~ 2
All of the ROOT header files are found in the directory that is pointed to by the environment
variable
Or you can the learn about this class using the reference manual at the CERN web site: http://root.cern.ch/root/html534/ClassIndex.html
You will not see theGeant4 package in the Workbook but it will be used by the software for your experiment, so it is described here for completeness. Geant4 is a toolkit for modeling the propagation particles in electromagnetic fields and for modeling the interactions of particles with matter; it is the core of all detector simulation codes in HEP and is also widely used in both the Medical Imaging community and the Particle Astrophysics community.
As with ROOT, Geant4 was designed before namespaces were a stable part of the C++ language. Therefore Geant4 adopted the following conventions.
Most of the header files defined by Geant4 are found in a single directory, which is pointed to by
the environment variable
The rule for writing an include directive for a header file from Geant4 is to write its name without any leading path elements:
The workbook does not set up a version of Geant4; therefore G4INCLUDE is not defined. If it were, you would look at this file by:
Both ROOT and Geant4 define many thousands of classes, functions and global variables. In
order to avoid collisions with these identifiers, do not define any identifiers that begin with
any of (case-sensitive):
The Workbook exercises can be run in several environments:
Many details of the working environment change from site to site1 and these differences are parameterized so that (a) it is easy to establish the required environment, and (b) the Workbook exercises behave the same way at all sites. In this chapter you will learn how to find and log into the right machine remotely from your local machine (laptop or desktop), and make sure it can support your Workbook work.
In order to run the exercises in the Workbook, you will need an account on a machine that can
access your site’s installation of the Workbook code. The experiments provide instructions for
getting computer accounts on their machines (and various other information for new users) on
web pages that they maintain, as listed in Table 8.1. The URLs in the table are live
hyperlinks.
Currently, each of the experiments using
| |
| |
| |
| |
| |
| |
| |
| |
| |
At time of writing, the new-user instructions for all LArSoft-based experiments are at the LArSoft site; there are no separate instructions for each experiment.
If you are planning to take the
If you would like a computer account on a Fermilab computer in order to evaluate
The experiment-specific machines confirmed to host the Workbook code are listed in Table 8.2
In most cases the name given is not the name of an actual computer, but rather a round-robin alias
for a cluster. For example, if you log into
| |
| |
| |
| |
| |
| |
| |
| |
| |
|
|
Each experiment’s web page has instructions on how to log in to its computers from your local machine.
Some of the Workbook exercises will launch an X window from the remote machine that opens
in your local machine. To test that this works, type
This should, without any messages, give you a new command prompt. After a few seconds, a
new shell window should appear on your laptop screen; if you are logging into a Fermilab
computer from a remote site, this may take up to 10 seconds. If the window does not
appear, or if the command issues an error message, contact a computing expert on your
experiment.
To close the new window, type
If you have a problem with Try logging in again with
As you work through the Workbook exericses you will need to edit files. Familiarize yourself
with one of the editors available on the computer that is hosting the Workbook. Most
Fermilab computers offer four reasonable choices: emacs, vi, vim and nedit. Of these,
nedit is probably the most intuitive and user-friendly. All are very powerful once you
have learned to use them. Most other sites offer at least the first three choices. You
can always contact your local system administrator to suggest that other editors be
installed.
In this first exercise of the Workbook, you will be introduced to the
Before running any of the exercises in this Workbook, you need to be familiar enough with the material discussed in Part I (Introduction) of this documentation set and with Chapter 8 to be able to find information as needed.
If you are following the instructions below on an older Mac computer (OSX 10.6, Snow Leopard,
or earlier), and if you are reading the instructions from a PDF file, be aware that if you use the
mouse or trackpad to cut and paste text from the PDF file into your terminal window, the
underscore characters will be turned into spaces. You will have to fix them before the
commands will work.
In this exercise you will learn:
This discussion is aimed to help you understand the process described in this chapter as a whole
and how the pieces fit together in the
At the center of the figure is a box labelled “
One remaining box in the figure (at right, second from bottom) is not encountered in the first
Workbook exercise but has been provided for completeness. In most
The arrows in Figure 9.1 show the direction in which information flows. Everything but the
output flows into the
Several event-data input files have been provided for use by the Workbook exercises. These input files are packaged as part of the toyExperiment UPS product. Table 9.1 lists the range of event IDs found in each file. You will need to refer back to this table as you proceed.
| | | |
| | | |
| | | |
| | | |
| | |
|
| | |
|
| | | |
A run-time configuration (FHiCL) file has been provided for each exercise. For Exercise 1 it is
The intent of this section is for the reader to start from “zero” and execute an
Some steps are written as statements, others as commands to issue at the prompt. Notice that
Most readers: Follow the steps in Section 9.6.1.1, then proceed directly to Section 9.7.
If you wish to manage your working directory yourself, skip Section 9.6.1.1, follow the
steps in Section 9.6.1.2, then proceed to Section 9.7.
If you log out and wish to log back in to continue this exercise, follow the procedure outlined in Section 9.6.1.3.
Proceed to Section 9.7.
Proceed to Section 9.7.
If you log out and later wish to log in again to work on this exercise, you need to do the folllowing:
Compare this with the list given in Section 9.6.1. You will see that three steps are missing because they only need to be done the first time.
You are now ready to run
From your working directory, execute
Compare the ouptut you produced (in the file
Every time you run
The file
This file is written in the Fermilab Hierarchical Configuration Language (FHiCL, pronounced
“fickle”), a language that was developed at Fermilab to support run-time configuration for several
projects, including
The full details of the FHiCL language, plus the details of how it is used by Most people will find it much easier to follow the
discussion in the Workbook documentation than to digest the full documentation up
front.
In a FHiCL file, the start of a comment is marked either by the hash sign character (
The hash sign has one other use. If the first eight characters of a line are exactly
The basic element of FHiCL is the
A group of FHiCL definitions delimited by braces {} is called a document set will often
refer to a FHiCL table as a parameter set.
The fragment of
The name it
will interpret that parameter set to be the description of the source of events for this run of
Within the
The string
In most cases the filenames in the sequence must be enclosed in quotes. FHiCL, like many other languages has the following rule: if a string contains white space or any special characters, then quoting it is required, otherwise quotes are optional.
FHiCL has its own set of special characters; these include anything
It is implied in the foregoing discussion that a FHiCL
sequence.
The identifier
The fragment of
At the outermost scope of the FHiCL file,
For our current purposes, the module
where RR, SS and EE are substituted with the actual run, subRun and event number of each
event.
If you look back at Listing 9.1, you will see that this line appears ten times, once each for events
1 through 10 of run 1, subRun 0 (as expected, according to Table 9.1). The remainder of the
listing is standard output generated by
On line 20,
The remainder of the lines in
The
the bash prompt:
Note that some options have both a short form and a long form. This is a common convention for Unix programs; the short form is convenient for interacive use and the long form makes scripts more readable. It is also a common convention that the short form of an option begins single dash character, while the long form of an option begins with two dash characters, for example --help above.
By default
Run each of these commands and observe their output.
The second way is within the FHiCL file. Start by making a copy of
Edit
By convention this is added after the fileNames definition but it can go anywhere inside the
source parameter set because the order of parameters within a FHiCL table is not important. Run
You should see output from the
To configure the file for the input files, either leave off the maxEvents parameter or give it a value of
-1.
If the maximum number of events is specified both on the command line and in the FHiCL file, then the command line takes precedence. Compare the outputs of the following commands:
For historical reasons, there are multiple ways to specify the input event-data file (or the list of
input files) to an
If input file names are provided both in the FHiCL file and on the command line, the command
line takes precedence.
Let’s run a few examples.
We’ll start with the
To see what you should expect given the following input file, check Table 9.1, then run:
Notice that the ten events in this output are from run 2 subRun 0, in contrast to the previous
printout which showed events from run 1. Notice also that the command line specification
overrode that in the FHiCL file. The
This time, edit the source parameter set inside the
(Notice that you also added
Check Table 9.1 to see what you should expect, then rerun
You will see 20 lines from the
Back to the
This will read only
There are several ways to specify multiple files at the command line. One choice is to use the
(upper case) [
Now run
You should see the
Finally, you can list the input files at the end of the
(Remember the Unix convention about a trailing backslash marking a command that continues
on another line; see Chapter 2. ) In this case you should see the
In summary, there are three ways to specify input files from the command line; all of them
override any input files specified in the FHiCL file. Do not try to use two or more of these
methods on a single
The
An equivalent operation can be done from the command line in two different ways. Try the following two commands and compare the output:
You can also specify the intial event to process relative to a given event ID (which, recall,
contains the run, subRun and event number). Edit
When you run this job,
Recall from Section 9.8.2 that the
Look back at the listing on page 207, which contains the
The identifier
Here
Module labels are fully described in Section 24.5.
In this example
Now look at the FHiCL fragment below that starts with
once with a tight mass cut on the same intermediate state. In
When
and lower-case letters and the numerals 0 to 9.
In the FHiCL files in this exercise, all of the modules are analyzer modules. Since analyzers do
not make data products, these module labels are nothing more than identifiers inside the FHiCL
file. For producer modules, however, which
Within
In the
run them.2
In this exercise there is only one module to run (the analyzer
The FHiCL parameter
An
many groups are working on a common project, this helps to maximize the independence of each
work group.
Recall from Section
Note that any path listed in the A similar mechansim is used to specify the workflow of
producer and filter modules; that mechanism will be discussed when you encounter it. If you
need a reminder about the types of modules, see Section 3.6.3.
If the
both of which are allowable,
As is standard in FHiCL, if the definition of
The notion of
The above description is intended to be sufficient for completing the Workbook exercises. If you want to learn more, now or later, Section 9.8.8.1 provides more detail.
This section is optional; it contains more details about the material just described in
Section 9.8.8. It is not really a “dangerous bend” section for experts — just a side
trip.
Exercise 1 is not rich enough to illustrate how to specify an
Suppose that there are two groups of people working on a large collaborative project, the project
leaders are Anne and Rob. Each group has a workflow that requires running five or six module
instances; some of the module instances may be in the workflow for both groups. Recall that an
instance of a module refers to the name of a module plus its parameter set, and a module instance
is specified by giving its module label. For this example let’s have eight module instances with
the unimaginative names
That is, Anne defines the modules that her group needs to run and Rob defines the modules that
his group needs to run. Anne and Rob do not need to know anything about each other’s list. The
parameter definitions
The parameter named
The above machinery probably seems a little heavyweight for the example given. But
consider a workflow like that needed to design the trigger for the CMS experiment,
which requires about 200 paths and many hundreds of modules. Finding the set of
unique modules labels is not a task that is best done by hand! By introducing the idea of
paths, the design allows each group to focus on its own work, unaffected by the other
groups.
Actually, the above is only part of the story: the module labels given in the paths
To illustrate this parallel mechanism let’s continue the above example of two work groups led by
Rob and Anne. In this case let there be filter modules with labels given by,
Here the parameters
During
Now, what happens if you define a path with a mix of modules from the two groups? It might look like this:
In this case
Furthermore, if you put a module label into either
Now it’s time to define two really badly chosen
names:3
Similarly the paths prefixed with
This documentation will try to avoid avoid confusion between
If the
When more than one
The rules for order of execution of module instances named in an
analyzer and output modules may neither add new information to the event nor communicate
with each other except via the event, the processing order is not important. By definition, then,
appearance in the path,
but do not write code that depends on execution order because
The file
Output files are written by output modules; one module can write one file. An
If you wish to add an output module to an
If you wish to add more output modules, repeat steps 2 and 3 for each additional output file.
The parameter set
of the output file; this parameter is processed by the
In the example of
Before running the exercise, look at the source parameter set of
To run
The first command will write the ouptut file; the second will check that the output file was
created and will tell you its size; the last one will read back the output file and print the event IDs
for all of the events in the file. You should see the
Section 9.6.1 contained a list of steps needed to run this exercise; this section will
describe each of those steps in detail. When you understand what is done in these
steps, you will understand
Steps 1 and 4 should be self explanatory and will not be discussed further.
When reading this section, you do not need to run any of the commands given here; this is
commentary on commands that you have already run.
The site-specific setup procedure, described in Chapter 5, ensures that the UPS system is
properly initialized and that the UPS database (containing all of the UPS products
needed to run the Workbook exercises) is present in the
This procedure also defines two environment variables that are defined by your experiment to allow you to run the Workbook exercises on their computer(s):
If these environment variables are not defined, ask a system admin on your experiment.
On the Fermilab computers the home disk areas are quite small so most experiments ask that their collaborators work in some other disk space. This is common to sites in general, so we recommend working in a separate space as a best practice. The Workbook is designed to require it.
This step, shown on two lines as:
creates a new directory to use as your working directory. It is defined relative to an environment variable described in Section 9.9.1. It only needs to be done the first time that you log in to work on Workbook exercises.
If you follow the rest of the naming scheme, you will guarantee that you have no conflicts with other parts of the Workbook.
As discussed in Section 9.6.1.2, you may of course choose your own working directory on any disk that has adequate disk space.
This step is the main event in the eight-step process.
This command tells UPS to find a product named toyExperiment, with the specified version and
qualifiers, and to
The required qualifiers may change from one experiment to another and even from one site to
another within the same experiment. To deal with this, the site specific setup procedure defines
the environment variable
The complete UPS qualifier for toyExperiment has two components, separated by a colon: the
string defined by
Each version of the toyExperiment product knows that it requires a particular version and
qualifier of the
If you are interested, you can inspect your environment before and after doing this setup. To do this, log out and log in again. Before doing the setup, run the following commands:
Then
The step:
only needs to be done only the first time that you log in to work on the Workbook.
In this step you copied the files that you will use for the exercises into your current working directory. You should see these files:
This step:
only needs to be done only the first time that you log in to work on the Workbook. It created
some symbolic links that
The FHiCL files used in the Workbook exercises look for their input files in the subdirectory
in which the necessary input files are found.
This script also ensures that there is an output directory that you can write into when you run the
exercises and adds a symbolic link from the current working directory to this output directory.
The output directory is made under the directory
Issuing the command:
runs the
When you ran
It looked at the environment variable
The output should look similar to that shown in Listing 9.3.