I Appendix
A : Local metadatabase structure and
principles
I.1 Metadatabase
for easy data access
I.1.1 Principle
and structure
From the user's point of view, the analysis
environment should allow her/him to access simply any vector or frame that is
contained in a set of frame files, even remotely. One should also be able to
access vector not caring about frame boundaries. It should also be possible to
make accesses conditioned by some trigger, slow monitoring value or user-defined
condition. Furthermore, one should be able to manage a set of files that is as
big as possible (of the order of 1 TB or more)
The option chosen in VEGA to solve this
problem is to build a database that contains metadata about frames and indexes
these frames. The structure of the database is drawn on the following figure
:
The complete frame information is kept in
the frame files, while the database serves as an index to access them. The
access is a multistep process. First, one gives the starting time of the vector
one wants to a frame channel. This starting time is given to the metadatabase
and is used to determine, with the help of what is called a "time hash table",
the frame files that possibly contain the desired information. Then, only the
metadata corresponding to these files are searched. The corresponding metadata
is sent back to the frame channel that will access the desired frame in the
relevant file or reconstruct the desired vector and give it to the
user.
The metadata is kept in a container, called
Tree, introduced in ROOT and specifically designed to handle big amounts of data
and access it quickly. The tree structure can also contain conditions for frame
access.
To create
a local database, we need to create a database object of class VFrameMetaDB. At
the same time, we can build it, looking for frame files. As, for example, in
vega[]
vd = new VFrameMetaDB("demoDB.root","CREATE","./")
we call the VFrameMetaDB constructor.
The first parameter is the name of the
database file.
The second one is the mode with which we
open the database. Here, it is opened in "CREATE" mode, since we want to build
it.
the third parameter is the path to the
directory containing the frame files. Here, it is "./", meaning the local
directory. You can put whatever path you like and specify particular
files/directories with wildcards.
The search is by default recursive, all the
contained directories will be searched recursively. If you want to turn this
option off, add a last parameter and specify "S"
That's it ! The amount of time needed to
index the files depends on the number and size of those files. But typically, it
consists of a sequential read of all the files. This shouldn't be much slower
than the speed of the connection to the disk containing the
directories.
I.1.3 Access
to metadata
I.1.3.1 Opening an
existing database
Opening an existing metadatabase may be done
by using the constructor in "READ" mode, which is the default :
vega[]
vd = new VFrameMetaDB("./demoDB.root")
One may replace "./demoDB.root" with the
path/name of his local database.
I.1.3.2 Extracting
metadata
Metadata are extracted from the database
with
void GetMetaData(VMetaData* meta, Double_t
time)
This
method extracts a metadata at an absolute time.
As in
vega[]vd->GetMetaData(meta,
time)
where time is a double expressing a time
that is contained in the frame and meta is a pointer to a VMetaData metadatabase
object, created by the user, to be filled. This method fills the meta object
with the right information. In this way, the user always keeps control of
creation/deletion of the metadata objects.
Once one has extracted one metadata, it is
possible to extract the next one or the preceding one in the database
:
GetNextMeta(VMetaData*
meta)
For
example :
vega[]
vd->GetNextMeta(meta)
or
GetPreviousMeta(VMetaData*
meta)
For
example :
vega[]
vd->GetPreviousMeta(meta)
where meta is a pointer to a VMetaData
metadatabase object, created by the user, to be filled. This method fills the
meta object with the right information. In this way, the user always keeps
control of creation/deletion of the metadata objects.
Be careful, while GetFrame is checking to
see if the requested time is really contained in one of the frames of the
database, GetNextMeta and GetPreviousMeta do not. They will simply return you
the metadata that is the next (or previous) one in time in the database, even if
it's years away.
I.1.4 Getting
general information about the metadatabase and it’s contents
I.1.4.1 Getting the start
time of the metadatabase
To get the start time of the first frame
indexed in the metadatabase, one has to use the VFrameMetaDB
method
double
GetStart()
For
example if vd is a valid metadatabase,
vega[]
vd->GetStart()
will output this start time
while
vega[]
st = vd->GetStart()
will put it in a double that may be
reused.
I.1.4.2 Printing
information about the metadatabase
One can get some information about the
metadatabase by using the VFrameMetaDB method
double
Print()
For
example if vd is a valid metadatabase,
vega[]
vd->Print()
This will show a rather extensive output of
the metadatabase content.
I.2 Dealing
with selected or triggered data
I.2.1 Condition
information in the metadatabase
When the metadatabase is build by reading
the frame files, the trigger information contained in each frame is retrieved
and arranged in such a way to allow easy extraction of frames or vectors
satisfying some selection.
The triggers (structures of type FrTrigData
in the FrameLib) are converted to more general objects called conditions. This
will allow to use in the future as conditions other information such as slow
monitoring data or quality information. The user may even add his own
conditions, without copying the whole files just to add a FrTrigData structure.
A simple scheme of the database was given above. This scheme is now enhanced by
the addition of condition trees and an index for fast access:
I.2.2 Extracting
metadata with a condition
There are two ways of extracting metadata
that satisfy a given condition: directly or sequentially. The direct method will
be used if a particular frame, which approximate time is known, is to be
extracted. The sequential method is used if one needs to process or view
sequentially all the frames that correspond to a given selection
expression.
I.2.2.1 Direct
methods
The method of VFrameMetaDB that allow a
direct conditioned access to metadata is:
GetNextFrame(VMetaData* meta, Double_t time,
char*
selection)
This
method extracts a metadata at an absolute time. "selection" is a selection
expression referring to conditions existing in the database such as
"Trig1.amp>50 && Trig2.amp>2".
the search will begin from time "time". The
metadata "meta" will be filled by information from the one containing the start
time of the first interval satisfying the selection expression. This method
fills the meta object with the right information. In this way, the user always
keeps control of creation/deletion of the metadata objects.
I.2.2.2 Sequential
methods
In order to access sequentially all the
metadata of interest, satisfying a selection expression, one has to have an
object that will point to the intervals of interest. So came the idea of
condition sets. Once a condition set has been defined, one can use it to extract
metadata that are recorded at the corresponding time. The methods of
VFrameMetaDB to do so are:
This
method extracts a metadata at an absolute time. "condset" is a condition set
defined as explained in the paragraph "Condition Sets"
Example
vega[]
vd->GetNextMeta(meta, condset)
The search will be governed by the condition
set "condset". The metadata returned will be the one containing the start time
of the next interval pointed to by "condset". These methods have to be called
sequentially in a loop.
This method fills the meta object with the
right information. In this way, the user always keeps control of
creation/deletion of the metadata objects.