Start of topic | Skip to actions
Data Management ServicesData Management is provided by the EDG Replica Management Service (RMS) and the LCG data management tools. Grid files are known by various names. Arguably the most important is the Grid Unique IDentifier (GUID) which is based on the UUID standard and is guaranteed to be unique. All of the replicas of a file have the same GUID. The general form of the GUID is: guid:[unique_string] Although a file can be located via its GUID, a more intuitive method (to humans) uses the Logical File Name (LFS), viz: lfn:[any_alias] To actually locate where a file is physically stored, the RMS uses the Storage URL (SURL). This takes the general form: sfn:[SE_hostname]/[local_string] Once the location of the file has been determined, the RMS uses the Transport URL (TURL) to retrieve the file. The TURL is composed of the hostname, path, protocol and port. To achieve all of this, the RMS provides the Replica Location Service (RLS) and the Replica Metadata Catalog (RMC). Where the RLS maintains information about the physical location of the replicas, the RMC stores the mapping between GUIDs and the respective aliases (LFNs). In addition, the RMC maintains metadata information (e.g. sizes, dates, ownerships etc). The final component in the Data Management system, the Replica Manager, is responsible for providing a single interface for the RMS to the user or other grid service (e.g. resource broker). NB: (a) The replica manager is integrated with the user interface component of the LCG middleware. Although some of its functionality has been replaced by (faster) LCG data management tools, the resource broker still requires the older EDG interface. (b) "For the moment these catalogues are centralized and there is one RLS per VO. In the first phase, all RLSs are run at CERN" - verbatim from the manual. | |