CMS data ordering by Rucio¶
CMS is using Rucio for the data management across sites. Our Tier-3 site has the Rucio name T3_CH_PSI, the Swiss Tier-2 is accordingly named T2_CH_CSCS. Learn more about Rucio in the CMS user documentation.
Users can submit requests through Rucio. Requests need to fit into the available space and policies of the Tier-3. Try to reduce the size of the data by not ordering full datasets.
We will convert an approved request by your Rucio account user to a new request of the special Rucio account t3_ch_psi_local_users. This is done, because only the user or a central Rucio manager can change a rule once it is approved. E.g. an approved request without an expiry date would stay forever in the system, if the user had left without removing it, and it would require an escalation procedure from our side to get rid of it. Converting your requests to local requests of this special site user gives our local data managers the possibility to react in case of larger storage needs to manage the existing Rucio rules.
You will get a mail from us with the IDs of the converted rules. Your original rule request will be denied, and Rucio will send mail about it, but the mail will list as reason:
Request approved but converted to a local rule of t3_ch_psi_local_users. You will receive the new rule by mail."
Alternatively you can also just send us (admin mailing list cms-tier3@lists.psi.ch) a list of data sets or blocks together with the intended expiry date, and we will order them on your behalf.
Important for ordering Rucio data to the Tier-3
- You must specify an expiry date. Rules without expiry date will
not be accepted. Please try to pick a time of maximally three months.
- transfers are fast, TBs can be transferred within hours to the Tier-3, if they are on disks somewhere on the Grid (some more waiting time will be needed, if the data first needs to be staged from tape). Therefore, data can be refetched very fast when the expiry time has been reached.
- The Tier-3 has limited storage (\~1.5 PB) and most of it is consumed by user data. Please try to limit the volume that you bring to the Tier-3. Don't fetch full data sets except if you really need them.
- The CSCS Tier-2 is reasonably well connected (just 5ms latency), so you could also bring some data there and process it from the Tier-3 remotely, or from other sites of the grid, since the Tier-2 provides better bandwidth for other sites than the Tier-3.