Difference between revisions of "BwDataArchiv FAQs"

From Lsdf
m
m
Line 25: Line 25:
 
* We store a MD5 checksum for every file. When the file is read the checksum will be build again and compared with the stored checksum. If there is no match the file will not be delivered to the user. For detailed information s. https://www.rda.kit.edu/img/FAQ-bwDataArchiv%20Data%20Protection%20%20-%20V2.pdf <br /> Also at a more basic level on disk and on tape the data is protected with checksums.
 
* We store a MD5 checksum for every file. When the file is read the checksum will be build again and compared with the stored checksum. If there is no match the file will not be delivered to the user. For detailed information s. https://www.rda.kit.edu/img/FAQ-bwDataArchiv%20Data%20Protection%20%20-%20V2.pdf <br /> Also at a more basic level on disk and on tape the data is protected with checksums.
   
==Help==
+
==Assistance==
 
I have a question. Who do I ask?
 
I have a question. Who do I ask?
 
* Support and help https://www.rda.kit.edu/english/65.php
 
* Support and help https://www.rda.kit.edu/english/65.php
Line 43: Line 43:
 
==Preparations and usage==
 
==Preparations and usage==
 
Do you have recommendations regarding the size of the data
 
Do you have recommendations regarding the size of the data
* The objects you store should be as large as possible for the following reasons
+
* The objects you store should be as large as possible. But what is large you ask?
   
 
I want to routinely create and validate checksums of large amounts of files
 
I want to routinely create and validate checksums of large amounts of files

Revision as of 17:11, 1 December 2016

For whom, for what?

Who are the designated users?

  • The service offers three service models. Your home organisation can apply for a contract with bwDataArchiv. Its up to your home organisation to decide who has access and is allowed to store data. Universities in Baden-Wuerttemberg typically have an IDP which forwards the entitlement for users entitled to use the archive (idp-service-model). Organisations outside the BWIDM federation designate an administrator who can invite users to register for the service using an advanced administration portal (admin-service-model). A third operational model is the sa-service-model, in which a single service account (sa) from an organisation or project is used to allow an application to store data to the archive. This is used i.e. by the RADAR and bwDataDiss projects

Whats up with the bw in bwDataArchiv

  • bw Stands for Baden-Wuerttemberg, the state that KIT is located in. The initial investments and the develoment of the service were funded by the Ministery of arts an sciense in Baden-Wuerttemberg. The name bwDataArchiv should be self explanatory except maybe for the missing 'e' in Archiv which is the German word for archive. The english name of the bwDataArchiv brand is RDA, the research data archive (with e :-))

Technology

What storage technologies do you use?

What ist HPSS?

  • HPSS is a data management application that is being developed at several computer centres that require long term storage for large amounts of data. See here for more detailed information: HPSS web site

How is the data secured?

  • Data is stored on magnetic tape. We use the following tape drives and technologies: LTO5 (max. 1.5 TB per cartridge), STK 10kC (max. 4 TB per cartridge) and STK 10kD (max. 8 TB per cartridge), IBM TS1140 (max. 4.5 TB per cartridge)

Features

I have a suggestion for improvement. What is the award if it gets implemented?

  • You will be named on the bwDataArchiv Hall of Fame pages and are eligible for 10 years of 1 TB of free storage.

How many copies of the data are made and where are they stored?

  • All data in bwDataArchiv has at least 2 copies. Data is moved to disk and from there duplicated to tape storage. There are tape libraries in two data centres in CN as well as in CS.

How long will the data remain in the archive?

  • The regular retention time for files on bwDataArchiv is ten years. After this ten year period bwDataArchiv will delete your data. A warning message is send 6 months ahead to the registered mail addresses. (This is probably the biggest reason to keep at least one of the two possible mail addresses up to date). Contact us at least three months in advance to have the retention time prolonged. If you want to terminate the cooperation with bwDataArchiv or if your data is no longer needed you can delete your data yourself.

How can I make sure my data did not change. Do you support checksums?

Assistance

I have a question. Who do I ask?

I did everything right. Still my client cannot access the archive. What could be wrong?

  • Please contact bwDataArchiv per E-Mail or, if you are a User from BW alternative via Baden-Württemberg Support Portal https://bw-support.scc.kit.edu/. Describe your problems and what you have done and add for example some screenshots.

User Registration

Where do I register for the service?

I have registered but still cannot access the service. What is wrong?

  • Maybe the registration workflow did not finish completely. This can happen because of network errors or unexpected browser behaviour. Go to https://bwidm.scc.kit.edu/user/index.xhtml, login with your credentials and unregister from the service. Then register again. You will receive an email after you have registered successfully.

I lost my password

Why do I need a different password for the archive. Cant I use the one I use at my home - institution ====

  • The data will stay at least 10 years in the archive. By that time you may have left the organisation and your data is still there

Preparations and usage

Do you have recommendations regarding the size of the data

  • The objects you store should be as large as possible. But what is large you ask?

I want to routinely create and validate checksums of large amounts of files


Transfer Data

What protocols do you support for uploading and downloading data

Read Data

Accessing my data takes a long time. Why?

  • Long response time maybe due to several reasons:
    • Has your data been stored a long time ago? Then it is probably no longer on disk and has to be copied in from tape. This may take up to several hours, depending on the current archive data traffic.
    • Retrieval of lots of small files takes longer than of a few large files.
    • Something is broken (but we are fixing it).

Delete Data

I deleted [a file, some files, a directory, my files]. Can I recover the lost data?

  • Straight answer: No. Technical answer: maybe. Operational answer: it depends. Here is the deal: deleted data is kept around for sometime in a trashcan. However the current trashcan implementation has some limitations. Therefore don't count on a full rescue. Contact us and we will give our best to help.