S3 on WOS: Difference between revisions

From Lsdf
Jump to navigationJump to search
No edit summary
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
==S3 on WOS==
==S3 on WOS==


The REST S3 API [http://docs.aws.amazon.com/AmazonS3/latest/API/APIRest.html] was introduced with the Amazon S3 online storage service and has now become the de-facto standard for cloud storage.
The REST S3 API [http://docs.aws.amazon.com/AmazonS3/latest/API/APIRest.html] was introduced with the Amazon S3 online storage service and has now become the de-facto standard for cloud storage.


WOS (Web Object Scaler) is a cloud-based storage technology offered by DDN (Data Direct Network). DDN provides a REST API compatible with the Amazon S3 API.
WOS (Web Object Scaler) is a cloud-based storage technology offered by DDN (Data Direct Network). DDN provides a REST API compatible with the Amazon S3 API.


Required to access the cloud storage through the S3 API:
Required to access the cloud storage through the S3 API:
Line 11: Line 11:


Data are stored as objects in different buckets. Every user holds a pair of access-secret keys.
Data are stored as objects in different buckets. Every user holds a pair of access-secret keys.

==== Set-up ====

The S3 protocol is delivered via gateways to the Web Object Scaler (WOS) nodes. Currently there are 2 WOS nodes and 5 gateway nodes in production.

Each WOS node has 60 disks of 4 TB. Each WOS node has 2 10 GE interfaces. Currently only a single 10 GE connection is active. IP adresses and cabling have been reserved for connecting the second interfaces (link aggregration). The 2 WOS nodes are logically in 2 different zones though physically they are in the same rack.

The S3 gateway is installed on 5 servers: 4 of them serve the S3 requests using DNS round robin, while the fifth server is used for management and monitoring. The setup is described in the picture below.

[[File:s3-wos-setup.png|300px]]

==== Operations Supported by DDN S3 API ====
*Service Operation
**GET Service
*Bucket Operations
**GET Bucket
**GET Bucket ACL
**PUT Bucket
**PUT Bucket ACL
**DELETE Bucket
**HEAD Bucket
**List Multipart Uploads
*Object Operations
**GET Object
**Get Object ACL
**PUT Object
**PUT Object ACL
**PUT Object Copy
**DELETE Object
**HEAD Object
**POST Object
**Initiate Multipart Upload
**Upload Part Copy
**Complete Multipart Upload
**Abort Multipart Upload
**List Parts


==S3 clients==
==S3 clients==
Line 37: Line 73:
Configure s3cmd by running:
Configure s3cmd by running:
s3cmd --configure
s3cmd --configure
You will be asked to input the two keys; all other questions are optional. Then edit the file ~/.s3cfg to use the endpoint and check user keys:
You will be asked to input the two keys; all other questions are optional. When asked "Test access with supplied credentials? [Y/n]" answer no. Then edit the file ~/.s3cfg to use the endpoint and check user keys:
access_key = <your_access_key>
access_key = <your_access_key>
secret_key = <your_secret_key>
secret_key = <your_secret_key>
Line 120: Line 156:
Read more about options and limitations at [https://github.com/s3fs-fuse/s3fs-fuse/wiki/Fuse-Over-Amazon].
Read more about options and limitations at [https://github.com/s3fs-fuse/s3fs-fuse/wiki/Fuse-Over-Amazon].


==AWK SDK==
==AWS SDK==


===Java===
===Java===
Line 133: Line 169:
For Python clients, use the BOTO S3 API [http://boto.readthedocs.org/en/latest/s3_tut.html].
For Python clients, use the BOTO S3 API [http://boto.readthedocs.org/en/latest/s3_tut.html].


Set-up:
Set-up and list buckets:
import boto
conn=S3Connection(host='s3.data.kit.edu',
import boto.s3.connection
port=80, is_secure=False,
aws_access_key_id=your_access_key,
conn=boto.connect_s3(
aws_secret_key_id=your_secret_key)
aws_access_key_id='<your_access_key>',
aws_secret_access_key='<your_secret_key>',
host='s3.data.kit.edu',
port=80, is_secure=False )
for bucket in conn.get_all_buckets():
print "{name}\t{created}".format(
name = bucket.name,
created = bucket.creation_date,
)

==References==
*[http://wiki.scc.kit.edu/lsdf/index.php/hidden:S3_on_WOS S3 on WOS setup and administration] -- internal documentation

Latest revision as of 02:06, 15 May 2015

S3 on WOS

The REST S3 API [1] was introduced with the Amazon S3 online storage service and has now become the de-facto standard for cloud storage.

WOS (Web Object Scaler) is a cloud-based storage technology offered by DDN (Data Direct Network). DDN provides a REST API compatible with the Amazon S3 API.

Required to access the cloud storage through the S3 API:

Data are stored as objects in different buckets. Every user holds a pair of access-secret keys.

Set-up

The S3 protocol is delivered via gateways to the Web Object Scaler (WOS) nodes. Currently there are 2 WOS nodes and 5 gateway nodes in production.

Each WOS node has 60 disks of 4 TB. Each WOS node has 2 10 GE interfaces. Currently only a single 10 GE connection is active. IP adresses and cabling have been reserved for connecting the second interfaces (link aggregration). The 2 WOS nodes are logically in 2 different zones though physically they are in the same rack.

The S3 gateway is installed on 5 servers: 4 of them serve the S3 requests using DNS round robin, while the fifth server is used for management and monitoring. The setup is described in the picture below.

S3-wos-setup.png

Operations Supported by DDN S3 API

  • Service Operation
    • GET Service
  • Bucket Operations
    • GET Bucket
    • GET Bucket ACL
    • PUT Bucket
    • PUT Bucket ACL
    • DELETE Bucket
    • HEAD Bucket
    • List Multipart Uploads
  • Object Operations
    • GET Object
    • Get Object ACL
    • PUT Object
    • PUT Object ACL
    • PUT Object Copy
    • DELETE Object
    • HEAD Object
    • POST Object
    • Initiate Multipart Upload
    • Upload Part Copy
    • Complete Multipart Upload
    • Abort Multipart Upload
    • List Parts

S3 clients

There are several clients that can be used to access the S3 storage. Below there is a list of clients and instructions on how to set-up and use them to access the (DDN) S3 API.

s3cmd

Command line tool for Linux and Mac. [2] The Windows version is S3Express. [3]

Download and install:

wget -O- -q http://s3tools.org/repo/deb-all/stable/s3tools.key | sudo apt-key add -
sudo wget -O/etc/apt/sources.list.d/s3tools.list http://s3tools.org/repo/deb-all/stable/s3tools.list
sudo apt-get update && sudo apt-get install s3cmd
    • RedHat, CentOS & Fedora
cd /etc/yum.repos.d
wget http://s3tools.org/repo/RHEL_6/s3tools.repo (for RHEL 6 and CentOS 6.x)
yum install s3cmd 
answer yes to accept GPG key

Configure s3cmd by running:

s3cmd --configure

You will be asked to input the two keys; all other questions are optional. When asked "Test access with supplied credentials? [Y/n]" answer no. Then edit the file ~/.s3cfg to use the endpoint and check user keys:

access_key = <your_access_key>
secret_key = <your_secret_key>
host_base = s3.data.kit.edu
host_bucket = %(bucket)s.s3.data.kit.edu
website_endpoint = http://s3.data.kit.edu/

Available commands for managing objects and buckets in the S3 storage:

Make bucket
 s3cmd mb s3://BUCKET
Remove bucket
 s3cmd rb s3://BUCKET
List objects or buckets
 s3cmd ls [s3://BUCKET[/PREFIX]]
List all object in all buckets
 s3cmd la 
Put file into bucket
 s3cmd put FILE [FILE...] s3://BUCKET[/PREFIX]
Get file from bucket
 s3cmd get s3://BUCKET/OBJECT LOCAL_FILE
Delete file from bucket
 s3cmd del s3://BUCKET/OBJECT
Synchronize a directory tree to S3
 s3cmd sync LOCAL_DIR s3://BUCKET[/PREFIX] or s3://BUCKET[/PREFIX] LOCAL_DIR
Disk usage by buckets
 s3cmd du [s3://BUCKET[/PREFIX]]
Copy object
 s3cmd cp s3://BUCKET1/OBJECT1 s3://BUCKET2[/OBJECT2]
Move object
 s3cmd mv s3://BUCKET1/OBJECT1 s3://BUCKET2[/OBJECT2]

DragonDisk

Cross-platform freeware software, GUI based, available on Windows, Linux and Mac OS. [4]

Download and install:

sudo apt-get install libssl0.9.8
wget http://download.dragondisk.com/dragondisk_1.0.5-0ubuntu_i386.deb
sudo apt-get install libqt4-dbus libqt4-network libqt4-xml libqtcore4 libqtgui4
sudo dpkg -i dragondisk_1.0.5-0ubuntu_i386.deb
  • Ubuntu 64-bit
sudo apt-get install libssl0.9.8
wget http://download.dragondisk.com/dragondisk_1.0.5-0ubuntu_amd64.deb
sudo apt-get install libqt4-dbus libqt4-network libqt4-xml libqtcore4 libqtgui4
sudo dpkg -i dragondisk_1.0.5-0ubuntu_amd64.deb
  • CentOS, RedHat, Fedora, Suse
wget http://download.dragondisk.com/dragondisk-1.0.5-1.i686.rpm
yum install qt
sudo rpm -Uvh dragondisk-1.0.5-1.i686.rpm

To get started, create a new account using the endpoint, access key and secret key. Then you can use the GUI browse the existing objects, create new buckets, add or remove objects etc.

Check the quickstart guide at [5] for screenshots.

S3Anywhere

Available for Android [6].

S3FS-Fuse

A FUSE based solution to mount/unmount S3 buckets. [7]

Download and install on Ubuntu (similarly on other Linux distributions):

sudo apt-get install git build-essential libfuse-dev fuse libcurl4-openssl-dev libxml2-dev mime-support
git clone https://github.com/s3fs-fuse/s3fs-fuse
cd s3fs-fuse
./autogen.sh
./configure
make
sudo make install

Configure s3fs by adding this line to the file ~/.passwd-s3fs (or systemwide /etc/passwd-s3fs):

[bucket_name:]your_access_key:your_secret_key

Alternatively, you can set the environment variables AWSACCESSKEYID and AWSSECRETACCESSKEY, or specify the password file with the command line option passwd_file.

Mounting a bucket:

mkdir ~/s3bucket
s3fs <bucket_name> ~/s3bucket -ourl=http://s3.data.kit.edu

Unmounting a bucket:

umount ~/s3bucket

Read more about options and limitations at [8].

AWS SDK

Java

Recommended to use AWS SDK 1.8.6 or higher for the DDN S3. [9]

Set-up:

AWSCredentials credentials = new BasicAWSCredentials(your_access_key, you_secret_key);
AmazonS3 s3client = new AmazonS3Client(credentials);
s3client.setEndpoint("http://s3.data.kit.edu:80");

Python

For Python clients, use the BOTO S3 API [10].

Set-up and list buckets:

import boto                                                                     
import boto.s3.connection                                                       
                                                                                
conn=boto.connect_s3(                                                           
     aws_access_key_id='<your_access_key>',                                  
     aws_secret_access_key='<your_secret_key>',                                                                                                                        
     host='s3.data.kit.edu',                                                    
     port=80, is_secure=False )                                                 
                                                                                
for bucket in conn.get_all_buckets():                                           
     print "{name}\t{created}".format(                                       
          name = bucket.name,                                             
          created = bucket.creation_date,                                 
     )

References