.. include:: system.rst .. _object-storage: Object Storage on JUDAC ======================= Beside classical POSIX file systems, JUST offers an object storage interface to store, manage, and access large amounts of data via HTTP. Data projects can request access to this resource via the `JARDS`_ portal. Supported object storage access protocols are *OpenStack Swift* and *S3*. Access endpoint for client authentication is an OpenStack Keystone instance at `just-keystone.fz-juelich.de:5000`, which is connected to the `JuDoor`_ user database. .. figure:: ../just/images/Object-Storage.png :name: JSC Storage Cloud :align: right .. IMPORTANT:: POSIX file systems and object storage data are stored separately, it is not possible to store data via POSIX and access them via the object storage interface or vice versa. User Environment Preparation ---------------------------- .. note:: A user has to be granted access to a projects object storage resource first to get access. - Create file ``_rc`` with content: .. code-block:: bash #!/usr/bin/env bash export OS_AUTH_URL=https://just-keystone.fz-juelich.de:5000 export OS_PROJECT_NAME= export OS_USER_DOMAIN_NAME="JUDOOR" export OS_PROJECT_DOMAIN_ID="6d3a30736c864c5498d59a9e54b6e4b2" export OS_USERNAME="" echo "Please enter your OpenStack Password for project $OS_PROJECT_NAME as user $OS_USERNAME: " read -sr OS_PASSWORD_INPUT export OS_PASSWORD=$OS_PASSWORD_INPUT export OS_REGION_NAME="JUST" export OS_INTERFACE=public export OS_IDENTITY_API_VERSION=3 which you have to personalize with: **:** name of your project with the object storage resource **:** your user name (JuDoor account) - activate OpenStack environment variables by sourcing the file in your favorite shell: .. code-block:: $ source _rc OpenStackClient CLI ------------------- ``openstack`` is a CLI for OpenStack that brings the command set for Compute, Identity, Image, Object Storage and Block Storage APIs together in a single shell with a uniform command structure. Detailed information about usage of the OpenStack CLI can be found in the official `OpenStackClient Documentation`_. The CLI is provided by the ``python-openstackclient`` package, which also provides good Python bindings for programmatically accessing OpenStack APIs from scripts. Useful Examples ^^^^^^^^^^^^^^^ - list projects: .. code-block:: $ openstack project list - issue token: .. code-block:: $ openstack token issue - some openstack commands also require ``PROJECT_ID`` which can be displayed by the command: .. code-block:: $ openstack project show $OS_PROJECT_NAME and add the line to your environment file: .. code-block:: export OS_PROJECT_ID= SWIFT Protocol - Manage objects and containers ---------------------------------------------- Full documentation can be found in the official `python-swiftclient Docs`_. S3 Protocol - Manage objects and containers ------------------------------------------- We run a Swift object storage with an enabled S3 emulation mode. The compatibility matrix can be found at `S3/Swift Docs`_. .. note:: The S3 emulation supports AWS Signature version 2 only. Please configure your client accordingly. On the **JUDAC** login nodes we have installed the MinIO Client, a S3 compatible command line client. S3 Environment Preparation ^^^^^^^^^^^^^^^^^^^^^^^^^^ Generate access/secret pair with command: .. code-block:: $ openstack ec2 credential create +------------+--------------------------------------------------------------------------------------------+ | Field | Value | +------------+--------------------------------------------------------------------------------------------+ | access | | | links | {'self': ''} | | project_id | 78..xxxxxxxxxxxxxxxxxxxxxxxxx..9 | | secret | | | trust_id | None | | user_id | 6b95..xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx..6a | +------------+--------------------------------------------------------------------------------------------+ or list already generated credentials: .. code-block:: $ openstack ec2 credentials list +----------------------------------+----------------------------------+----------------------------------+------------------------------------------------------------------+ | Access | Secret | Project ID | User ID | +----------------------------------+----------------------------------+----------------------------------+------------------------------------------------------------------+ | | | 78..xxxxxxxxxxxxxxxxxxxxxxxxx..9 | 6b95..xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx..6a | +----------------------------------+----------------------------------+----------------------------------+------------------------------------------------------------------+ and use ``s3cmd`` command with following configuration file (``~/.s3cfg``) .. code-block:: [default] access_key= check_ssl_certificate = True check_ssl_hostname = True host_base = just-object.fz-juelich.de:8080 host_bucket = just-object.fz-juelich.de:8080 human_readable_sizes = True secret_key= signature_v2 = True S3 Object Examples ^^^^^^^^^^^^^^^^^^ .. code-block:: @judac$ s3cmd mb s3://my_container Bucket 's3://my_container/' created @judac$ s3cmd ls s3://my_container @judac$ s3cmd put /bin/bash s3://my_container upload: '/bin/bash' -> 's3://my_container/bash' [1 of 1] @judac03$ cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 42 | head -n 42 > my_42.file @judac03$ s3cmd put my_42.file s3://my_container upload: 'my_42.file' -> 's3://my_container/my_42.file' [1 of 1] 1806 of 1806 100% in 0s 12.23 KB/s done @judac03$ s3cmd ls s3://my_container 2021-06-25 15:19 1123k s3://my_container/bash 2021-06-25 15:19 1806 s3://my_container/my_42.file @judac03$ s3cmd get s3://my_container/bash download: 's3://my_container/bash' -> './bash' [1 of 1] 1150576 of 1150576 100% in 0s 18.61 MB/s done @judac03$ cmp bash /bin/bash && echo "same data" same data @judac03$ s3cmd rb --recursive s3://my_container WARNING: Bucket is not empty. Removing all the objects from it first. This may take some time... delete: 's3://my_container/my_42.file' Bucket 's3://my_container/' removed S3 boto3 API ------------ As an option you can use the ``boto3`` API to access your data from python scripts: .. code-block:: python import boto3 import botocore #boto3.set_stream_logger(name='botocore') # this enables debug tracing session = boto3.session.Session() s3_client = session.client( service_name='s3', aws_access_key_id="", aws_secret_access_key="", endpoint_url="https://just-object.fz-juelich.de:8080", # The next option is only required because my provider only offers "version 2" # authentication protocol. Otherwise this would be 's3v4' (the default, version 4). config=botocore.client.Config(signature_version='s3'), ) buckets = s3_client.list_buckets() print('Existing buckets:') for bucket in buckets['Buckets']: print(f'Bucket: {bucket["Name"]}') objects = s3_client.list_objects(Bucket=f'{bucket["Name"]}') for object in objects["Contents"]: print(f'Object:{object["Key"]}') For more information see: `Boto3 Docs`_ .. _`JARDS`: https://application.fz-juelich.de/Antragsserver/dataprojects/WEB/application/login.php?appkind=dataprojects .. _`JuDoor`: https://judoor.fz-juelich.de .. _`OpenStackClient Documentation`: https://docs.openstack.org/python-openstackclient/latest/ .. _`python-swiftclient Docs`: https://docs.openstack.org/python-swiftclient/latest/ .. _`S3/Swift Docs`: https://docs.openstack.org/swift/latest/s3_compat.html .. _`Boto3 Docs`: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html