Working With IBM Cloud Object Storage In Python
IBM Cloud Object Storage
Working with IBM Watson Studio comes with a flexible storage option of IBM Cloud Object Storage
. When you create a project in Watson Studio, integration with Cloud Object Storage for storing project assets is automatically created for you.
This blog is focused on how to use IBM Cloud Object Storage
in Python, but you can also easily load data in Watson Studio using UI.
Import Credentials
To access IBM Cloud Object Storage
you need credentials. You can get these credentials using insert to code
option in the notebook. To insert credentials you need to first upload some data to Watson Studio using browse functionality.
# @hidden_cell
# The following code contains the credentials for a file in your IBM Cloud Object Storage.
# You might want to remove those credentials before you share your notebook.
credentials = {
'IBM_API_KEY_ID': '*******************************',
'IAM_SERVICE_ID': '*******************************',
'ENDPOINT': '*******************************',
'IBM_AUTH_ENDPOINT': '*******************************',
'BUCKET': '*******************************',
'FILE': '*******************************'
}
ibm_boto3
library provides complete access to the IBM® Cloud Object Storage API. We need to create a low-level client using above credentials.
from ibm_botocore.client import Config
import ibm_boto3cos = ibm_boto3.client(service_name='s3',
ibm_api_key_id=credentials['IBM_API_KEY_ID'],
ibm_service_instance_id=credentials['IAM_SERVICE_ID'],
ibm_auth_endpoint=credentials['IBM_AUTH_ENDPOINT'],
config=Config(signature_version='oauth'),
endpoint_url=credentials['ENDPOINT'])
Upload Files
To upload file in COS, we will be using upload_file
function. It takes three parameters — local file name(along with path), bucket name and key. Key can be different from local file name. Your file will be identified by this name within the bucket. When you create a project with IBM Cloud Object Storage
option, Watson Studio creates bucket for your project . You can find bucket corresponding to your project in credentials.
cos.upload_file(Filename='wine/wine.csv',Bucket=credentials['BUCKET'],Key='wine_data.csv')
We can use this function for uploading zip-files or pickle objects. Here I have Gradient Boosting Classifier as pickle object.
#Upload zip file
cos.upload_file('wine.gz', credentials['BUCKET'],'wine.gz')#upload pickle object
cos.upload_file('GB_Classification_model.pkl', credentials['BUCKET'],'GB_Classification_model.pkl')
To help you with quickly uploading files here is the upload_file_cos
function. You need to pass your credentials, local file name and key as parameters to this function.
from ibm_botocore.client import Config
import ibm_boto3def upload_file_cos(credentials,local_file_name,key):
cos = ibm_boto3.client(service_name='s3',
ibm_api_key_id=credentials['IBM_API_KEY_ID'],
ibm_service_instance_id=credentials['IAM_SERVICE_ID'],
ibm_auth_endpoint=credentials['IBM_AUTH_ENDPOINT'],
config=Config(signature_version='oauth'),
endpoint_url=credentials['ENDPOINT'])
try:
res=cos.upload_file(Filename=local_file_name, Bucket=credentials['BUCKET'],Key=key)
except Exception as e:
print(Exception, e)
else:
print('File Uploaded')
Download Files To Local Machine
Once you have your file in IBM Cloud Object Storage
, now you can download it on your local machine.
- Click on your project
- Click
find and add data
icon from upper right hand side panel. - Select the file and click download.
Get Data From COS Into Notebook
To download file from COS into notebook, we will be using download_file
function. It takes same parameters as above. Here I am downloading file wine.csv to saving it with name wine1.csv
cos.download_file(Bucket=credentials['BUCKET'],Key='wine.csv',Filename='wine1.csv')
Here is download_file_cos
function for quick use.
from ibm_botocore.client import Config
import ibm_boto3def download_file_cos(credentials,local_file_name,key):
cos = ibm_boto3.client(service_name='s3',
ibm_api_key_id=credentials['IBM_API_KEY_ID'],
ibm_service_instance_id=credentials['IAM_SERVICE_ID'],
ibm_auth_endpoint=credentials['IBM_AUTH_ENDPOINT'],
config=Config(signature_version='oauth'),
endpoint_url=credentials['ENDPOINT'])
try:
res=cos.download_file(Bucket=credentials['BUCKET'],Key=key,Filename=local_file_name)
except Exception as e:
print(Exception, e)
else:
print('File Downloaded')
Instead of file if you want to upload/download file-like object you can use upload_fileobj
and download_fileobj
. Object must implement the read method, and return/accept bytes respectively.
with open('wine.csv', 'rb') as data:
cos.upload_fileobj(data, credentials['BUCKET'], 'wine_bytes')with open('wine_copy.csv', 'wb') as data:
cos.download_fileobj(credentials['BUCKET'], 'wine_bytes', data)
Credentials you get in Watson Studio using insert to code
option are scoped to one bucket access permission i.e. they allow you to only interact with your project’s bucket. If you want to interact with other buckets then you will have to create new credentials with appropriate access permissions.
Create New Service Credentials
To create new credentials, you need to follow these steps.
- Go to IBM Cloud and click on IBM Cloud Object Storage service under
services
section. - Select
Service credentials
from left hand panel and click onNew credential
button. - Give name and choose appropriate role based on your requirements and hit add button.
You should be able to see this credential under service credentials.
Just copy these credentials from view credentials option and create cos object.
cos_credentials={
"apikey": "***********************",
"endpoints": "***********************",
"iam_apikey_description": "***********************",
"iam_apikey_name": "***********************",
"iam_role_crn": "***********************",
"iam_serviceid_crn": "***********************",
"resource_instance_id": "***********************"
}auth_endpoint = 'https://iam.bluemix.net/oidc/token'
service_endpoint = 'https://s3-api.us-geo.objectstorage.softlayer.net'cos = ibm_boto3.client('s3',
ibm_api_key_id=cos_credentials['apikey'],
ibm_service_instance_id=cos_credentials['resource_instance_id'],
ibm_auth_endpoint=auth_endpoint,
config=Config(signature_version='oauth'),
endpoint_url=service_endpoint)
List Buckets
Using list_buckets
function we can list all buckets.
for bucket in cos.list_buckets()['Buckets']:
print(bucket['Name'])
Create & Delete Buckets
create_bucket
and delete_bucket
function will help you in creating and deleting buckets.
cos.create_bucket(Bucket='bucket1-test')
cos.delete_bucket(Bucket='bucket1-test')
There are many functions you can use to manage your IBM Cloud Object Storage
. In this blog,we have covered basic functions to make your job easy while working with IBM Cloud Object Storage
in Watson Studio using Python.
For more information you can refer this documentation.