How do i read a file from azure blob storage in python?

Can someone tell me if it is possible to read a csv file directly from Azure blob storage as a stream and process it using Python? I know it can be done using C#.Net [shown below] but wanted to know the equivalent library in Python to do this.

CloudBlobClient client = storageAccount.CreateCloudBlobClient[];
CloudBlobContainer container = client.GetContainerReference["outfiles"];
CloudBlob blob = container.GetBlobReference["Test.csv"];*

Jay Gong

22.4k2 gold badges20 silver badges28 bronze badges

asked Feb 20, 2018 at 8:57

1

Yes, it is certainly possible to do so. Check out Azure Storage SDK for Python

from azure.storage.blob import BlockBlobService

block_blob_service = BlockBlobService[account_name='myaccount', account_key='mykey']

block_blob_service.get_blob_to_path['mycontainer', 'myblockblob', 'out-sunset.png']

You can read the complete SDK documentation here: //azure-storage.readthedocs.io.

answered Feb 20, 2018 at 9:01

Gaurav MantriGaurav Mantri

120k11 gold badges187 silver badges219 bronze badges

8

Here's a way to do it with the new version of the SDK [12.0.0]:

from azure.storage.blob import BlobClient

blob = BlobClient[account_url="//.blob.core.windows.net"
                  container_name="",
                  blob_name="",
                  credential=""]

with open["example.csv", "wb"] as f:
    data = blob.download_blob[]
    data.readinto[f]

See here for details.

answered Nov 6, 2019 at 13:41

4

One can stream from blob with python like this:

from tempfile import NamedTemporaryFile
from azure.storage.blob.blockblobservice import BlockBlobService

entry_path = conf['entry_path']
container_name = conf['container_name']
blob_service = BlockBlobService[
            account_name=conf['account_name'],
            account_key=conf['account_key']]

def get_file[filename]:
    local_file = NamedTemporaryFile[]
    blob_service.get_blob_to_stream[container_name, filename, stream=local_file, 
    max_connections=2]

    local_file.seek[0]
    return local_file

answered May 27, 2019 at 10:08

Daniel RDaniel R

1301 silver badge10 bronze badges

2

Provide Your Azure subscription Azure storage name and Secret Key as Account Key here

block_blob_service = BlockBlobService[account_name='$$$$$$', account_key='$$$$$$']

This still get the blob and save in current location as 'output.jpg'

block_blob_service.get_blob_to_path['you-container_name', 'your-blob', 'output.jpg']

This will get text/item from blob

blob_item= block_blob_service.get_blob_to_bytes['your-container-name','blob-name']

    blob_item.content

answered Sep 5, 2019 at 17:47

I recommend using smart_open.

import os

from azure.storage.blob import BlobServiceClient
from smart_open import open

connect_str = os.environ['AZURE_STORAGE_CONNECTION_STRING']
transport_params = {
    'client': BlobServiceClient.from_connection_string[connect_str],
}

# stream from Azure Blob Storage
with open['azure://my_container/my_file.txt', transport_params=transport_params] as fin:
    for line in fin:
        print[line]

# stream content *into* Azure Blob Storage [write mode]:
with open['azure://my_container/my_file.txt', 'wb', transport_params=transport_params] as fout:
    fout.write[b'hello world']

answered Jul 27, 2020 at 13:46

pistolpetepistolpete

9009 silver badges19 bronze badges

2

Here is the simple way to read a CSV using Pandas from a Blob:

import os
from azure.storage.blob import BlobServiceClient

service_client = BlobServiceClient.from_connection_string[os.environ['AZURE_STORAGE_CONNECTION_STRING']]
client = service_client.get_container_client["your_container"]
bc = client.get_blob_client[blob="your_folder/yourfile.csv"]
data = bc.download_blob[]
with open["file.csv", "wb"] as f:
   data.readinto[f]
df = pd.read_csv["file.csv"]

answered Feb 18, 2021 at 10:31

IlyasIlyas

1,68814 silver badges9 bronze badges

1

Since I wasn't able to find what I needed on this thread, I wanted to follow up on @SebastianDziadzio's answer to retrieve the data without downloading it as a local file, which is what I was trying to find for myself.

Replace the with statement with the following:

from io import BytesIO
import pandas as pd

with BytesIO[] as input_blob:
    blob_client_instance.download_blob[].download_to_stream[input_blob]
    input_blob.seek[0]
    df = pd.read_csv[input_blob, compression='infer', index_col=0]

answered Aug 3 at 10:57

I know this is an old post but if someone wants to do the same. I was able to access as per below codes

Note: you need to set the AZURE_STORAGE_CONNECTION_STRING which can be obtained from Azure Portal -> Go to your storage -> Settings -> Access keys and then you will get the connection string there.

For Windows: setx AZURE_STORAGE_CONNECTION_STRING ""

For Linux: export AZURE_STORAGE_CONNECTION_STRING=""

For macOS: export AZURE_STORAGE_CONNECTION_STRING=""

import os
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient, __version__

connect_str = os.getenv['AZURE_STORAGE_CONNECTION_STRING']
print[connect_str]
blob_service_client = BlobServiceClient.from_connection_string[connect_str]
container_client = blob_service_client.get_container_client["Your Storage Name Here"]
try:

    print["\nListing blobs..."]

    # List the blobs in the container
    blob_list = container_client.list_blobs[]
    for blob in blob_list:
        print["\t" + blob.name]

except Exception as ex:
    print['Exception:']
    print[ex]

answered May 8, 2021 at 18:12

How do I read Azure blob files?

View a blob container's contents Open Storage Explorer. In the left pane, expand the storage account containing the blob container you wish to view. Expand the storage account's Blob Containers. Right-click the blob container you wish to view, and - from the context menu - select Open Blob Container Editor.

How do I connect to blob storage Azure using Python?

Setting up.
Install the package. ... .
Set up the app framework. ... .
Get the connection string for authentication. ... .
Create a container. ... .
Upload blobs to a container. ... .
List the blobs in a container. ... .
Download blobs. ... .
Delete a container..

How do I download files from Azure blob storage in Python?

Introduction.
Search Storage Accounts in the Azure Portal..
Click New to create a new storage account..
Fill in all the details..
After successfully creating your storage account, open your storage account and click on Access Keys from the left navigation pane to get your storage account credentials..

How do I retrieve data from blob storage?

In this tutorial, you learn how to:.
Prerequisites to export data from Azure Blob storage with Azure Import/Export..
Step 1: Create an export job..
Step 2: Ship the drives..
Step 3: Update the job with tracking information..
Step 4: Receive the disks..
Step 5: Unlock the disks..

Chủ Đề