Read file from s3 python

I read the filenames in my S3 bucket by doing

objs = boto3.client.list_objects[Bucket='my_bucket']
    while 'Contents' in objs.keys[]:
        objs_contents = objs['Contents']
        for i in range[len[objs_contents]]:
            filename = objs_contents[i]['Key']

Now, I need to get the actual content of the file, similarly to a open[filename].readlines[]. What is the best way?

asked Mar 24, 2016 at 16:41

boto3 offers a resource model that makes tasks like iterating through objects easier. Unfortunately, StreamingBody doesn't provide readline or readlines.

s3 = boto3.resource['s3']
bucket = s3.Bucket['test-bucket']
# Iterates through all the objects, doing the pagination for you. Each obj
# is an ObjectSummary, so it doesn't contain the body. You'll need to call
# get to get the whole body.
for obj in bucket.objects.all[]:
    key = obj.key
    body = obj.get[]['Body'].read[]

answered Mar 24, 2016 at 16:57

Jordon PhillipsJordon Phillips

13.7k4 gold badges34 silver badges42 bronze badges

13

You might consider the smart_open module, which supports iterators:

from smart_open import smart_open

# stream lines from an S3 object
for line in smart_open['s3://mybucket/mykey.txt', 'rb']:
    print[line.decode['utf8']]

and context managers:

with smart_open['s3://mybucket/mykey.txt', 'rb'] as s3_source:
    for line in s3_source:
         print[line.decode['utf8']]

    s3_source.seek[0]  # seek to the beginning
    b1000 = s3_source.read[1000]  # read 1000 bytes

Find smart_open at //pypi.org/project/smart_open/

answered Dec 14, 2018 at 18:30

caffreydcaffreyd

1,0741 gold badge17 silver badges25 bronze badges

2

Using the client instead of resource:

s3 = boto3.client['s3']
bucket='bucket_name'
result = s3.list_objects[Bucket = bucket, Prefix='/something/']
for o in result.get['Contents']:
    data = s3.get_object[Bucket=bucket, Key=o.get['Key']]
    contents = data['Body'].read[]
    print[contents.decode["utf-8"]]

Ryan M

16.5k30 gold badges56 silver badges65 bronze badges

answered Jan 27, 2021 at 18:08

0

When you want to read a file with a different configuration than the default one, feel free to use either mpu.aws.s3_read[s3path] directly or the copy-pasted code:

def s3_read[source, profile_name=None]:
    """
    Read a file from an S3 source.

    Parameters
    ----------
    source : str
        Path starting with s3://, e.g. 's3://bucket-name/key/foo.bar'
    profile_name : str, optional
        AWS profile

    Returns
    -------
    content : bytes

    botocore.exceptions.NoCredentialsError
        Botocore is not able to find your credentials. Either specify
        profile_name or add the environment variables AWS_ACCESS_KEY_ID,
        AWS_SECRET_ACCESS_KEY and AWS_SESSION_TOKEN.
        See //boto3.readthedocs.io/en/latest/guide/configuration.html
    """
    session = boto3.Session[profile_name=profile_name]
    s3 = session.client['s3']
    bucket_name, key = mpu.aws._s3_path_split[source]
    s3_object = s3.get_object[Bucket=bucket_name, Key=key]
    body = s3_object['Body']
    return body.read[]

answered Aug 23, 2018 at 19:36

Martin ThomaMartin Thoma

113k148 gold badges570 silver badges875 bronze badges

0

If you already know the filename, you can use the boto3 builtin download_fileobj

import boto3

from io import BytesIO

session = boto3.Session[]
s3_client = session.client["s3"]

f = BytesIO[]
s3_client.download_fileobj["bucket_name", "filename", f]
print[f.getvalue[]]

answered Apr 6, 2020 at 2:33

reubanoreubano

4,8141 gold badge39 silver badges38 bronze badges

2

import boto3

print["started"]

s3 = boto3.resource['s3',region_name='region_name', aws_access_key_id='your_access_id', aws_secret_access_key='your access key']

obj = s3.Object['bucket_name','file_name']

data=obj.get[]['Body'].read[]

print[data]

answered Jun 16 at 8:33

1

This is the correct and tested code to access the file contents using boto3 from the s3 bucket. It is working for me till the date of posting.

def get_file_contents[bucket, prefix]:
    s3 = boto3.resource['s3']
    s3.meta.client.meta.events.register['choose-signer.s3.*', disable_signing]
    bucket = s3.Bucket[bucket]
    for obj in bucket.objects.filter[Prefix=prefix]:
        key = obj.key
        body = obj.get[]['Body'].read[]
        print[body]
        return body

get_file_contents['coderbytechallengesandbox', '__cb__']

answered Jul 4 at 1:08

bilalmohibbilalmohib

2703 silver badges14 bronze badges

the best way for me is this:

result = s3.list_objects[Bucket = s3_bucket, Prefix=s3_key]
for file in result.get['Contents']:
    data = s3.get_object[Bucket=s3_bucket, Key=file.get['Key']]
    contents = data['Body'].read[]
    #if Float types are not supported with dynamodb; use Decimal types instead
    j = json.loads[contents, parse_float=Decimal]
    for item in j:
       timestamp = item['timestamp']

       table.put_item[
           Item={
            'timestamp': timestamp
           }
      ]

once you have the content you can run it through another loop to write it to a dynamodb table for instance ...

answered Dec 4, 2021 at 15:31

aerioeusaerioeus

1,2181 gold badge12 silver badges35 bronze badges

How does Python read S3 files?

How to Upload And Download Files From AWS S3 Using Python [2022].
Step 1: Setup an account. ... .
Step 2: Create a user. ... .
Step 3: Create a bucket. ... .
Step 4: Create a policy and add it to your user. ... .
Step 5: Download AWS CLI and configure your user. ... .
Step 6: Upload your files. ... .
Step 7: Check if authentication is working..

Can we read file from S3 without downloading?

Reading objects without downloading them Similarly, if you want to upload and read small pieces of textual data such as quotes, tweets, or news articles, you can do that using the S3 resource method put[], as demonstrated in the example below [Gist].

How do I view an S3 bucket file?

In the Amazon S3 console, choose your S3 bucket, choose the file that you want to open or download, choose Actions, and then choose Open or Download. If you are downloading an object, specify where you want to save it. The procedure for saving the object depends on the browser and operating system that you are using.

Chủ Đề