I would do it following way, let file.txt
content be
{"eventVersion":"1.08","userIdentity":{"type":"AssumedRole","principalId":"AA:i-096379450e69ed082","arn":"arn:aws:sts::34502sdsdsd:assumed-role/RDSAccessRole/i-096379450e69ed082","accountId":"34502sdsdsd","accessKeyId":"ASIAVAVKXAXXXXXXXC","sessionContext":{"sessionIssuer":{"type":"Role","principalId":"AROAVAVKXAKDDDDD","arn":"arn:aws:iam::3450291sdsdsd:role/RDSAccessRole","accountId":"345029asasas","userName":"RDSAccessRole"},"webIdFederationData":{},"attributes":{"mfaAuthenticated":"false","creationDate":"2021-04-27T04:38:52Z"},"ec2RoleDelivery":"2.0"}},"eventTime":"2021-04-27T07:24:20Z","eventSource":"ssm.amazonaws.com","eventName":"ListInstanceAssociations","awsRegion":"us-east-1","sourceIPAddress":"188.208.227.188","userAgent":"aws-sdk-go/1.25.41 [go1.13.15; linux; amd64] amazon-ssm-agent/","requestParameters":{"instanceId":"i-096379450e69ed082","maxResults":20},"responseElements":null,"requestID":"a5c63b9d-aaed-4a3c-9b7d-a4f7c6b774ab","eventID":"70de51df-c6df-4a57-8c1e-0ffdeb5ac29d","readOnly":true,"resources":[{"accountId":"34502914asasas","ARN":"arn:aws:ec2:us-east-1:3450291asasas:instance/i-096379450e69ed082"}],"eventType":"AwsApiCall","managementEvent":true,"eventCategory":"Management","recipientAccountId":"345029149342"}
{"eventVersion":"1.08","userIdentity":{"type":"AssumedRole","principalId":"AROAVAVKXAKPKZ25XXXX:AmazonMWAA-airflow","arn":"arn:aws:sts::3450291asasas:assumed-role/dev-1xdcfd/AmazonMWAA-airflow","accountId":"34502asasas","accessKeyId":"ASIAVAVKXAXXXXXXX","sessionContext":{"sessionIssuer":{"type":"Role","principalId":"AROAVAVKXAKPKZXXXXX","arn":"arn:aws:iam::345029asasas:role/service-role/AmazonMWAA-dlp-dev-1xdcfd","accountId":"3450291asasas","userName":"dlp-dev-1xdcfd"},"webIdFederationData":{},"attributes":{"mfaAuthenticated":"false","creationDate":"2021-04-27T07:04:08Z"}},"invokedBy":"airflow.amazonaws.com"},"eventTime":"2021-04-27T07:23:46Z","eventSource":"logs.amazonaws.com","eventName":"CreateLogStream","awsRegion":"us-east-1","sourceIPAddress":"airflow.amazonaws.com","userAgent":"airflow.amazonaws.com","errorCode":"ResourceAlreadyExistsException","errorMessage":"The specified log stream already exists","requestParameters":{"logStreamName":"scheduler.py.log","logGroupName":"dlp-dev-DAGProcessing"},"responseElements":null,"requestID":"40b48ef9-fc4b-4d1a-8fd1-4f2584aff1e9","eventID":"ef608d43-4765-4a3a-9c92-14ef35104697","readOnly":false,"eventType":"AwsApiCall","apiVersion":"20140328","managementEvent":true,"eventCategory":"Management","recipientAccountId":"3450291asasas"}
then
with open['file.txt', 'r'] as f:
jsons = [i.strip[] for i in f.readlines[]]
with open['total.json', 'w'] as f:
f.write['{"Records":[']
f.write[','.join[jsons]]
f.write[']}']
will produce total.json
with desired shape and being legal JSON if every line inside file.txt
is legal JSON.
I have a list which contains multiple JSON objects and I would like to combine those JSON objects into a single json object, I tried using jsonmerge but with no luck.,94770/merge-multiple-objects-into-single-json-object-using-python, How to merge multiple json objects into a single... ,I put the list in a for loop and tried json merge and I got the error head missing expected 2 arguments got 1 My list is:t = [{
'ComSMS': 'true'
}, {
'ComMail': 'true'
}, {
'PName': 'riyaas'
}, {
'phone': '1'
}]
The desired output is
t = [{ 'ComSMS': 'true', 'ComMail': 'true', 'PName': 'riyaas', 'phone': '1' }]
You may do like this but it should not be in order.
>>> t = [{ 'ComSMS': 'true' }, { 'ComMail': 'true' }, { 'PName': 'riyaas' }, { 'phone': '1' }] >>> [{ i: j for x in t for i, j in x.items[] }] [{ 'ComSMS': 'true', 'phone': '1', 'PName': 'riyaas', 'ComMail': 'true' }]
Suggestion : 2
The error is showing that you have run out of memory in your RAM when executing the codes. Fortunately, you can shrink the file size of the datasets by combining those movie JSON files into a single movies JSON file. A single JSON file also acts as a data warehouse for reproducible data analysis.,In this article, I will show you an example of JSON data processing. The goal is to make JSON files containing one observation per file in a folder combined into a single JSON file. The advantage of performing this kind of data processing is you can significantly shrink your data size and simplify your data form so that it will be easier for somebody to use it.,Imagine you received a folder of movie datasets, all in JSON type format. You will need to do an ETL process on those datasets, i.e. clean and store them into a data warehouse. But then you found that the folder is quite large.,After movies.json created, we can see that the file size decreased from 452 MB to around 145 MB, 67% of memory is freed without losing any information! Besides, now we have a more centred data source that can be processed easier.
We can see that it is just a line of a dictionary, showing one movie data. Some of the fields are relevant for data
analysis, some of them are not. We want to combine all movie observations into a single JSON file by picking necessary fields of a movie then put that in one line, then for the next movie we put that in the next line and so on. But before we dump the data, we need to do the transformation on some fields such as genres
and spoken_language
. The whole processes of extract, transform and load can be done in less than 40 lines of code [might be lesser, comment if you can find!]. The code is as follows.
Here I use os.walk
to extract data from the movies folder. Then by iterating the stored filenames, the data transformation on the genre
and spoken_language
column is running. The transformed data then stored in a predefined empty list called data_list
. After the iteration is done, then by using json.dump
we will be putting the data_list
in a single JSON file, resulting in this output:
After movies.json
created, we can see that the file size decreased from 452 MB to around 145 MB, 67% of memory is freed without losing any information! Besides, now we have a more centred data source that can be processed easier.
Suggestion : 3
In this code, we have opened the files in ‘read’ mode [which is by default] whose content we want to add in the other file.,Here, we have 3 different files of .json type so without wasting any time let’s jump into the code to see the implementation.,In both codes, we have opened file 1 and file 3 in the append mode[‘a’] respectively. Don’t you think why we didn’t use the write mode[‘w’]? If you will use write mode, it’ll replace all the existing data in the file and if you don’t want to erase the existing data you should go for append mode.,As you have seen the image on the top, we have three JSON files and the third file ‘file3.json‘ is empty now. Let’s see what will happen after the execution of code!
Without wasting time see the code given below:
f2data = "" with open['C:\\Users\\lenovo\\Documents\\file2.json'] as f2: f2data = '\n' + f2.read[] with open['C:\\Users\\lenovo\\Documents\\file1.json', 'a+'] as f1: f1.write[f2data]
As you have seen the image on the top, we have three JSON files and the third file ‘file3.json‘ is empty now. Let’s see what will happen after the execution of code!
f1data = f2data = "" with open['C:\\Users\\lenovo\\Documents\\file1.json'] as f1: f1data = f1.read[] with open['C:\\Users\\lenovo\\Documents\\file2.json'] as f2: f2data = f2.read[] f1data += "\n" f1data += f2data with open['C:\\Users\\lenovo\\Documents\\file3.json', 'a'] as f3: f3.write[f1data]
Suggestion : 4
Let us say you have the following 2 json files which have same columns.,We will use this approach when all JSON files have same keys [columns]. In this case, we will load all JSON files with help of Pandas dataframe one by one. Then we will concatenate the dataframes into one dataframe. Finally, we will convert the concatenated dataframe into CSV file.,If only some of the keys/columns match between the JSON files, we will merge these files to create a single dataframe which contains of a union of keys in both JSON files.,Here is the code to combine the two JSON files in python. Python pandas allows you to merge dataframes using inner/outer/left/right joins. Finally, we will write the combine dataframe into CSV file.
Let us say you have the following 2 json files which have same columns.
# File1.json { "ID": { "0": 23, "1": 43, }, "Name": { "0": "Ram", "1": "Deep", }, "Marks": { "0": 89, "1": 97, }, "Grade": { "0": "B", "1": "A", } } #File2.json { "ID": { "0": 90, "1": 56, }, "Name": { "0": "Akash", "1": "Chalsea", }, "Marks": { "0": 81, "1": 87, }, "Grade": { "0": "B", "1": "B", } }
Here is the python code to convert JSON to CSV.
# importing packages import pandas as pd # load json file using pandas df1 = pd.read_json['file1.json'] # view data print[df1] # load json file using pandas df2 = pd.read_json['file2.json'] # view data print[df2] # use pandas.concat method df = pd.concat[[df1, df2]] # view the concatenated dataframe print[df] # convert dataframe to csv file df.to_csv["CSV.csv", index = False] # load the resultant csv file result = pd.read_csv["CSV.csv"] # and view the data print[result]
If you run the above code, you will get the following output.
ID Name Marks Grade 0 23 Ram 89 B 1 43 Deep 97 A ID Name Marks Grade 0 90 Akash 81 B 1 56 Chalsea 87 B ID Name Marks Grade 0 23 Ram 89 B 1 43 Deep 97 A 0 90 Akash 81 B 1 56 Chalsea 87 B ID Name Marks Grade 0 23 Ram 89 B 1 43 Deep 97 A 2 90 Akash 81 B 3 56 Chalsea 87 B
Suggestion : 5
This Python module allows you to merge a series of JSON documents into a single one.,Merge a series of JSON documents.,The Merger class allows you to further customize the merging of JSON data by allowing you to:,Another common example is when you need to keep a versioned list of values that appeared in the series of documents:
Consider a trivial example with two documents:
>>> base = { ..."foo": 1, ..."bar": ["one"], ... } >>> head = { ..."bar": ["two"], ..."baz": "Hello, world!" ... }
We call the document we are merging changes into base and the changed document head. To merge these two documents using jsonmerge:
>>> from pprint import pprint >>> from jsonmerge import merge >>> result = merge[base, head] >>> pprint[result, width = 40] { 'bar': ['two'], 'baz': 'Hello, world!', 'foo': 1 }
Let’s say you want to specify that the merged bar field in the example document above should contain elements from all documents, not just the latest one. You can do this with a schema like this:
>>> schema = { ..."properties": { ..."bar": { ..."mergeStrategy": "append" ... } ... } ... } >>> from jsonmerge import Merger >>> merger = Merger[schema] >>> result = merger.merge[base, head] >>> pprint[result, width = 40] { 'bar': ['one', 'two'], 'baz': 'Hello, world!', 'foo': 1 }
To install the latest jsonmerge release from the Python package index:
To install from source, run the following from the top of the source distribution:
jsonmerge uses Tox for testing. To run the test suite run: