Merge all json files in directory python

Suppose there are 3 files - data1.json, data2.json, data3.json.

Let's say data1.json contains -

{ 
   "Players":[ 
      { 
         "name":"Alexis Sanchez",
         "club":"Manchester United"
      },
      { 
         "name":"Robin van Persie",
         "club":"Feyenoord"
      }
   ]
}

data2.json contains -

{ 
   "Players":[ 
      { 
         "name":"Nicolas Pepe",
         "club":"Arsenal"
      }
   ]
}

data3.json contains -

{ 
   "players":[ 
      { 
         "name":"Gonzalo Higuain",
         "club":"Napoli"
      },
      { 
         "name":"Sunil Chettri",
         "club":"Bengaluru FC"
      }
   ]
}

A merge of these 3 files will generate a file with the following data. result.json -

{ 
   "players":[ 
      { 
         "name":"Alexis Sanchez",
         "club":"Manchester United"
      },
      { 
         "name":"Robin van Persie",
         "club":"Feyenoord"
      },
      { 
         "name":"Nicolas Pepe",
         "club":"Arsenal"
      },
      { 
         "name":"Gonzalo Higuain",
         "club":"Napoli"
      },
      { 
         "name":"Sunil Chettri",
         "club":"Bengaluru FC"
      }
   ]
}

How to open multiple JSON file from folder and merge them in single JSON file in python?

My Approach :

import os, json
import pandas as pd
path_to_json =  #path for all the files.
json_files = [pos_json for pos_json in os.listdir[path_to_json] if pos_json.endswith['.json']]

jsons_data = pd.DataFrame[columns=['name', 'club']]

for index, js in enumerate[json_files]:
    with open[os.path.join[path_to_json, js]] as json_file:
        json_text = json.load[json_file]

        name = json_text['strikers'][0]['name']
        club = json_text['strikers'][0]['club']

        jsons_data.loc[index] = [name, club]

print[jsons_data]

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters

# You have file1.json and file2.json files.
# Each file has structure:
# [{"key1": "value1"}] - [in file1]
# [{"key2": "value2"}] - [in file2]
# And your goal to merge them and get next view:
# [{"key1": "value1"},
# {"key2": "value2"}]
import json
import glob
result = []
for f in glob.glob["*.json"]:
with open[f, "rb"] as infile:
result.append[json.load[infile]]
with open["merged_file.json", "wb"] as outfile:
json.dump[result, outfile, indent=4]

Hi, I would like to point out that every feedback is more than welcome, I created this solution because I couldn't find it anywhere, this small script read a list of files based on the file name for example:

Let’s say that you have something like this:

09/03/2018 07:01 p. m.              173 file-20180309T200145.json
09/03/2018 11:01 p. m.              173 file-20180310T000129.json
10/03/2018 03:01 a. m.              173 file-20180310T040117.json
10/03/2018 07:01 a. m.              173 file-20180310T080111.json
10/03/2018 11:01 a. m.              173 file-20180310T120127.json
11/03/2018 03:01 p. m.              173 file-20180311T160118.json

And you need to consolidate them as quickly as possible prior to ingestion, this is when this script becomes handy, as it also adds the filename to the data and you can track it to your raw data.

Assumptions:

  1. The python script must be in the same directory as the json files.
  2. The python script and other files whitin the same folder MUST have a different name than the files to be merged

The code:

import os
import glob
import json
import pandas as pd
import numpy as np
import csv
from findtools.find_files import [find_files, Match]
from pandas.io.json import json_normalize


cwd = os.getcwd[]
path_to_json =cwd


#contents = []
File_prefix = 'file-*'


dfs = []


# Recursively find all *.json files in **/home/**
json_files_pattern = Match[filetype='f', name=File_prefix]
found_files = find_files[path='.', match=json_files_pattern]


for found_file in found_files:
		#-----------------------------------------------------
		f = open[found_file]
		data ​= json.load[f]
		f.close[]


		df = pd.DataFrame.from_dict[data, orient='columns']
		##set the json to a pandas dataframe in a table form to a csv
		
		
		df = pd.DataFrame.from_dict[data, orient='columns']
		df['filename'] = pd.Series[found_file]
		dfs.append[df] # append the data frame to the list
		#add the filename column
	
		#------------------------------------------------------
		print ["Adding... "  + found_file]
temp = pd.concat[dfs, ignore_index=True] #to add multiple jsons
print ["Saving ...."  + File_prefix + " File"]
temp.to_csv["data"+ "File_prefix" + ".csv"]

Please share your feedback and thank you.

How do you merge JSON files in Python?

I am using the below code to merge the files:.
data = [].
for f in glob.glob["*.json"]:.
with open[f,] as infile:.
data.append[json.load[infile]].
with open["merged_file.json",'w'] as outfile:.
json.dump[data, outfile].
out: [[[a,b],[c,d],[e,f]],[[g,h],[i,f],[k,l]],[[m,n],[o,p],[q,r]]].

How do I merge multiple JSON files in one json file in Python?

Step 1: Load the json files with the help of pandas dataframe. Step 2: Merge the dataframes by different methods as inner/outer/left/right joins. Step 3: Convert the merged dataframe into CSV file.

How do I load multiple JSON files in Python?

To Load and parse a JSON file with multiple JSON objects we need to follow below steps:.
Create an empty list called jsonList..
Read the file line by line because each line contains valid JSON. ... .
Convert each JSON object into Python dict using a json. ... .
Save this dictionary into a list called result jsonList..

Chủ Đề