Hướng dẫn python deep merge json

It depends slightly on what you want

Solution 1

If you simply want to replace all values by the new dictionary, you can use the following options:

result = {**file_1, **file_2}  

from pprint import pprint
pprint[result]

This will result in:

{'toplevel': {'value': {'settings': [{'inner': {'name': 'Another Real Value'},
                                      'name': 'A Real Value'}]}}}

Alternatively you can use

file_1.update[file_2]

pprint[file_1]

Which will lead to the same outcome, but will update file_1 in place.

Solution 2

If you only want to update the specific key in the nesting, and leave all other values intact, you can do this using recursion. In your example you are using dict, list and str values. So I will build the recursion using the same types.

def update_dict[original, update]:
    for key, value in update.items[]:

        # Add new key values
        if key not in original:
            original[key] = update[key]
            continue

        # Update the old key values with the new key values
        if key in original:
            if isinstance[value, dict]:
                update_dict[original[key], update[key]]
            if isinstance[value, list]:
                update_list[original[key], update[key]]
            if isinstance[value, [str, int, float]]:
                original[key] = update[key]
    return original
def update_list[original, update]:
    # Make sure the order is equal, otherwise it is hard to compare the items.
    assert len[original] == len[update], "Can only handle equal length lists."

    for idx, [val_original, val_update] in enumerate[zip[original, update]]:
        if not isinstance[val_original, type[val_update]]:
            raise ValueError[f"Different types! {type[val_original]}, {type[val_update]}"]
        if isinstance[val_original, dict]:
            original[idx] = update_dict[original[idx], update[idx]]
        if isinstance[val_original, [tuple, list]]:
            original[idx] = update_list[original[idx], update[idx]]
        if isinstance[val_original, [str, int, float]]:
            original[idx] = val_update
    return original

The above might be a bit harder to understand, but I will try to explain it. There are two methods, one which will merge two dictionaries and one that tries to merge two lists.

Merging dictionaries

In order to merge the two dictionaries I go over all the keys and values of the update dictionary, because this will probably be the smaller of the two.

The first block puts new keys in the original dictionary, this is updating values that weren't in the original dictionary at the start.

The second block is updating the nested values. There I distinguish three cases:

  1. If the value is another dict, run the dictionary merge again, but one level deeper.
  2. If the value is a list [or tuple], run the list merge function.
  3. If the value is a str [or int, float], replace the original value with the updated value.

Merging lists

This is a bit trickier than dictionaries, because lists do not have an order or keys that I can compare. Therefore I have to make a heavy assumption that the list updates will always contain the same elements, see limitations on how to handle lists with more than 1 element.

Since the lists are of the same length, I can assume that the indices of the lists are matching. Now in order to check if all the values are the same, we have to do the following:

  1. Make sure that the value types are the same, otherwise we will throw an error since I am not sure how to handle that case.
  2. If the values are dictionaries, use the merging of dictionaries.
  3. If the values are list [or tuple] us the list merging.
  4. If the values are str [or int, float], override the original in place.

Result

using:

from pprint import pprint

pprint[update_dict[file_1, file_2]]

The final result will be:

{'toplevel': {'value': {'settings': [{'inner': {'name': 'Another Real Value',
                                                'setting': 'help'},
                                      'name': 'A Real Value',
                                      'region': 'US'}]}}}

Note that in contrast with the first solution the values 'setting': 'help' and 'region': 'US'} are now still in the original dictionary.

Limitations

Due to the same length constraint, if you do not want to update an element in the list you have to pass the same element type, but empty.

Example on how to ignore a list update:

... {'settings': [
          {}                      # do not update the first element.
          {'name': 'A new name'}  # update second element.
       ]
    }

Chủ Đề