How to manage nested objects in Elasticsearch documents

How to add, update and delete nested objects in Elasticsearch documents using the Update API and painless scripts.
Added new section on the blog: "Humor" for programmers

In this post we are going to manage nested objects of a document indexed with Elasticsearch.

The nested type is a specialised version of the object datatype that allows arrays of objects to be indexed in a way that they can be queried independently of each other. Nested datatype - Official Elasticsearch reference

Prerequisites

To follow this post you need:

  • an up and running Elasticsearch instance
    • I use 6.7 here
  • an up and running Kibana instance to interact with Elasticsearch

Preparation

The document of our index will represent a human and its nested objects will be cats (no surprises).

Create the index

Open your Kibana dev console and type the following to create the index.

# Create the index
PUT iridakos_nested_objects
{
  "mappings": {
    "human": {
      "properties": {
        "name": {
          "type": "text"
        },
        "cats": {
          "type": "nested",
          "properties": {
            "colors": {
              "type": "integer"
            },
            "name": {
              "type": "text"
            },
            "breed": {
              "type": "text"
            }
          }
        }
      }
    }
  }
}

Human has:

  • a name property of type text
  • a cats property of type nested

Each cat has:

  • a colors property of type integer
  • a name property of type text
  • a breed property of type text

Add a human

In the Kibana console, execute the following to add a human with three cats.

# Index a human
POST iridakos_nested_objects/human/1
{
  "name": "iridakos",
  "cats": [
    {
      "colors": 1,
      "name": "Irida",
      "breed": "European Shorthair"
    },
    {
      "colors": 2,
      "name": "Phoebe",
      "breed": "European"
    },
    {
      "colors": 3,
      "name": "Nino",
      "breed": "Aegean"
    }
  ]
}

Confirm the insertion with:

GET iridakos_nested_objects/human/1

You should see something like this:

{
  "_index": "iridakos_nested_objects",
  "_type": "human",
  "_id": "1",
  "_version": 1,
  "found": true,
  "_source": {
    "name": "iridakos",
    "cats": [
      {
        "colors": 1,
        "name": "Irida",
        "breed": "European Shorthair"
      },
      {
        "colors": 2,
        "name": "Phoebe",
        "breed": "European"
      },
      {
        "colors": 3,
        "name": "Nino",
        "breed": "Aegean"
      }
    ]
  }
}

Done, moving on.

Managing nested objects

Add a new nested object

Suppose that iridakos got a new Persian cat named Leon. To add it in iridakos’ collection of cats we will use the Update API.

In Kibana:

# Add a new cat
POST iridakos_nested_objects/human/1/_update
{
  "script": {
    "source": "ctx._source.cats.add(params.cat)",
    "params": {
      "cat": {
        "colors": 4,
        "name": "Leon",
        "breed": "Persian"
      }
    }
  }
}

Notes:

  • We accessed the nested cat objects of our human with ctx._source.cats. This gave us a collection
  • We executed the add method on the collection to add a new cat
  • The properties of the new cat (params.cat) were passed as parameters in the params attribute of the request under the attribute cat.

Confirm the addition with:

GET iridakos_nested_objects/human/1

Cat added:

{
  "_index": "iridakos_nested_objects",
  "_type": "human",
  "_id": "1",
  "_version": 2,
  "found": true,
  "_source": {
    "name": "iridakos",
    "cats": [
      {
        "colors": 1,
        "name": "Irida",
        "breed": "European Shorthair"
      },
      {
        "colors": 2,
        "name": "Phoebe",
        "breed": "European"
      },
      {
        "colors": 3,
        "name": "Nino",
        "breed": "Aegean"
      },
      {
        "colors": 4,
        "name": "Leon",
        "breed": "Persian"
      }
    ]
  }
}

Remove a nested object

Suppose we want to remove Nino from the human’s cat collection.

In Kibana:

# Remove Nino
POST iridakos_nested_objects/human/1/_update
{
  "script": {
    "source": "ctx._source.cats.removeIf(cat -> cat.name == params.cat_name)",
    "params": {
      "cat_name": "Nino"
    }
  }
}

Notes:

  • We accessed the nested cat objects of our human with ctx._source.cats. This gave us a collection
  • We executed the removeIf method on the collection to conditionally remove an item
  • We provided a Predicate to the removeIf method in which we specify which items we want to remove. This predicate will be executed on each item of the collection and resolves to a Boolean value. If the resolution is true then the item will be removed. In our case, the condition is a simple equality check on the cat’s name attribute.
  • The cat_name was passed as a parameter (params.cat_name) instead of fixing it to the script source.

Confirm the addition with:

GET iridakos_nested_objects/human/1

Cat removed:

{
  "_index": "iridakos_nested_objects",
  "_type": "human",
  "_id": "1",
  "_version": 3,
  "found": true,
  "_source": {
    "name": "iridakos",
    "cats": [
      {
        "colors": 1,
        "name": "Irida",
        "breed": "European Shorthair"
      },
      {
        "colors": 2,
        "name": "Phoebe",
        "breed": "European"
      },
      {
        "colors": 4,
        "name": "Leon",
        "breed": "Persian"
      }
    ]
  }
}

Update a nested object

Suppose we want to change all cat breeds from European to European Shorthair (Phoebe is the only one in our case).

# Update breed
POST iridakos_nested_objects/human/1/_update
{
  "script": {
    "source": "def targets = ctx._source.cats.findAll(cat -> cat.breed == params.current_breed); for(cat in targets) { cat.breed = params.breed }",
    "params": {
      "current_breed": "European",
      "breed": "European Shorthair"
    }
  }
}

Notes:

  • We accessed the nested cat objects of our human with ctx._source.cats. This gave us a collection
  • We executed the findAll method on the collection to select specific items
  • We provided a Predicate to the findAll method in which we specify which items we want to select. This predicate will be executed on each item of the collection and resolves to a Boolean value. If the resolution is true then the item will be selected. In our case, the condition is a simple equality check on the cat’s breed attribute.
  • The current_breed was passed as a parameter (params.current_breed) instead of fixing it to the script source.
  • We then loop on the selected cats (whose breed attribute has value European) and change their breed to the new value which we passed by another parameter breed.

Confirm the change:

GET iridakos_nested_objects/human/1

Cat updated:

{
  "_index": "iridakos_nested_objects",
  "_type": "human",
  "_id": "1",
  "_version": 5,
  "found": true,
  "_source": {
    "name": "iridakos",
    "cats": [
      {
        "colors": 1,
        "name": "Irida",
        "breed": "European Shorthair"
      },
      {
        "colors": 2,
        "name": "Phoebe",
        "breed": "European Shorthair"
      },
      {
        "colors": 4,
        "name": "Leon",
        "breed": "Persian"
      }
    ]
  }
}

Update multiple attributes of nested objects fulfilling multiple conditions

Now, in a more advanced example, we are going to use a more flexible script to:

  • target objects based on multiple conditions (here colors and breed)
  • update more than one attributes (here colors and breed)

Suppose we want to change the breed of cats who have 4 colors and their breed is Persian to Aegean and their colors to 3.

The script we will use is the following:

# Update multiple attributes with multiple conditions
POST iridakos_nested_objects/human/1/_update
{
  "script": {
    "source": "def targets = ctx._source.cats.findAll(cat -> { for (condition in params.conditions.entrySet()) { if (cat[condition.getKey()] != condition.getValue()) { return false; } } return true; }); for (cat in targets) { for (change in params.changes.entrySet()) { cat[change.getKey()] = change.getValue() } }",
    "params": {
      "conditions": {
        "breed": "Persian",
        "colors": 4
      },
      "changes": {
        "breed": "Aegean",
        "colors": 3
      }
    }
  }
}

For convenience, here’s the script source with proper indentation.

def targets = ctx._source.cats.findAll(cat -> {
                                         for (condition in params.conditions.entrySet()) {
                                           if (cat[condition.getKey()] != condition.getValue()) {
                                             return false;
                                           }
                                         }
                                         return true; });
for (cat in targets) {
 for (change in params.changes.entrySet()) {
   cat[change.getKey()] = change.getValue()
 }
}

Notes:

  • We select which cats we want to update by checking that their properties have the value specified in the params.conditions parameter.
  • For each selected cat, we change its attributes’ values as specified in the params.changes parameter.

Confirm:

GET iridakos_nested_objects/human/1

Cat updated.

{
  "_index": "iridakos_nested_objects",
  "_type": "human",
  "_id": "1",
  "_version": 5,
  "found": true,
  "_source": {
    "name": "iridakos",
    "cats": [
      {
        "colors": 1,
        "name": "Irida",
        "breed": "European Shorthair"
      },
      {
        "colors": 2,
        "name": "Phoebe",
        "breed": "European Shorthair"
      },
      {
        "name": "Leon",
        "colors": 3,
        "breed": "Aegean"
      }
    ]
  }
}

That’s all! Cat photo.

I'll be back

Comments and feedback
For feedback, comments, typos etc. please use this issue.
Thanks for visiting!