Collections

Overview

A Collection is a grouping of Entities with the same Partition Key and allows you to make an efficient query across multiple entities. If your background is SQL, imagine Partition Keys as Foreign Keys, a Collection represents a View with multiple joined Entities.

ElectroDB Collections use a single DynamoDB query to retrieve results. One query is made to retrieve results for all Entities (the benefits of single table design), however keep in mind that DynamoDB returns all records in order of the Entity’s sort key. In cases where your partition contains a large volume of items, it is possible some entities will not return items during pagination. This can be mitigated through the use of Index Types.

Collections are defined on an Index, and the name of the collection should represent what the query would return as a pseudo Entity. Additionally, Collection names must be unique across a Service.

A collection name must be unique to a single common index across entities.

Table Definition

Example Setup

Table Definition

{
  "TableName": "electro",
  "KeySchema": [
    {
      "AttributeName": "pk",
      "KeyType": "HASH"
    },
    {
      "AttributeName": "sk",
      "KeyType": "RANGE"
    }
  ],
  "AttributeDefinitions": [
    {
      "AttributeName": "pk",
      "AttributeType": "S"
    },
    {
      "AttributeName": "sk",
      "AttributeType": "S"
    },
    {
      "AttributeName": "gsi1pk",
      "AttributeType": "S"
    },
    {
      "AttributeName": "gsi1sk",
      "AttributeType": "S"
    }
  ],
  "GlobalSecondaryIndexes": [
    {
      "IndexName": "gsi1pk-gsi1sk-index",
      "KeySchema": [
        {
          "AttributeName": "gsi1pk",
          "KeyType": "HASH"
        },
        {
          "AttributeName": "gsi1sk",
          "KeyType": "RANGE"
        }
      ],
      "Projection": {
        "ProjectionType": "ALL"
      }
    }
  ],
  "BillingMode": "PAY_PER_REQUEST"
}

Entities

import { Entity } from "electrodb";

export const Employee = new Entity({
  model: {
    entity: "employee",
    version: "1",
    service: "taskapp",
  },
  attributes: {
    employeeId: {
      type: "string",
    },
    organizationId: {
      type: "string",
    },
    name: {
      type: "string",
    },
    team: {
      type: ["jupiter", "mercury", "saturn"] as const,
    },
  },
  indexes: {
    staff: {
      pk: {
        field: "pk",
        composite: ["organizationId"],
      },
      sk: {
        field: "sk",
        composite: ["employeeId"],
      },
    },
    employee: {
      collection: "assignments",
      index: "gsi2",
      pk: {
        field: "gsi2pk",
        composite: ["employeeId"],
      },
      sk: {
        field: "gsi2sk",
        composite: [],
      },
    },
  },
});

import { Entity } from "electrodb";

const Task = new Entity({
  model: {
    entity: "tasks",
    version: "1",
    service: "taskapp",
  },
  attributes: {
    taskId: {
      type: "string",
    },
    employeeId: {
      type: "string",
    },
    projectId: {
      type: "string",
    },
    title: {
      type: "string",
    },
    body: {
      type: "string",
    },
  },
  indexes: {
    project: {
      pk: {
        field: "pk",
        composite: ["projectId"],
      },
      sk: {
        field: "sk",
        composite: ["taskId"],
      },
    },
    assigned: {
      collection: "assignments",
      index: "gsi2",
      pk: {
        field: "gsi2pk",
        composite: ["employeeId"],
      },
      sk: {
        field: "gsi2sk",
        composite: ["projectId"],
      },
    },
  },
});

Example

import DynamoDB from "aws-sdk/clients/dynamodb";

const table = "projectmanagement";
const client = new DynamoDB.DocumentClient();

const TaskApp = new Service({
  employee: Employee,
  task: Task,
});

await TaskApp.collections.assignments({ employeeId: "JExotic" }).go();

Response Format

{
  data: {
    task: EmployeeItem[];
    employee: TaskItem[];
  }
  cursor: string | null;
}

Equivalent Parameters

{
  "TableName": "projectmanagement",
  "ExpressionAttributeNames": { "#pk": "gsi2pk", "#sk1": "gsi2sk" },
  "ExpressionAttributeValues": {
    ":pk": "$taskapp_1#employeeid_joeexotic",
    ":sk1": "$assignments"
  },
  "KeyConditionExpression": "#pk = :pk and begins_with(#sk1, :sk1)",
  "IndexName": "gsi2"
}

Collection Queries vs Entity Queries

To query across entities, collection queries make use of ElectroDB’s Sort Key structure, which prefixes Sort Key fields with the collection name. Unlike an Entity Query, Collection queries for isolated indexes only leverage Composite Attributes from an access pattern’s Partition Key, while Collection queries for clustered indexes allow you to query on both Partition and Sort Keys.

To better explain how Collection Queries are formed, here is a juxtaposition of an Entity Query’s parameters vs a Collection Query’s parameters:

Entity Query

Example

await TaskApp.entities.task.query.assigned({ employeeId: "JExotic" }).go();

Equivalent Parameters

{
  "KeyConditionExpression": "#pk = :pk and begins_with(#sk1, :sk1)",
  "TableName": "projectmanagement",
  "ExpressionAttributeNames": {
    "#pk": "gsi2pk",
    "#sk1": "gsi2sk"
  },
  "ExpressionAttributeValues": {
    ":pk": "$taskapp#employeeid_jexotic",
    ":sk1": "$assignments#tasks_1"
  },
  "IndexName": "gsi2"
}

Collection Query

Example

await TaskApp.collections.assignments({ employeeId: "JExotic" }).go();

Equivalent Parameters

{
  "KeyConditionExpression": "#pk = :pk and begins_with(#sk1, :sk1)",
  "TableName": "projectmanagement",
  "ExpressionAttributeNames": { "#pk": "gsi2pk", "#sk1": "gsi2sk" },
  "ExpressionAttributeValues": {
    ":pk": "$taskapp#employeeid_jexotic",
    ":sk1": "$assignments"
  },
  "IndexName": "gsi2"
}

The notable difference between the two is how much of the Sort Key is specified at query time.

Entity Query Params

{
  "ExpressionAttributeValues": { ":sk1": "$assignments#tasks_1" }
}

Collection Query Params

{
  "ExpressionAttributeValues": { ":sk1": "$assignments" }
}

Collection Response Structure

Unlike Entity Queries which return an array, Collection Queries return an object. This object will have a key for every Entity name (or Entity Alias) associated with that Collection, and an array for all results queried that belong to that Entity.

For example, using the “TaskApp” models defined above, we would expect the following response from a query to the “assignments” collection:

Example

const results = await TaskApp.collections
  .assignments({ employeeId: "JExotic" })
  .go();

Response Format

{
  data: {
    tasks: [...],    // tasks for employeeId "JExotic"
    employees: [...] // employee record(s) with employeeId "JExotic"
  },
  cursor: null
}

Equivalent Parameters

{
  "TableName": "projectmanagement",
  "ExpressionAttributeNames": { "#pk": "gsi2pk", "#sk1": "gsi2sk" },
  "ExpressionAttributeValues": {
    ":pk": "$taskapp_1#employeeid_joeexotic",
    ":sk1": "$assignments"
  },
  "KeyConditionExpression": "#pk = :pk and begins_with(#sk1, :sk1)",
  "IndexName": "gsi2"
}

Because the Tasks and Employee Entities both associated their index (gsi2) with the same collection name (assignments), ElectroDB is able to associate the two entities via a shared Partition Key. As stated in the collections section, querying across Entities by PK can be comparable to querying across a foreign key in a traditional relational database.

Sub-Collections

Sub-Collections are an extension of Collection functionality that allow you to model more advanced access patterns. Collections and Sub-Collections are defined on Indexes via a property called collection, as either a string or string array respectively.

Sub-Collections are only supported on “isolated” index types.

The following is an example of functionally identical collections, implemented as a string (referred to as a “collection”) and then as a string array (referred to as sub-collections):

As a string (collection)

{
  collection: "assignments"
  pk: {
    field: "pk",
    composite: ["employeeId"]
  },
  sk: {
    field: "sk",
    composite: ["projectId"]
  }
}

As a string array (sub-collections)

{
  collection: ["assignments"]
  pk: {
    field: "pk",
    composite: ["employeeId"]
  },
  sk: {
    field: "sk",
    composite: ["projectId"]
  }
}

Both implementations above will create a “collections” method called assignments when added to a Service.

const results = await TaskApp.collections
  .assignments({ employeeId: "JExotic" })
  .go();

The advantage to using a string array to define collections is the ability to express sub-collections. Below is an example of three entities using sub-collections, followed by an explanation of their sub-collection definitions:

import { Entity } from "electrodb";

const employees = new Entity({
  model: {
    entity: "employees",
    version: "1",
    service: "taskapp",
  },
  attributes: {
    employeeId: {
      type: "string",
    },
    organizationId: {
      type: "string",
    },
    name: {
      type: "string",
    },
    team: {
      type: ["jupiter", "mercury", "saturn"] as const,
    },
  },
  indexes: {
    staff: {
      pk: {
        field: "pk",
        composite: ["organizationId"],
      },
      sk: {
        field: "sk",
        composite: ["employeeId"],
      },
    },
    employee: {
      // highlight next line
      collection: "contributions",
      index: "gsi2",
      pk: {
        field: "gsi2pk",
        composite: ["employeeId"],
      },
      sk: {
        field: "gsi2sk",
        composite: [],
      },
    },
  },
});

import { Entity } from "electrodb";

const tasks = new Entity({
  model: {
    entity: "tasks",
    version: "1",
    service: "taskapp",
  },
  attributes: {
    taskId: {
      type: "string",
    },
    employeeId: {
      type: "string",
    },
    projectId: {
      type: "string",
    },
    title: {
      type: "string",
    },
    body: {
      type: "string",
    },
  },
  indexes: {
    project: {
      // highlight next line
      collection: "overview",
      pk: {
        field: "pk",
        composite: ["projectId"],
      },
      sk: {
        field: "sk",
        composite: ["taskId"],
      },
    },
    assigned: {
      // highlight next line
      collection: ["contributions", "assignments"] as const,
      index: "gsi2",
      pk: {
        field: "gsi2pk",
        composite: ["employeeId"],
      },
      sk: {
        field: "gsi2sk",
        composite: ["projectId"],
      },
    },
  },
});

import { Entity } from "electrodb";

const projectMembers = new Entity({
  model: {
    entity: "projectMembers",
    version: "1",
    service: "taskapp",
  },
  attributes: {
    employeeId: {
      type: "string",
    },
    projectId: {
      type: "string",
    },
    name: {
      type: "string",
    },
  },
  indexes: {
    members: {
      // highlight next line
      collection: "overview",
      pk: {
        field: "pk",
        composite: ["projectId"],
      },
      sk: {
        field: "sk",
        composite: ["employeeId"],
      },
    },
    projects: {
      // highlight next line
      collection: ["contributions", "assignments"] as const,
      index: "gsi2",
      pk: {
        field: "gsi2pk",
        composite: ["employeeId"],
      },
      sk: {
        field: "gsi2sk",
        composite: [],
      },
    },
  },
});

import DynamoDB from "aws-sdk/clients/dynamodb";

const table = "projectmanagement";
const client = new DynamoDB.DocumentClient();

const TaskApp = new Service(
  {
    employees,
    tasks,
    projectMembers,
  },
  { client, table },
);

TypeScript Note: Use as const syntax when defining collection as a string array for improved type support.

The last code block above creates a Service called TaskApp using the Entity instances created above its declaration. By creating a Service, ElectroDB will identify and validate the sub-collections defined across all three models. The result in this case are three unique collections: “overview”, “contributions”, and “assignments”.

The simplest collection to understand is overview. This collection is defined on the table’s Primary Index, composed of a projectId in the Partition Key, and is currently implemented by two Entities: tasks and projectMembers. If another entity were to be added to our service, it could “join” this collection by implementing an identical Partition Key composite (projectId) and labeling itself as part of the overview collection. The following is an example of using the overview collection:

Simple Collections

Example

// overview
const results = await TaskApp.collections
  .overview({ projectId: "SD-204" })
  .go();

Response Format

{
  data: {
    tasks: [...],         // tasks associated with projectId "SD-204"
    projectMembers: [...] // employees of project "SD-204"
  },
  cursor: null,
}

Equivalent Parameters

{
  "KeyConditionExpression": "#pk = :pk and begins_with(#sk1, :sk1)",
  "TableName": "projectmanagement",
  "ExpressionAttributeNames": { "#pk": "pk", "#sk1": "sk" },
  "ExpressionAttributeValues": {
    ":pk": "$taskapp#projectid_sd-204",
    ":sk1": "$overview"
  }
}

Complex Collections

Unlike overview, the collections contributions and assignments are more complex.

In the case of contributions, all three entities implement this collection on the gsi2 index, and compose their Partition Key with the employeeId attribute. The assignments collection, however, is only implemented by the tasks and projectMembers Entities. Below is an example of using these collections:

Collection values of collection: "contributions" and collection: ["contributions"] are interpreted by ElectroDB as being the same implementation.

Example

// contributions
const results = await TaskApp.collections
  .contributions({ employeeId: "JExotic" })
  .go();

Response Format

{
  data: {
    tasks: [...], // tasks assigned to employeeId "JExotic"
    projectMembers: [...], // projects with employeeId "JExotic"
    employees: [...] // employee record(s) with employeeId "JExotic"
  },
  cursor: null,
}

Equivalent Parameters

{
  "KeyConditionExpression": "#pk = :pk and begins_with(#sk1, :sk1)",
  "TableName": "projectmanagement",
  "ExpressionAttributeNames": { "#pk": "gsi2pk", "#sk1": "gsi2sk" },
  "ExpressionAttributeValues": {
    ":pk": "$taskapp#employeeid_jexotic",
    ":sk1": "$contributions"
  },
  "IndexName": "gsi2"
}

Complex Collections (Continued)

This collection contains only the entities that share the second collection element.

Example

const results = await TaskApp.collections
  .assignments({ employeeId: "JExotic" })
  .go();

Response Format

{
  data: {
    tasks: [...],          // tasks assigned to employeeId "JExotic"
    projectMembers: [...], // projects with employeeId "JExotic"
  },
  cursor: null,
}

Equivalent Parameters

{
  "KeyConditionExpression": "#pk = :pk and begins_with(#sk1, :sk1)",
  "TableName": "projectmanagement",
  "ExpressionAttributeNames": { "#pk": "gsi2pk", "#sk1": "gsi2sk" },
  "ExpressionAttributeValues": {
    ":pk": "$taskapp#employeeid_jexotic",
    ":sk1": "$contributions#assignments"
  },
  "IndexName": "gsi2"
}

Looking above we can see that the assignments collection is actually a subset of the results that could be queried with the contributions collection. The power behind having the assignments sub-collection is the flexibility to further slice and dice your cross-entity queries into more specific and performant queries.

Index and Collection Naming Conventions

ElectroDB puts an emphasis on allowing users to define more domain specific naming. Instead of referring to indexes by their name on the table, ElectroDB allows users to define their indexes as Access Patterns.

Please refer to the Entities defined in the section Sub-Collections as the source of examples within this section.

Index Naming Conventions

The following is an access pattern on the “employees” entity defined here:

staff: {
  pk: {
    field: "pk",
    composite: ["organizationId"]
  },
  sk: {
    field: "sk",
    composite: ["employeeId"]
  }
}

This Access Pattern is defined on the table’s Primary Index (note the lack of an index property), is given the name staff, and is composed of an organizationId and an employeeId.

When deciding on an Access Pattern name, ask yourself, “What would the array of items returned represent if I only supplied the Partition Key”. In this example case, the entity defines an “Employee” by its organizationId and employeeId. If you performed a query against this index, and only provided organizationId you would then expect to receive all Employees for that Organization. From there, the name staff was chosen because the focus becomes “What are these Employees to that Organization?“.

This convention also becomes evident when you consider that the Access Pattern name becomes the name of the method you use to query that index.

await employee.query.staff({ organizationId: "nike" }).go();

Collection Naming Conventions

The following are access patterns on entities defined here:

// employees entity
employee: {
  collection: "contributions",
  index: "gsi2",
  pk: {
    field: "gsi2pk",
    composite: ["employeeId"],
  },
  sk: {
    field: "gsi2sk",
    composite: [],
  },
}

// tasks entity
assigned: {
  collection: ["contributions", "assignments"] as const,
  index: "gsi2",
  pk: {
    field: "gsi2pk",
    composite: ["employeeId"],
  },
  sk: {
    field: "gsi2sk",
    composite: ["projectId"],
  },
}

// projectMembers entity
projects: {
  collection: ["contributions", "assignments"] as const,
  index: "gsi2",
  pk: {
    field: "gsi2pk",
    composite: ["employeeId"],
  },
  sk: {
    field: "gsi2sk",
    composite: [],
  },
}

In the case of the entities above, we see an example of a sub-collection. ElectroDB will use the above definitions to generate two collections: contributions, assignments.

The considerations for naming a collection are nearly identical to the considerations for naming an index: What do the query results from supplying just the Partition Key represent? In the case of collections you must also consider what the results represent across all the involved entities, and the entities that may be added in the future.

For example, the contributions collection is named such because when given an employeeId we receive the employee’s details, the tasks the that employee, and the projects where they are currently a member.

In the case of assignments, we receive a subset of contributions when supplying an employeeId: Only the tasks and projects they are “assigned” are returned.