luizmachado.dev

PT EN

Session 031 — DynamoDB: GSIs and LSIs, hot partitions and write amplification

Estimated duration: 60 minutes
Prerequisites: session-030-dynamodb-single-table-adjacency


Objective

By the end, you will be able to design a GSI that solves a secondary access pattern without creating
a hot partition, calculate the write amplification cost when writing to a table with multiple GSIs,
and decide when an LSI is preferable to a GSI (trade-off of flexibility vs cost per item).


Context

[FACT] Secondary indexes are the primary tool for supporting access patterns that cannot be served
by the table's primary key. Without them, the only alternative would be Scan — a full table scan
that consumes throughput proportional to the total data size, regardless of how many items are
relevant to the query. A well-designed GSI transforms an O(n) query into O(log n) with cost
proportional only to the result.

[FACT] The cost of secondary indexes is real and measurable: each GSI replicates data (storage) and
consumes additional WCUs on each write to the base table (write amplification). Understanding these
costs is essential for designing schemas that scale without billing surprises or throttling.


Key concepts

1. GSI — anatomy and consistency model

[FACT] A Global Secondary Index (GSI) is an index with PK and SK that can be completely
different from those of the base table. It is "global" because it can index items from any partition
of the base table — a query on the GSI spans all partitions.

Tabela base: Orders
  PK = order_id (String)
  SK = customer_id (String)

GSI: OrdersByStatusDate
  PK = status (String)   ← diferente da tabela base
  SK = order_date (String)

Fundamental GSI properties:

1. PK e SK independentes da tabela base (podem ser qualquer atributo top-level String/Number/Binary)
2. Consistência EVENTUAL por padrão (GSI é atualizado assincronamente após write na tabela base)
3. Sem GetItem — somente Query e Scan são suportados em GSIs
4. Limite padrão de 20 GSIs por tabela (aumentável via quota request)
5. Throughput provisionado SEPARADO da tabela base (em modo provisioned)
6. GSI herda o table class da tabela base
7. Chave do GSI não precisa ser única — múltiplos itens podem ter o mesmo PK+SK no GSI
8. Se um item não tem o atributo do GSI key, ele NÃO é projetado no índice (sparse index)

[FACT] The eventual consistency of the GSI has an important practical implication: right after a
write to the base table, a query on the GSI may not return the updated item. In scenarios where
immediate read-after-write is critical, the code should read from the base table (which supports
ConsistentRead=True), not from the GSI.

Sparse index — when attribute absence is a feature:

[FACT] If the GSI key attribute does not exist on the item, the item is not projected into the
GSI
. This allows creating indexes that cover only a subset of the table:

Tabela: Tasks
  PK = task_id
  SK = task_id
  status (atributo opcional — só existe quando status = "OPEN")

GSI: OpenTasksIndex
  PK = status

Itens com status="OPEN" → projetados no GSI
Itens concluídos (sem atributo status) → NÃO projetados no GSI

Resultado: o GSI contém APENAS tasks abertas, sem custo de storage para as concluídas.
Query "listar tasks abertas" → Query GSI, pequeno, eficiente.

This pattern is called a sparse index and is one of the most valuable in DynamoDB: the GSI only
stores items with a given state, reducing storage and read throughput.

2. Write amplification — calculating the real cost of multiple GSIs

[FACT] Each write to the base table (PutItem, UpdateItem, DeleteItem) can trigger additional writes
to each GSI. The exact number of writes per GSI depends on what changed:

Cenário                                          Writes adicionais no GSI
─────────────────────────────────────────────────────────────────────────
Item novo que define o atributo GSI key         +1 write (insere no GSI)
Update que muda o VALOR de um atributo GSI key  +2 writes (deleta antigo + insere novo)
Update que deleta um atributo GSI key           +1 write (deleta do GSI)
Item não tinha o atributo e ainda não tem       +0 writes
Update de atributo projetado (não GSI key)      +1 write (atualiza projeção no GSI)

Write amplification calculation with N GSIs:

Tabela com 3 GSIs, cada item tem os 3 atributos de GSI key definidos:

PutItem → 1 write (tabela base) + 3 writes (1 por GSI) = 4 writes totais
UpdateItem mudando os 3 atributos GSI key → 1 write + 3×2 writes = 7 writes totais

[FACT] The WCU cost of each write to the GSI is calculated by the size of the item projected in
the GSI
(rounded up to the next KB), not by the item size in the base table. This means:
KEYS_ONLY projection minimizes the WCU per GSI write; ALL projection maximizes it.

Numerical example:

Tabela com on-demand pricing, item de 3 KB, 3 GSIs:
  - GSI1: projeção KEYS_ONLY (item projetado = 200 bytes → 1 WCU)
  - GSI2: projeção INCLUDE (item projetado = 1.5 KB → 2 WCUs)
  - GSI3: projeção ALL (item projetado = 3 KB → 3 WCUs)

PutItem (novo item):
  Tabela base: 3 KB → 3 WCUs
  GSI1: 200 bytes → 1 WCU
  GSI2: 1.5 KB → 2 WCUs
  GSI3: 3 KB → 3 WCUs
  ─────────────────────────
  Total: 9 WCUs por PutItem  (3x o custo sem GSIs)

[FACT] In on-demand mode, there is no explicitly provisioned WCU — DynamoDB scales
automatically. But the cost per write request unit still multiplies by the number of affected GSIs.
Check the cost at aws.amazon.com/dynamodb/pricing for the region and table class.

3. GSI throttling and back-pressure — the most dangerous cascading effect

[FACT] The back-pressure mechanism is the most counterintuitive aspect of GSIs: if a GSI does
not have enough capacity to process updates, DynamoDB throttles WRITES TO THE BASE TABLE
, not
just queries on the GSI. This means an undersized GSI can bring down your entire table's write
capacity.

Tabela base: 1.000 WCU provisionados (suficiente para o workload)
GSI1: 100 WCU provisionados (subdimensionado)

Workload: 500 writes/s na tabela, todos afetam GSI1

Resultado:
  Tabela base: OK (500 WCU < 1.000)
  GSI1: THROTTLED (500 WCU > 100)
  → Back-pressure: writes na tabela base também começam a throttle
  → ProvisionedThroughputExceededException nos writes da tabela base
     (ResourceArn aponta para o GSI1, não para a tabela — confusão garantida)

[FACT] There are four distinct types of GSI throttling:

1. IndexWriteProvisionedThroughputExceeded
   Causa: WCU provisionado do GSI insuficiente (modo provisioned)
   Fix: aumentar WCU do GSI

2. IndexWriteMaxOnDemandThroughputExceeded
   Causa: limite máximo de throughput configurado no GSI on-demand excedido
   Fix: aumentar o max throughput configurado ou remover o limite

3. IndexWriteKeyRangeThroughputExceeded
   Causa: HOT PARTITION no GSI — uma única partição do GSI excede o limite
          físico (~3.000 WCU/s por partição), mesmo com WCU total suficiente
   Fix: redesenhar a PK do GSI para melhor distribuição (write sharding)

4. IndexWriteAccountLimitExceeded
   Causa: tabela ultrapassou o limite regional de throughput da conta
   Fix: solicitar aumento de quota via Service Quotas

4. Hot partition in GSI — diagnosis and solution with write sharding

[FACT] A hot partition in a GSI happens when the GSI PK has low cardinality (few distinct
values), concentrating all writes into a few physical partitions of the index. The classic case
is using a status attribute as the GSI PK:

GSI: OrdersByStatus
  PK = status   → valores: "OPEN", "PROCESSING", "CLOSED"

Se 80% dos pedidos têm status="PROCESSING":
  80% dos writes no GSI vão para a partição "PROCESSING"
  Essa partição pode receber 4.000 writes/s → excede o limite de ~3.000 WCU/s por partição
  → ThrottlingException com IndexWriteKeyRangeThroughputExceeded

Solution: write sharding on the GSI key

The standard technique is to add a random suffix (shard) to the GSI PK value:

Sem sharding:         GSI PK = "PROCESSING"
Com 10 shards:        GSI PK = "PROCESSING#" + str(random.randint(0, 9))
                      → valores: "PROCESSING#0" a "PROCESSING#9"

When writing, you distribute randomly across shards. When reading, you make N parallel queries
(one per shard) and consolidate the results on the client:

import asyncio
import boto3
from boto3.dynamodb.conditions import Key

NUM_SHARDS = 10
table = boto3.resource("dynamodb").Table("orders")

# Write: escolhe shard aleatório
import random
shard = random.randint(0, NUM_SHARDS - 1)
table.put_item(Item={
    "order_id": "ord123",
    "status_shard": f"PROCESSING#{shard}",  # PK do GSI com shard
    "order_date": "2024-04-01",
    # ... outros atributos
})

# Read: query em todos os shards em paralelo
def query_shard(shard_num: int) -> list:
    response = table.query(
        IndexName="OrdersByStatusShard",
        KeyConditionExpression=(
            Key("status_shard").eq(f"PROCESSING#{shard_num}") &
            Key("order_date").begins_with("2024-04")
        ),
    )
    return response.get("Items", [])

# Paralelismo com ThreadPoolExecutor (boto3 não é async-native)
from concurrent.futures import ThreadPoolExecutor
with ThreadPoolExecutor(max_workers=NUM_SHARDS) as executor:
    futures = [executor.submit(query_shard, i) for i in range(NUM_SHARDS)]
    all_items = []
    for future in futures:
        all_items.extend(future.result())

[FACT] The number of shards must be calibrated: too few shards won't solve the hot partition; too
many shards increase the parallel query latency (more roundtrips in parallel). A rule of thumb:
num_shards = ceil(peak_writes_per_second / 800) — reserving margin relative to the ~1,000 WCU/s
per partition limit that is considered safe to avoid sporadic throttling.

5. LSI — Local Secondary Index: alternative for an alternate sort key

[FACT] A Local Secondary Index (LSI) keeps the same PK as the base table, but with a different
SK. It is "local" because each LSI partition is co-located with the base table partition that has
the same PK value. This guarantees that a query on the LSI is always strongly consistent when
ConsistentRead=True.

Tabela base: Thread
  PK = ForumName (String)
  SK = Subject (String)

LSI: LastPostIndex
  PK = ForumName (String)  ← MESMO que a tabela base — obrigatório
  SK = LastPostDateTime (String)  ← diferente da tabela base

LSI restrictions:

1. SOMENTE criado no momento da criação da tabela (CreateTable)
   Não pode ser adicionado posteriormente — ao contrário do GSI
2. PK deve ser idêntica à PK da tabela base
3. SK deve ser um único atributo escalar (String, Number, Binary)
4. Máximo de 5 LSIs por tabela
5. Item collection limit: 10 GB por valor de PK (tabela + todos os LSIs)
   → Se uma partition key acumular > 10 GB entre tabela e LSIs: ItemCollectionSizeLimitExceededException

[FACT] The 10 GB per item collection limit is the most critical LSI restriction and the main reason
to prefer GSIs for entities that grow indefinitely. On the other hand, the LSI shares the base
table's provisioned throughput (no additional WCU cost per index) and supports strongly consistent
reads — advantages that GSIs do not offer.

Fetching non-projected attributes in LSI:

[FACT] A unique property of LSIs (not available in GSIs): if a query on the LSI needs attributes
not projected in the index, DynamoDB performs an automatic fetch from the base table. This is
transparent to the code, but has a cost: each item that requires a fetch consumes additional RCUs —
calculated by the item size in the base table (rounded up to 4 KB), not by the item size in the
LSI. In GSIs, this is not possible — the GSI cannot access the base table.

# Query no LSI com atributo não projetado
response = table.query(
    IndexName="LastPostIndex",
    ConsistentRead=True,   # suportado em LSI, não em GSI
    KeyConditionExpression=(
        Key("ForumName").eq("EC2") &
        Key("LastPostDateTime").between("2024-01-01", "2024-12-31")
    ),
    ProjectionExpression="Subject, Replies, LastPostDateTime, Tags",
    # Tags não está projetado no índice → DynamoDB faz fetch na tabela base
    # Custo: RCUs do LSI + RCUs do fetch (por item, arredondado para 4 KB)
)

6. GSI vs LSI — decision table

Critério                         GSI                      LSI
─────────────────────────────────────────────────────────────────────────────
PK do índice                     Qualquer atributo        Mesma da tabela base
SK do índice                     Qualquer atributo        Qualquer atributo
Quando pode ser criado           A qualquer momento       Somente na criação da tabela
Quando pode ser deletado         A qualquer momento       Somente ao deletar a tabela
Consistência de leitura          Eventual (apenas)        Eventual OU Forte
Throughput                       Separado da tabela       Compartilhado com a tabela
Limite de tamanho por PK         Sem limite               10 GB (tabela + todos LSIs)
Fetch de atributos não projetados Não suportado           Suportado (com custo extra)
Sparse index                     Suportado               Suportado
Máximo por tabela                20 (aumentável)         5 (fixo)
Custo de storage                 WCU + storage próprios  Storage compartilhado

When to prefer LSI:
- Alternate sort key access within the same partition (e.g., sort posts by date instead of title, within the same forum)
- Need for strongly consistent reads on the index
- Item collection provably small (< 5 GB) with no forecast of growing to 10 GB
- Application already exists with a table created and LSI planned from the start

When to prefer GSI:
- Access by an attribute from a different entity (e.g., look up orders by customer email)
- Item collection that can grow indefinitely
- Need to add the index after the table is already in production


Practical example

Scenario: e-commerce platform with an Orders table. Access patterns:
1. Look up order by ID (table PK)
2. List orders for a customer by date (GSI: customer_id + order_date)
3. List open orders by date (sparse GSI: status + order_date, only OPEN items)
4. List orders by sales rep in the last month (GSI: rep_id + order_date)

CDK Python — table with 3 GSIs

from aws_cdk import (
    Stack,
    aws_dynamodb as dynamodb,
    RemovalPolicy,
)
from constructs import Construct

class OrdersTableStack(Stack):
    def __init__(self, scope: Construct, id: str, **kwargs):
        super().__init__(scope, id, **kwargs)

        table = dynamodb.Table(
            self, "OrdersTable",
            table_name="orders",
            partition_key=dynamodb.Attribute(
                name="order_id",
                type=dynamodb.AttributeType.STRING,
            ),
            billing_mode=dynamodb.BillingMode.PAY_PER_REQUEST,
            removal_policy=RemovalPolicy.DESTROY,
        )

        # GSI 1: pedidos por cliente (AP2)
        # PK = customer_id, SK = order_date → boa distribuição (muitos clientes)
        table.add_global_secondary_index(
            index_name="OrdersByCustomerDate",
            partition_key=dynamodb.Attribute(
                name="customer_id",
                type=dynamodb.AttributeType.STRING,
            ),
            sort_key=dynamodb.Attribute(
                name="order_date",
                type=dynamodb.AttributeType.STRING,
            ),
            projection_type=dynamodb.ProjectionType.INCLUDE,
            non_key_attributes=["status", "total_amount", "items_count"],
        )

        # GSI 2: pedidos abertos (AP3) — sparse index
        # status só existe quando = "OPEN"; pedidos fechados não são projetados
        # ATENÇÃO: PK = status tem baixa cardinalidade → usar sharding em alto volume
        table.add_global_secondary_index(
            index_name="OpenOrdersByDate",
            partition_key=dynamodb.Attribute(
                name="open_status_shard",  # "OPEN#0" a "OPEN#9"
                type=dynamodb.AttributeType.STRING,
            ),
            sort_key=dynamodb.Attribute(
                name="order_date",
                type=dynamodb.AttributeType.STRING,
            ),
            projection_type=dynamodb.ProjectionType.INCLUDE,
            non_key_attributes=["customer_id", "total_amount", "rep_id"],
        )

        # GSI 3: pedidos por rep comercial (AP4)
        table.add_global_secondary_index(
            index_name="OrdersByRepDate",
            partition_key=dynamodb.Attribute(
                name="rep_id",
                type=dynamodb.AttributeType.STRING,
            ),
            sort_key=dynamodb.Attribute(
                name="order_date",
                type=dynamodb.AttributeType.STRING,
            ),
            projection_type=dynamodb.ProjectionType.INCLUDE,
            non_key_attributes=["customer_id", "status", "total_amount"],
        )

boto3 — write with sharding on sparse GSI and amplification calculation

import boto3
import random
from boto3.dynamodb.conditions import Key
from concurrent.futures import ThreadPoolExecutor
from datetime import datetime, timezone

dynamodb = boto3.resource("dynamodb", region_name="us-east-1")
table = dynamodb.Table("orders")

NUM_SHARDS = 10  # calibrar conforme peak writes/s


def create_order(order_id: str, customer_id: str, rep_id: str, total: float) -> None:
    """
    Write amplification para este PutItem:
      - Tabela base: 1 write
      - GSI OrdersByCustomerDate: +1 write (customer_id e order_date definidos)
      - GSI OpenOrdersByDate: +1 write (open_status_shard definido → pedido aberto)
      - GSI OrdersByRepDate: +1 write (rep_id e order_date definidos)
    Total: 4 writes (4x amplification)
    """
    now = datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%S")
    shard = random.randint(0, NUM_SHARDS - 1)

    table.put_item(Item={
        "order_id": order_id,
        "customer_id": customer_id,
        "rep_id": rep_id,
        "order_date": now,
        "status": "OPEN",
        "open_status_shard": f"OPEN#{shard}",  # atributo do GSI sparse com sharding
        "total_amount": str(total),
        "items_count": 0,
    })


def close_order(order_id: str) -> None:
    """
    Update amplification para fechar um pedido:
      - Tabela base: 1 write
      - GSI OpenOrdersByDate:
          +1 write para DELETAR open_status_shard do GSI (item sai do sparse index)
          → REMOVE open_status_shard do item na tabela base
      - GSIs OrdersByCustomerDate e OrdersByRepDate: 0 writes adicionais (chave não muda)
    Total: 2 writes

    Nota: ao remover o atributo open_status_shard, o item deixa de existir no GSI sparse.
    """
    table.update_item(
        Key={"order_id": order_id},
        UpdateExpression="SET #s = :closed REMOVE open_status_shard",
        ExpressionAttributeNames={"#s": "status"},
        ExpressionAttributeValues={":closed": "CLOSED"},
    )


# AP3: listar pedidos abertos em paralelo nos shards
def list_open_orders(date_prefix: str = "2024-04") -> list[dict]:
    def query_shard(shard_num: int) -> list:
        response = table.query(
            IndexName="OpenOrdersByDate",
            KeyConditionExpression=(
                Key("open_status_shard").eq(f"OPEN#{shard_num}") &
                Key("order_date").begins_with(date_prefix)
            ),
            ScanIndexForward=False,
        )
        return response.get("Items", [])

    with ThreadPoolExecutor(max_workers=NUM_SHARDS) as executor:
        results = list(executor.map(query_shard, range(NUM_SHARDS)))

    # Consolidar e ordenar por data (cada shard retorna ordenado, merge manual)
    all_items = [item for shard_items in results for item in shard_items]
    return sorted(all_items, key=lambda x: x["order_date"], reverse=True)

CLI — check throttling and item collection size

# Verificar throughput consumido por GSI (CloudWatch)
aws cloudwatch get-metric-statistics \
  --namespace AWS/DynamoDB \
  --metric-name ConsumedWriteCapacityUnits \
  --dimensions Name=TableName,Value=orders Name=GlobalSecondaryIndexName,Value=OpenOrdersByDate \
  --start-time 2024-04-01T00:00:00Z \
  --end-time 2024-04-01T01:00:00Z \
  --period 60 \
  --statistics Sum

# Verificar throttling por GSI
aws cloudwatch get-metric-statistics \
  --namespace AWS/DynamoDB \
  --metric-name WriteThrottleEvents \
  --dimensions Name=TableName,Value=orders Name=GlobalSecondaryIndexName,Value=OpenOrdersByDate \
  --start-time 2024-04-01T00:00:00Z \
  --end-time 2024-04-01T01:00:00Z \
  --period 60 \
  --statistics Sum

# Monitorar tamanho da item collection (para tabelas com LSI)
# Usar ReturnItemCollectionMetrics=SIZE no PutItem/UpdateItem
aws dynamodb put-item \
  --table-name forum-threads \
  --item '{"ForumName":{"S":"EC2"}, "Subject":{"S":"test"}, ...}' \
  --return-item-collection-metrics SIZE

# Descrever GSIs de uma tabela
aws dynamodb describe-table \
  --table-name orders \
  --query 'Table.GlobalSecondaryIndexes[*].{Name:IndexName, Status:IndexStatus, WCU:ProvisionedThroughput.WriteCapacityUnits}'

Common pitfalls

Pitfall 1: Undersized GSI throttles the base table (silent back-pressure)

The most insidious pitfall: you provision 1,000 WCU on the base table and only 50 WCU on the GSI.
The workload increases to 200 writes/s, each one affecting the GSI. The GSI gets throttled
(200 > 50), and DynamoDB starts throttling writes to the base table to protect GSI consistency.
The ResourceArn in the exception points to the GSI, but the code that fails is writing to the
base table — the developer gets confused. The AWS-documented rule: the GSI's provisioned WCU should
be equal to or greater than the base table's WCU (because each write to the base table can
generate a write to the GSI). In on-demand mode, back-pressure still exists — the GSI has its own
configurable maximum throughput limit.

Pitfall 2: ALL projection on a GSI for a table with large items

ALL projection on the GSI means each base table item is completely replicated in the index. For a
table with 10 KB items and 10 million items, a GSI with ALL adds ~100 GB of storage and
multiplies the cost of each write by the item size (10 KB → 10 WCUs of write per GSI, per write to
the base table). The alternative is INCLUDE with only the attributes needed for the query. If the
code uses ProjectionExpression on the GSI to fetch only 3 attributes, those 3 attributes should
be projected — it makes no sense to use ALL and then select 3 fields.

Pitfall 3: Trying to create an LSI after the table already exists in production

The LSI can only be created at CreateTable time. If you realize, 6 months after launch, that you
need an alternate sort key with strongly consistent reads on the same partition, it's not possible
to add an LSI — the only option is to create a new table, migrate the data, and update the code.
The available alternative is a GSI (which can be added at any time), but GSI does not support
strongly consistent reads. To avoid this situation, the correct approach is to plan all necessary
LSIs before creating the table in production, even if the corresponding access patterns are not
yet critical. The cost of an unused LSI is only storage — much less than a table migration.


Reflection exercise

You are working with an Events table for a live events platform. The table has:
- PK = event_id
- Attributes: venue_id, start_time, status (UPCOMING/LIVE/ENDED), category, ticket_price

The access patterns are:
- AP1: look up event by ID → GetItem on the base table
- AP2: list all events for a venue by date → GSI
- AP3: list all LIVE events now (can have 5-500 simultaneous events, peak of 2,000 writes/s to the table) → sparse GSI with sharding?
- AP4: list events in a category by ascending price → GSI
- AP5: list events for a venue sorted by price → alternative to the default sort key from AP2

For AP3, calculate: with a peak of 2,000 writes/s, each write affecting the status GSI, how many
shards are needed to avoid a hot partition (assuming a limit of 800 WCU/s per partition as a safe
margin)? For AP5, discuss whether an LSI or a second GSI would be more appropriate, and what the
implications are of choosing the LSI given that the platform may have venues with thousands of
events over the years.


Resources for further study

1. Using Global Secondary Indexes in DynamoDB
URL: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html
What you'll find: complete GSI model (projections, eventual consistency, asynchronous
synchronization), exact WCU calculation per GSI operation (with scenario table), storage
considerations (100-byte overhead per item in the index).
Why it's the right source: primary documentation with the write cost scenario table by operation
type — exactly what's needed to calculate write amplification.

2. Understanding Global Secondary Index (GSI) write throttling and back pressure in DynamoDB
URL: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/gsi-throttling.html
What you'll find: the 4 types of GSI throttling with specific error codes
(IndexWriteProvisionedThroughputExceeded, IndexWriteKeyRangeThroughputExceeded, etc.) and the
back-pressure mechanism explained with the example of status as a GSI PK.
Why it's the right source: it's the only page in the official documentation that explains
back-pressure explicitly — the most counterintuitive concept of GSIs.

3. Local secondary indexes
URL: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/LSI.html
What you'll find: creation-only-at-CreateTable limitation, 10 GB per item collection limit,
behavior of fetching non-projected attributes (with cost calculated by the full item in the base
table), ReturnItemCollectionMetrics to monitor collection size.
Why it's the right source: primary documentation with the exact cost calculation for fetching
non-projected attributes — one of the most overlooked LSI nuances.

4. General guidelines for secondary indexes in DynamoDB
URL: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-indexes-general.html
What you'll find: practical guidelines for projection (when to use KEYS_ONLY vs INCLUDE vs ALL),
recommendations for when to create indexes vs when to use FilterExpression, and when sparse indexes
are the correct solution.
Why it's the right source: it's the synthesis of indexing best practices — more practical than the
individual GSI/LSI reference pages.