Skip to content

Records and iteration

First PublishedByAtif Alam

This page builds on Data structures — especially list and dict — with patterns you use when each row is a small record (a dict) and you hold many rows in a list. Examples use a fictional service inventory so the shapes match ops-style scripts. Everything here is stdlib only.

List of dicts: create, read, update, delete

Section titled “List of dicts: create, read, update, delete”
services = [
{"id": 1, "name": "auth-service", "status": "healthy", "replicas": 3},
{"id": 2, "name": "maps-tile-server", "status": "degraded", "replicas": 5},
{"id": 3, "name": "routing-engine", "status": "healthy", "replicas": 2},
]
# Create — append a dict
new_service = {"id": 4, "name": "geocoder", "status": "healthy", "replicas": 4}
services.append(new_service)
print("After create:", [s["name"] for s in services])
def find_by_id(data, service_id):
return next((s for s in data if s["id"] == service_id), None)
# Read — one row or a subset
print("By id:", find_by_id(services, 2))
degraded = [s["name"] for s in services if s["status"] == "degraded"]
print("Degraded:", degraded)
# Update — mutate the dict in place (same object the list holds)
def update_status(data, service_id, new_status):
row = find_by_id(data, service_id)
if row:
row["status"] = new_status
return True
return False
update_status(services, 2, "healthy")
print("After update:", find_by_id(services, 2))
# Delete — build a new list (original pattern unchanged if you assign to a new name)
services = [s for s in services if s["id"] != 4]
print("After delete:", [s["name"] for s in services])
sorted_services = sorted(services, key=lambda s: s["replicas"], reverse=True)
print("By replicas:", [(s["name"], s["replicas"]) for s in sorted_services])
print("Total replicas:", sum(s["replicas"] for s in services))

find_by_id is a small reusable pattern: scan until match, return one dict or None. It is the same idea as “first row where …” in many APIs.

filter and map return iterators. In day-to-day Python, list comprehensions and sum / max / min are often clearer; still, recognizing filter / map / reduce helps when reading older code or other languages.

from functools import reduce
services = [
{"id": 1, "name": "auth-service", "status": "healthy", "replicas": 3, "region": "us-west"},
{"id": 2, "name": "maps-tile-server", "status": "degraded", "replicas": 5, "region": "us-east"},
{"id": 3, "name": "routing-engine", "status": "healthy", "replicas": 2, "region": "us-west"},
{"id": 4, "name": "geocoder", "status": "down", "replicas": 0, "region": "eu-west"},
{"id": 5, "name": "search-index", "status": "healthy", "replicas": 4, "region": "us-east"},
]
healthy = list(filter(lambda s: s["status"] == "healthy", services))
healthy_lc = [s for s in services if s["status"] == "healthy"]
# Same members as filter() here; list comp is usually preferred for simple filters.
assert [s["id"] for s in healthy] == [s["id"] for s in healthy_lc]
us_healthy = list(
filter(
lambda s: s["status"] == "healthy" and s["region"].startswith("us"),
services,
)
)
print("US healthy:", [s["name"] for s in us_healthy])
names = list(map(lambda s: s["name"], services))
print("All names:", names)
def enrich_service(s):
return {**s, "is_critical": s["replicas"] >= 4}
enriched = list(map(enrich_service, services))
print("Enriched sample:", enriched[1])
total_replicas = reduce(lambda acc, s: acc + s["replicas"], services, 0)
print("Total replicas (reduce):", total_replicas)
busiest = reduce(lambda a, b: a if a["replicas"] > b["replicas"] else b, services)
print("Busiest:", busiest["name"], busiest["replicas"])
id_to_name = reduce(lambda acc, s: {**acc, s["id"]: s["name"]}, services, {})
print("ID lookup:", id_to_name)
total_v2 = sum(
s["replicas"]
for s in services
if s["status"] == "healthy" and "us" in s["region"]
)
print("Healthy US replicas (sum):", total_v2)

Pragmatic rule: prefer comprehensions and sum(...) / max(..., key=...) when they read naturally. Use reduce when you fold into a non-trivial accumulator (for example building a lookup dict with {**acc, k: v}).

Grouping: itertools.groupby and a dict of lists

Section titled “Grouping: itertools.groupby and a dict of lists”

itertools.groupby only groups consecutive rows that share the same key. Sort by that key first, or you get repeated “groups” for the same key.

from itertools import groupby
from operator import itemgetter
services = [
{"id": 1, "name": "auth-service", "status": "healthy", "replicas": 3, "region": "us-west"},
{"id": 2, "name": "maps-tile-server", "status": "degraded", "replicas": 5, "region": "us-east"},
{"id": 3, "name": "routing-engine", "status": "healthy", "replicas": 2, "region": "us-west"},
{"id": 4, "name": "geocoder", "status": "down", "replicas": 0, "region": "eu-west"},
{"id": 5, "name": "search-index", "status": "healthy", "replicas": 4, "region": "us-east"},
{"id": 6, "name": "cdn-edge", "status": "degraded", "replicas": 3, "region": "eu-west"},
]
sorted_by_status = sorted(services, key=itemgetter("status"))
for status, group in groupby(sorted_by_status, key=itemgetter("status")):
members = list(group)
print(status, [s["name"] for s in members])
def group_by(data, key):
result = {}
for item in data:
k = item[key]
result.setdefault(k, []).append(item)
return result
for region, items in group_by(services, "region").items():
total = sum(s["replicas"] for s in items)
print(f"{region}: {len(items)} services, {total} replicas")

The group_by helper (dict of lists) does not require sorting and is easy to reuse. itemgetter("field") is a fast, readable key function for sorted and groupby.

The guide Incident records analyzer walks through a small list of incident dicts: filter, enrich, group, aggregate, and a short pipeline — stdlib only, with a full solution you can run.