Records and iteration
This page builds on Data structures — especially list and dict — with patterns you use when each row is a small record (a dict) and you hold many rows in a list.
Examples use a fictional service inventory so the shapes match ops-style scripts. Everything here is stdlib only.
List of Dicts: Create, Update, Delete
Section titled “List of Dicts: Create, Update, Delete”services = [ {"id": 1, "name": "auth-service", "status": "healthy", "replicas": 3}, {"id": 2, "name": "maps-tile-server", "status": "degraded", "replicas": 5}, {"id": 3, "name": "routing-engine", "status": "healthy", "replicas": 2},]Create: Append a Dict
Section titled “Create: Append a Dict”# Create — append a dict to a listnew_service = {"id": 4, "name": "geocoder", "status": "healthy", "replicas": 4}services.append(new_service)print("After create:", [service["name"] for service in services])Update: Mutate in Place
Section titled “Update: Mutate in Place”# Update — mutate the dict in place (same object the list holds)def find_by_id(data, service_id): return next((service for service in data if service["id"] == service_id), None)
def update_status(data, service_id, new_status): row = find_by_id(data, service_id) if row: row["status"] = new_status return True return False
update_status(services, 2, "healthy")print("After update:", find_by_id(services, 2))Delete: Build a New List
Section titled “Delete: Build a New List”# Delete — remove rows by keeping only those that do not match# If you ran "Create" above, id=4 exists and this removes it.services = [service for service in services if service["id"] != 4]print("After delete:", [service["name"] for service in services])List of Dicts: Search, Filter, Sort
Section titled “List of Dicts: Search, Filter, Sort”services = [ {"id": 1, "name": "auth-service", "status": "healthy", "replicas": 3}, {"id": 2, "name": "maps-tile-server", "status": "degraded", "replicas": 5}, {"id": 3, "name": "routing-engine", "status": "healthy", "replicas": 2},]Search: First Dict by Id
Section titled “Search: First Dict by Id”# Search — first dict with this id, or Nonedef find_by_id(data, service_id): return next((service for service in data if service["id"] == service_id), None)
print("By id:", find_by_id(services, 2))Filter: Subset by Condition
Section titled “Filter: Subset by Condition”# Filter — one row or a filtered subsetdegraded = [service["name"] for service in services if service["status"] == "degraded"]print("Degraded:", degraded)Sort: Highest Replicas First
Section titled “Sort: Highest Replicas First”# Sort services by replicas.# key=... picks replicas; reverse=True makes highest come first.sorted_services = sorted(services, key=lambda service: service["replicas"], reverse=True)print("By replicas:", [(service["name"], service["replicas"]) for service in sorted_services])print("Total replicas:", sum(service["replicas"] for service in services))find_by_id is a small reusable pattern: scan until match, return one dict or None. It is the same idea as “first row where …” in many APIs.
Grouping: itertools.groupby and a dict of lists
Section titled “Grouping: itertools.groupby and a dict of lists”itertools.groupby only groups consecutive rows that share the same key. Sort by that key first, or you get repeated “groups” for the same key.
services = [ {"id": 1, "name": "auth-service", "status": "healthy", "replicas": 3, "region": "us-west"}, {"id": 2, "name": "maps-tile-server", "status": "degraded", "replicas": 5, "region": "us-east"}, {"id": 3, "name": "routing-engine", "status": "healthy", "replicas": 2, "region": "us-west"}, {"id": 4, "name": "geocoder", "status": "down", "replicas": 0, "region": "eu-west"}, {"id": 5, "name": "search-index", "status": "healthy", "replicas": 4, "region": "us-east"}, {"id": 6, "name": "cdn-edge", "status": "degraded", "replicas": 3, "region": "eu-west"},]Group by Region (Dict of Lists Helper)
Section titled “Group by Region (Dict of Lists Helper)”group_by(data, key) is a reusable helper that builds a dict of lists, where each dict key is a field value (for example, a region).
Unlike itertools.groupby, this helper does not require sorting first.
def group_by(data, key): result = {} for item in data: # Read the value for the supplied key (e.g., "region") from this item. group_key = item[key] # Initialize this group's list if missing, then append this item. result.setdefault(group_key, []).append(item) return result
# Group by region and print service count and total replicas per region.for region, items in group_by(services, "region").items(): total = sum(s["replicas"] for s in items) print(f"{region}: {len(items)} services, {total} replicas")Group by Status (itertools.groupby)
Section titled “Group by Status (itertools.groupby)”itemgetter("status") is a short way to say “use each row’s status value” (same idea as lambda row: row["status"]).
groupby(...) groups consecutive rows with the same key, so we sort by status first and then group by that same key.
This status example uses groupby on purpose to show the stdlib alternative; you could also group status with the same dict-of-lists helper used for region.
This pattern is useful for stream-like processing: after sorting, you can process one group at a time instead of building all groups in memory first.
from itertools import groupbyfrom operator import itemgetter
sorted_by_status = sorted(services, key=itemgetter("status"))for status, group in groupby(sorted_by_status, key=itemgetter("status")): members = list(group) print(status, [s["name"] for s in members])The group_by helper (dict of lists) does not require sorting and is easy to reuse. itemgetter("field") is a fast, readable key function for sorted and groupby.
filter, map, and reduce
Section titled “filter, map, and reduce”filter and map return iterators. In day-to-day Python, list comprehensions and sum / max / min are often clearer; still, recognizing filter / map / reduce helps when reading older code or other languages.
from functools import reduce
services = [ {"id": 1, "name": "auth-service", "status": "healthy", "replicas": 3, "region": "us-west"}, {"id": 2, "name": "maps-tile-server", "status": "degraded", "replicas": 5, "region": "us-east"}, {"id": 3, "name": "routing-engine", "status": "healthy", "replicas": 2, "region": "us-west"}, {"id": 4, "name": "geocoder", "status": "down", "replicas": 0, "region": "eu-west"}, {"id": 5, "name": "search-index", "status": "healthy", "replicas": 4, "region": "us-east"},]
healthy = list(filter(lambda s: s["status"] == "healthy", services))healthy_lc = [s for s in services if s["status"] == "healthy"]# Same members as filter() here; list comp is usually preferred for simple filters.assert [s["id"] for s in healthy] == [s["id"] for s in healthy_lc]
us_healthy = list( filter( lambda s: s["status"] == "healthy" and s["region"].startswith("us"), services, ))print("US healthy:", [s["name"] for s in us_healthy])
names = list(map(lambda s: s["name"], services))print("All names:", names)
def enrich_service(s): return {**s, "is_critical": s["replicas"] >= 4}
enriched = list(map(enrich_service, services))print("Enriched sample:", enriched[1])
total_replicas = reduce(lambda acc, s: acc + s["replicas"], services, 0)print("Total replicas (reduce):", total_replicas)
busiest = reduce(lambda a, b: a if a["replicas"] > b["replicas"] else b, services)print("Busiest:", busiest["name"], busiest["replicas"])
id_to_name = reduce(lambda acc, s: {**acc, s["id"]: s["name"]}, services, {})print("ID lookup:", id_to_name)
total_v2 = sum( s["replicas"] for s in services if s["status"] == "healthy" and "us" in s["region"])print("Healthy US replicas (sum):", total_v2)Pragmatic rule: prefer comprehensions and sum(...) / max(..., key=...) when they read naturally. Use reduce when you fold into a non-trivial accumulator (for example building a lookup dict with {**acc, k: v}).
Try it: incident records
Section titled “Try it: incident records”The guide Incident records analyzer walks through a small list of incident dicts: filter, enrich, group, aggregate, and a short pipeline — stdlib only, with a full solution you can run.
Other Common Operations on Record Lists
Section titled “Other Common Operations on Record Lists”- Aggregate: totals, averages, min/max, and counts.
- Transform / enrich: derive fields like
is_criticalor normalized names. - Project: keep only selected fields for output.
- Deduplicate: remove repeated rows (for example by
id). - Join / merge: combine with another dataset by a shared key.
- Partition: split into two sets by a condition.
- Validate / clean: check required keys/types and fill defaults.
- Index: build fast lookups (for example
id -> record). - Top-k / sample: keep top N rows or random samples for inspection.
- Export: write JSON/CSV for downstream tools.