Dataclasses
What are dataclasses?
Section titled “What are dataclasses?”Dataclasses are a way to define classes that mainly hold data, with less boilerplate. You declare attributes with type hints; Python generates __init__, __repr__, and (by default) __eq__ for you. They live in the standard library: from dataclasses import dataclass. Requires Python 3.7+.
Example with mixed types
Section titled “Example with mixed types”Fields can use any types you would annotate elsewhere—str, int, float, bool, datetime.date, optional types, nested dataclasses, and so on:
from dataclasses import dataclassfrom datetime import date
@dataclassclass ServiceRecord: vehicle_id: str service_type: str serviced_on: date odometer_km: int cost_usd: float warranty_active: bool
rec = ServiceRecord( vehicle_id="VH-204", service_type="Oil change", serviced_on=date(2026, 2, 15), odometer_km=45200, cost_usd=89.99, warranty_active=True,)print(rec)# ServiceRecord(vehicle_id='VH-204', service_type='Oil change', serviced_on=2026-02-15, odometer_km=45200, cost_usd=89.99, warranty_active=True)- The
@dataclassdecorator tells Python to generate__init__,__repr__, and__eq__from the attribute list. - You can construct instances with positional or keyword arguments (order follows the field order in the class).
Create, read, update, delete (CRUD-style)
Section titled “Create, read, update, delete (CRUD-style)”Dataclass instances are mutable by default, so you often read and update attributes in place. When you want a new instance with some fields changed (without mutating the original), use dataclasses.replace. Delete usually means removing an object from a collection or dropping a reference—not “delete a column” like in a database.
from dataclasses import dataclass, replacefrom datetime import date
@dataclassclass ServiceRecord: vehicle_id: str service_type: str serviced_on: date odometer_km: int cost_usd: float warranty_active: bool
# CREATE — construct a new instancerec = ServiceRecord("VH-204", "Oil change", date(2026, 2, 15), 45200, 89.99, True)
# READ — attribute accessprint(rec.vehicle_id, rec.serviced_on, rec.warranty_active)
# UPDATE (in place) — mutate attributesrec.odometer_km = 46000rec.warranty_active = False
# UPDATE (new instance) — original `rec` unchanged if you still hold a reference to the old objectrec_v2 = replace(rec, serviced_on=date(2026, 3, 1), cost_usd=120.50)
# DELETE — typically remove from a list or stop referencing the objecthistory = [rec, rec_v2]history = [r for r in history if r.vehicle_id != "VH-204"] # drop matching records# Or: del history[0] # remove by indexSummary
| Operation | Typical approach |
|---|---|
| Create | Call the class like a constructor: ServiceRecord(...) |
| Read | Use dot access: rec.odometer_km |
| Update | Assign: rec.odometer_km = …, or replace(rec, field=value) for a copy with changes |
| Delete | Remove from a list/dict, or del the variable; use @dataclass(frozen=True) if you want to forbid attribute assignment |
For optional fields, use Optional[...] or T | None (3.10+) and defaults, e.g. notes: str | None = None.
How they differ from …
Section titled “How they differ from …”Regular class
Section titled “Regular class”With a normal class you write __init__, __repr__, and __eq__ by hand. A dataclass generates all of these for you:
Manual class — you implement the constructor (how to create an instance), string representation (how it prints), and equality (when two instances are considered equal) yourself:
from datetime import date
class ServiceRecord: def __init__( self, vehicle_id: str, service_type: str, serviced_on: date, odometer_km: int, cost_usd: float, warranty_active: bool, ): self.vehicle_id = vehicle_id self.service_type = service_type self.serviced_on = serviced_on self.odometer_km = odometer_km self.cost_usd = cost_usd self.warranty_active = warranty_active
def __repr__(self): return ( f"ServiceRecord(vehicle_id={self.vehicle_id!r}, service_type={self.service_type!r}, " f"serviced_on={self.serviced_on!r}, odometer_km={self.odometer_km}, " f"cost_usd={self.cost_usd}, warranty_active={self.warranty_active})" )
def __eq__(self, other): if not isinstance(other, ServiceRecord): return NotImplemented return ( self.vehicle_id == other.vehicle_id and self.service_type == other.service_type and self.serviced_on == other.serviced_on and self.odometer_km == other.odometer_km and self.cost_usd == other.cost_usd and self.warranty_active == other.warranty_active )Dataclass — same behavior from a short attribute list:
@dataclassclass ServiceRecord: vehicle_id: str service_type: str serviced_on: date odometer_km: int cost_usd: float warranty_active: boolSame idea; less code and clearer types with a dataclass.
namedtuple
Section titled “namedtuple”- Dataclasses are mutable by default, support default values and type hints naturally, can have methods, and are a normal class (so IDEs and type checkers understand them).
- namedtuple is immutable and has a very light syntax, but no attribute defaults in the same way and less tooling support.
Use dataclasses when you want a small data container with optional defaults and methods.
When to use
Section titled “When to use”Good for: config objects, parsing results, small data transfer objects (DTOs) — any place you’d otherwise write a class that’s mostly “data + maybe a method or two.” For a longer example, see Process cloud policies, which uses dataclasses for policy statements and documents.