Deepen PracticeOrdered learning track

Data Access: SQLAlchemy, Transactions, Repository, dan Unit of Work

Part 027 — Data Access: SQLAlchemy, Transactions, Repository, dan Unit of Work

Membahas data access Python secara production-grade: SQLAlchemy 2.x style, Engine, Session, ORM mapping, transactions, repository pattern, unit of work, migration, testing, dan boundary design.

14 min read2797 words
PrevNext
Lesson 2735 lesson track2029 Deepen Practice
#python#sqlalchemy#database#transactions+4 more

Part 027 — Data Access: SQLAlchemy, Transactions, Repository, dan Unit of Work

1. Tujuan Part Ini

Setelah JSON file storage, batas berikutnya adalah database.

Database bukan sekadar “tempat simpan data”. Database membawa konsep engineering yang kuat:

  • transaksi;
  • consistency;
  • isolation;
  • constraints;
  • indexes;
  • query planning;
  • migrations;
  • connection pooling;
  • locking;
  • concurrency;
  • durability;
  • failure recovery;
  • schema evolution;
  • multi-user access.

Di Python, SQLAlchemy adalah salah satu toolkit database paling penting. Ia menyediakan dua lapisan utama:

  • SQLAlchemy Core: SQL expression language, engine, connection, table, statement.
  • SQLAlchemy ORM: object-relational mapper, mapped classes, Session, identity map, unit of work.

Part ini membahas SQLAlchemy sebagai boundary data access, bukan sekadar template CRUD.

Target setelah part ini:

  1. Memahami masalah yang diselesaikan database dibanding JSON file.
  2. Memahami Engine, Connection, Session.
  3. Memahami ORM mapping SQLAlchemy 2.x style.
  4. Memahami transaction boundary.
  5. Memahami flush, commit, rollback.
  6. Memahami identity map dan unit of work.
  7. Mendesain repository yang tidak membocorkan ORM.
  8. Mendesain Unit of Work pattern.
  9. Menghindari session lifetime bug.
  10. Menulis tests data access.
  11. Memahami migration dan Alembic secara konseptual.
  12. Menghubungkan SQLAlchemy ke case-tracker.

2. Kenapa Berpindah dari JSON File ke Database?

JSON file cukup untuk:

  • learning project;
  • local CLI single-user;
  • small config/data;
  • simple export/import;
  • prototype;
  • deterministic tests.

JSON file mulai bermasalah jika:

  • banyak user menulis bersamaan;
  • data besar;
  • query/filter kompleks;
  • butuh index;
  • butuh transaksi;
  • butuh migration;
  • butuh constraint;
  • butuh partial update;
  • butuh audit log;
  • butuh referential integrity;
  • butuh backup/restore lebih serius;
  • butuh concurrent reader/writer.

Database memberi boundary yang lebih kuat.

Untuk case-tracker, database masuk akal ketika:

  • API multi-user;
  • cases banyak;
  • status query sering;
  • notes/audit trail perlu relasi;
  • transition harus transactional;
  • duplicate id harus dicegah di DB;
  • data tidak boleh corrupt karena write bersamaan.

3. SQLAlchemy Mental Model

Concepts:

ConceptMeaning
EngineDatabase connectivity factory + connection pool
ConnectionActual DBAPI connection/transaction channel
SessionORM unit of work + identity map + transaction interface
Mapped classPython class mapped to database table
StatementSQL expression such as select(...)
TransactionAtomic set of operations
FlushSend pending SQL to DB transaction
CommitPersist transaction
RollbackUndo transaction
Identity mapOne Python object per DB identity within a Session

Important:

In ORM code, Session is the main object for database interaction. It is not just a connection wrapper; it tracks object state and transaction work.


4. Engine

Create engine:

from sqlalchemy import create_engine

engine = create_engine("sqlite:///cases.db", echo=False)

For PostgreSQL:

engine = create_engine("postgresql+psycopg://user:password@localhost:5432/cases")

Engine:

  • knows database URL;
  • manages connection pool;
  • creates connections;
  • is usually long-lived;
  • is safe to share as application-level dependency;
  • is not a transaction itself.

Do not create engine per request.

Bad:

def get_case(...):
    engine = create_engine(...)
    ...

Better:

engine = create_engine(...)
SessionLocal = sessionmaker(bind=engine)

5. Session

Session is ORM workspace.

from sqlalchemy.orm import sessionmaker

SessionLocal = sessionmaker(bind=engine)

Use:

with SessionLocal() as session:
    ...

Session:

  • tracks ORM objects;
  • coordinates flush/commit/rollback;
  • has identity map;
  • represents a transaction scope in practical usage;
  • should be short-lived for request/use case;
  • should not be shared across threads/tasks.

Rule:

One Session per unit of work/request/job. Do not use one global Session for the whole app.


6. Session Lifetime

Good pattern:

def run_use_case() -> None:
    with SessionLocal() as session:
        with session.begin():
            ...

Or:

with SessionLocal.begin() as session:
    ...

The context commits if no exception and rolls back on exception.

Conceptual:

If exception:

rollback -> close -> propagate error

7. Declarative ORM Mapping

SQLAlchemy 2.x style mapping often uses DeclarativeBase, Mapped, and mapped_column.

from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column


class Base(DeclarativeBase):
    pass


class CaseRecord(Base):
    __tablename__ = "cases"

    id: Mapped[str] = mapped_column(primary_key=True)
    title: Mapped[str]
    status: Mapped[str]

Create tables for learning:

Base.metadata.create_all(engine)

For production, use migrations instead of calling create_all as deployment strategy.


8. Domain Model vs ORM Model

Domain model:

@dataclass
class Case:
    id: CaseId
    title: str
    status: CaseStatus
    notes: list[str] = field(default_factory=list)

ORM model:

class CaseRecord(Base):
    __tablename__ = "cases"

    id: Mapped[str] = mapped_column(primary_key=True)
    title: Mapped[str]
    status: Mapped[str]

Should they be the same class?

Options:

8.1 Active Record-ish: ORM Model as Domain

class Case(Base):
    __tablename__ = "cases"
    ...

Pros:

  • less mapping code;
  • faster to start;
  • common in many apps;
  • simple CRUD.

Cons:

  • domain tied to SQLAlchemy;
  • persistence concerns leak into domain;
  • tests may need DB/session;
  • hard to keep framework-agnostic;
  • domain lifecycle can be shaped by ORM.

8.2 Separate Domain and ORM Record

Pros:

  • clean domain;
  • persistence isolated;
  • easier pure unit tests;
  • supports multiple adapters;
  • aligns with ports/adapters.

Cons:

  • mapping code;
  • possible duplication;
  • more architecture ceremony;
  • must maintain translation.

For this learning series, we prefer separation because the user’s goal is top-tier engineering judgment, not fastest CRUD.


9. Mapping Functions

ORM to domain:

def case_record_to_domain(record: CaseRecord) -> Case:
    return Case(
        id=CaseId(record.id),
        title=record.title,
        status=CaseStatus(record.status),
        notes=[],
    )

Domain to ORM:

def update_case_record_from_domain(record: CaseRecord, case: Case) -> None:
    record.title = case.title
    record.status = case.status.value

Create record:

def case_to_record(case: Case) -> CaseRecord:
    return CaseRecord(
        id=case.id.value,
        title=case.title,
        status=case.status.value,
    )

Mapping is not wasted code. It is boundary control.


10. Basic Insert

with SessionLocal.begin() as session:
    record = CaseRecord(
        id="CASE-001",
        title="Late reporting",
        status="DRAFT",
    )
    session.add(record)

When transaction commits, SQL INSERT is persisted.

Important:

  • session.add marks object pending.
  • flush sends SQL.
  • commit commits transaction.
  • SQLAlchemy may flush before query/commit.

11. Basic Select

from sqlalchemy import select

with SessionLocal() as session:
    statement = select(CaseRecord).where(CaseRecord.id == "CASE-001")
    record = session.scalars(statement).one_or_none()

one_or_none() returns:

  • one object;
  • None;
  • raises if multiple rows.

For primary key, also:

record = session.get(CaseRecord, "CASE-001")

session.get is convenient for primary key lookup and can use identity map.


12. Basic Update

with SessionLocal.begin() as session:
    record = session.get(CaseRecord, "CASE-001")

    if record is None:
        raise CaseNotFoundError(CaseId("CASE-001"))

    record.status = "SUBMITTED"

No explicit session.add needed if record is already persistent in session. SQLAlchemy tracks changes and flushes.

This is unit-of-work behavior.


13. Basic Delete

with SessionLocal.begin() as session:
    record = session.get(CaseRecord, "CASE-001")

    if record is not None:
        session.delete(record)

Deletion occurs on flush/commit.

For domain systems, deletion may not be allowed. Often you use soft delete or status transition to preserve audit/history.


14. Flush vs Commit

Flush:

Send pending SQL statements to database transaction, but do not finalize transaction.

Commit:

Commit database transaction so changes become durable/visible according to isolation rules.

Example:

session.add(record)
session.flush()
print(record.id)
session.commit()

Flush can happen automatically before queries or commit.

Why care?

  • DB-generated ids may be available after flush;
  • constraints may fail on flush;
  • you may need SQL executed before another query;
  • rollback can still undo flushed changes before commit.

Rule:

Commit belongs at transaction boundary. Do not scatter commits deep inside repository methods unless that is explicitly the boundary.


15. Rollback

If exception occurs, rollback undoes transaction.

with SessionLocal() as session:
    try:
        ...
        session.commit()
    except Exception:
        session.rollback()
        raise

Context manager pattern reduces boilerplate:

with SessionLocal.begin() as session:
    ...

If exception, rollback automatically.


16. Identity Map

Within a Session, SQLAlchemy tracks one object per database identity.

with SessionLocal() as session:
    first = session.get(CaseRecord, "CASE-001")
    second = session.get(CaseRecord, "CASE-001")

    assert first is second

This can be useful, but also surprising.

Implications:

  • Session is stateful.
  • Long-lived Session can hold stale objects.
  • Session should not be global.
  • Different sessions can see different object instances/states.
  • Commit/expire behavior matters.

17. Expiration and Staleness

By default, Session may expire ORM objects on commit depending configuration. Accessing attributes after commit can trigger reload if session active.

This is one reason not to return ORM objects outside session boundary.

Bad:

def get_case_record() -> CaseRecord:
    with SessionLocal() as session:
        return session.get(CaseRecord, "CASE-001")

Returned object may be detached or stale.

Better:

def get_case() -> Case:
    with SessionLocal() as session:
        record = session.get(CaseRecord, "CASE-001")
        return case_record_to_domain(record)

Return domain object or DTO, not live ORM entity, unless your architecture explicitly uses ORM model as domain.


18. Repository Pattern

Repository hides persistence details behind domain-oriented methods.

Protocol:

from typing import Protocol


class CaseRepository(Protocol):
    def get(self, case_id: CaseId) -> Case:
        ...

    def list(self, status: CaseStatus | None = None) -> list[Case]:
        ...

    def add(self, case: Case) -> None:
        ...

    def save(self, case: Case) -> None:
        ...

Service depends on protocol:

class CaseService:
    def __init__(self, repository: CaseRepository) -> None:
        self._repository = repository

SQLAlchemy adapter implements protocol.


19. SQLAlchemy Repository

class SqlAlchemyCaseRepository:
    def __init__(self, session: Session) -> None:
        self._session = session

    def get(self, case_id: CaseId) -> Case:
        record = self._session.get(CaseRecord, case_id.value)

        if record is None:
            raise CaseNotFoundError(case_id)

        return case_record_to_domain(record)

    def add(self, case: Case) -> None:
        self._session.add(case_to_record(case))

    def save(self, case: Case) -> None:
        record = self._session.get(CaseRecord, case.id.value)

        if record is None:
            raise CaseNotFoundError(case.id)

        update_case_record_from_domain(record, case)

Note:

  • repository does not commit;
  • transaction boundary belongs to Unit of Work/service boundary;
  • repository uses session injected from UoW.

20. Why Repository Should Not Commit

Bad:

class SqlAlchemyCaseRepository:
    def save(self, case: Case) -> None:
        ...
        self._session.commit()

Problem:

def transition_and_add_audit(case_id):
    repository.save(case)
    audit_repository.add(event)

If save commits before audit event, you can persist case transition but fail audit write.

Better:

with uow:
    case = uow.cases.get(case_id)
    case.transition_to(target)
    uow.cases.save(case)
    uow.audit_events.add(event)
    uow.commit()

One transaction for all related changes.


21. Unit of Work Pattern

Unit of Work represents transaction boundary.

Protocol:

from typing import Protocol


class UnitOfWork(Protocol):
    cases: CaseRepository

    def __enter__(self) -> "UnitOfWork":
        ...

    def __exit__(self, exc_type, exc, tb) -> None:
        ...

    def commit(self) -> None:
        ...

    def rollback(self) -> None:
        ...

Usage:

class CaseService:
    def __init__(self, uow_factory: Callable[[], UnitOfWork]) -> None:
        self._uow_factory = uow_factory

    def transition_case(self, case_id: CaseId, target_status: CaseStatus) -> Case:
        with self._uow_factory() as uow:
            case = uow.cases.get(case_id)
            case.transition_to(target_status)
            uow.cases.save(case)
            uow.commit()
            return case

This makes transaction boundary explicit.


22. SQLAlchemy Unit of Work

class SqlAlchemyUnitOfWork:
    def __init__(self, session_factory: sessionmaker[Session]) -> None:
        self._session_factory = session_factory

    def __enter__(self) -> "SqlAlchemyUnitOfWork":
        self.session = self._session_factory()
        self.cases = SqlAlchemyCaseRepository(self.session)
        return self

    def __exit__(self, exc_type, exc, tb) -> None:
        if exc_type is not None:
            self.rollback()

        self.session.close()

    def commit(self) -> None:
        self.session.commit()

    def rollback(self) -> None:
        self.session.rollback()

Alternative: use session.begin() inside UoW. The exact implementation can vary. The design rule is stable:

one UoW represents one transaction boundary.


23. Commit/Rollback Discipline

Pattern:

with uow_factory() as uow:
    ...
    uow.commit()

If exception happens before commit, __exit__ rolls back.

Potential issue: if no exception but commit forgotten, changes may not persist.

Some teams prefer UoW that commits automatically on successful exit. Others require explicit commit.

23.1 Explicit Commit Style

Pros:

  • commit is visible;
  • no accidental commit;
  • aligns with “commit only when use case complete”.

Cons:

  • forgetting commit silently rolls back/does not persist depending implementation.

23.2 Auto Commit Style

Pros:

  • less boilerplate.

Cons:

  • accidental writes if code exits normally;
  • transaction intent less explicit.

For learning, explicit commit teaches boundary awareness.


24. Transaction Boundary in API

FastAPI dependency:

def get_uow() -> UnitOfWork:
    return SqlAlchemyUnitOfWork(SessionLocal)

Route:

def transition_case_endpoint(
    case_id: str,
    request: TransitionCaseRequest,
    service: CaseService = Depends(get_case_service),
):
    case = service.transition_case(CaseId(case_id), CaseStatus(request.target_status.value))
    return case_to_response(case)

Service owns UoW boundary. Route does not call commit.

Alternatively, per-request transaction dependency can wrap route, but then service transaction semantics become less explicit.

For complex domain, service-level UoW is often clearer.


25. Transaction Boundary in CLI

CLI:

def main(argv: list[str] | None = None) -> int:
    ...
    service = CaseService(lambda: SqlAlchemyUnitOfWork(SessionLocal))
    service.transition_case(...)

Same service as API.

This is the value of clean boundary: CLI and API share use cases.


26. Constraints

Database constraints protect integrity.

Example:

class CaseRecord(Base):
    __tablename__ = "cases"

    id: Mapped[str] = mapped_column(primary_key=True)
    title: Mapped[str] = mapped_column(nullable=False)
    status: Mapped[str] = mapped_column(nullable=False)

Constraints:

  • primary key;
  • unique;
  • not null;
  • foreign key;
  • check constraint;
  • index.

Domain validation is not enough. Database constraints protect against:

  • bugs;
  • concurrent writes;
  • manual DB changes;
  • different app versions;
  • migration mistakes;
  • batch imports.

27. Indexes

If frequently querying by status:

from sqlalchemy import Index

Index("ix_cases_status", CaseRecord.status)

Or in mapped column:

status: Mapped[str] = mapped_column(index=True)

Index improves read query speed but costs:

  • storage;
  • slower writes;
  • migration complexity;
  • maintenance overhead.

Add indexes based on query patterns.


28. Relationships: Notes

Relational model:

from sqlalchemy import ForeignKey
from sqlalchemy.orm import relationship


class CaseNoteRecord(Base):
    __tablename__ = "case_notes"

    id: Mapped[int] = mapped_column(primary_key=True, autoincrement=True)
    case_id: Mapped[str] = mapped_column(ForeignKey("cases.id"))
    text: Mapped[str]


class CaseRecord(Base):
    __tablename__ = "cases"

    id: Mapped[str] = mapped_column(primary_key=True)
    title: Mapped[str]
    status: Mapped[str]

    notes: Mapped[list[CaseNoteRecord]] = relationship(
        cascade="all, delete-orphan",
    )

Relationship loading strategy matters. Lazy loading can cause N+1.


29. N+1 Query Problem

Bad pattern:

cases = session.scalars(select(CaseRecord)).all()

for case in cases:
    print(case.notes)

If notes lazy-load per case, this can issue one query for cases plus one query per case.

Fix with eager loading:

from sqlalchemy.orm import selectinload

statement = select(CaseRecord).options(selectinload(CaseRecord.notes))
cases = session.scalars(statement).all()

Choose loading strategy based on use case.


30. SQL Injection

Never interpolate user input into raw SQL.

Bad:

session.execute(text(f"SELECT * FROM cases WHERE id = '{case_id}'"))

Good:

from sqlalchemy import text

session.execute(
    text("SELECT * FROM cases WHERE id = :case_id"),
    {"case_id": case_id},
)

ORM expressions also parameterize values:

select(CaseRecord).where(CaseRecord.id == case_id)

SQL injection is a security issue, not style issue.


31. Raw SQL

SQLAlchemy ORM is powerful, but raw SQL can be appropriate for:

  • complex reporting;
  • performance-critical queries;
  • database-specific features;
  • migrations;
  • administrative scripts;
  • query tuning.

Use parameterization.

Keep raw SQL localized and tested.

Do not force every query through ORM if SQL is clearer.


32. Migrations

Base.metadata.create_all(engine) is fine for learning and tests.

Production needs migrations.

Alembic is commonly used with SQLAlchemy to:

  • create migration scripts;
  • upgrade/downgrade schema;
  • track migration history;
  • review schema changes;
  • deploy incremental DB changes.

Migration mindset:

  • schema changes are code;
  • migrations are reviewed;
  • data migrations need rollback/backup plan;
  • deployment order matters;
  • incompatible app/schema versions must be planned.

33. SQLite vs PostgreSQL

SQLite:

  • embedded;
  • zero server;
  • great for local apps/tests;
  • transactional;
  • enough for many tools;
  • limited concurrency compared to server DB;
  • behavior differs from PostgreSQL.

PostgreSQL:

  • server database;
  • strong concurrency;
  • rich SQL/features;
  • production-grade multi-user workloads;
  • operational overhead;
  • connection pooling matters.

For case-tracker:

  • SQLite is good next step after JSON.
  • PostgreSQL makes sense for multi-user API production.

Test against the database you deploy for critical behavior.


34. Testing Repository

Use SQLite temp file:

def create_test_session_factory(tmp_path: Path) -> sessionmaker[Session]:
    engine = create_engine(f"sqlite:///{tmp_path / 'test.db'}")
    Base.metadata.create_all(engine)
    return sessionmaker(bind=engine)

Test:

def test_sqlalchemy_repository_adds_and_gets_case(tmp_path) -> None:
    session_factory = create_test_session_factory(tmp_path)

    with session_factory.begin() as session:
        repository = SqlAlchemyCaseRepository(session)
        case = Case(id=CaseId("CASE-001"), title="Late reporting")
        repository.add(case)

    with session_factory() as session:
        repository = SqlAlchemyCaseRepository(session)
        loaded = repository.get(CaseId("CASE-001"))

    assert loaded.id == CaseId("CASE-001")

Use temp file rather than in-memory DB if multiple connections matter.


35. In-Memory SQLite Caveat

sqlite:///:memory: database is per connection by default.

If your test creates tables on one connection and uses another, tables may disappear.

For simple tests, use temp file SQLite:

sqlite:///tmp/test.db

or configure shared connection carefully.

Do not let SQLite test behavior hide production DB issues.


36. Contract Test for Repository

Re-use contract from fake repository.

def assert_case_repository_contract(repository: CaseRepository) -> None:
    case = Case(id=CaseId("CASE-001"), title="Late reporting")
    repository.add(case)

    loaded = repository.get(case.id)

    assert loaded.id == case.id
    assert loaded.title == case.title

For SQLAlchemy repository, wrap in transaction.

def test_sqlalchemy_case_repository_contract(tmp_path) -> None:
    session_factory = create_test_session_factory(tmp_path)

    with session_factory.begin() as session:
        repository = SqlAlchemyCaseRepository(session)
        assert_case_repository_contract(repository)

Contract tests ensure fake and real implementations align.


37. Testing Unit of Work

def test_uow_commits_changes(tmp_path) -> None:
    session_factory = create_test_session_factory(tmp_path)
    uow_factory = lambda: SqlAlchemyUnitOfWork(session_factory)

    with uow_factory() as uow:
        uow.cases.add(Case(id=CaseId("CASE-001"), title="Late reporting"))
        uow.commit()

    with session_factory() as session:
        record = session.get(CaseRecord, "CASE-001")

    assert record is not None

Rollback:

def test_uow_rolls_back_on_error(tmp_path) -> None:
    session_factory = create_test_session_factory(tmp_path)
    uow_factory = lambda: SqlAlchemyUnitOfWork(session_factory)

    with pytest.raises(RuntimeError):
        with uow_factory() as uow:
            uow.cases.add(Case(id=CaseId("CASE-001"), title="Late reporting"))
            raise RuntimeError("boom")

    with session_factory() as session:
        assert session.get(CaseRecord, "CASE-001") is None

38. Session Thread Safety

Do not share Session across concurrent threads/tasks.

Bad:

global_session = SessionLocal()

Then used by all requests.

Good:

def get_session():
    with SessionLocal() as session:
        yield session

For API, one session per request or one UoW per use case.

For background workers, one session per job/unit of work.


39. Async SQLAlchemy

SQLAlchemy supports async usage with async engine/session.

Conceptual:

from sqlalchemy.ext.asyncio import create_async_engine, async_sessionmaker

engine = create_async_engine("postgresql+asyncpg://...")
SessionLocal = async_sessionmaker(engine)

Async repository methods become async def.

Important:

  • do not use sync Session in async route if it blocks event loop heavily;
  • do not mix async/sync APIs casually;
  • async DB driver required;
  • transaction boundary still matters;
  • Session is still stateful and must not be shared concurrently.

For current learning step, sync SQLAlchemy is simpler. Learn transaction semantics first.


40. Case Tracker SQL Schema

Tables:

cases
  id text primary key
  title text not null
  status text not null
  created_at timestamp not null
  updated_at timestamp not null

case_notes
  id integer primary key
  case_id text foreign key -> cases.id
  text text not null
  created_at timestamp not null

audit_events
  id text primary key
  case_id text not null
  event_type text not null
  occurred_at timestamp not null
  details_json text not null

This separates:

  • current state;
  • notes;
  • audit trail.

Do not store audit trail only in logs.


41. Case Tracker Transition Transaction

Use case:

def transition_case(self, case_id: CaseId, target_status: CaseStatus, actor: Actor) -> Case:
    with self._uow_factory() as uow:
        case = uow.cases.get(case_id)
        from_status = case.status

        case.transition_to(target_status)

        event = AuditEvent.case_transitioned(
            case_id=case.id,
            actor=actor,
            from_status=from_status,
            to_status=target_status,
        )

        uow.cases.save(case)
        uow.audit_events.add(event)
        uow.commit()

        return case

Important:

  • case update and audit event commit together;
  • invalid transition raises before save;
  • failure before commit rolls back all;
  • transaction boundary is explicit.

42. Case Tracker SQLAlchemy Mapping Sketch

class CaseRecord(Base):
    __tablename__ = "cases"

    id: Mapped[str] = mapped_column(primary_key=True)
    title: Mapped[str] = mapped_column(nullable=False)
    status: Mapped[str] = mapped_column(nullable=False)
    created_at: Mapped[datetime] = mapped_column(nullable=False)
    updated_at: Mapped[datetime] = mapped_column(nullable=False)


class CaseNoteRecord(Base):
    __tablename__ = "case_notes"

    id: Mapped[int] = mapped_column(primary_key=True, autoincrement=True)
    case_id: Mapped[str] = mapped_column(ForeignKey("cases.id"), nullable=False)
    text: Mapped[str] = mapped_column(nullable=False)
    created_at: Mapped[datetime] = mapped_column(nullable=False)

Keep mapping explicit.


43. Case Tracker Repository Query Methods

def list(self, status: CaseStatus | None = None, *, limit: int = 100, offset: int = 0) -> list[Case]:
    statement = select(CaseRecord).order_by(CaseRecord.id).limit(limit).offset(offset)

    if status is not None:
        statement = statement.where(CaseRecord.status == status.value)

    records = self._session.scalars(statement).all()

    return [case_record_to_domain(record) for record in records]

Pagination should happen in database, not after loading all rows.


44. Count Query

from sqlalchemy import func, select


def count(self, status: CaseStatus | None = None) -> int:
    statement = select(func.count()).select_from(CaseRecord)

    if status is not None:
        statement = statement.where(CaseRecord.status == status.value)

    return self._session.scalar(statement) or 0

For API list response:

total = repository.count(status)
cases = repository.list(status, limit=limit, offset=offset)

Avoid loading all cases just to count.


45. Data Access Smell Checklist

Watch for:

  1. Global Session.
  2. Commit inside repository method.
  3. Domain imports SQLAlchemy.
  4. Route handler directly uses Session for business logic.
  5. ORM object returned outside session boundary unintentionally.
  6. No transaction around multi-write use case.
  7. Audit event saved separately from state change.
  8. No DB constraints.
  9. No indexes for frequent queries.
  10. N+1 queries.
  11. Raw SQL with string interpolation.
  12. create_all used as production migration.
  13. SQLite tests only while production is PostgreSQL with different behavior.
  14. Async route using blocking sync DB heavily.
  15. Session shared across threads/tasks.

46. Practice: Build SQLAlchemy Models

Create:

src/case_tracker_sqlalchemy/
  models.py
  repository.py
  unit_of_work.py

Implement:

  • Base;
  • CaseRecord;
  • CaseNoteRecord;
  • create_engine;
  • sessionmaker.

Use SQLite first.


47. Practice: Repository Implementation

Implement:

  • get;
  • list;
  • add;
  • save;
  • count.

Rules:

  • no commit in repository;
  • map ORM to domain;
  • raise CaseNotFoundError on missing get;
  • use select;
  • support pagination.

48. Practice: Unit of Work

Implement SqlAlchemyUnitOfWork.

Test:

  • commit persists;
  • exception rolls back;
  • missing commit does not persist if explicit commit style;
  • repository available inside UoW.

49. Practice: API Uses SQL UoW

Wire FastAPI dependency:

def get_uow() -> UnitOfWork:
    return SqlAlchemyUnitOfWork(SessionLocal)

Service uses UoW factory.

Test with SQLite temp file or fake UoW.


50. Practice: N+1 Detection

Create cases with notes. Write list endpoint. Log SQL statements with echo=True or test query count conceptually.

Then use selectinload or repository query that fetches notes efficiently.

Explain before/after.


51. Self-Check

Jawab tanpa melihat materi:

  1. Apa perbedaan Engine dan Session?
  2. Kenapa Engine long-lived?
  3. Kenapa Session short-lived?
  4. Apa itu transaction boundary?
  5. Apa beda flush dan commit?
  6. Apa itu rollback?
  7. Apa itu identity map?
  8. Kenapa ORM object tidak sebaiknya dikembalikan keluar session sembarangan?
  9. Apa itu repository pattern?
  10. Kenapa repository tidak commit?
  11. Apa itu Unit of Work pattern?
  12. Kenapa case transition dan audit event harus satu transaksi?
  13. Apa fungsi DB constraints?
  14. Kapan perlu index?
  15. Apa itu N+1 query?
  16. Bagaimana mencegah SQL injection?
  17. Kenapa migration penting?
  18. Apa caveat SQLite in-memory test?
  19. Kenapa Session tidak boleh shared antar threads/tasks?
  20. Kapan async SQLAlchemy layak?

52. Definition of Done Part 027

Kamu selesai part ini jika bisa:

  1. Menjelaskan Engine, Connection, Session.
  2. Membuat SQLAlchemy mapped class 2.x style.
  3. Membuat Session factory.
  4. Menulis insert/select/update/delete sederhana.
  5. Menjelaskan flush/commit/rollback.
  6. Menjelaskan identity map.
  7. Memisahkan domain model dan ORM record.
  8. Menulis mapping domain-record.
  9. Menulis repository tanpa commit.
  10. Menulis Unit of Work.
  11. Menulis transaction use case.
  12. Menambahkan constraints/index.
  13. Menjelaskan N+1 dan eager loading.
  14. Menulis repository tests dengan SQLite.
  15. Menjelaskan migration/Alembic secara konseptual.

53. Ringkasan

Data access adalah boundary reliability.

Inti part ini:

  • database menyelesaikan masalah transaksi, concurrency, query, constraint, dan durability yang JSON file tidak desain untuk;
  • SQLAlchemy Engine adalah konektivitas/pool yang long-lived;
  • Session adalah stateful ORM workspace dan transaction boundary praktis;
  • Session tidak boleh global atau shared antar concurrent tasks;
  • ORM mapping harus dipisahkan dari domain jika ingin architecture bersih;
  • repository menyembunyikan persistence detail;
  • repository tidak sebaiknya commit;
  • Unit of Work mengontrol transaction boundary;
  • flush mengirim SQL, commit menyelesaikan transaksi;
  • identity map berguna tetapi membuat Session stateful;
  • constraints dan indexes adalah bagian dari domain integrity di level database;
  • N+1 query harus diwaspadai;
  • raw SQL harus parameterized;
  • migrations adalah bagian dari delivery, bukan opsional untuk production;
  • API/CLI bisa berbagi service jika data access dibungkus port/adapters.

Part berikutnya akan membahas architecture in Python: layering, ports, adapters, domain modelling, use cases, and dependency direction.


54. Referensi

  • SQLAlchemy Documentation — ORM Quick Start.
  • SQLAlchemy Documentation — Session Basics.
  • SQLAlchemy Documentation — Transactions and Connection Management.
  • SQLAlchemy Documentation — Data Manipulation with the ORM.
  • SQLAlchemy Documentation — ORM Querying Guide.
  • Alembic Documentation — Database Migrations.
Lesson Recap

You just completed lesson 27 in deepen practice. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.