Data Access: SQLAlchemy, Transactions, Repository, dan Unit of Work
Part 027 — Data Access: SQLAlchemy, Transactions, Repository, dan Unit of Work
Membahas data access Python secara production-grade: SQLAlchemy 2.x style, Engine, Session, ORM mapping, transactions, repository pattern, unit of work, migration, testing, dan boundary design.
Part 027 — Data Access: SQLAlchemy, Transactions, Repository, dan Unit of Work
1. Tujuan Part Ini
Setelah JSON file storage, batas berikutnya adalah database.
Database bukan sekadar “tempat simpan data”. Database membawa konsep engineering yang kuat:
- transaksi;
- consistency;
- isolation;
- constraints;
- indexes;
- query planning;
- migrations;
- connection pooling;
- locking;
- concurrency;
- durability;
- failure recovery;
- schema evolution;
- multi-user access.
Di Python, SQLAlchemy adalah salah satu toolkit database paling penting. Ia menyediakan dua lapisan utama:
- SQLAlchemy Core: SQL expression language, engine, connection, table, statement.
- SQLAlchemy ORM: object-relational mapper, mapped classes, Session, identity map, unit of work.
Part ini membahas SQLAlchemy sebagai boundary data access, bukan sekadar template CRUD.
Target setelah part ini:
- Memahami masalah yang diselesaikan database dibanding JSON file.
- Memahami Engine, Connection, Session.
- Memahami ORM mapping SQLAlchemy 2.x style.
- Memahami transaction boundary.
- Memahami flush, commit, rollback.
- Memahami identity map dan unit of work.
- Mendesain repository yang tidak membocorkan ORM.
- Mendesain Unit of Work pattern.
- Menghindari session lifetime bug.
- Menulis tests data access.
- Memahami migration dan Alembic secara konseptual.
- Menghubungkan SQLAlchemy ke
case-tracker.
2. Kenapa Berpindah dari JSON File ke Database?
JSON file cukup untuk:
- learning project;
- local CLI single-user;
- small config/data;
- simple export/import;
- prototype;
- deterministic tests.
JSON file mulai bermasalah jika:
- banyak user menulis bersamaan;
- data besar;
- query/filter kompleks;
- butuh index;
- butuh transaksi;
- butuh migration;
- butuh constraint;
- butuh partial update;
- butuh audit log;
- butuh referential integrity;
- butuh backup/restore lebih serius;
- butuh concurrent reader/writer.
Database memberi boundary yang lebih kuat.
Untuk case-tracker, database masuk akal ketika:
- API multi-user;
- cases banyak;
- status query sering;
- notes/audit trail perlu relasi;
- transition harus transactional;
- duplicate id harus dicegah di DB;
- data tidak boleh corrupt karena write bersamaan.
3. SQLAlchemy Mental Model
Concepts:
| Concept | Meaning |
|---|---|
| Engine | Database connectivity factory + connection pool |
| Connection | Actual DBAPI connection/transaction channel |
| Session | ORM unit of work + identity map + transaction interface |
| Mapped class | Python class mapped to database table |
| Statement | SQL expression such as select(...) |
| Transaction | Atomic set of operations |
| Flush | Send pending SQL to DB transaction |
| Commit | Persist transaction |
| Rollback | Undo transaction |
| Identity map | One Python object per DB identity within a Session |
Important:
In ORM code, Session is the main object for database interaction. It is not just a connection wrapper; it tracks object state and transaction work.
4. Engine
Create engine:
from sqlalchemy import create_engine
engine = create_engine("sqlite:///cases.db", echo=False)
For PostgreSQL:
engine = create_engine("postgresql+psycopg://user:password@localhost:5432/cases")
Engine:
- knows database URL;
- manages connection pool;
- creates connections;
- is usually long-lived;
- is safe to share as application-level dependency;
- is not a transaction itself.
Do not create engine per request.
Bad:
def get_case(...):
engine = create_engine(...)
...
Better:
engine = create_engine(...)
SessionLocal = sessionmaker(bind=engine)
5. Session
Session is ORM workspace.
from sqlalchemy.orm import sessionmaker
SessionLocal = sessionmaker(bind=engine)
Use:
with SessionLocal() as session:
...
Session:
- tracks ORM objects;
- coordinates flush/commit/rollback;
- has identity map;
- represents a transaction scope in practical usage;
- should be short-lived for request/use case;
- should not be shared across threads/tasks.
Rule:
One Session per unit of work/request/job. Do not use one global Session for the whole app.
6. Session Lifetime
Good pattern:
def run_use_case() -> None:
with SessionLocal() as session:
with session.begin():
...
Or:
with SessionLocal.begin() as session:
...
The context commits if no exception and rolls back on exception.
Conceptual:
If exception:
rollback -> close -> propagate error
7. Declarative ORM Mapping
SQLAlchemy 2.x style mapping often uses DeclarativeBase, Mapped, and mapped_column.
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column
class Base(DeclarativeBase):
pass
class CaseRecord(Base):
__tablename__ = "cases"
id: Mapped[str] = mapped_column(primary_key=True)
title: Mapped[str]
status: Mapped[str]
Create tables for learning:
Base.metadata.create_all(engine)
For production, use migrations instead of calling create_all as deployment strategy.
8. Domain Model vs ORM Model
Domain model:
@dataclass
class Case:
id: CaseId
title: str
status: CaseStatus
notes: list[str] = field(default_factory=list)
ORM model:
class CaseRecord(Base):
__tablename__ = "cases"
id: Mapped[str] = mapped_column(primary_key=True)
title: Mapped[str]
status: Mapped[str]
Should they be the same class?
Options:
8.1 Active Record-ish: ORM Model as Domain
class Case(Base):
__tablename__ = "cases"
...
Pros:
- less mapping code;
- faster to start;
- common in many apps;
- simple CRUD.
Cons:
- domain tied to SQLAlchemy;
- persistence concerns leak into domain;
- tests may need DB/session;
- hard to keep framework-agnostic;
- domain lifecycle can be shaped by ORM.
8.2 Separate Domain and ORM Record
Pros:
- clean domain;
- persistence isolated;
- easier pure unit tests;
- supports multiple adapters;
- aligns with ports/adapters.
Cons:
- mapping code;
- possible duplication;
- more architecture ceremony;
- must maintain translation.
For this learning series, we prefer separation because the user’s goal is top-tier engineering judgment, not fastest CRUD.
9. Mapping Functions
ORM to domain:
def case_record_to_domain(record: CaseRecord) -> Case:
return Case(
id=CaseId(record.id),
title=record.title,
status=CaseStatus(record.status),
notes=[],
)
Domain to ORM:
def update_case_record_from_domain(record: CaseRecord, case: Case) -> None:
record.title = case.title
record.status = case.status.value
Create record:
def case_to_record(case: Case) -> CaseRecord:
return CaseRecord(
id=case.id.value,
title=case.title,
status=case.status.value,
)
Mapping is not wasted code. It is boundary control.
10. Basic Insert
with SessionLocal.begin() as session:
record = CaseRecord(
id="CASE-001",
title="Late reporting",
status="DRAFT",
)
session.add(record)
When transaction commits, SQL INSERT is persisted.
Important:
session.addmarks object pending.flushsends SQL.commitcommits transaction.- SQLAlchemy may flush before query/commit.
11. Basic Select
from sqlalchemy import select
with SessionLocal() as session:
statement = select(CaseRecord).where(CaseRecord.id == "CASE-001")
record = session.scalars(statement).one_or_none()
one_or_none() returns:
- one object;
None;- raises if multiple rows.
For primary key, also:
record = session.get(CaseRecord, "CASE-001")
session.get is convenient for primary key lookup and can use identity map.
12. Basic Update
with SessionLocal.begin() as session:
record = session.get(CaseRecord, "CASE-001")
if record is None:
raise CaseNotFoundError(CaseId("CASE-001"))
record.status = "SUBMITTED"
No explicit session.add needed if record is already persistent in session. SQLAlchemy tracks changes and flushes.
This is unit-of-work behavior.
13. Basic Delete
with SessionLocal.begin() as session:
record = session.get(CaseRecord, "CASE-001")
if record is not None:
session.delete(record)
Deletion occurs on flush/commit.
For domain systems, deletion may not be allowed. Often you use soft delete or status transition to preserve audit/history.
14. Flush vs Commit
Flush:
Send pending SQL statements to database transaction, but do not finalize transaction.
Commit:
Commit database transaction so changes become durable/visible according to isolation rules.
Example:
session.add(record)
session.flush()
print(record.id)
session.commit()
Flush can happen automatically before queries or commit.
Why care?
- DB-generated ids may be available after flush;
- constraints may fail on flush;
- you may need SQL executed before another query;
- rollback can still undo flushed changes before commit.
Rule:
Commit belongs at transaction boundary. Do not scatter commits deep inside repository methods unless that is explicitly the boundary.
15. Rollback
If exception occurs, rollback undoes transaction.
with SessionLocal() as session:
try:
...
session.commit()
except Exception:
session.rollback()
raise
Context manager pattern reduces boilerplate:
with SessionLocal.begin() as session:
...
If exception, rollback automatically.
16. Identity Map
Within a Session, SQLAlchemy tracks one object per database identity.
with SessionLocal() as session:
first = session.get(CaseRecord, "CASE-001")
second = session.get(CaseRecord, "CASE-001")
assert first is second
This can be useful, but also surprising.
Implications:
- Session is stateful.
- Long-lived Session can hold stale objects.
- Session should not be global.
- Different sessions can see different object instances/states.
- Commit/expire behavior matters.
17. Expiration and Staleness
By default, Session may expire ORM objects on commit depending configuration. Accessing attributes after commit can trigger reload if session active.
This is one reason not to return ORM objects outside session boundary.
Bad:
def get_case_record() -> CaseRecord:
with SessionLocal() as session:
return session.get(CaseRecord, "CASE-001")
Returned object may be detached or stale.
Better:
def get_case() -> Case:
with SessionLocal() as session:
record = session.get(CaseRecord, "CASE-001")
return case_record_to_domain(record)
Return domain object or DTO, not live ORM entity, unless your architecture explicitly uses ORM model as domain.
18. Repository Pattern
Repository hides persistence details behind domain-oriented methods.
Protocol:
from typing import Protocol
class CaseRepository(Protocol):
def get(self, case_id: CaseId) -> Case:
...
def list(self, status: CaseStatus | None = None) -> list[Case]:
...
def add(self, case: Case) -> None:
...
def save(self, case: Case) -> None:
...
Service depends on protocol:
class CaseService:
def __init__(self, repository: CaseRepository) -> None:
self._repository = repository
SQLAlchemy adapter implements protocol.
19. SQLAlchemy Repository
class SqlAlchemyCaseRepository:
def __init__(self, session: Session) -> None:
self._session = session
def get(self, case_id: CaseId) -> Case:
record = self._session.get(CaseRecord, case_id.value)
if record is None:
raise CaseNotFoundError(case_id)
return case_record_to_domain(record)
def add(self, case: Case) -> None:
self._session.add(case_to_record(case))
def save(self, case: Case) -> None:
record = self._session.get(CaseRecord, case.id.value)
if record is None:
raise CaseNotFoundError(case.id)
update_case_record_from_domain(record, case)
Note:
- repository does not commit;
- transaction boundary belongs to Unit of Work/service boundary;
- repository uses session injected from UoW.
20. Why Repository Should Not Commit
Bad:
class SqlAlchemyCaseRepository:
def save(self, case: Case) -> None:
...
self._session.commit()
Problem:
def transition_and_add_audit(case_id):
repository.save(case)
audit_repository.add(event)
If save commits before audit event, you can persist case transition but fail audit write.
Better:
with uow:
case = uow.cases.get(case_id)
case.transition_to(target)
uow.cases.save(case)
uow.audit_events.add(event)
uow.commit()
One transaction for all related changes.
21. Unit of Work Pattern
Unit of Work represents transaction boundary.
Protocol:
from typing import Protocol
class UnitOfWork(Protocol):
cases: CaseRepository
def __enter__(self) -> "UnitOfWork":
...
def __exit__(self, exc_type, exc, tb) -> None:
...
def commit(self) -> None:
...
def rollback(self) -> None:
...
Usage:
class CaseService:
def __init__(self, uow_factory: Callable[[], UnitOfWork]) -> None:
self._uow_factory = uow_factory
def transition_case(self, case_id: CaseId, target_status: CaseStatus) -> Case:
with self._uow_factory() as uow:
case = uow.cases.get(case_id)
case.transition_to(target_status)
uow.cases.save(case)
uow.commit()
return case
This makes transaction boundary explicit.
22. SQLAlchemy Unit of Work
class SqlAlchemyUnitOfWork:
def __init__(self, session_factory: sessionmaker[Session]) -> None:
self._session_factory = session_factory
def __enter__(self) -> "SqlAlchemyUnitOfWork":
self.session = self._session_factory()
self.cases = SqlAlchemyCaseRepository(self.session)
return self
def __exit__(self, exc_type, exc, tb) -> None:
if exc_type is not None:
self.rollback()
self.session.close()
def commit(self) -> None:
self.session.commit()
def rollback(self) -> None:
self.session.rollback()
Alternative: use session.begin() inside UoW. The exact implementation can vary. The design rule is stable:
one UoW represents one transaction boundary.
23. Commit/Rollback Discipline
Pattern:
with uow_factory() as uow:
...
uow.commit()
If exception happens before commit, __exit__ rolls back.
Potential issue: if no exception but commit forgotten, changes may not persist.
Some teams prefer UoW that commits automatically on successful exit. Others require explicit commit.
23.1 Explicit Commit Style
Pros:
- commit is visible;
- no accidental commit;
- aligns with “commit only when use case complete”.
Cons:
- forgetting commit silently rolls back/does not persist depending implementation.
23.2 Auto Commit Style
Pros:
- less boilerplate.
Cons:
- accidental writes if code exits normally;
- transaction intent less explicit.
For learning, explicit commit teaches boundary awareness.
24. Transaction Boundary in API
FastAPI dependency:
def get_uow() -> UnitOfWork:
return SqlAlchemyUnitOfWork(SessionLocal)
Route:
def transition_case_endpoint(
case_id: str,
request: TransitionCaseRequest,
service: CaseService = Depends(get_case_service),
):
case = service.transition_case(CaseId(case_id), CaseStatus(request.target_status.value))
return case_to_response(case)
Service owns UoW boundary. Route does not call commit.
Alternatively, per-request transaction dependency can wrap route, but then service transaction semantics become less explicit.
For complex domain, service-level UoW is often clearer.
25. Transaction Boundary in CLI
CLI:
def main(argv: list[str] | None = None) -> int:
...
service = CaseService(lambda: SqlAlchemyUnitOfWork(SessionLocal))
service.transition_case(...)
Same service as API.
This is the value of clean boundary: CLI and API share use cases.
26. Constraints
Database constraints protect integrity.
Example:
class CaseRecord(Base):
__tablename__ = "cases"
id: Mapped[str] = mapped_column(primary_key=True)
title: Mapped[str] = mapped_column(nullable=False)
status: Mapped[str] = mapped_column(nullable=False)
Constraints:
- primary key;
- unique;
- not null;
- foreign key;
- check constraint;
- index.
Domain validation is not enough. Database constraints protect against:
- bugs;
- concurrent writes;
- manual DB changes;
- different app versions;
- migration mistakes;
- batch imports.
27. Indexes
If frequently querying by status:
from sqlalchemy import Index
Index("ix_cases_status", CaseRecord.status)
Or in mapped column:
status: Mapped[str] = mapped_column(index=True)
Index improves read query speed but costs:
- storage;
- slower writes;
- migration complexity;
- maintenance overhead.
Add indexes based on query patterns.
28. Relationships: Notes
Relational model:
from sqlalchemy import ForeignKey
from sqlalchemy.orm import relationship
class CaseNoteRecord(Base):
__tablename__ = "case_notes"
id: Mapped[int] = mapped_column(primary_key=True, autoincrement=True)
case_id: Mapped[str] = mapped_column(ForeignKey("cases.id"))
text: Mapped[str]
class CaseRecord(Base):
__tablename__ = "cases"
id: Mapped[str] = mapped_column(primary_key=True)
title: Mapped[str]
status: Mapped[str]
notes: Mapped[list[CaseNoteRecord]] = relationship(
cascade="all, delete-orphan",
)
Relationship loading strategy matters. Lazy loading can cause N+1.
29. N+1 Query Problem
Bad pattern:
cases = session.scalars(select(CaseRecord)).all()
for case in cases:
print(case.notes)
If notes lazy-load per case, this can issue one query for cases plus one query per case.
Fix with eager loading:
from sqlalchemy.orm import selectinload
statement = select(CaseRecord).options(selectinload(CaseRecord.notes))
cases = session.scalars(statement).all()
Choose loading strategy based on use case.
30. SQL Injection
Never interpolate user input into raw SQL.
Bad:
session.execute(text(f"SELECT * FROM cases WHERE id = '{case_id}'"))
Good:
from sqlalchemy import text
session.execute(
text("SELECT * FROM cases WHERE id = :case_id"),
{"case_id": case_id},
)
ORM expressions also parameterize values:
select(CaseRecord).where(CaseRecord.id == case_id)
SQL injection is a security issue, not style issue.
31. Raw SQL
SQLAlchemy ORM is powerful, but raw SQL can be appropriate for:
- complex reporting;
- performance-critical queries;
- database-specific features;
- migrations;
- administrative scripts;
- query tuning.
Use parameterization.
Keep raw SQL localized and tested.
Do not force every query through ORM if SQL is clearer.
32. Migrations
Base.metadata.create_all(engine) is fine for learning and tests.
Production needs migrations.
Alembic is commonly used with SQLAlchemy to:
- create migration scripts;
- upgrade/downgrade schema;
- track migration history;
- review schema changes;
- deploy incremental DB changes.
Migration mindset:
- schema changes are code;
- migrations are reviewed;
- data migrations need rollback/backup plan;
- deployment order matters;
- incompatible app/schema versions must be planned.
33. SQLite vs PostgreSQL
SQLite:
- embedded;
- zero server;
- great for local apps/tests;
- transactional;
- enough for many tools;
- limited concurrency compared to server DB;
- behavior differs from PostgreSQL.
PostgreSQL:
- server database;
- strong concurrency;
- rich SQL/features;
- production-grade multi-user workloads;
- operational overhead;
- connection pooling matters.
For case-tracker:
- SQLite is good next step after JSON.
- PostgreSQL makes sense for multi-user API production.
Test against the database you deploy for critical behavior.
34. Testing Repository
Use SQLite temp file:
def create_test_session_factory(tmp_path: Path) -> sessionmaker[Session]:
engine = create_engine(f"sqlite:///{tmp_path / 'test.db'}")
Base.metadata.create_all(engine)
return sessionmaker(bind=engine)
Test:
def test_sqlalchemy_repository_adds_and_gets_case(tmp_path) -> None:
session_factory = create_test_session_factory(tmp_path)
with session_factory.begin() as session:
repository = SqlAlchemyCaseRepository(session)
case = Case(id=CaseId("CASE-001"), title="Late reporting")
repository.add(case)
with session_factory() as session:
repository = SqlAlchemyCaseRepository(session)
loaded = repository.get(CaseId("CASE-001"))
assert loaded.id == CaseId("CASE-001")
Use temp file rather than in-memory DB if multiple connections matter.
35. In-Memory SQLite Caveat
sqlite:///:memory: database is per connection by default.
If your test creates tables on one connection and uses another, tables may disappear.
For simple tests, use temp file SQLite:
sqlite:///tmp/test.db
or configure shared connection carefully.
Do not let SQLite test behavior hide production DB issues.
36. Contract Test for Repository
Re-use contract from fake repository.
def assert_case_repository_contract(repository: CaseRepository) -> None:
case = Case(id=CaseId("CASE-001"), title="Late reporting")
repository.add(case)
loaded = repository.get(case.id)
assert loaded.id == case.id
assert loaded.title == case.title
For SQLAlchemy repository, wrap in transaction.
def test_sqlalchemy_case_repository_contract(tmp_path) -> None:
session_factory = create_test_session_factory(tmp_path)
with session_factory.begin() as session:
repository = SqlAlchemyCaseRepository(session)
assert_case_repository_contract(repository)
Contract tests ensure fake and real implementations align.
37. Testing Unit of Work
def test_uow_commits_changes(tmp_path) -> None:
session_factory = create_test_session_factory(tmp_path)
uow_factory = lambda: SqlAlchemyUnitOfWork(session_factory)
with uow_factory() as uow:
uow.cases.add(Case(id=CaseId("CASE-001"), title="Late reporting"))
uow.commit()
with session_factory() as session:
record = session.get(CaseRecord, "CASE-001")
assert record is not None
Rollback:
def test_uow_rolls_back_on_error(tmp_path) -> None:
session_factory = create_test_session_factory(tmp_path)
uow_factory = lambda: SqlAlchemyUnitOfWork(session_factory)
with pytest.raises(RuntimeError):
with uow_factory() as uow:
uow.cases.add(Case(id=CaseId("CASE-001"), title="Late reporting"))
raise RuntimeError("boom")
with session_factory() as session:
assert session.get(CaseRecord, "CASE-001") is None
38. Session Thread Safety
Do not share Session across concurrent threads/tasks.
Bad:
global_session = SessionLocal()
Then used by all requests.
Good:
def get_session():
with SessionLocal() as session:
yield session
For API, one session per request or one UoW per use case.
For background workers, one session per job/unit of work.
39. Async SQLAlchemy
SQLAlchemy supports async usage with async engine/session.
Conceptual:
from sqlalchemy.ext.asyncio import create_async_engine, async_sessionmaker
engine = create_async_engine("postgresql+asyncpg://...")
SessionLocal = async_sessionmaker(engine)
Async repository methods become async def.
Important:
- do not use sync Session in async route if it blocks event loop heavily;
- do not mix async/sync APIs casually;
- async DB driver required;
- transaction boundary still matters;
- Session is still stateful and must not be shared concurrently.
For current learning step, sync SQLAlchemy is simpler. Learn transaction semantics first.
40. Case Tracker SQL Schema
Tables:
cases
id text primary key
title text not null
status text not null
created_at timestamp not null
updated_at timestamp not null
case_notes
id integer primary key
case_id text foreign key -> cases.id
text text not null
created_at timestamp not null
audit_events
id text primary key
case_id text not null
event_type text not null
occurred_at timestamp not null
details_json text not null
This separates:
- current state;
- notes;
- audit trail.
Do not store audit trail only in logs.
41. Case Tracker Transition Transaction
Use case:
def transition_case(self, case_id: CaseId, target_status: CaseStatus, actor: Actor) -> Case:
with self._uow_factory() as uow:
case = uow.cases.get(case_id)
from_status = case.status
case.transition_to(target_status)
event = AuditEvent.case_transitioned(
case_id=case.id,
actor=actor,
from_status=from_status,
to_status=target_status,
)
uow.cases.save(case)
uow.audit_events.add(event)
uow.commit()
return case
Important:
- case update and audit event commit together;
- invalid transition raises before save;
- failure before commit rolls back all;
- transaction boundary is explicit.
42. Case Tracker SQLAlchemy Mapping Sketch
class CaseRecord(Base):
__tablename__ = "cases"
id: Mapped[str] = mapped_column(primary_key=True)
title: Mapped[str] = mapped_column(nullable=False)
status: Mapped[str] = mapped_column(nullable=False)
created_at: Mapped[datetime] = mapped_column(nullable=False)
updated_at: Mapped[datetime] = mapped_column(nullable=False)
class CaseNoteRecord(Base):
__tablename__ = "case_notes"
id: Mapped[int] = mapped_column(primary_key=True, autoincrement=True)
case_id: Mapped[str] = mapped_column(ForeignKey("cases.id"), nullable=False)
text: Mapped[str] = mapped_column(nullable=False)
created_at: Mapped[datetime] = mapped_column(nullable=False)
Keep mapping explicit.
43. Case Tracker Repository Query Methods
def list(self, status: CaseStatus | None = None, *, limit: int = 100, offset: int = 0) -> list[Case]:
statement = select(CaseRecord).order_by(CaseRecord.id).limit(limit).offset(offset)
if status is not None:
statement = statement.where(CaseRecord.status == status.value)
records = self._session.scalars(statement).all()
return [case_record_to_domain(record) for record in records]
Pagination should happen in database, not after loading all rows.
44. Count Query
from sqlalchemy import func, select
def count(self, status: CaseStatus | None = None) -> int:
statement = select(func.count()).select_from(CaseRecord)
if status is not None:
statement = statement.where(CaseRecord.status == status.value)
return self._session.scalar(statement) or 0
For API list response:
total = repository.count(status)
cases = repository.list(status, limit=limit, offset=offset)
Avoid loading all cases just to count.
45. Data Access Smell Checklist
Watch for:
- Global Session.
- Commit inside repository method.
- Domain imports SQLAlchemy.
- Route handler directly uses Session for business logic.
- ORM object returned outside session boundary unintentionally.
- No transaction around multi-write use case.
- Audit event saved separately from state change.
- No DB constraints.
- No indexes for frequent queries.
- N+1 queries.
- Raw SQL with string interpolation.
create_allused as production migration.- SQLite tests only while production is PostgreSQL with different behavior.
- Async route using blocking sync DB heavily.
- Session shared across threads/tasks.
46. Practice: Build SQLAlchemy Models
Create:
src/case_tracker_sqlalchemy/
models.py
repository.py
unit_of_work.py
Implement:
Base;CaseRecord;CaseNoteRecord;create_engine;sessionmaker.
Use SQLite first.
47. Practice: Repository Implementation
Implement:
get;list;add;save;count.
Rules:
- no commit in repository;
- map ORM to domain;
- raise
CaseNotFoundErroron missingget; - use
select; - support pagination.
48. Practice: Unit of Work
Implement SqlAlchemyUnitOfWork.
Test:
- commit persists;
- exception rolls back;
- missing commit does not persist if explicit commit style;
- repository available inside UoW.
49. Practice: API Uses SQL UoW
Wire FastAPI dependency:
def get_uow() -> UnitOfWork:
return SqlAlchemyUnitOfWork(SessionLocal)
Service uses UoW factory.
Test with SQLite temp file or fake UoW.
50. Practice: N+1 Detection
Create cases with notes. Write list endpoint. Log SQL statements with echo=True or test query count conceptually.
Then use selectinload or repository query that fetches notes efficiently.
Explain before/after.
51. Self-Check
Jawab tanpa melihat materi:
- Apa perbedaan Engine dan Session?
- Kenapa Engine long-lived?
- Kenapa Session short-lived?
- Apa itu transaction boundary?
- Apa beda flush dan commit?
- Apa itu rollback?
- Apa itu identity map?
- Kenapa ORM object tidak sebaiknya dikembalikan keluar session sembarangan?
- Apa itu repository pattern?
- Kenapa repository tidak commit?
- Apa itu Unit of Work pattern?
- Kenapa case transition dan audit event harus satu transaksi?
- Apa fungsi DB constraints?
- Kapan perlu index?
- Apa itu N+1 query?
- Bagaimana mencegah SQL injection?
- Kenapa migration penting?
- Apa caveat SQLite in-memory test?
- Kenapa Session tidak boleh shared antar threads/tasks?
- Kapan async SQLAlchemy layak?
52. Definition of Done Part 027
Kamu selesai part ini jika bisa:
- Menjelaskan Engine, Connection, Session.
- Membuat SQLAlchemy mapped class 2.x style.
- Membuat Session factory.
- Menulis insert/select/update/delete sederhana.
- Menjelaskan flush/commit/rollback.
- Menjelaskan identity map.
- Memisahkan domain model dan ORM record.
- Menulis mapping domain-record.
- Menulis repository tanpa commit.
- Menulis Unit of Work.
- Menulis transaction use case.
- Menambahkan constraints/index.
- Menjelaskan N+1 dan eager loading.
- Menulis repository tests dengan SQLite.
- Menjelaskan migration/Alembic secara konseptual.
53. Ringkasan
Data access adalah boundary reliability.
Inti part ini:
- database menyelesaikan masalah transaksi, concurrency, query, constraint, dan durability yang JSON file tidak desain untuk;
- SQLAlchemy Engine adalah konektivitas/pool yang long-lived;
- Session adalah stateful ORM workspace dan transaction boundary praktis;
- Session tidak boleh global atau shared antar concurrent tasks;
- ORM mapping harus dipisahkan dari domain jika ingin architecture bersih;
- repository menyembunyikan persistence detail;
- repository tidak sebaiknya commit;
- Unit of Work mengontrol transaction boundary;
- flush mengirim SQL, commit menyelesaikan transaksi;
- identity map berguna tetapi membuat Session stateful;
- constraints dan indexes adalah bagian dari domain integrity di level database;
- N+1 query harus diwaspadai;
- raw SQL harus parameterized;
- migrations adalah bagian dari delivery, bukan opsional untuk production;
- API/CLI bisa berbagi service jika data access dibungkus port/adapters.
Part berikutnya akan membahas architecture in Python: layering, ports, adapters, domain modelling, use cases, and dependency direction.
54. Referensi
- SQLAlchemy Documentation — ORM Quick Start.
- SQLAlchemy Documentation — Session Basics.
- SQLAlchemy Documentation — Transactions and Connection Management.
- SQLAlchemy Documentation — Data Manipulation with the ORM.
- SQLAlchemy Documentation — ORM Querying Guide.
- Alembic Documentation — Database Migrations.
You just completed lesson 27 in deepen practice. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.
Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.