Code Generation¶

Overview¶

This project does not hand-write the 400+ AWS service enum values or their corresponding typed client properties. Instead, a code-generation script crawls the official boto3 documentation, collects every service name and client ID into a spec file, and then renders Python source files from Jinja2 templates.

Whenever AWS releases new services, re-running the script brings everything up to date automatically.

How It Works¶

The entire pipeline lives in scripts/codegen/ and consists of two steps:

Step 1 — Crawl (step1_crawl_spec_file_data):

Fetches the boto3 services index page.
Parses the sidebar to extract every service’s display name, href, documentation URL, and client ID (the string you pass to boto3.client(...)).
Writes the result to scripts/codegen/spec-file.json.

HTTP responses are cached on disk (in scripts/codegen/.cache/, git-ignored) with a 24-hour expiry, so repeated runs during development don’t hammer the boto3 docs site.

Step 2 — Generate (step2_generate_code):

Reads spec-file.json.
Renders two Jinja2 templates against the service list:
- services.jinja2 → boto_session_manager/services.py — the AwsServiceEnum class with one class attribute per service (e.g. AccessAnalyzer = "accessanalyzer").
- clients.jinja2 → boto_session_manager/clients.py — the ClientMixin class with one @property per service (e.g. bsm.accessanalyzer_client) that returns a typed boto3 client, leveraging mypy-boto3-* stubs for IDE autocomplete.

Both steps run in sequence when you execute:

python scripts/codegen/RUN_crawl_and_generate.py

File Layout¶

scripts/codegen/
├── RUN_crawl_and_generate.py   # entry point — run this to regenerate
├── services.jinja2             # template for AwsServiceEnum
├── clients.jinja2              # template for ClientMixin
├── spec-file.json              # crawled service metadata (checked in)
├── .cache/                     # HTTP response cache (git-ignored)
└── .gitignore                  # ignores .cache/

Key Data Model¶

Each AWS service is represented by the AWSService dataclass with these fields:

name — sidebar display text (e.g. "EBS"), used as the enum attribute name.
href_name — filename from the doc URL (e.g. "ebs.html").
doc_url — full boto3 documentation URL.
service_id — the string passed to boto3.client(service_id) (e.g. "ebs"), derived from href_name by stripping the .html suffix.

Derived properties convert the name to snake_case (for Python identifiers) and CamelCase as needed by the templates.

When to Re-run¶

Re-run the code generation script when:

AWS launches new services — the new service will appear on the boto3 docs index page.
A service is renamed or removed — the spec file and generated code will update accordingly.
You modify a Jinja2 template — e.g. to change the generated property signature or add new functionality.

After re-running, review the diff in services.py and clients.py to confirm the changes look correct before committing.

Dependencies¶

The code generation script requires several packages beyond the project’s runtime dependencies:

jinja2 — template rendering
httpx — HTTP client for fetching boto3 docs
selectolax — fast HTML parsing
diskcache — disk-based HTTP response caching

These are development-only dependencies and are not required at runtime.