Code Generation ============================================================================== Overview ------------------------------------------------------------------------------ This project does **not** hand-write the 400+ AWS service enum values or their corresponding typed client properties. Instead, a code-generation script crawls the official `boto3 documentation `_, collects every service name and client ID into a spec file, and then renders Python source files from Jinja2 templates. Whenever AWS releases new services, re-running the script brings everything up to date automatically. How It Works ------------------------------------------------------------------------------ The entire pipeline lives in ``scripts/codegen/`` and consists of two steps: **Step 1 — Crawl** (``step1_crawl_spec_file_data``): 1. Fetches the boto3 services index page. 2. Parses the sidebar to extract every service's display name, href, documentation URL, and client ID (the string you pass to ``boto3.client(...)``). 3. Writes the result to ``scripts/codegen/spec-file.json``. HTTP responses are cached on disk (in ``scripts/codegen/.cache/``, git-ignored) with a 24-hour expiry, so repeated runs during development don't hammer the boto3 docs site. **Step 2 — Generate** (``step2_generate_code``): 1. Reads ``spec-file.json``. 2. Renders two Jinja2 templates against the service list: - ``services.jinja2`` → ``boto_session_manager/services.py`` — the ``AwsServiceEnum`` class with one class attribute per service (e.g. ``AccessAnalyzer = "accessanalyzer"``). - ``clients.jinja2`` → ``boto_session_manager/clients.py`` — the ``ClientMixin`` class with one ``@property`` per service (e.g. ``bsm.accessanalyzer_client``) that returns a typed boto3 client, leveraging ``mypy-boto3-*`` stubs for IDE autocomplete. Both steps run in sequence when you execute:: python scripts/codegen/RUN_crawl_and_generate.py File Layout ------------------------------------------------------------------------------ :: scripts/codegen/ ├── RUN_crawl_and_generate.py # entry point — run this to regenerate ├── services.jinja2 # template for AwsServiceEnum ├── clients.jinja2 # template for ClientMixin ├── spec-file.json # crawled service metadata (checked in) ├── .cache/ # HTTP response cache (git-ignored) └── .gitignore # ignores .cache/ Key Data Model ------------------------------------------------------------------------------ Each AWS service is represented by the ``AWSService`` dataclass with these fields: - ``name`` — sidebar display text (e.g. ``"EBS"``), used as the enum attribute name. - ``href_name`` — filename from the doc URL (e.g. ``"ebs.html"``). - ``doc_url`` — full boto3 documentation URL. - ``service_id`` — the string passed to ``boto3.client(service_id)`` (e.g. ``"ebs"``), derived from ``href_name`` by stripping the ``.html`` suffix. Derived properties convert the name to ``snake_case`` (for Python identifiers) and ``CamelCase`` as needed by the templates. When to Re-run ------------------------------------------------------------------------------ Re-run the code generation script when: - **AWS launches new services** — the new service will appear on the boto3 docs index page. - **A service is renamed or removed** — the spec file and generated code will update accordingly. - **You modify a Jinja2 template** — e.g. to change the generated property signature or add new functionality. After re-running, review the diff in ``services.py`` and ``clients.py`` to confirm the changes look correct before committing. Dependencies ------------------------------------------------------------------------------ The code generation script requires several packages beyond the project's runtime dependencies: - `jinja2 `_ — template rendering - `httpx `_ — HTTP client for fetching boto3 docs - `selectolax `_ — fast HTML parsing - `diskcache `_ — disk-based HTTP response caching These are development-only dependencies and are not required at runtime.