There isn't an overall theme to this "blog". Each post can vary from the next based on what projects I'm working on, what I'm reading, listening to, or learning. This will be a vehicle to become a better writer and communicator simply by putting pen to paper so to speak.

Almost all of my written communication in the latter part of my life has been either text messages to friends or succint messages at work. Neither of these lend itself to long-form writing (not that I was ever good at that).

The projects I'll be working on and writing about will generally fall into one of two categories: solving a problem that I have or "I wonder if I could ___". I will attempt to convey my thought process around any project I have, not only for others but for myself when I inevitably think "why did I do it like this?"

Digitizing a Family Cookbook

cookbook
source code

I've been in posession of my grandma's cookbook for a few years with the intention of digitizing the cookbook. There are many recipes taken from magazines, back of packaging, and others where it's in a nice print that can easily be picked up from any OCR engine. The issue arises when reading in cursive recipes. I have created an OCR engine previously when red-tape would have taken months for a python library to be approved. This was a tedious process that required using open-cv and manually labeling each character that was scanned in order to train the model, and this was on printed text. Cursive text would be much harder because open cv relies on boundary boxes based on edges that do not exist as singular characters in cursive. This is where advancements in Model-as-a-service make it much easier (and less stressful).

Workflow

After taking a picture for each recipe to be digitized and placing it in an images folder, we would send the image to OpenAI with a prompt on what we're looking for. OpenAI would return the recipe and include any tags that will be used for searching on the site. The response form OpenAI is converted into a markdown file, and after all recipes have been generated we would generate html files by using Zensical, creators of the Material theme for MkDocs. The cookbook is pushed to Github pages, but any static site will do.

flowchart TD
    A[📁 images/ directory] --> B[find_images\nscans for .jpg .png .heic etc.]
    B --> C{skip_existing?\ncheck docs/recipes/}
    C -- already exists --> D[⏭️ Skip image]
    C -- not found --> E[encode_image\nbase64 encode]

    E --> F[OpenAI GPT-4o Vision API]

    subgraph prompt [Prompt]
        P1[SYSTEM_PROMPT\ncategory tag taxonomy\nJSON schema]
        P2[USER_PROMPT\n+ base64 image]
    end

    F <--> prompt

    F --> G[JSON response\ntitle, tags, ingredients\ninstructions, timings...]
    G --> H[recipe_to_markdown\nbuild YAML frontmatter\n+ Markdown body]
    H --> I[📄 docs/recipes/slug.md\n---\ntitle: ...\ntags: chicken, baking...\n---]

    I --> J[build_index.py\nreads all .md frontmatter]
    J --> K[docs/recipes/index.md\ncategorized recipe list]
    J --> L[docs/tags.md\nalphabetical tag listing]

    K --> M[zensical build\nMkDocs static site generator]
    L --> M
    N[docs/index.md\nhomepage] --> M
    O[mkdocs.yml\nsite config + nav] --> M

    M --> P[site/\nstatic HTML/CSS/JS]
    P --> Q[🚀 GitHub Pages\nvia Actions deploy.yml]

Future Enhancements

I'd like others and myself to be able to add recipes to this easily, but haven't decided on the best path. Some things to consider is ease of use for others to add to the cookbook, adding it securely, avoid spamming of requests to add. Perhaps I can setup an email address w/ some rules around who sent the email and it must include specific wording in the subject.

What I'm listening to week of June 2nd, 2025

The Real Python Podcast ep 252 - Python Training, itertools, and Idioms

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo.

Listen on: SpotifySpotify  Apple Podcasts  YouTube  Real Python Website

Migrating from Docusaurus to MkDocs

Recently, I migrated my personal site from Docusaurus to MkDocs. This decision was driven by several factors, but a significant one was that Docusaurus is owned and maintained by Meta. While Docusaurus is a powerful and flexible documentation framework, I wanted to move to a platform that is more community-driven and open, aligning better with my values.

MkDocs, especially with the Material theme, offers a clean, fast, and highly customizable experience for static site generation. The migration process involved restructuring my documentation, updating configuration files, and moving content from MDX to standard Markdown. I also appreciated MkDocs' straightforward integration with Python tooling and its active, independent community.

If you're considering a similar move, the process is very manageable and the benefits—especially around transparency and community ownership—are well worth it.

Learning curve

The learning curve for mkdocs is pretty shallow, especially with moving from docusaurus. Both build static sites off of markdown files, so the bulk of documentation doesn't need to change. There are some intricasies with updating the mkdocs.yml file that I need to dive in some more on, but I have zero complains for creating a basic site and publishing quickly.

Deploying website

For deployment, I use GitHub Actions to automate the build and deployment process. Whenever I push changes to the main branch, a GitHub Actions workflow is triggered that builds the MkDocs site and syncs the generated static files to an Amazon S3 bucket. The site is then served through Amazon CloudFront, which provides a fast and secure CDN for global access.

This setup allows me to focus on writing content and making improvements, while the deployment and hosting are handled automatically in the background. Using S3 and CloudFront together ensures that the site is highly available, scalable, and benefits from low-latency delivery to visitors around the world.

It's important to note that while you can host a static site directly from S3, that is limited to http connections. I'd prefer to let anyone who stumbles upon this site the comfort in knowing that it's using a secure connection.

flowchart TD
    J[Writer Creates Content] --> K[Git Commit Attempt]
    K --> L[Pre-commit Hook Runs]
    L --> M[mkdocs build]
    M --> N{Build Successful?}
    N -->|Yes| A[Push to Main Branch]
    N -->|No| O[Fix Build Errors]
    O --> K

    A --> B[GitHub Actions Triggered]
    B --> C[Build MkDocs Site]
    C --> D[Generate Static Files]
    D --> E[Sync to Amazon S3 Bucket]
    E --> F[Amazon CloudFront CDN]
    F --> P[Invalidate CloudFront Cache]
    G--> S[Verify Site Accessibility]
    P --> G[Global Content Delivery]
    G --> H[HTTPS Secure Connection]
    H --> I[Users Access Site]

    style A fill:#e1f5fe,color:#000
    style B fill:#f3e5f5,color:#000
    style C fill:#f3e5f5,color:#000
    style D fill:#f3e5f5,color:#000
    style E fill:#fff3e0,color:#000
    style F fill:#fff3e0,color:#000
    style P fill:#fff3e0,color:#000
    style S fill:#e8f5e8,color:#000
    style G fill:#e8f5e8,color:#000
    style H fill:#e8f5e8,color:#000
    style I fill:#e8f5e8,color:#000
    style J fill:#e1f5fe,color:#000
    style K fill:#ffecb3,color:#000
    style L fill:#ffecb3,color:#000
    style M fill:#ffecb3,color:#000
    style N fill:#ffecb3,color:#000
    style O fill:#ffcdd2,color:#000

    classDef automation fill:#f3e5f5,stroke:#9c27b0,color:#000
    classDef aws fill:#fff3e0,stroke:#ff9800,color:#000
    classDef seo fill:#e3f2fd,stroke:#1976d2,color:#000
    classDef delivery fill:#e8f5e8,stroke:#4caf50,color:#000
    classDef user fill:#e1f5fe,stroke:#2196f3,color:#000
    classDef precommit fill:#ffecb3,stroke:#f57c00,color:#000
    classDef error fill:#ffcdd2,stroke:#d32f2f,color:#000

    class B,C,D automation
    class E,F,P aws
    class Q,R seo
    class S,G,H,I delivery
    class A,J user
    class K,L,M,N precommit
    class O error