automationnotebookstrading

Automated Daily Briefing Generator Using Jupyter and Commodity APIs

UUnknown

2026-02-08

10 min read

Notebook-first, production-ready guide to fetch commodity blurbs, enrich with open interest and cash price, then render a PDF and post to Slack.

Automated Daily Briefing Generator Using Jupyter and Commodity APIs

Hook: If you manage data pipelines for a trading desk, you know the pain: swapping between slow APIs, inconsistent CSVs, manual write-ups and last-minute PDF edits. This notebook-first approach shows how to automate a daily commodity briefing that fetches market blurbs, enriches them with indicators (open interest, cash price), renders a polished PDF, and posts the result to Slack—reliably and repeatably.

Why build a Jupyter-based briefing in 2026?

In 2026 the market for commodity data integration has matured: normalized REST/GraphQL endpoints, cloud-hosted parquet downloads, and lower-latency feeds (Arrow Flight and streaming APIs) are common. Trading teams want:

Machine-readable, auditable briefs that can be archived and surfaced to ML models
Reproducible notebooks that codify business logic and can be parameterized for different desks
Cloud-native automation so briefings run on schedule with observability and secure secrets

What you’ll get from this guide

A Jupyter notebook pattern to fetch commodity blurbs and numeric indicators
Data-enrichment examples: open interest, cash price, percent moves
Template-driven PDF rendering and Slack delivery
Operational best practices: retries, caching, scheduling, provenance

Architecture overview (practical)

High-level pipeline:

Notebook orchestrates API calls to a commodity blurb API and numeric endpoints (open interest, cash price)
Enrich raw blurbs with computed indicators and short automated commentary
Render an HTML template and convert to PDF (WeasyPrint or wkhtmltopdf)
Upload PDF to S3 (or internal file store) and post a notification with a short summary and PDF link to Slack

Prerequisites

Python 3.10+ and JupyterLab (or Jupyter Notebook)
Libraries: requests, pandas, jinja2, matplotlib, weasyprint (or pdfkit + wkhtmltopdf), slack_sdk
API keys for your commodity data provider(s) — set them as environment variables or use Secrets Manager
Optional: AWS credentials for S3 uploads, CI runner (GitHub Actions, Airflow, or Prefect) for scheduling

Step 1 — Notebook layout and parameters

In the notebook, keep a top cell with parameters so it can be run manually or programmatically with papermill or nbconvert.

# parameters cell (for papermill)
date = '2026-01-18'
commodities = ['corn', 'soybeans', 'wheat', 'cotton']
DATA_API_KEY = os.getenv('COMMODITY_API_KEY')
SLACK_WEBHOOK = os.getenv('SLACK_WEBHOOK')
OUTPUT_BUCKET = 'trading-briefings'

Step 2 — Fetch blurbs and numeric indicators

Use small, resilient wrappers around your HTTP calls. Add retries, backoff, and caching to avoid hitting rate limits.

import os
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

session = requests.Session()
retries = Retry(total=3, backoff_factor=1, status_forcelist=[429, 500, 502, 503, 504])
session.mount('https://', HTTPAdapter(max_retries=retries))

BASE = 'https://api.example-commodity.com/v1'  # replace with your provider

def get_blurb(commodity, date):
    url = f"{BASE}/market/blurbs/{commodity}?date={date}"
    r = session.get(url, headers={'Authorization': f'Bearer {DATA_API_KEY}'}, timeout=10)
    r.raise_for_status()
    return r.json()

def get_indicator(commodity, indicator, date):
    url = f"{BASE}/indicators/{commodity}/{indicator}?date={date}"
    r = session.get(url, headers={'Authorization': f'Bearer {DATA_API_KEY}'}, timeout=10)
    r.raise_for_status()
    return r.json()

Example JSON returned by get_indicator might include fields: value, unit, prev_value, source.

Batch fetch and normalization

import pandas as pd

rows = []
for c in commodities:
    blurb = get_blurb(c, date)
    oi = get_indicator(c, 'open_interest', date)
    cash = get_indicator(c, 'cash_price', date)

    rows.append({
        'commodity': c,
        'blurb_text': blurb.get('text'),
        'open_interest': oi.get('value'),
        'open_interest_prev': oi.get('prev_value'),
        'cash_price': cash.get('value'),
        'cash_price_prev': cash.get('prev_value'),
        'source_blurb': blurb.get('source'),
        'source_oi': oi.get('source'),
        'source_cash': cash.get('source')
    })

df = pd.DataFrame(rows)

Step 3 — Enrich and compute signals

Compute simple signals traders care about: percent change, OI delta and whether the move is significant.

def pct_change(new, old):
    try:
        return (new - old) / old * 100 if old and old != 0 else None
    except Exception:
        return None

 df['cash_pct'] = df.apply(lambda r: pct_change(r['cash_price'], r['cash_price_prev']), axis=1)
 df['oi_pct'] = df.apply(lambda r: pct_change(r['open_interest'], r['open_interest_prev']), axis=1)

 def short_signal(row):
    parts = []
    if row['cash_pct'] is not None:
        if abs(row['cash_pct']) > 1.0:
            parts.append(f"Cash price {'up' if row['cash_pct']>0 else 'down'} {row['cash_pct']:.2f}%")
    if row['oi_pct'] is not None:
        if abs(row['oi_pct']) > 2.0:
            parts.append(f"Open interest {'up' if row['oi_pct']>0 else 'down'} {row['oi_pct']:.1f}%")
    return '; '.join(parts) if parts else 'No major signal'

 df['signal'] = df.apply(short_signal, axis=1)

Step 4 — Generate textual summary using templates

Use Jinja2 to produce consistent blurbs. This keeps styling separate from logic and makes localization or edits simple.

from jinja2 import Environment, FileSystemLoader

env = Environment(loader=FileSystemLoader('templates'))
template = env.get_template('daily_briefing.html')

report_html = template.render(date=date, rows=df.to_dict(orient='records'))

Example template fragment (templates/daily_briefing.html):

<!doctype html>
<html><head><style>body{font-family:Arial,Helvetica,sans-serif} .commodity{margin-bottom:18px}</style></head><body>
<h1>Commodity Briefing — {{ date }}</h1>
{% for r in rows %}
  <div class="commodity">
    <h2>{{ r.commodity|capitalize }} — {{ r.signal }}</h2>
    <p>{{ r.blurb_text }}</p>
    <ul>
      <li>Cash price: {{ r.cash_price }} ({{ r.cash_pct|round(2) }}%) — source: {{ r.source_cash }}</li>
      <li>Open interest: {{ r.open_interest }} ({{ r.oi_pct|round(2) }}%) — source: {{ r.source_oi }}</li>
    </ul>
  </div>
{% endfor %}
</body></html>

Step 5 — Convert HTML to PDF

Two production-friendly options:

WeasyPrint — pure Python and renders modern CSS well
wkhtmltopdf + pdfkit — fast and battle-tested, but requires system dependency

# WeasyPrint example
from weasyprint import HTML
pdf_bytes = HTML(string=report_html).write_pdf()
with open('briefing_2026-01-18.pdf', 'wb') as f:
    f.write(pdf_bytes)

Step 6 — Upload PDF and post to Slack

Upload to S3 (or your corporate file server) and then post a short summary + PDF URL to Slack. Use signed URLs when needed.

import boto3
s3 = boto3.client('s3')
key = f"briefings/{date}/briefing.pdf"
s3.put_object(Bucket=OUTPUT_BUCKET, Key=key, Body=pdf_bytes, ContentType='application/pdf')
# Create presigned URL valid for 24h
url = s3.generate_presigned_url('get_object', Params={'Bucket': OUTPUT_BUCKET, 'Key': key}, ExpiresIn=86400)

from slack_sdk import WebClient
client = WebClient(token=os.getenv('SLACK_BOT_TOKEN'))

# Short message with signal highlights
summary_lines = []
for _, r in df.iterrows():
    summary_lines.append(f"*{r['commodity'].upper()}*: {r['signal']}")
summary_text = "\n".join(summary_lines)

client.chat_postMessage(channel='#trading-briefs', text=f"Daily Briefing {date}\n{summary_text}\n{url}")

Step 7 — Run and schedule the notebook

Automation options that match different operational profiles:

Local cron / dedicated VM — cheapest but limited observability
GitHub Actions — run nbconvert or a Python script nightly; integrates with secrets
Airflow / Prefect — best for dependency management, retries, and SLA-based monitoring
Serverless (Cloud Run / Lambda) — lower maintenance for stateless runs

Example GitHub Actions workflow snippet (run a script that executes the notebook using papermill):

name: Daily Briefing
on:
  schedule:
    - cron: '0 12 * * *'  # 12:00 UTC

jobs:
  run:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-python@v4
        with:
          python-version: '3.10'
      - run: pip install -r requirements.txt
      - run: papermill notebooks/daily_briefing.ipynb notebooks/out.ipynb -p date ${{ github.event.schedule }}

Operational best practices (2026 tips)

Use vectorized bulk endpoints — many providers now expose multi-symbol endpoints; reduce latency and billing.
Cache aggressively — keep a short-lived cache (Redis) for intraday runs and a daily archive for provenance.
Audit data lineage — store source IDs, timestamps and provider version in metadata fields for every brief.
Respect rate limits — use exponential backoff and global leaky-bucket rate limiting in your session wrapper.
Secrets and credentials — use cloud secrets managers (AWS Secrets Manager, GCP Secret Manager) and short-lived tokens where possible.
Monitor costs — if your provider charges per request, consolidate and batch to reduce cost.

Governance and compliance

Traders need to present briefs to auditors and compliance teams. Keep an index (parquet or SQL) recording: date, brief_id, provider versions, and a checksum of the PDF. This provides immutability and a straightforward way to prove the brief content. See indexing manuals for the edge era for approaches to metadata and indexes.

Advanced: Add generative summarization and alert prioritization

By 2026 many teams augment rules-based blurbs with conditional generative summaries. Use an LLM for short executive summaries but keep the original sources and the model prompt stored alongside the brief for explainability.

# Example: produce a 2-sentence executive summary with a small, explainable prompt
from openai import OpenAI
client = OpenAI(api_key=os.getenv('OPENAI_KEY'))

prompt = f"Given these rows: {df[['commodity','signal']].to_dict(orient='records')}, write a 2-sentence summary for a commodities trader highlighting major market moves."
resp = client.responses.create(model='gpt-4o-mini', input=prompt)
summary = resp.output_text

Important: store the prompt and model metadata. For compliance, use smaller, closed models if required and log tokens used and costs.

Common pitfalls and how to avoid them

Inconsistent field names — normalize keys from different providers with one adapter layer.
Stale data — verify timestamps returned from the provider and enforce a freshness policy.
Rate-limited mornings — schedule a staggered fetch for large symbol lists or use a pre-warmed cache.
PDF layout issues — preview HTML locally; use CSS page-breaks to avoid splitting tables mid-row.

Sample troubleshooting checklist

Confirm provider status page and check for outages
Inspect notebook logs and saved response bodies for error payloads
Validate API key scopes and expiry
Check S3 permissions if upload fails (IAM role / bucket policy)
For Slack failures, validate bot token scopes and channel membership

Pro tip: Keep the run id and a trace id in every API call headers so you can correlate provider logs with your run logs.

Real-world example: Corn briefing (walkthrough)

Using the source-style blurbs you may see statements like “Corn ticking higher on Friday morning” and numeric fields such as cash price 3.82 and open interest +14,050. The notebook converts that into:

Structured record with cash_pct = -0.39% (if prev cash differs)
OI delta reported and flagged because +14,050 represents a >2% move vs. prior
Short signal produced: “Open interest up 3.2%; No major cash move”

The final PDF shows the raw blurb, the computed indicators and a short trading note. The Slack notification contains the executive summary and an S3-presigned link.

Security and licensing considerations

Always confirm provider licensing before distributing briefs externally. For internal use, ensure contract terms permit redistribution to stakeholders. Store and access API keys using best practices:

Never commit keys to git.
Prefer ephemeral credentials and rotate regularly.
Log access but redact secrets from logs.

Why a notebook-first approach wins for trading teams

Notebooks act both as documentation and executable logic. They accelerate iteration between quant, dev and trader:

Rapid prototyping: Build the brief, add indicators, tweak language templates.
Reproducibility: Parameterize with papermill and rerun previous briefs for backtests.
Auditability: Save executed notebooks (HTML or ipynb) alongside the PDF for provenance.

2026 trends that matter for your pipeline

Normalized commodity data contracts: More providers offer consistent object schemas and multi-symbol batch endpoints.
Arrow Flight and low-latency streams: For intraday desks, stream numeric indicators and use the notebook to pull final blurbs.
Explainable summarization: Regulators expect logging of prompts and model outputs when LLMs are used for trading commentary.
Event-driven automation: Cloud functions triggered by market events (e.g., 2% price move) can generate urgent mini-briefs.

Actionable takeaways

Start with a small, reproducible Jupyter notebook that fetches 3–5 commodities and renders a PDF
Enrich blurbs with open interest and cash price and compute simple percent-change signals
Render HTML + CSS and convert to PDF (WeasyPrint for pure-Python stacks)
Automate via GitHub Actions or Airflow; integrate secrets via a cloud secrets manager
Store metadata for provenance and compliance; keep raw response payloads for audit

Final notes

This pattern balances speed, auditability and trader usability. In 2026 the differentiator is not just data access but how you operationalize, secure, and explain the insights you push to desks.

Call to action

Ready to prototype? Download the companion notebook (ipynb) and template bundle from our Git repo, fork it, and run a nightly briefing in under an hour. If you want a turnkey integration with high-throughput commodity APIs and managed pipeline orchestration, contact our team for a pilot and a cost analysis tailored to your data volume.

Unknown

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.