Automated Daily Briefing Generator Using Jupyter and Commodity APIs
Notebook-first, production-ready guide to fetch commodity blurbs, enrich with open interest and cash price, then render a PDF and post to Slack.
Automated Daily Briefing Generator Using Jupyter and Commodity APIs
Hook: If you manage data pipelines for a trading desk, you know the pain: swapping between slow APIs, inconsistent CSVs, manual write-ups and last-minute PDF edits. This notebook-first approach shows how to automate a daily commodity briefing that fetches market blurbs, enriches them with indicators (open interest, cash price), renders a polished PDF, and posts the result to Slack—reliably and repeatably.
Why build a Jupyter-based briefing in 2026?
In 2026 the market for commodity data integration has matured: normalized REST/GraphQL endpoints, cloud-hosted parquet downloads, and lower-latency feeds (Arrow Flight and streaming APIs) are common. Trading teams want:
- Machine-readable, auditable briefs that can be archived and surfaced to ML models
- Reproducible notebooks that codify business logic and can be parameterized for different desks
- Cloud-native automation so briefings run on schedule with observability and secure secrets
What you’ll get from this guide
- A Jupyter notebook pattern to fetch commodity blurbs and numeric indicators
- Data-enrichment examples: open interest, cash price, percent moves
- Template-driven PDF rendering and Slack delivery
- Operational best practices: retries, caching, scheduling, provenance
Architecture overview (practical)
High-level pipeline:
- Notebook orchestrates API calls to a commodity blurb API and numeric endpoints (open interest, cash price)
- Enrich raw blurbs with computed indicators and short automated commentary
- Render an HTML template and convert to PDF (WeasyPrint or wkhtmltopdf)
- Upload PDF to S3 (or internal file store) and post a notification with a short summary and PDF link to Slack
Prerequisites
- Python 3.10+ and JupyterLab (or Jupyter Notebook)
- Libraries: requests, pandas, jinja2, matplotlib, weasyprint (or pdfkit + wkhtmltopdf), slack_sdk
- API keys for your commodity data provider(s) — set them as environment variables or use Secrets Manager
- Optional: AWS credentials for S3 uploads, CI runner (GitHub Actions, Airflow, or Prefect) for scheduling
Step 1 — Notebook layout and parameters
In the notebook, keep a top cell with parameters so it can be run manually or programmatically with papermill or nbconvert.
# parameters cell (for papermill)
date = '2026-01-18'
commodities = ['corn', 'soybeans', 'wheat', 'cotton']
DATA_API_KEY = os.getenv('COMMODITY_API_KEY')
SLACK_WEBHOOK = os.getenv('SLACK_WEBHOOK')
OUTPUT_BUCKET = 'trading-briefings'
Step 2 — Fetch blurbs and numeric indicators
Use small, resilient wrappers around your HTTP calls. Add retries, backoff, and caching to avoid hitting rate limits.
import os
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
session = requests.Session()
retries = Retry(total=3, backoff_factor=1, status_forcelist=[429, 500, 502, 503, 504])
session.mount('https://', HTTPAdapter(max_retries=retries))
BASE = 'https://api.example-commodity.com/v1' # replace with your provider
def get_blurb(commodity, date):
url = f"{BASE}/market/blurbs/{commodity}?date={date}"
r = session.get(url, headers={'Authorization': f'Bearer {DATA_API_KEY}'}, timeout=10)
r.raise_for_status()
return r.json()
def get_indicator(commodity, indicator, date):
url = f"{BASE}/indicators/{commodity}/{indicator}?date={date}"
r = session.get(url, headers={'Authorization': f'Bearer {DATA_API_KEY}'}, timeout=10)
r.raise_for_status()
return r.json()
Example JSON returned by get_indicator might include fields: value, unit, prev_value, source.
Batch fetch and normalization
import pandas as pd
rows = []
for c in commodities:
blurb = get_blurb(c, date)
oi = get_indicator(c, 'open_interest', date)
cash = get_indicator(c, 'cash_price', date)
rows.append({
'commodity': c,
'blurb_text': blurb.get('text'),
'open_interest': oi.get('value'),
'open_interest_prev': oi.get('prev_value'),
'cash_price': cash.get('value'),
'cash_price_prev': cash.get('prev_value'),
'source_blurb': blurb.get('source'),
'source_oi': oi.get('source'),
'source_cash': cash.get('source')
})
df = pd.DataFrame(rows)
Step 3 — Enrich and compute signals
Compute simple signals traders care about: percent change, OI delta and whether the move is significant.
def pct_change(new, old):
try:
return (new - old) / old * 100 if old and old != 0 else None
except Exception:
return None
df['cash_pct'] = df.apply(lambda r: pct_change(r['cash_price'], r['cash_price_prev']), axis=1)
df['oi_pct'] = df.apply(lambda r: pct_change(r['open_interest'], r['open_interest_prev']), axis=1)
def short_signal(row):
parts = []
if row['cash_pct'] is not None:
if abs(row['cash_pct']) > 1.0:
parts.append(f"Cash price {'up' if row['cash_pct']>0 else 'down'} {row['cash_pct']:.2f}%")
if row['oi_pct'] is not None:
if abs(row['oi_pct']) > 2.0:
parts.append(f"Open interest {'up' if row['oi_pct']>0 else 'down'} {row['oi_pct']:.1f}%")
return '; '.join(parts) if parts else 'No major signal'
df['signal'] = df.apply(short_signal, axis=1)
Step 4 — Generate textual summary using templates
Use Jinja2 to produce consistent blurbs. This keeps styling separate from logic and makes localization or edits simple.
from jinja2 import Environment, FileSystemLoader
env = Environment(loader=FileSystemLoader('templates'))
template = env.get_template('daily_briefing.html')
report_html = template.render(date=date, rows=df.to_dict(orient='records'))
Example template fragment (templates/daily_briefing.html):
<!doctype html>
<html><head><style>body{font-family:Arial,Helvetica,sans-serif} .commodity{margin-bottom:18px}</style></head><body>
<h1>Commodity Briefing — {{ date }}</h1>
{% for r in rows %}
<div class="commodity">
<h2>{{ r.commodity|capitalize }} — {{ r.signal }}</h2>
<p>{{ r.blurb_text }}</p>
<ul>
<li>Cash price: {{ r.cash_price }} ({{ r.cash_pct|round(2) }}%) — source: {{ r.source_cash }}</li>
<li>Open interest: {{ r.open_interest }} ({{ r.oi_pct|round(2) }}%) — source: {{ r.source_oi }}</li>
</ul>
</div>
{% endfor %}
</body></html>
Step 5 — Convert HTML to PDF
Two production-friendly options:
- WeasyPrint — pure Python and renders modern CSS well
- wkhtmltopdf + pdfkit — fast and battle-tested, but requires system dependency
# WeasyPrint example
from weasyprint import HTML
pdf_bytes = HTML(string=report_html).write_pdf()
with open('briefing_2026-01-18.pdf', 'wb') as f:
f.write(pdf_bytes)
Step 6 — Upload PDF and post to Slack
Upload to S3 (or your corporate file server) and then post a short summary + PDF URL to Slack. Use signed URLs when needed.
import boto3
s3 = boto3.client('s3')
key = f"briefings/{date}/briefing.pdf"
s3.put_object(Bucket=OUTPUT_BUCKET, Key=key, Body=pdf_bytes, ContentType='application/pdf')
# Create presigned URL valid for 24h
url = s3.generate_presigned_url('get_object', Params={'Bucket': OUTPUT_BUCKET, 'Key': key}, ExpiresIn=86400)
from slack_sdk import WebClient
client = WebClient(token=os.getenv('SLACK_BOT_TOKEN'))
# Short message with signal highlights
summary_lines = []
for _, r in df.iterrows():
summary_lines.append(f"*{r['commodity'].upper()}*: {r['signal']}")
summary_text = "\n".join(summary_lines)
client.chat_postMessage(channel='#trading-briefs', text=f"Daily Briefing {date}\n{summary_text}\n{url}")
Step 7 — Run and schedule the notebook
Automation options that match different operational profiles:
- Local cron / dedicated VM — cheapest but limited observability
- GitHub Actions — run nbconvert or a Python script nightly; integrates with secrets
- Airflow / Prefect — best for dependency management, retries, and SLA-based monitoring
- Serverless (Cloud Run / Lambda) — lower maintenance for stateless runs
Example GitHub Actions workflow snippet (run a script that executes the notebook using papermill):
name: Daily Briefing
on:
schedule:
- cron: '0 12 * * *' # 12:00 UTC
jobs:
run:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: '3.10'
- run: pip install -r requirements.txt
- run: papermill notebooks/daily_briefing.ipynb notebooks/out.ipynb -p date ${{ github.event.schedule }}
Operational best practices (2026 tips)
- Use vectorized bulk endpoints — many providers now expose multi-symbol endpoints; reduce latency and billing.
- Cache aggressively — keep a short-lived cache (Redis) for intraday runs and a daily archive for provenance.
- Audit data lineage — store source IDs, timestamps and provider version in metadata fields for every brief.
- Respect rate limits — use exponential backoff and global leaky-bucket rate limiting in your session wrapper.
- Secrets and credentials — use cloud secrets managers (AWS Secrets Manager, GCP Secret Manager) and short-lived tokens where possible.
- Monitor costs — if your provider charges per request, consolidate and batch to reduce cost.
Governance and compliance
Traders need to present briefs to auditors and compliance teams. Keep an index (parquet or SQL) recording: date, brief_id, provider versions, and a checksum of the PDF. This provides immutability and a straightforward way to prove the brief content. See indexing manuals for the edge era for approaches to metadata and indexes.
Advanced: Add generative summarization and alert prioritization
By 2026 many teams augment rules-based blurbs with conditional generative summaries. Use an LLM for short executive summaries but keep the original sources and the model prompt stored alongside the brief for explainability.
# Example: produce a 2-sentence executive summary with a small, explainable prompt
from openai import OpenAI
client = OpenAI(api_key=os.getenv('OPENAI_KEY'))
prompt = f"Given these rows: {df[['commodity','signal']].to_dict(orient='records')}, write a 2-sentence summary for a commodities trader highlighting major market moves."
resp = client.responses.create(model='gpt-4o-mini', input=prompt)
summary = resp.output_text
Important: store the prompt and model metadata. For compliance, use smaller, closed models if required and log tokens used and costs.
Common pitfalls and how to avoid them
- Inconsistent field names — normalize keys from different providers with one adapter layer.
- Stale data — verify timestamps returned from the provider and enforce a freshness policy.
- Rate-limited mornings — schedule a staggered fetch for large symbol lists or use a pre-warmed cache.
- PDF layout issues — preview HTML locally; use CSS page-breaks to avoid splitting tables mid-row.
Sample troubleshooting checklist
- Confirm provider status page and check for outages
- Inspect notebook logs and saved response bodies for error payloads
- Validate API key scopes and expiry
- Check S3 permissions if upload fails (IAM role / bucket policy)
- For Slack failures, validate bot token scopes and channel membership
Pro tip: Keep the run id and a trace id in every API call headers so you can correlate provider logs with your run logs.
Real-world example: Corn briefing (walkthrough)
Using the source-style blurbs you may see statements like “Corn ticking higher on Friday morning” and numeric fields such as cash price 3.82 and open interest +14,050. The notebook converts that into:
- Structured record with cash_pct = -0.39% (if prev cash differs)
- OI delta reported and flagged because +14,050 represents a >2% move vs. prior
- Short signal produced: “Open interest up 3.2%; No major cash move”
The final PDF shows the raw blurb, the computed indicators and a short trading note. The Slack notification contains the executive summary and an S3-presigned link.
Security and licensing considerations
Always confirm provider licensing before distributing briefs externally. For internal use, ensure contract terms permit redistribution to stakeholders. Store and access API keys using best practices:
- Never commit keys to git.
- Prefer ephemeral credentials and rotate regularly.
- Log access but redact secrets from logs.
Why a notebook-first approach wins for trading teams
Notebooks act both as documentation and executable logic. They accelerate iteration between quant, dev and trader:
- Rapid prototyping: Build the brief, add indicators, tweak language templates.
- Reproducibility: Parameterize with papermill and rerun previous briefs for backtests.
- Auditability: Save executed notebooks (HTML or ipynb) alongside the PDF for provenance.
2026 trends that matter for your pipeline
- Normalized commodity data contracts: More providers offer consistent object schemas and multi-symbol batch endpoints.
- Arrow Flight and low-latency streams: For intraday desks, stream numeric indicators and use the notebook to pull final blurbs.
- Explainable summarization: Regulators expect logging of prompts and model outputs when LLMs are used for trading commentary.
- Event-driven automation: Cloud functions triggered by market events (e.g., 2% price move) can generate urgent mini-briefs.
Actionable takeaways
- Start with a small, reproducible Jupyter notebook that fetches 3–5 commodities and renders a PDF
- Enrich blurbs with open interest and cash price and compute simple percent-change signals
- Render HTML + CSS and convert to PDF (WeasyPrint for pure-Python stacks)
- Automate via GitHub Actions or Airflow; integrate secrets via a cloud secrets manager
- Store metadata for provenance and compliance; keep raw response payloads for audit
Further reading and tools
- Papermill — parameterize notebooks for scheduled runs
- WeasyPrint / wkhtmltopdf — PDF rendering options
- slack_sdk — Python Slack client
- AWS S3 + presigned URLs for secure distribution
- OpenAI / local LLMs — for conditional executive summaries (log prompts and responses)
Final notes
This pattern balances speed, auditability and trader usability. In 2026 the differentiator is not just data access but how you operationalize, secure, and explain the insights you push to desks.
Call to action
Ready to prototype? Download the companion notebook (ipynb) and template bundle from our Git repo, fork it, and run a nightly briefing in under an hour. If you want a turnkey integration with high-throughput commodity APIs and managed pipeline orchestration, contact our team for a pilot and a cost analysis tailored to your data volume.
Related Reading
- From Micro-App to Production: CI/CD and Governance for LLM-Built Tools
- Observability in 2026: Subscription Health, ETL, and Real‑Time SLOs
- Review: CacheOps Pro — A Hands-On Evaluation for High-Traffic APIs
- Building Resilient Architectures: Design Patterns to Survive Multi-Provider Failures
- Monetizing Sensitive Islamic Content: Ethical Guidance for Creators
- How to Build Party Playlists That Respect Streaming Rights
- Project Idea Pack: 12 Small AI & Mobile Projects You Can Complete in a Weekend
- CES Beauty Tech Roundup: 8 Emerging Devices That Could Change Your Skincare Routine in 2026
- From Thinking Machines to Quantum Startups: Where Laid-Off AI Talent Can Add Value
Related Topics
worlddata
Contributor
Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.
Up Next
More stories handpicked for you