Recent Posts

Understanding the data (error) generating processes for data validation

A data consumer’s guide to validating data based on the failure modes data producer’s try to avoid

A Tale of Six States: Flexible data extraction with scraping and browser automation

Exploring how Playwright's headless browser automation (and its friends) can help unite the states’ data

Embedding column-name contracts in data pipelines with dbt

dbt supercharges SQL with Jinja templating, macros, and testing – all of which can be customized to enforce controlled vocabularies and their implied contracts on a data model

Causal design patterns for data analysts

An informal primer to causal analysis designs and data structures

Resource Round-Up: Causal Inference

Free books, lectures, blogs, papers, and more for a causal inference crash course


Column Names as Contracts

Exploring the benefits of using controlled vocabularies to encode metadata in column names, and demonstrations of implementing this approach with the convo R package or dbt extensions of SQL.

oRganization: Design patterns for internal packages

An overview of the unique design challenges and opportunities when building R packages for use inside of a single organization versus open-source. By using the jobs-to-be-done framework, this talk explores how internal packages can be better teammates by following specific design patterns for API design, testing, documentaiton, and more.

projmgr: Managing the human dependencies of your project

A lightning talk on key features of the projmgr package

RMarkdown Driven Development

How and why to refactor one time analyses in RMarkdown into sustainable data products

tidycf: Turning analysis on its head by turning cashflows on their side

An overview of how the tidycf R package led to process and cultural change at Capital One




dbt package bringing dplyr semantics to SQL


R package for managing controlled vocabularies

satRday Chicago Conference Organizer

Speaker & Sponsor lead for 2019 and 2020


Hackathon-in-a-box templates for custom Rmd and ggplot2 themes


R package providing project management interface to GitHub


97 Things Every Data Engineer Should Know: Collective Wisdom from the Experts

Contributed six chapters on tops ranging from data design, development, validation, and democratization

R Markdown Cookbook

This cookbook contains tips and tricks to help you get the most out of R Markdown. Topics include the automated generation of content (diagrams, text), customizing format (Pandoc, HTML, and LaTeX templates), workflow improvements (modularizing child documents, cross-referencing code chunks, chunk caching), modifying rendering behavior with hooks, and using alternative language engines.