Ananya Pathak

Blog

Thoughts on ML engineering, technology, and building things.

Reducing OOM in ML Pipelines with Parquet + PyArrow + Streaming Standardization

2026-05-05

A practical pattern for preventing OOMKilled training jobs using Parquet storage, PyArrow batch scanning, and batch-wise standardization.

MLOps Parquet PyArrow Kubernetes Machine Learning
Read more →

Swapping Self Attention With Fourier Transform

10-07-2025

Key considerations when building machine learning systems that need to perform reliably in production environments.

transformers fast inference fourier transform
Read more →

Setting up MLOps dev environment with Dagster

10-07-2025

MLOps Dagster data engineering scaling
Read more →