Building Machine Learning Systems with a Feature Store
Jim Dowling
This book introduces fundamental principles and practices for developing, testing, and operating ML and AI systems at scale. It illustrates how an AI system can be decomposed into independent feature, training, and inference pipelines connected by a shared data layer. Through example ML systems, readers will tackle the hardest part of ML systems—the data, learning how to transform data into features and embeddings, and how to design a data model for AI.
The book is arranged into 6 logical parts, with each consisting of a group of chapters. Part I (chap. 1~3) introduces the feature-training-inference (FTI) architecture and concludes with a case study. Part II (chap. 4, 5) introduces feature stores for ML and a real-time credit card fraud example that will be covered throughout the book. Part III (chap. 6~9) is about data transformations for AI systems using frameworks such as Pandas, Polars, Apache Spark, Apache Flink, and Feldera. Part IV (chap. 10) is about training models on time-series data, unstructured data. It also outlines the scalability challenges in distributed training. Part V (chap. 11, 12) is about making predictions in batch, real-time, and agentic AI systems. Part VI (chap. 13~15) is about MLOps, from tests for AI systems to observability. Case studies from real-world applications are included as well.














There are no reviews yet.