Week 2

Data Foundations for Dairy Decisions

Learning Outcomes

By the end of Week 2, students will be able to: - Explain real-world data governance and ownership. - Organize dairy data according to FAIR principles. - Clean and structure datasets for downstream simulation modeling. - Develop and apply naming conventions and metadata standards. - Understand why decision-support begins with disciplined data management.

Lecture

1. The Reality of Dairy Farm Data

  • Diverse streams: milk meters, sensors, activity monitors, feeding systems, health logs.
  • Characteristics: heterogeneous, messy, operationally generated.
  • Importance: high-quality data improves diagnostic, predictive, and prescriptive support.

2. Data Ownership, Stewardship & Governance

  • Farmers as primary data owners; vendors often store/manage but do not own the data.
  • Ethical considerations: privacy, competitive sensitivity, misuse risks.
  • Governance: clear data-use agreements, access permissions, audit trails.

3. Data Organization Essentials

  • Tidy data principles: each row = observation; each column = variable.
  • Aligning time-series and event-based data across systems.
  • Metadata: sensor IDs, units, timestamps, sampling frequency.

4. FAIR Principles in Dairy Context

FAIR = Findable, Accessible, Interoperable, Reusable. - Findable: structured storage, logical filenames. - Accessible: secure but functional access controls. - Interoperable: open formats, consistent units. - Reusable: documentation and provenance tracking.

5. Naming Conventions as a Foundation for Reproducibility

  • Standardizing variable names (cow_id, milk_yield_kg).
  • File naming conventions (2026-02-01_milkingparlor_data.csv).
  • Folder hierarchy: raw/, cleaned/, analysis/, models/.