lactationcurve.characteristics.method_test_interval

ICAR Test Interval Method for accumulated lactation yield.

Purpose

This module implements the ICAR Test Interval Method (Procedure 2, Section 2) to compute cumulative milk yield from test-day records.

Method Summary

Calculate total milk yield over a lactation by summing three parts:
  • First test-day milk yield multiplied by the number of days between calving and the first test day.
  • Trapezoidal integration between consecutive test days (the average of two consecutive yields multiplied by the interval length).
  • Last test-day milk yield multiplied by the number of days from the last test day to the end of the calculation window.
Formula:

MY = I0M1 + I1(M1 + M2)/2 + I2*(M2 + M3)/2 + ... + I(n-1)(M(n-1) + Mn)/2 + InMn

Where:

  • MY: total milk yield over the lactation window.
  • M1, M2, ..., Mn: milk yield measured in the 24 hours of each test day.
  • I1, I2, ..., I(n-1): interval lengths (days) between consecutive test days.
  • I0: days from lactation start (calving) to first test day.
  • In: days from last test day to the end of the calculation window (e.g., DIM 305).

Key Entry Points

  • test_interval_method: Computes cumulative lactation milk yield per TestId using start, interval, and end segments.

Column Flexibility

The function accepts several case-insensitive column name aliases and can create a default TestId if one is missing. Recognized aliases:

  • Days in Milk: ["daysinmilk", "dim", "testday"]
  • Milk Yield: ["milkingyield", "testdaymilkyield", "milkyield", "yield"]
  • Test Id: ["animalid", "testid", "id"]

It is also possible to provide your own column names so the function can be applied to dataframes with different column naming conventions.

Returns a DataFrame with columns: ["TestId", "LactationMilkYield"].

Notes

  • Units: DIM (days in milk) is measured in days, and milk yield can be in kg or lb. The output stays in the same unit as the input.
  • Records with DIM > max_dim are excluded before computation.
  • This method's main strength is its ease of use and simplicity.
  • Its main disadvantage is that it does not account for the shape of the lactation curve, which can lead to underestimation of total yield, especially for lactations with few test days or irregular patterns. Outliers in test-day records can also have a large influence on the final result.
  • When a full lactation is known (i.e., test days up to DIM 305), this method will give a higher cumulative milk yield then the sum off all test days, because of the way the start and end contributions are calculated.

Reference

Sargent, F. D., V. H. Lyton, and 0. G. Wall, J r . 1968. Test interval method of calculating Dairy Herd Improvement Association records. J. Dairy Sci. 51:170.

Author: Meike van Leerdam Date: 07-31-2025 Last update: 22-May-2026

  1"""
  2ICAR Test Interval Method for accumulated lactation yield.
  3
  4Purpose
  5-------
  6This module implements the ICAR Test Interval Method (Procedure 2,
  7Section 2) to compute cumulative milk yield from test-day records.
  8
  9Method Summary
 10--------------
 11Calculate total milk yield over a lactation by summing three parts:
 12        - First test-day milk yield multiplied by the number of days between
 13            calving and the first test day.
 14        - Trapezoidal integration between consecutive test days (the average of
 15            two consecutive yields multiplied by the interval length).
 16        - Last test-day milk yield multiplied by the number of days from the last
 17            test day to the end of the calculation window.
 18
 19Formula:
 20
 21    MY = I0*M1
 22         + I1*(M1 + M2)/2
 23         + I2*(M2 + M3)/2
 24         + ...
 25         + I(n-1)*(M(n-1) + Mn)/2
 26         + In*Mn
 27
 28Where:
 29- ``MY``: total milk yield over the lactation window.
 30- ``M1, M2, ..., Mn``: milk yield measured in the 24 hours of each test day.
 31- ``I1, I2, ..., I(n-1)``: interval lengths (days) between consecutive test days.
 32- ``I0``: days from lactation start (calving) to first test day.
 33- ``In``: days from last test day to the end of the calculation window (e.g., DIM 305).
 34
 35
 36Key Entry Points
 37----------------
 38- ``test_interval_method``: Computes cumulative lactation milk yield per
 39    ``TestId`` using start, interval, and end segments.
 40
 41
 42Column Flexibility
 43------------------
 44The function accepts several case-insensitive column name aliases and can
 45create a default ``TestId`` if one is missing. Recognized aliases:
 46
 47- Days in Milk: `["daysinmilk", "dim", "testday"]`
 48- Milk Yield: `["milkingyield", "testdaymilkyield", "milkyield", "yield"]`
 49- Test Id: `["animalid", "testid", "id"]`
 50
 51It is also possible to provide your own column names so the function
 52can be applied to dataframes with different column naming conventions.
 53
 54Returns a DataFrame with columns: ``["TestId", "LactationMilkYield"]``.
 55
 56Notes
 57-----
 58- Units: DIM (days in milk) is measured in days,
 59    and milk yield can be in kg or lb. The
 60    output stays in the same unit as the input.
 61- Records with ``DIM > max_dim`` are excluded before computation.
 62- This method's main strength is its ease of use and simplicity.
 63- Its main disadvantage is that it does not account for the shape of the
 64    lactation curve, which can lead to underestimation of total yield,
 65    especially for lactations with few test days or irregular patterns.
 66    Outliers in test-day records can also have a large influence on the final
 67    result.
 68- When a full lactation is known (i.e., test days up to DIM 305),
 69this method will give a higher cumulative milk yield then the sum off all test days,
 70because of the way the start and end contributions are calculated.
 71
 72Reference
 73---------
 74Sargent, F. D., V. H. Lyton, and 0. G. Wall, J r . 1968. Test interval method of
 75calculating Dairy Herd Improvement Association records. J. Dairy Sci. 51:170.
 76
 77Author: Meike van Leerdam
 78Date: 07-31-2025
 79Last update: 22-May-2026
 80"""
 81
 82import pandas as pd
 83
 84from lactationcurve.preprocessing import standardize_lactation_columns
 85
 86
 87def test_interval_method(
 88    df: pd.DataFrame,
 89    days_in_milk_col: str | None = None,
 90    milking_yield_col: str | None = None,
 91    test_id_col: str | None = None,
 92    default_test_id: int = 0,
 93    max_dim: int = 305,
 94) -> pd.DataFrame:
 95    """Compute total lactation milk yield using the ICAR Test Interval Method.
 96
 97    The method applies:
 98    - First test day milk yield from calving to the first test day,
 99    - Trapezoidal integration between consecutive test days,
100    - Last test day milk yield from the last test day to DIM = max_dim (default = 305).
101
102    Args:
103        df (pd.DataFrame): Input DataFrame with at least DaysInMilk and
104            MilkingYield columns, plus an optional TestId column. Column names
105            can be provided explicitly or matched via known aliases.
106        days_in_milk_col (str | None): Optional column name override for
107            DaysInMilk.
108        milking_yield_col (str | None): Optional column name override for
109            MilkingYield.
110        test_id_col (str | None): Optional column name override for TestId.
111        default_test_id (int): Value used to create a default TestId column if
112            one is missing.
113        max_dim (int): Lactation length used to calculate cumulative
114            production. The default is 305 days.
115            Records with DIM > max_dim are excluded.
116
117    Returns:
118        pd.DataFrame: Two-column DataFrame with
119            - "TestId": identifier per lactation,
120            - "LactationMilkYield": computed total milk yield over the
121              specified window.
122
123    Raises:
124        ValueError: If required columns (DaysInMilk or MilkingYield) cannot be found.
125
126    Notes:
127        - Records with DIM > max_dim are dropped before computation.
128        - At least two data points per TestId are required for trapezoidal integration;
129          otherwise the lactation is skipped.
130    """
131
132    # Standardize columns and filter DIM <= max_dim
133    df = standardize_lactation_columns(
134        df,
135        days_in_milk_col=days_in_milk_col,
136        milking_yield_col=milking_yield_col,
137        test_id_col=test_id_col,
138        default_test_id=default_test_id,
139        max_dim=max_dim,
140    )
141
142    result = []
143
144    # Iterate over each lactation
145    for lactation in df["TestId"].unique():
146        lactation_df = pd.DataFrame(df[df["TestId"] == lactation])
147
148        # Sort by DaysInMilk ascending
149        lactation_df.sort_values(by="DaysInMilk", ascending=True, inplace=True)
150
151        if len(lactation_df) < 2:
152            print(f"Skipping TestId {lactation}: not enough data points for interpolation.")
153            continue
154
155        # Start and end points
156        start = lactation_df.iloc[0]
157        end = lactation_df.iloc[-1]
158
159        # Start contribution
160        MY0 = start["DaysInMilk"] * start["MilkingYield"]
161
162        # End contribution
163        MYend = (max_dim + 1 - end["DaysInMilk"]) * end["MilkingYield"]
164
165        # Intermediate trapezoidal contributions
166        lactation_df["width"] = lactation_df["DaysInMilk"].diff().shift(-1)
167        lactation_df["avg_yield"] = (
168            lactation_df["MilkingYield"] + lactation_df["MilkingYield"].shift(-1)
169        ) / 2
170        lactation_df["trapezoid_area"] = lactation_df["width"] * lactation_df["avg_yield"]
171
172        total_intermediate = lactation_df["trapezoid_area"].sum()
173
174        total_yield = MY0 + total_intermediate + MYend
175        result.append((lactation, total_yield))
176
177    return pd.DataFrame(result, columns=pd.Index(["TestId", "LactationMilkYield"]))
178
179
180# to prevent pytest from trying to collect this function as a test
181test_interval_method.__test__ = False
def test_interval_method( df: pandas.core.frame.DataFrame, days_in_milk_col: str | None = None, milking_yield_col: str | None = None, test_id_col: str | None = None, default_test_id: int = 0, max_dim: int = 305) -> pandas.core.frame.DataFrame:
 88def test_interval_method(
 89    df: pd.DataFrame,
 90    days_in_milk_col: str | None = None,
 91    milking_yield_col: str | None = None,
 92    test_id_col: str | None = None,
 93    default_test_id: int = 0,
 94    max_dim: int = 305,
 95) -> pd.DataFrame:
 96    """Compute total lactation milk yield using the ICAR Test Interval Method.
 97
 98    The method applies:
 99    - First test day milk yield from calving to the first test day,
100    - Trapezoidal integration between consecutive test days,
101    - Last test day milk yield from the last test day to DIM = max_dim (default = 305).
102
103    Args:
104        df (pd.DataFrame): Input DataFrame with at least DaysInMilk and
105            MilkingYield columns, plus an optional TestId column. Column names
106            can be provided explicitly or matched via known aliases.
107        days_in_milk_col (str | None): Optional column name override for
108            DaysInMilk.
109        milking_yield_col (str | None): Optional column name override for
110            MilkingYield.
111        test_id_col (str | None): Optional column name override for TestId.
112        default_test_id (int): Value used to create a default TestId column if
113            one is missing.
114        max_dim (int): Lactation length used to calculate cumulative
115            production. The default is 305 days.
116            Records with DIM > max_dim are excluded.
117
118    Returns:
119        pd.DataFrame: Two-column DataFrame with
120            - "TestId": identifier per lactation,
121            - "LactationMilkYield": computed total milk yield over the
122              specified window.
123
124    Raises:
125        ValueError: If required columns (DaysInMilk or MilkingYield) cannot be found.
126
127    Notes:
128        - Records with DIM > max_dim are dropped before computation.
129        - At least two data points per TestId are required for trapezoidal integration;
130          otherwise the lactation is skipped.
131    """
132
133    # Standardize columns and filter DIM <= max_dim
134    df = standardize_lactation_columns(
135        df,
136        days_in_milk_col=days_in_milk_col,
137        milking_yield_col=milking_yield_col,
138        test_id_col=test_id_col,
139        default_test_id=default_test_id,
140        max_dim=max_dim,
141    )
142
143    result = []
144
145    # Iterate over each lactation
146    for lactation in df["TestId"].unique():
147        lactation_df = pd.DataFrame(df[df["TestId"] == lactation])
148
149        # Sort by DaysInMilk ascending
150        lactation_df.sort_values(by="DaysInMilk", ascending=True, inplace=True)
151
152        if len(lactation_df) < 2:
153            print(f"Skipping TestId {lactation}: not enough data points for interpolation.")
154            continue
155
156        # Start and end points
157        start = lactation_df.iloc[0]
158        end = lactation_df.iloc[-1]
159
160        # Start contribution
161        MY0 = start["DaysInMilk"] * start["MilkingYield"]
162
163        # End contribution
164        MYend = (max_dim + 1 - end["DaysInMilk"]) * end["MilkingYield"]
165
166        # Intermediate trapezoidal contributions
167        lactation_df["width"] = lactation_df["DaysInMilk"].diff().shift(-1)
168        lactation_df["avg_yield"] = (
169            lactation_df["MilkingYield"] + lactation_df["MilkingYield"].shift(-1)
170        ) / 2
171        lactation_df["trapezoid_area"] = lactation_df["width"] * lactation_df["avg_yield"]
172
173        total_intermediate = lactation_df["trapezoid_area"].sum()
174
175        total_yield = MY0 + total_intermediate + MYend
176        result.append((lactation, total_yield))
177
178    return pd.DataFrame(result, columns=pd.Index(["TestId", "LactationMilkYield"]))

Compute total lactation milk yield using the ICAR Test Interval Method.

The method applies:

  • First test day milk yield from calving to the first test day,
  • Trapezoidal integration between consecutive test days,
  • Last test day milk yield from the last test day to DIM = max_dim (default = 305).
Arguments:
  • df (pd.DataFrame): Input DataFrame with at least DaysInMilk and MilkingYield columns, plus an optional TestId column. Column names can be provided explicitly or matched via known aliases.
  • days_in_milk_col (str | None): Optional column name override for DaysInMilk.
  • milking_yield_col (str | None): Optional column name override for MilkingYield.
  • test_id_col (str | None): Optional column name override for TestId.
  • default_test_id (int): Value used to create a default TestId column if one is missing.
  • max_dim (int): Lactation length used to calculate cumulative production. The default is 305 days. Records with DIM > max_dim are excluded.
Returns:

pd.DataFrame: Two-column DataFrame with - "TestId": identifier per lactation, - "LactationMilkYield": computed total milk yield over the specified window.

Raises:
  • ValueError: If required columns (DaysInMilk or MilkingYield) cannot be found.
Notes:
  • Records with DIM > max_dim are dropped before computation.
  • At least two data points per TestId are required for trapezoidal integration; otherwise the lactation is skipped.