lactationcurve.characteristics.test_interval_method

ICAR 305-day yield calculation — Test Interval Method.

This module implements the Test Interval Method described in ICAR guidelines (Procedure 2, Section 2: Computing of Accumulated Lactation Yield) to compute total 305-day milk yield from test-day data.

Approach

  • Start segment: Linear projection from calving (DIM=0) to the first test day.
  • Intermediate segments: Trapezoidal rule between consecutive test days.
  • End segment: Linear projection from the last test day to DIM=305 (exclusive upper bound 306 for day counting).

Column Flexibility

The function can accept various column name aliases (case-insensitive) and optionally create a default TestId if missing. Recognized aliases:

  • Days in Milk: ["daysinmilk", "dim", "testday"]
  • Milk Yield: ["milkingyield", "testdaymilkyield", "milkyield", "yield"]
  • Test Id: ["animalid", "testid", "id"]

Returns a DataFrame with columns: ["TestId", "Total305Yield"].

Notes

  • Units: DIM in days, milk yield in kg.
  • Records with DIM > 305 are excluded prior to computation.

Author: Meike van Leerdam, Date: 07-31-2025

  1"""
  2ICAR 305-day yield calculation — Test Interval Method.
  3
  4This module implements the **Test Interval Method** described in ICAR guidelines
  5(Procedure 2, Section 2: Computing of Accumulated Lactation Yield) to compute
  6total **305-day milk yield** from test-day data.
  7
  8Approach
  9--------
 10- **Start segment**: Linear projection from calving (DIM=0) to the first test day.
 11- **Intermediate segments**: **Trapezoidal rule** between consecutive test days.
 12- **End segment**: Linear projection from the last test day to DIM=305 (exclusive
 13  upper bound 306 for day counting).
 14
 15Column Flexibility
 16------------------
 17The function can accept various column name aliases (case-insensitive) and
 18optionally create a default `TestId` if missing. Recognized aliases:
 19
 20- Days in Milk: `["daysinmilk", "dim", "testday"]`
 21- Milk Yield: `["milkingyield", "testdaymilkyield", "milkyield", "yield"]`
 22- Test Id: `["animalid", "testid", "id"]`
 23
 24Returns a DataFrame with columns: `["TestId", "Total305Yield"]`.
 25
 26Notes
 27-----
 28- Units: DIM in days, milk yield in kg.
 29- Records with `DIM > 305` are excluded prior to computation.
 30
 31Author: Meike van Leerdam, Date: 07-31-2025
 32"""
 33
 34import pandas as pd
 35
 36
 37def test_interval_method(
 38    df, days_in_milk_col=None, milking_yield_col=None, test_id_col=None, default_test_id=1
 39):
 40    """Compute 305-day total milk yield using the ICAR Test Interval Method.
 41
 42    The method applies:
 43    - Linear projection from calving to the first test day,
 44    - Trapezoidal integration between consecutive test days,
 45    - Linear projection from the last test day to DIM=305.
 46
 47    Args:
 48        df (pd.DataFrame): Input DataFrame with at least DaysInMilk, MilkingYield,
 49            and (optionally) TestId columns (names can be provided via arguments
 50            or matched via known aliases, case-insensitive).
 51        days_in_milk_col (str | None): Optional column name override for DaysInMilk.
 52        milking_yield_col (str | None): Optional column name override for MilkingYield.
 53        test_id_col (str | None): Optional column name override for TestId.
 54        default_test_id (Any): If TestId is missing, a new `TestId` column is created
 55            with this value.
 56
 57    Returns:
 58        pd.DataFrame: Two-column DataFrame with
 59            - "TestId": identifier per lactation,
 60            - "Total305Yield": computed total milk yield over 305 days.
 61
 62    Raises:
 63        ValueError: If required columns (DaysInMilk or MilkingYield) cannot be found.
 64
 65    Notes:
 66        - Records with DIM > 305 are dropped before computation.
 67        - At least two data points per TestId are required for trapezoidal integration;
 68          otherwise the lactation is skipped.
 69    """
 70    result = []
 71
 72    # create a bit more flexibility in naming the columns and
 73    # when only one lactation is put in without a testid
 74
 75    # Define accepted variations for each logical column
 76    # Accepted aliases (case-insensitive)
 77    aliases = {
 78        "DaysInMilk": ["daysinmilk", "dim", "testday"],
 79        "MilkingYield": ["milkingyield", "testdaymilkyield", "milkyield", "yield"],
 80        "TestId": ["animalid", "testid", "id"],
 81    }
 82
 83    # Create a mapping from lowercase to actual column names
 84    col_lookup = {col.lower(): col for col in df.columns}
 85
 86    def get_col_name(override, possible_names):
 87        """Return a matching actual column name from `df`, or `None` if not found.
 88
 89        Args:
 90            override (str | None): Explicit column name provided by the user.
 91            possible_names (list[str]): List of acceptable aliases (lowercase).
 92
 93        Returns:
 94            str | None: The actual column name present in `df`, or `None` if no match.
 95        """
 96        if override:
 97            return col_lookup.get(override.lower())
 98        for name in possible_names:
 99            if name in col_lookup:
100                return col_lookup[name]
101        return None
102
103    # Resolve columns
104    dim_col = get_col_name(days_in_milk_col, aliases["DaysInMilk"])
105    if not dim_col:
106        raise ValueError("No DaysInMilk column found in DataFrame.")
107
108    my_col = get_col_name(milking_yield_col, aliases["MilkingYield"])
109    if not my_col:
110        raise ValueError("No MilkingYield column found in DataFrame.")
111
112    id_col = get_col_name(test_id_col, aliases["TestId"])
113    if not id_col:
114        id_col = "TestId"
115        df[id_col] = default_test_id
116
117    # Filter out records where Day > 305
118    df = df[df[dim_col] <= 305]
119
120    # Iterate over each lactation
121    for lactation in df[id_col].unique():
122        lactation_df = df[df[id_col] == lactation].copy()
123
124        # Sort by DaysInMilk ascending
125        lactation_df.sort_values(by=dim_col, ascending=True, inplace=True)
126
127        if len(lactation_df) < 2:
128            print(f"Skipping TestId {lactation}: not enough data points for interpolation.")
129            continue
130
131        # Start and end points
132        start = lactation_df.iloc[0]
133        end = lactation_df.iloc[-1]
134
135        # Start contribution
136        MY0 = start[dim_col] * start[my_col]
137
138        # End contribution
139        MYend = (306 - end[dim_col]) * end[my_col]
140
141        # Intermediate trapezoidal contributions
142        lactation_df["width"] = lactation_df[dim_col].diff().shift(-1)
143        lactation_df["avg_yield"] = (lactation_df[my_col] + lactation_df[my_col].shift(-1)) / 2
144        lactation_df["trapezoid_area"] = lactation_df["width"] * lactation_df["avg_yield"]
145
146        total_intermediate = lactation_df["trapezoid_area"].sum()
147
148        total_yield = MY0 + total_intermediate + MYend
149        result.append((lactation, total_yield))
150
151    return pd.DataFrame(result, columns=["TestId", "Total305Yield"])
def test_interval_method( df, days_in_milk_col=None, milking_yield_col=None, test_id_col=None, default_test_id=1):
 38def test_interval_method(
 39    df, days_in_milk_col=None, milking_yield_col=None, test_id_col=None, default_test_id=1
 40):
 41    """Compute 305-day total milk yield using the ICAR Test Interval Method.
 42
 43    The method applies:
 44    - Linear projection from calving to the first test day,
 45    - Trapezoidal integration between consecutive test days,
 46    - Linear projection from the last test day to DIM=305.
 47
 48    Args:
 49        df (pd.DataFrame): Input DataFrame with at least DaysInMilk, MilkingYield,
 50            and (optionally) TestId columns (names can be provided via arguments
 51            or matched via known aliases, case-insensitive).
 52        days_in_milk_col (str | None): Optional column name override for DaysInMilk.
 53        milking_yield_col (str | None): Optional column name override for MilkingYield.
 54        test_id_col (str | None): Optional column name override for TestId.
 55        default_test_id (Any): If TestId is missing, a new `TestId` column is created
 56            with this value.
 57
 58    Returns:
 59        pd.DataFrame: Two-column DataFrame with
 60            - "TestId": identifier per lactation,
 61            - "Total305Yield": computed total milk yield over 305 days.
 62
 63    Raises:
 64        ValueError: If required columns (DaysInMilk or MilkingYield) cannot be found.
 65
 66    Notes:
 67        - Records with DIM > 305 are dropped before computation.
 68        - At least two data points per TestId are required for trapezoidal integration;
 69          otherwise the lactation is skipped.
 70    """
 71    result = []
 72
 73    # create a bit more flexibility in naming the columns and
 74    # when only one lactation is put in without a testid
 75
 76    # Define accepted variations for each logical column
 77    # Accepted aliases (case-insensitive)
 78    aliases = {
 79        "DaysInMilk": ["daysinmilk", "dim", "testday"],
 80        "MilkingYield": ["milkingyield", "testdaymilkyield", "milkyield", "yield"],
 81        "TestId": ["animalid", "testid", "id"],
 82    }
 83
 84    # Create a mapping from lowercase to actual column names
 85    col_lookup = {col.lower(): col for col in df.columns}
 86
 87    def get_col_name(override, possible_names):
 88        """Return a matching actual column name from `df`, or `None` if not found.
 89
 90        Args:
 91            override (str | None): Explicit column name provided by the user.
 92            possible_names (list[str]): List of acceptable aliases (lowercase).
 93
 94        Returns:
 95            str | None: The actual column name present in `df`, or `None` if no match.
 96        """
 97        if override:
 98            return col_lookup.get(override.lower())
 99        for name in possible_names:
100            if name in col_lookup:
101                return col_lookup[name]
102        return None
103
104    # Resolve columns
105    dim_col = get_col_name(days_in_milk_col, aliases["DaysInMilk"])
106    if not dim_col:
107        raise ValueError("No DaysInMilk column found in DataFrame.")
108
109    my_col = get_col_name(milking_yield_col, aliases["MilkingYield"])
110    if not my_col:
111        raise ValueError("No MilkingYield column found in DataFrame.")
112
113    id_col = get_col_name(test_id_col, aliases["TestId"])
114    if not id_col:
115        id_col = "TestId"
116        df[id_col] = default_test_id
117
118    # Filter out records where Day > 305
119    df = df[df[dim_col] <= 305]
120
121    # Iterate over each lactation
122    for lactation in df[id_col].unique():
123        lactation_df = df[df[id_col] == lactation].copy()
124
125        # Sort by DaysInMilk ascending
126        lactation_df.sort_values(by=dim_col, ascending=True, inplace=True)
127
128        if len(lactation_df) < 2:
129            print(f"Skipping TestId {lactation}: not enough data points for interpolation.")
130            continue
131
132        # Start and end points
133        start = lactation_df.iloc[0]
134        end = lactation_df.iloc[-1]
135
136        # Start contribution
137        MY0 = start[dim_col] * start[my_col]
138
139        # End contribution
140        MYend = (306 - end[dim_col]) * end[my_col]
141
142        # Intermediate trapezoidal contributions
143        lactation_df["width"] = lactation_df[dim_col].diff().shift(-1)
144        lactation_df["avg_yield"] = (lactation_df[my_col] + lactation_df[my_col].shift(-1)) / 2
145        lactation_df["trapezoid_area"] = lactation_df["width"] * lactation_df["avg_yield"]
146
147        total_intermediate = lactation_df["trapezoid_area"].sum()
148
149        total_yield = MY0 + total_intermediate + MYend
150        result.append((lactation, total_yield))
151
152    return pd.DataFrame(result, columns=["TestId", "Total305Yield"])

Compute 305-day total milk yield using the ICAR Test Interval Method.

The method applies:

  • Linear projection from calving to the first test day,
  • Trapezoidal integration between consecutive test days,
  • Linear projection from the last test day to DIM=305.
Arguments:
  • df (pd.DataFrame): Input DataFrame with at least DaysInMilk, MilkingYield, and (optionally) TestId columns (names can be provided via arguments or matched via known aliases, case-insensitive).
  • days_in_milk_col (str | None): Optional column name override for DaysInMilk.
  • milking_yield_col (str | None): Optional column name override for MilkingYield.
  • test_id_col (str | None): Optional column name override for TestId.
  • default_test_id (Any): If TestId is missing, a new TestId column is created with this value.
Returns:

pd.DataFrame: Two-column DataFrame with - "TestId": identifier per lactation, - "Total305Yield": computed total milk yield over 305 days.

Raises:
  • ValueError: If required columns (DaysInMilk or MilkingYield) cannot be found.
Notes:
  • Records with DIM > 305 are dropped before computation.
  • At least two data points per TestId are required for trapezoidal integration; otherwise the lactation is skipped.