lactationcurve.characteristics.test_interval_method
ICAR 305-day yield calculation — Test Interval Method.
This module implements the Test Interval Method described in ICAR guidelines (Procedure 2, Section 2: Computing of Accumulated Lactation Yield) to compute total 305-day milk yield from test-day data.
Approach
- Start segment: Linear projection from calving (DIM=0) to the first test day.
- Intermediate segments: Trapezoidal rule between consecutive test days.
- End segment: Linear projection from the last test day to DIM=305 (exclusive upper bound 306 for day counting).
Column Flexibility
The function can accept various column name aliases (case-insensitive) and
optionally create a default TestId if missing. Recognized aliases:
- Days in Milk:
["daysinmilk", "dim", "testday"] - Milk Yield:
["milkingyield", "testdaymilkyield", "milkyield", "yield"] - Test Id:
["animalid", "testid", "id"]
Returns a DataFrame with columns: ["TestId", "Total305Yield"].
Notes
- Units: DIM in days, milk yield in kg.
- Records with
DIM > 305are excluded prior to computation.
Author: Meike van Leerdam, Date: 07-31-2025
1""" 2ICAR 305-day yield calculation — Test Interval Method. 3 4This module implements the **Test Interval Method** described in ICAR guidelines 5(Procedure 2, Section 2: Computing of Accumulated Lactation Yield) to compute 6total **305-day milk yield** from test-day data. 7 8Approach 9-------- 10- **Start segment**: Linear projection from calving (DIM=0) to the first test day. 11- **Intermediate segments**: **Trapezoidal rule** between consecutive test days. 12- **End segment**: Linear projection from the last test day to DIM=305 (exclusive 13 upper bound 306 for day counting). 14 15Column Flexibility 16------------------ 17The function can accept various column name aliases (case-insensitive) and 18optionally create a default `TestId` if missing. Recognized aliases: 19 20- Days in Milk: `["daysinmilk", "dim", "testday"]` 21- Milk Yield: `["milkingyield", "testdaymilkyield", "milkyield", "yield"]` 22- Test Id: `["animalid", "testid", "id"]` 23 24Returns a DataFrame with columns: `["TestId", "Total305Yield"]`. 25 26Notes 27----- 28- Units: DIM in days, milk yield in kg. 29- Records with `DIM > 305` are excluded prior to computation. 30 31Author: Meike van Leerdam, Date: 07-31-2025 32""" 33 34import pandas as pd 35 36 37def test_interval_method( 38 df, days_in_milk_col=None, milking_yield_col=None, test_id_col=None, default_test_id=1 39): 40 """Compute 305-day total milk yield using the ICAR Test Interval Method. 41 42 The method applies: 43 - Linear projection from calving to the first test day, 44 - Trapezoidal integration between consecutive test days, 45 - Linear projection from the last test day to DIM=305. 46 47 Args: 48 df (pd.DataFrame): Input DataFrame with at least DaysInMilk, MilkingYield, 49 and (optionally) TestId columns (names can be provided via arguments 50 or matched via known aliases, case-insensitive). 51 days_in_milk_col (str | None): Optional column name override for DaysInMilk. 52 milking_yield_col (str | None): Optional column name override for MilkingYield. 53 test_id_col (str | None): Optional column name override for TestId. 54 default_test_id (Any): If TestId is missing, a new `TestId` column is created 55 with this value. 56 57 Returns: 58 pd.DataFrame: Two-column DataFrame with 59 - "TestId": identifier per lactation, 60 - "Total305Yield": computed total milk yield over 305 days. 61 62 Raises: 63 ValueError: If required columns (DaysInMilk or MilkingYield) cannot be found. 64 65 Notes: 66 - Records with DIM > 305 are dropped before computation. 67 - At least two data points per TestId are required for trapezoidal integration; 68 otherwise the lactation is skipped. 69 """ 70 result = [] 71 72 # create a bit more flexibility in naming the columns and 73 # when only one lactation is put in without a testid 74 75 # Define accepted variations for each logical column 76 # Accepted aliases (case-insensitive) 77 aliases = { 78 "DaysInMilk": ["daysinmilk", "dim", "testday"], 79 "MilkingYield": ["milkingyield", "testdaymilkyield", "milkyield", "yield"], 80 "TestId": ["animalid", "testid", "id"], 81 } 82 83 # Create a mapping from lowercase to actual column names 84 col_lookup = {col.lower(): col for col in df.columns} 85 86 def get_col_name(override, possible_names): 87 """Return a matching actual column name from `df`, or `None` if not found. 88 89 Args: 90 override (str | None): Explicit column name provided by the user. 91 possible_names (list[str]): List of acceptable aliases (lowercase). 92 93 Returns: 94 str | None: The actual column name present in `df`, or `None` if no match. 95 """ 96 if override: 97 return col_lookup.get(override.lower()) 98 for name in possible_names: 99 if name in col_lookup: 100 return col_lookup[name] 101 return None 102 103 # Resolve columns 104 dim_col = get_col_name(days_in_milk_col, aliases["DaysInMilk"]) 105 if not dim_col: 106 raise ValueError("No DaysInMilk column found in DataFrame.") 107 108 my_col = get_col_name(milking_yield_col, aliases["MilkingYield"]) 109 if not my_col: 110 raise ValueError("No MilkingYield column found in DataFrame.") 111 112 id_col = get_col_name(test_id_col, aliases["TestId"]) 113 if not id_col: 114 id_col = "TestId" 115 df[id_col] = default_test_id 116 117 # Filter out records where Day > 305 118 df = df[df[dim_col] <= 305] 119 120 # Iterate over each lactation 121 for lactation in df[id_col].unique(): 122 lactation_df = df[df[id_col] == lactation].copy() 123 124 # Sort by DaysInMilk ascending 125 lactation_df.sort_values(by=dim_col, ascending=True, inplace=True) 126 127 if len(lactation_df) < 2: 128 print(f"Skipping TestId {lactation}: not enough data points for interpolation.") 129 continue 130 131 # Start and end points 132 start = lactation_df.iloc[0] 133 end = lactation_df.iloc[-1] 134 135 # Start contribution 136 MY0 = start[dim_col] * start[my_col] 137 138 # End contribution 139 MYend = (306 - end[dim_col]) * end[my_col] 140 141 # Intermediate trapezoidal contributions 142 lactation_df["width"] = lactation_df[dim_col].diff().shift(-1) 143 lactation_df["avg_yield"] = (lactation_df[my_col] + lactation_df[my_col].shift(-1)) / 2 144 lactation_df["trapezoid_area"] = lactation_df["width"] * lactation_df["avg_yield"] 145 146 total_intermediate = lactation_df["trapezoid_area"].sum() 147 148 total_yield = MY0 + total_intermediate + MYend 149 result.append((lactation, total_yield)) 150 151 return pd.DataFrame(result, columns=["TestId", "Total305Yield"])
def
test_interval_method( df, days_in_milk_col=None, milking_yield_col=None, test_id_col=None, default_test_id=1):
38def test_interval_method( 39 df, days_in_milk_col=None, milking_yield_col=None, test_id_col=None, default_test_id=1 40): 41 """Compute 305-day total milk yield using the ICAR Test Interval Method. 42 43 The method applies: 44 - Linear projection from calving to the first test day, 45 - Trapezoidal integration between consecutive test days, 46 - Linear projection from the last test day to DIM=305. 47 48 Args: 49 df (pd.DataFrame): Input DataFrame with at least DaysInMilk, MilkingYield, 50 and (optionally) TestId columns (names can be provided via arguments 51 or matched via known aliases, case-insensitive). 52 days_in_milk_col (str | None): Optional column name override for DaysInMilk. 53 milking_yield_col (str | None): Optional column name override for MilkingYield. 54 test_id_col (str | None): Optional column name override for TestId. 55 default_test_id (Any): If TestId is missing, a new `TestId` column is created 56 with this value. 57 58 Returns: 59 pd.DataFrame: Two-column DataFrame with 60 - "TestId": identifier per lactation, 61 - "Total305Yield": computed total milk yield over 305 days. 62 63 Raises: 64 ValueError: If required columns (DaysInMilk or MilkingYield) cannot be found. 65 66 Notes: 67 - Records with DIM > 305 are dropped before computation. 68 - At least two data points per TestId are required for trapezoidal integration; 69 otherwise the lactation is skipped. 70 """ 71 result = [] 72 73 # create a bit more flexibility in naming the columns and 74 # when only one lactation is put in without a testid 75 76 # Define accepted variations for each logical column 77 # Accepted aliases (case-insensitive) 78 aliases = { 79 "DaysInMilk": ["daysinmilk", "dim", "testday"], 80 "MilkingYield": ["milkingyield", "testdaymilkyield", "milkyield", "yield"], 81 "TestId": ["animalid", "testid", "id"], 82 } 83 84 # Create a mapping from lowercase to actual column names 85 col_lookup = {col.lower(): col for col in df.columns} 86 87 def get_col_name(override, possible_names): 88 """Return a matching actual column name from `df`, or `None` if not found. 89 90 Args: 91 override (str | None): Explicit column name provided by the user. 92 possible_names (list[str]): List of acceptable aliases (lowercase). 93 94 Returns: 95 str | None: The actual column name present in `df`, or `None` if no match. 96 """ 97 if override: 98 return col_lookup.get(override.lower()) 99 for name in possible_names: 100 if name in col_lookup: 101 return col_lookup[name] 102 return None 103 104 # Resolve columns 105 dim_col = get_col_name(days_in_milk_col, aliases["DaysInMilk"]) 106 if not dim_col: 107 raise ValueError("No DaysInMilk column found in DataFrame.") 108 109 my_col = get_col_name(milking_yield_col, aliases["MilkingYield"]) 110 if not my_col: 111 raise ValueError("No MilkingYield column found in DataFrame.") 112 113 id_col = get_col_name(test_id_col, aliases["TestId"]) 114 if not id_col: 115 id_col = "TestId" 116 df[id_col] = default_test_id 117 118 # Filter out records where Day > 305 119 df = df[df[dim_col] <= 305] 120 121 # Iterate over each lactation 122 for lactation in df[id_col].unique(): 123 lactation_df = df[df[id_col] == lactation].copy() 124 125 # Sort by DaysInMilk ascending 126 lactation_df.sort_values(by=dim_col, ascending=True, inplace=True) 127 128 if len(lactation_df) < 2: 129 print(f"Skipping TestId {lactation}: not enough data points for interpolation.") 130 continue 131 132 # Start and end points 133 start = lactation_df.iloc[0] 134 end = lactation_df.iloc[-1] 135 136 # Start contribution 137 MY0 = start[dim_col] * start[my_col] 138 139 # End contribution 140 MYend = (306 - end[dim_col]) * end[my_col] 141 142 # Intermediate trapezoidal contributions 143 lactation_df["width"] = lactation_df[dim_col].diff().shift(-1) 144 lactation_df["avg_yield"] = (lactation_df[my_col] + lactation_df[my_col].shift(-1)) / 2 145 lactation_df["trapezoid_area"] = lactation_df["width"] * lactation_df["avg_yield"] 146 147 total_intermediate = lactation_df["trapezoid_area"].sum() 148 149 total_yield = MY0 + total_intermediate + MYend 150 result.append((lactation, total_yield)) 151 152 return pd.DataFrame(result, columns=["TestId", "Total305Yield"])
Compute 305-day total milk yield using the ICAR Test Interval Method.
The method applies:
- Linear projection from calving to the first test day,
- Trapezoidal integration between consecutive test days,
- Linear projection from the last test day to DIM=305.
Arguments:
- df (pd.DataFrame): Input DataFrame with at least DaysInMilk, MilkingYield, and (optionally) TestId columns (names can be provided via arguments or matched via known aliases, case-insensitive).
- days_in_milk_col (str | None): Optional column name override for DaysInMilk.
- milking_yield_col (str | None): Optional column name override for MilkingYield.
- test_id_col (str | None): Optional column name override for TestId.
- default_test_id (Any): If TestId is missing, a new
TestIdcolumn is created with this value.
Returns:
pd.DataFrame: Two-column DataFrame with - "TestId": identifier per lactation, - "Total305Yield": computed total milk yield over 305 days.
Raises:
- ValueError: If required columns (DaysInMilk or MilkingYield) cannot be found.
Notes:
- Records with DIM > 305 are dropped before computation.
- At least two data points per TestId are required for trapezoidal integration; otherwise the lactation is skipped.