lactationcurve.characteristics.method_test_interval
ICAR Test Interval Method for accumulated lactation yield.
Purpose
This module implements the ICAR Test Interval Method (Procedure 2, Section 2) to compute cumulative milk yield from test-day records.
Method Summary
Calculate total milk yield over a lactation by summing three parts:
- First test-day milk yield multiplied by the number of days between calving and the first test day.
- Trapezoidal integration between consecutive test days (the average of two consecutive yields multiplied by the interval length).
- Last test-day milk yield multiplied by the number of days from the last test day to the end of the calculation window.
Formula:
MY = I0M1 + I1(M1 + M2)/2 + I2*(M2 + M3)/2 + ... + I(n-1)(M(n-1) + Mn)/2 + InMn
Where:
MY: total milk yield over the lactation window.M1, M2, ..., Mn: milk yield measured in the 24 hours of each test day.I1, I2, ..., I(n-1): interval lengths (days) between consecutive test days.I0: days from lactation start (calving) to first test day.In: days from last test day to the end of the calculation window (e.g., DIM 305).
Key Entry Points
test_interval_method: Computes cumulative lactation milk yield perTestIdusing start, interval, and end segments.
Column Flexibility
The function accepts several case-insensitive column name aliases and can
create a default TestId if one is missing. Recognized aliases:
- Days in Milk:
["daysinmilk", "dim", "testday"] - Milk Yield:
["milkingyield", "testdaymilkyield", "milkyield", "yield"] - Test Id:
["animalid", "testid", "id"]
It is also possible to provide your own column names so the function can be applied to dataframes with different column naming conventions.
Returns a DataFrame with columns: ["TestId", "LactationMilkYield"].
Notes
- Units: DIM (days in milk) is measured in days, and milk yield can be in kg or lb. The output stays in the same unit as the input.
- Records with
DIM > max_dimare excluded before computation. - This method's main strength is its ease of use and simplicity.
- Its main disadvantage is that it does not account for the shape of the lactation curve, which can lead to underestimation of total yield, especially for lactations with few test days or irregular patterns. Outliers in test-day records can also have a large influence on the final result.
- When a full lactation is known (i.e., test days up to DIM 305), this method will give a higher cumulative milk yield then the sum off all test days, because of the way the start and end contributions are calculated.
Reference
Sargent, F. D., V. H. Lyton, and 0. G. Wall, J r . 1968. Test interval method of calculating Dairy Herd Improvement Association records. J. Dairy Sci. 51:170.
Author: Meike van Leerdam Date: 07-31-2025 Last update: 22-May-2026
1""" 2ICAR Test Interval Method for accumulated lactation yield. 3 4Purpose 5------- 6This module implements the ICAR Test Interval Method (Procedure 2, 7Section 2) to compute cumulative milk yield from test-day records. 8 9Method Summary 10-------------- 11Calculate total milk yield over a lactation by summing three parts: 12 - First test-day milk yield multiplied by the number of days between 13 calving and the first test day. 14 - Trapezoidal integration between consecutive test days (the average of 15 two consecutive yields multiplied by the interval length). 16 - Last test-day milk yield multiplied by the number of days from the last 17 test day to the end of the calculation window. 18 19Formula: 20 21 MY = I0*M1 22 + I1*(M1 + M2)/2 23 + I2*(M2 + M3)/2 24 + ... 25 + I(n-1)*(M(n-1) + Mn)/2 26 + In*Mn 27 28Where: 29- ``MY``: total milk yield over the lactation window. 30- ``M1, M2, ..., Mn``: milk yield measured in the 24 hours of each test day. 31- ``I1, I2, ..., I(n-1)``: interval lengths (days) between consecutive test days. 32- ``I0``: days from lactation start (calving) to first test day. 33- ``In``: days from last test day to the end of the calculation window (e.g., DIM 305). 34 35 36Key Entry Points 37---------------- 38- ``test_interval_method``: Computes cumulative lactation milk yield per 39 ``TestId`` using start, interval, and end segments. 40 41 42Column Flexibility 43------------------ 44The function accepts several case-insensitive column name aliases and can 45create a default ``TestId`` if one is missing. Recognized aliases: 46 47- Days in Milk: `["daysinmilk", "dim", "testday"]` 48- Milk Yield: `["milkingyield", "testdaymilkyield", "milkyield", "yield"]` 49- Test Id: `["animalid", "testid", "id"]` 50 51It is also possible to provide your own column names so the function 52can be applied to dataframes with different column naming conventions. 53 54Returns a DataFrame with columns: ``["TestId", "LactationMilkYield"]``. 55 56Notes 57----- 58- Units: DIM (days in milk) is measured in days, 59 and milk yield can be in kg or lb. The 60 output stays in the same unit as the input. 61- Records with ``DIM > max_dim`` are excluded before computation. 62- This method's main strength is its ease of use and simplicity. 63- Its main disadvantage is that it does not account for the shape of the 64 lactation curve, which can lead to underestimation of total yield, 65 especially for lactations with few test days or irregular patterns. 66 Outliers in test-day records can also have a large influence on the final 67 result. 68- When a full lactation is known (i.e., test days up to DIM 305), 69this method will give a higher cumulative milk yield then the sum off all test days, 70because of the way the start and end contributions are calculated. 71 72Reference 73--------- 74Sargent, F. D., V. H. Lyton, and 0. G. Wall, J r . 1968. Test interval method of 75calculating Dairy Herd Improvement Association records. J. Dairy Sci. 51:170. 76 77Author: Meike van Leerdam 78Date: 07-31-2025 79Last update: 22-May-2026 80""" 81 82import pandas as pd 83 84from lactationcurve.preprocessing import standardize_lactation_columns 85 86 87def test_interval_method( 88 df: pd.DataFrame, 89 days_in_milk_col: str | None = None, 90 milking_yield_col: str | None = None, 91 test_id_col: str | None = None, 92 default_test_id: int = 0, 93 max_dim: int = 305, 94) -> pd.DataFrame: 95 """Compute total lactation milk yield using the ICAR Test Interval Method. 96 97 The method applies: 98 - First test day milk yield from calving to the first test day, 99 - Trapezoidal integration between consecutive test days, 100 - Last test day milk yield from the last test day to DIM = max_dim (default = 305). 101 102 Args: 103 df (pd.DataFrame): Input DataFrame with at least DaysInMilk and 104 MilkingYield columns, plus an optional TestId column. Column names 105 can be provided explicitly or matched via known aliases. 106 days_in_milk_col (str | None): Optional column name override for 107 DaysInMilk. 108 milking_yield_col (str | None): Optional column name override for 109 MilkingYield. 110 test_id_col (str | None): Optional column name override for TestId. 111 default_test_id (int): Value used to create a default TestId column if 112 one is missing. 113 max_dim (int): Lactation length used to calculate cumulative 114 production. The default is 305 days. 115 Records with DIM > max_dim are excluded. 116 117 Returns: 118 pd.DataFrame: Two-column DataFrame with 119 - "TestId": identifier per lactation, 120 - "LactationMilkYield": computed total milk yield over the 121 specified window. 122 123 Raises: 124 ValueError: If required columns (DaysInMilk or MilkingYield) cannot be found. 125 126 Notes: 127 - Records with DIM > max_dim are dropped before computation. 128 - At least two data points per TestId are required for trapezoidal integration; 129 otherwise the lactation is skipped. 130 """ 131 132 # Standardize columns and filter DIM <= max_dim 133 df = standardize_lactation_columns( 134 df, 135 days_in_milk_col=days_in_milk_col, 136 milking_yield_col=milking_yield_col, 137 test_id_col=test_id_col, 138 default_test_id=default_test_id, 139 max_dim=max_dim, 140 ) 141 142 result = [] 143 144 # Iterate over each lactation 145 for lactation in df["TestId"].unique(): 146 lactation_df = pd.DataFrame(df[df["TestId"] == lactation]) 147 148 # Sort by DaysInMilk ascending 149 lactation_df.sort_values(by="DaysInMilk", ascending=True, inplace=True) 150 151 if len(lactation_df) < 2: 152 print(f"Skipping TestId {lactation}: not enough data points for interpolation.") 153 continue 154 155 # Start and end points 156 start = lactation_df.iloc[0] 157 end = lactation_df.iloc[-1] 158 159 # Start contribution 160 MY0 = start["DaysInMilk"] * start["MilkingYield"] 161 162 # End contribution 163 MYend = (max_dim + 1 - end["DaysInMilk"]) * end["MilkingYield"] 164 165 # Intermediate trapezoidal contributions 166 lactation_df["width"] = lactation_df["DaysInMilk"].diff().shift(-1) 167 lactation_df["avg_yield"] = ( 168 lactation_df["MilkingYield"] + lactation_df["MilkingYield"].shift(-1) 169 ) / 2 170 lactation_df["trapezoid_area"] = lactation_df["width"] * lactation_df["avg_yield"] 171 172 total_intermediate = lactation_df["trapezoid_area"].sum() 173 174 total_yield = MY0 + total_intermediate + MYend 175 result.append((lactation, total_yield)) 176 177 return pd.DataFrame(result, columns=pd.Index(["TestId", "LactationMilkYield"])) 178 179 180# to prevent pytest from trying to collect this function as a test 181test_interval_method.__test__ = False
88def test_interval_method( 89 df: pd.DataFrame, 90 days_in_milk_col: str | None = None, 91 milking_yield_col: str | None = None, 92 test_id_col: str | None = None, 93 default_test_id: int = 0, 94 max_dim: int = 305, 95) -> pd.DataFrame: 96 """Compute total lactation milk yield using the ICAR Test Interval Method. 97 98 The method applies: 99 - First test day milk yield from calving to the first test day, 100 - Trapezoidal integration between consecutive test days, 101 - Last test day milk yield from the last test day to DIM = max_dim (default = 305). 102 103 Args: 104 df (pd.DataFrame): Input DataFrame with at least DaysInMilk and 105 MilkingYield columns, plus an optional TestId column. Column names 106 can be provided explicitly or matched via known aliases. 107 days_in_milk_col (str | None): Optional column name override for 108 DaysInMilk. 109 milking_yield_col (str | None): Optional column name override for 110 MilkingYield. 111 test_id_col (str | None): Optional column name override for TestId. 112 default_test_id (int): Value used to create a default TestId column if 113 one is missing. 114 max_dim (int): Lactation length used to calculate cumulative 115 production. The default is 305 days. 116 Records with DIM > max_dim are excluded. 117 118 Returns: 119 pd.DataFrame: Two-column DataFrame with 120 - "TestId": identifier per lactation, 121 - "LactationMilkYield": computed total milk yield over the 122 specified window. 123 124 Raises: 125 ValueError: If required columns (DaysInMilk or MilkingYield) cannot be found. 126 127 Notes: 128 - Records with DIM > max_dim are dropped before computation. 129 - At least two data points per TestId are required for trapezoidal integration; 130 otherwise the lactation is skipped. 131 """ 132 133 # Standardize columns and filter DIM <= max_dim 134 df = standardize_lactation_columns( 135 df, 136 days_in_milk_col=days_in_milk_col, 137 milking_yield_col=milking_yield_col, 138 test_id_col=test_id_col, 139 default_test_id=default_test_id, 140 max_dim=max_dim, 141 ) 142 143 result = [] 144 145 # Iterate over each lactation 146 for lactation in df["TestId"].unique(): 147 lactation_df = pd.DataFrame(df[df["TestId"] == lactation]) 148 149 # Sort by DaysInMilk ascending 150 lactation_df.sort_values(by="DaysInMilk", ascending=True, inplace=True) 151 152 if len(lactation_df) < 2: 153 print(f"Skipping TestId {lactation}: not enough data points for interpolation.") 154 continue 155 156 # Start and end points 157 start = lactation_df.iloc[0] 158 end = lactation_df.iloc[-1] 159 160 # Start contribution 161 MY0 = start["DaysInMilk"] * start["MilkingYield"] 162 163 # End contribution 164 MYend = (max_dim + 1 - end["DaysInMilk"]) * end["MilkingYield"] 165 166 # Intermediate trapezoidal contributions 167 lactation_df["width"] = lactation_df["DaysInMilk"].diff().shift(-1) 168 lactation_df["avg_yield"] = ( 169 lactation_df["MilkingYield"] + lactation_df["MilkingYield"].shift(-1) 170 ) / 2 171 lactation_df["trapezoid_area"] = lactation_df["width"] * lactation_df["avg_yield"] 172 173 total_intermediate = lactation_df["trapezoid_area"].sum() 174 175 total_yield = MY0 + total_intermediate + MYend 176 result.append((lactation, total_yield)) 177 178 return pd.DataFrame(result, columns=pd.Index(["TestId", "LactationMilkYield"]))
Compute total lactation milk yield using the ICAR Test Interval Method.
The method applies:
- First test day milk yield from calving to the first test day,
- Trapezoidal integration between consecutive test days,
- Last test day milk yield from the last test day to DIM = max_dim (default = 305).
Arguments:
- df (pd.DataFrame): Input DataFrame with at least DaysInMilk and MilkingYield columns, plus an optional TestId column. Column names can be provided explicitly or matched via known aliases.
- days_in_milk_col (str | None): Optional column name override for DaysInMilk.
- milking_yield_col (str | None): Optional column name override for MilkingYield.
- test_id_col (str | None): Optional column name override for TestId.
- default_test_id (int): Value used to create a default TestId column if one is missing.
- max_dim (int): Lactation length used to calculate cumulative production. The default is 305 days. Records with DIM > max_dim are excluded.
Returns:
pd.DataFrame: Two-column DataFrame with - "TestId": identifier per lactation, - "LactationMilkYield": computed total milk yield over the specified window.
Raises:
- ValueError: If required columns (DaysInMilk or MilkingYield) cannot be found.
Notes:
- Records with DIM > max_dim are dropped before computation.
- At least two data points per TestId are required for trapezoidal integration; otherwise the lactation is skipped.