Multi-Year Averaging in Transfer Pricing: Methods, Formulas, and Best Practices
Borys Ulanenko
CEO of ArmsLength AI
TL;DR - Key Takeaways
Multi-year data can improve comparability analysis—OECD supports examining multiple years where useful, but doesn't prescribe a fixed number and notes that multi-year data ≠ multi-year averaging.
Three methods: simple average (equal weight), weighted average (by relevant PLI base—a common practitioner technique), and period-weighted (emphasizes recent years).
Jurisdictional rules vary: India's Rule 10CA allows weighted multi-year treatment under specific conditions; Canada requires year-by-year testing in audits (no averaging to substantiate prices).
Use the same time period for tested party and all comparables—US regulations indicate data for the same years 'ordinarily must be considered' (subject to availability).
Don't automatically exclude loss years—analyze whether they reflect market conditions or indicate non-comparability. Multi-year context helps, but doesn't always require averaging.
Get the latest transfer pricing insights, AI benchmarking tips, and industry updates delivered straight to your inbox.
Quick Answer: Which Multi-Year Method Should You Use?
Multi-year data can improve comparability analysis by providing context on cycles, trends, and anomalies. When averaging is appropriate, the three main approaches are:
Simple average — Equal weight to all years; use when years are similarly sized and relevant
Weighted average — Weight by relevant PLI base (revenue, costs, or assets); a common practitioner technique
Period-weighted average — Assigns higher weight to recent years; useful when conditions are changing
Key caution: Using multiple-year data does not automatically mean using multi-year averages. OECD and several jurisdictions emphasize that each year should be arm's length, and multi-year data serves primarily for comparability context. If averaging is appropriate in your jurisdiction and circumstances, a 3-year weighted average (using the relevant PLI base) is common practice. Always use the same period for the tested party and all comparables.
Why Multi-Year Analysis Matters
Single-year data can mislead. A company might have an unusually good or bad year due to factors unrelated to transfer pricing—product launches, one-time expenses, economic shocks, or simple volatility. Multi-year analysis addresses these issues by:
Smoothing volatility — Averaging out year-to-year fluctuations that don't reflect underlying profitability
Capturing business cycles — Ensuring your analysis covers both peaks and troughs
Identifying anomalies — Revealing whether a tested party's result is part of a trend or an outlier
Improving comparability — Showing whether comparables faced similar conditions over time
OECD Position (¶3.75-3.79): Examining multiple-year data is often useful for understanding comparability (e.g., business cycles, trends), but is not systematically required. Importantly, OECD notes that using multiple-year data does not necessarily mean using multi-year averages—context and comparability analysis are the primary goals.
When Multi-Year Data May Be Appropriate
The US Treasury Regulations §1.482-1(f)(2)(iii) contemplate—and in some applications ordinarily require—consideration of multiple-year data. Circumstances where multi-year analysis may be appropriate include:
Industry is cyclical — Capturing full cycles may require more than one year
Product life cycles affect profitability — Start-up losses, maturity, decline
Market share strategies — Initial losses expected to yield later gains
Applying CPM — Certain CPM-related provisions expect consistent year coverage
The regulations also indicate that if multiple-year data is used for comparables, data for the tested party for the same years "ordinarily must be considered" (subject to data availability).
The Three Averaging Methods
1. Simple Average (Arithmetic Mean)
A simple average treats all years equally—just add the values and divide by the number of years.
Formula: Simple Average = (Year1 PLI + Year2 PLI + Year3 PLI) ÷ 3
Example: Comparable's operating margins for 2022-2024 are 4%, 6%, 5%
Simple Average = (4% + 6% + 5%) / 3 = 5.0%
When to use:
Stable industries without significant growth or decline
Years with similar operational scale
When regulations specify arithmetic mean
Small datasets where weighting adds complexity without benefit
Limitations:
Ignores differences in business size between years
A large year has the same influence as a small year
May not reflect trends (e.g., consistently improving margins)
2. Weighted Average (Common Practitioner Technique)
A weighted average gives more influence to larger years by weighting each year's result by the relevant PLI base—revenue for operating margin, costs for cost-plus, assets for ROA, etc. This is a commonly used technique in transfer pricing practice.
Formula: Weighted Average Margin = Sum of Operating Profits (all years) ÷ Sum of Revenues (all years)
The 2023 result (highest revenue year) has more influence than the simple average would give it.
India's Approach (Rule 10CA): Under the range concept, comparables may be reflected on a weighted multi-year basis where comparable transactions exist across years. The weights depend on the method/PLI base used—sales for resale-price-type PLIs, costs for cost-plus, etc. This is effectively a weighted average computation using the relevant base.
When to use:
Common choice for many analyses
When years vary in size (growth, acquisitions, market changes)
When your PLI denominator varies significantly year-to-year
Ensure weights match your PLI base (revenue, costs, or assets)
3. Period-Weighted Average (Trend-Weighted)
A period-weighted average assigns explicit weights to emphasize certain years—typically giving more weight to recent years when conditions are changing.
Rapidly changing industries where recent performance is more relevant
Post-restructuring periods where older data is less comparable
Recovery from economic shocks (e.g., post-pandemic)
When regulations allow and you can justify the weighting
Document carefully. Period-weighting is more subjective than other methods. Tax authorities may challenge arbitrary weights. Common defensible schemes: 30%-30%-40% or 20%-30%-50% for 3-year periods.
Choosing Your Timeframe: 3 Years vs. 5 Years
Factor
3-Year Analysis
5-Year Analysis
Recency
More current; reflects recent market conditions
Includes older, potentially outdated data
Cycle coverage
May miss full business cycle
Better captures peaks and troughs
Data availability
Usually complete for most comparables
Some companies may lack 5-year history
Regulatory acceptance
Standard in most jurisdictions
Used for cyclical industries
Relevance
Better if business recently changed
Better for stable, mature operations
Decision Framework
Use 3 years when:
Industry is relatively stable
Tested party's business model hasn't changed
Most comparables have complete 3-year data
Jurisdiction uses 3-year default (e.g., India)
Use 5 years when:
Industry is highly cyclical (commodities, semiconductors, shipping)
You need to capture a full economic cycle
Analysis period includes a major shock (e.g., COVID-19)
Longer view is needed to smooth volatility
OECD Guidance (¶3.76, ¶3.79): "It would not be appropriate to set prescriptive guidance as to the number of years." Importantly, OECD also notes that using multiple-year data "does not necessarily imply the use of multiple-year averages"—the data serves comparability context, not automatic averaging.
Jurisdictional Requirements
Jurisdiction
Multi-Year Approach
Key Rules
OECD
No fixed requirement; multi-year data ≠ multi-year averaging
Use multiple years where helpful for comparability; no prescriptive number of years (¶3.75-3.79)
United States
May be appropriate; 3 years common in CPM practice
Same years for tested party and comparables "ordinarily must be considered" (subject to availability)
India
Rule 10CA: weighted multi-year under specific conditions
Up to current + prior 2 years; weights depend on method/base; 35th-65th percentile range when ≥6 entries
Germany
3 years common in practice
Follows OECD principles; multi-year data may improve comparability (not a firm rule)
United Kingdom
3 years common in practice
HMRC focuses on year-by-year arm's length results; multi-year data for context
Australia
3-5 years common in practice
Longer periods may be used for cyclical industries (not a firm rule)
China
Arithmetic or weighted averages allowed
SAT allows year-by-year or multiple-year basis, but testing/adjustment is year-to-year
Canada
Year-by-year in audits; no averaging to substantiate
CRA: multi-year averaging may appear in APA analysis, but still verified annually
Switzerland
Year-by-year principle emphasized
Court trend (periodicity): each year must be arm's length; multi-year averaging doesn't excuse individual years
Canada Alert: CRA's TPM-16 explicitly states that taxpayers "should not average results over multiple years for the purpose of substantiating their transfer prices." In audits, prices/margins must be determined year-by-year. Multi-year averaging may appear in APA analysis, but results are still verified annually. Use multi-year data to select comparables and understand trends—not to justify your pricing.
Excel Formulas for Multi-Year Calculations
Data Setup
Column
Description
A
Comparable Name
B
Year
C
Revenue
D
Operating Profit
E
Operating Margin (D/C)
Simple Average (3-Year)
excel
=AVERAGE(E2:E4)
For margins in cells E2 through E4.
Weighted Average (Revenue-Weighted)
excel
=SUM(D2:D4)/SUM(C2:C4)
Sum of operating profits divided by sum of revenues.
Alternatively, using SUMPRODUCT:
excel
=SUMPRODUCT(C2:C4, E2:E4)/SUM(C2:C4)
Where E2:E4 contains margins and C2:C4 contains revenues.
Period-Weighted Average
For weights in cells G2:G4 (e.g., 0.30, 0.30, 0.40):
excel
=SUMPRODUCT(E2:E4, G2:G4)
Or directly:
excel
=0.30*E2 + 0.30*E3 + 0.40*E4
Handling Missing Data
To average only non-blank cells:
excel
=AVERAGE(E2:E4)
Excel's AVERAGE function automatically ignores blank cells.
To exclude zeros (if zero isn't a valid result):
excel
=AVERAGEIF(E2:E4,">0")
Handling Edge Cases
Missing Data Years
Options when a comparable lacks data for one year:
Approach
When to Use
Risk
Exclude comparable
Plenty of alternatives; consistency is critical
Reduces sample size
Use available years
Comparable is highly relevant; 2 of 3 years available
May distort average
Interpolate
Rarely appropriate
Speculation; hard to defend
Note on India: The "6 data points" threshold in Rule 10CA determines whether the 35th-65th percentile arm's length range applies—it's not a permission rule for using partial data. Whether a comparable can be included with fewer years depends on whether it has comparable transactions across those years.
Best practice: Document your approach. If you include a comparable with partial data, note: "CompX lacked 2022 data; 2-year average used (2021, 2023)." Consider whether excluding the comparable entirely would be more defensible.
Loss Years
Don't automatically exclude—analyze the cause and decide how to use the data.
Multi-year context helps understand whether a loss reflects market conditions or indicates non-comparability. A company with margins of [-2%, 5%, 6%] averages to +3%—but remember that using multi-year data doesn't always require multi-year averaging.
When to investigate:
Persistent losses (3+ consecutive years) may indicate the comparable isn't viable
Loss coincides with extraordinary events affecting only that company
Loss magnitude is dramatically different from industry peers
When including loss years may be appropriate:
Industry-wide loss year (e.g., 2020 pandemic impact)
Loss explained by normal business cycle
One bad year among otherwise healthy results
OECD (¶3.76): Examining prior years—including loss years—helps determine whether a tested party's loss "was part of a history of losses" or due to prior economic conditions. The goal is understanding comparability, not necessarily averaging the numbers together.
Business Discontinuities
If a comparable underwent a major change (merger, divestiture, restructuring):
Don't average across the discontinuity — Pre-merger and post-merger are different businesses
Use only comparable periods — If testing 2024, use only post-change data for that comparable
Consider exclusion — If the change makes multi-year data unreliable, drop the comparable
Practical Examples
Example A: 3-Year Weighted Average for Distributors
Scenario: European distributor benchmarked against 4 comparables
Comparable
2022 OP
2022 Rev
2023 OP
2023 Rev
2024 OP
2024 Rev
3-Yr Weighted Avg
CompA
€2.5M
€50M
€3.6M
€60M
€4.3M
€65M
5.94%
CompB
€3.2M
€80M
€3.8M
€85M
€4.1M
€90M
4.35%
CompC
€1.6M
€40M
€2.0M
€45M
€2.5M
€50M
4.52%
CompD
€3.5M
€70M
€1.8M
€72M
€4.2M
€75M
4.38%
Calculation (CompA):
Total OP = 2.5 + 3.6 + 4.3 = €10.4M
Total Rev = 50 + 60 + 65 = €175M
Weighted Avg = 10.4 / 175 = 5.94%
IQR (sorted): 4.35%, 4.38%, 4.52%, 5.94%
Q1 = ~4.36%
Median = ~4.45%
Q3 = ~5.23%
Example B: 5-Year Analysis for Cyclical Manufacturer
Scenario: Semiconductor manufacturer with volatile margins
Analysis: The 3-year average captures the recent "boom" period but misses the cycle trough. The 5-year average includes the full cycle, providing a more realistic long-term expectation.
Recommendation: For cyclical industries, the 5-year average better represents sustainable profitability—unless you can document why current conditions warrant the shorter view.
Example C: Period-Weighted for Post-Restructuring
Scenario: Company restructured in early 2022; margins are trending upward
Year
Margin
Weight
Contribution
2022
2%
20%
0.4%
2023
4%
30%
1.2%
2024
6%
50%
3.0%
Period-Weighted Average: 0.4 + 1.2 + 3.0 = 4.6%
Simple Average: (2 + 4 + 6) / 3 = 4.0%
Justification: "Given the 2022 restructuring, earlier results are less representative of current operations. We applied period-weighting (20%-30%-50%) to reflect improved post-restructuring performance while maintaining multi-year perspective."
Start with 3 years as your default—it's the most common period and usually sufficient. Extend to 5 years if the industry is highly cyclical, you need to capture a full economic cycle, or you're smoothing a significant anomaly (like COVID-19 impacts). The OECD explicitly avoids prescribing a number—use whatever period genuinely improves your analysis and document why.
What's the difference between simple and weighted averages?
A simple average treats all years equally (just average the percentages). A weighted average gives more influence to larger years by weighting each year's result by the relevant PLI base (revenue for margin, costs for cost-plus, etc.). For a comparable that grew significantly, weighted averaging better reflects its overall profitability. The weights should match your PLI denominator to maintain consistency.
Can I use different time periods for different comparables?
No. Consistency is critical. Use the same analysis period for all comparables and the tested party. Mixing periods (e.g., 3-year for some comparables, 5-year for others) creates an incoherent analysis and will draw audit scrutiny. The IRS regulations explicitly require using the same years for the tested party and comparables.
How do I handle a comparable missing one year of data?
Options: (1) Exclude the comparable if you have sufficient alternatives—this is often the cleanest approach; (2) Use available years if the comparable is otherwise highly relevant—calculate its average from available data and document carefully; (3) Never interpolate or estimate the missing year. Note that India's "6 data points" threshold relates to which range calculation method applies (35th-65th percentile), not a blanket permission for partial data.
Should loss years be excluded from the analysis?
Don't automatically exclude—analyze the cause. If multiple comparables had losses in the same year, that likely reflects market conditions and is relevant comparability data. However, OECD emphasizes that using multiple-year data doesn't necessarily mean averaging the numbers together—the goal is understanding comparability. Exclude only if the loss indicates deeper non-comparability (e.g., company going bankrupt, unique one-time event). Some jurisdictions (like CRA) explicitly note that OECD doesn't promote averaging multiple years of numerical data to establish comparability.
Does Canada allow multi-year averaging?
In audits, no. CRA's TPM-16 explicitly states that taxpayers "should not average results over multiple years" for substantiating transfer prices—prices/margins must be determined year-by-year. However, CRA notes that in an APA context, averaging historical outcomes may form part of the analysis—though results are still verified annually. You can use multi-year data to select comparables and understand trends, but don't use averaging to justify your pricing in an audit context.
How do I document my multi-year approach?
Include in your benchmarking study: (1) Period selected and rationale (e.g., "3 years to capture post-COVID recovery"); (2) Averaging method used (simple, weighted, or period-weighted) and why; (3) Edge cases handled (missing data, losses, discontinuities); (4) Jurisdictional compliance confirmation that your approach meets local requirements. Be prepared to explain why your method improves the analysis.