📌 Key Takeaways
Supplier brochures promise tight tolerances, but your corrugator needs proof that those claims hold at line speed across shifts and production lots.
- Capability is What Mills Actually Hold, Not What They Advertise: A hold window maps the measured min–max range a supplier has demonstrated over 30–60 days of recent production, proven through COAs, lab certificates, and control chart data rather than marketing specs.
- Method-Named Tolerances Eliminate Acceptance Disputes: Requiring suppliers to cite exact test standards (ISO 536 for basis weight, ISO 287 for moisture, ISO 2758 for burst) with units and recent evidence at the RFQ stage prevents post-award debates over measurement methods.
- Pilot-at-Speed is the Final Gate Before Volume: Acceptance requires stable runs at your target line speed with all critical properties within tolerance, retained samples that match supplier COAs, and no chronic stoppages—test conditions, not courtesy samples, reveal real capability.
- Combine Mill Capability with Exporter Reliability: An integration score weighted 60% on manufacturer evidence (quality systems, lab accreditation, hold windows) and 40% on operational predictors (on-time delivery, documentation accuracy, lane coverage) de-risks international awards before you scale.
- Normalize to Door Before Ranking, Then Stress-Test Freight: Converting mixed-Incoterm quotes to a common landed basis and modeling +30–50% freight surges identifies the flip point where your second choice becomes cheaper, protecting you from volatile ocean rates.
Evidence-backed qualification replaces promises with verifiable performance data.
Procurement teams, quality assurance managers, and plant operations staff in the containerboard industry will find this framework here, preparing them for the detailed implementation guidance that follows.
Supplier brochures arrive on your desk promising tight tolerances across basis weight, moisture, and burst strength. The quotes look competitive. The certifications appear legitimate. You award the contract, ramp to volume, and then the problems start: moisture swings trigger curl at the corrugator, edge crush values drift below spec, and your QA team spends more time investigating non-conformances than approving shipments.
The root cause isn’t dishonesty—it’s the gap between what a mill claims it can produce and what it can consistently hold under real operating conditions. Marketing specifications describe ideal-state capability. Your production line demands repeatable performance at speed, across shifts, and over months of supply.
This article provides a systematic framework for closing that gap. You’ll learn how to build a Mill Capability Matrix that transforms supplier promises into verifiable “hold windows,” use pilot runs to stress-test those claims at line speed, and combine manufacturer evidence with exporter reliability to de-risk international awards before you scale.
Capability ≠ Claims—What a “Hold Window” Really Means

Process capability describes what a mill can repeatedly deliver under normal operating conditions, not what it achieves during a carefully controlled sample run. The distinction matters because your corrugator doesn’t run under laboratory conditions—it runs at target speed, across three shifts, with material from different production lots.
A “hold window” is the measured range a supplier has demonstrated it can maintain for a specific property over recent production. For basis weight, a hold window might be 128–132 gsm against a nominal target of 130 gsm. For moisture content, it could be 7.0–8.0% when your specification calls for 7.5% ±0.5%. These windows emerge from analysis of certificates of analysis (COAs), lab test results, and control chart data—not from the mill’s technical data sheet.
The capability matrix makes hold windows explicit and comparable. Instead of evaluating suppliers based on whether they say “yes” to your specification, you evaluate them based on documented evidence of what they’ve actually held over the past 90 days. This shift from claims to evidence changes the qualification conversation entirely.
The Statistical Foundation: Stability Before Capability
Capability thinking borrows from statistical process control, and two principles deserve attention. First, stability must precede capability assessment. A property tracked on a control chart should show only common-cause variation—the normal, predictable fluctuation inherent to the process. Unstable processes that exhibit special-cause variation (unexpected shifts or spikes) cannot credibly “hold” any band, because the underlying process isn’t predictable.
Second, capability indices like Cp (process capability) and Cpk (capability adjusted for centering) offer a useful shorthand for how tightly a process fits within specified tolerances. Higher indices suggest less variation and better ability to stay within limits. A Cpk of 1.33 or above is generally considered capable in manufacturing. While you don’t need to calculate these indices yourself, asking suppliers for their process performance data—particularly for critical properties like moisture, basis weight, and strength parameters—gives you insight into their operational control. Treat these indices as supporting indicators, not as substitutes for a pilot run at your actual line speed. Mills that track and share this data signal a commitment to continuous improvement and process discipline.
Build the Mill Capability Matrix (Template + How-to)

The matrix is a structured comparison tool that forces evidence-based evaluation. Each row represents a critical property; each column captures the data needed to verify the mill’s hold window and assess risk.
Core Matrix Columns:
Property: The measurable characteristic you’re evaluating (e.g., basis weight, moisture content, Cobb value, short-span compression test).
Method ID (ISO/TAPPI): The specific test standard that defines how the property is measured. Always require method-named tolerances. ISO 536 for basis weight, ISO 287 for moisture, ISO 2758 for bursting strength, and ISO 9895 for short-span compression (SCT) are common examples. For board grades, you’ll add ISO 3037 for corrugated board edge crush test (ECT) and ISO 13821 for box compression test (BCT).
Target Band & Tolerance: Your required specification, expressed with both a nominal target and an acceptable range. For instance, “130 gsm ±2%” is clearer and more enforceable than “approximately 130 gsm.”
Hold Window (Min–Max from Recent Data): The actual range the mill has demonstrated over the past 30–60 days of production, extracted from COAs and lab certificates. This is the most revealing column—it shows you what the mill does hold, not what it says it can hold.
Evidence Type: The documentation that supports the hold window claim. Look for ISO/IEC 17025 accredited lab certificates, timestamped COAs with batch traceability, and where relevant, photos of calibrated test instruments. Generic statements or undated reports carry little weight.
Pilot Result at Speed: The measured performance during your qualification trial, conducted at your target line speed with production-representative material. This column remains empty until you run the pilot—it’s your final verification gate before award.
Control Signal: A brief note on the mill’s statistical process control status for this property. Are control charts maintained? Is the process “in control” (showing only common-cause variation), or should you “investigate trend” due to recent drift toward specification limits? This isn’t always available in initial discussions, but mills with mature quality systems will have this data readily accessible.
Last-90-Day OTIF / Doc Accuracy: On-time, in-full delivery percentage and documentation accuracy rate for recent shipments to other customers. This captures the exporter’s operational reliability, not just the mill’s production capability. Even perfect product quality becomes a liability if shipments consistently roll over or documentation triggers customs delays.
For a detailed example of how these columns work together in practice, refer to the kraft paper manufacturers capability matrix, which demonstrates the full framework applied to kraft paper grades.
Populate the Property Set Based on Your Application:
Containerboard procurement typically focuses on a core set of properties. For linerboard, prioritize basis weight (ISO 536), moisture content (ISO 287), bursting strength (ISO 2758), Cobb water absorption (ISO 535), and SCT (ISO 9895). If you’re evaluating corrugated board suppliers, add ECT (ISO 3037) and BCT (ISO 13821) to assess the structural performance of the finished board.
For fluting or medium grades, ring crush test (RCT per TAPPI T 822) and concora medium test (CMT per TAPPI T 809) become more relevant than burst. The specific property set should mirror the critical-to-quality characteristics that drive performance on your converting line—not a generic checklist.
Illustrative Matrix Example
The table below demonstrates the structure and format. All numeric values are placeholders for illustration purposes only and do not represent actual specifications or performance endorsements.
| Property | Method ID | Target & Tolerance | Hold Window (Last 30–60d) | Evidence Type | Pilot @ Target Speed | Control Signal | Last-90d OTIF / Doc Accuracy |
| Basis weight (GSM) | ISO 536 / TAPPI T 410 | 170 ±2% | 168–172 | COA series + lab cert | PASS (stable) | In control | 97% / 99% |
| Moisture (%) | ISO 287 | 7.0 ±1.0 | 6.6–7.4 | COA + instrument photo | PASS | In control | 97% / 99% |
| Cobb (60 s, g/m²) | ISO 535 | 28 ±4 | 26–31 | 17025 lab cert | PASS | In control | 97% / 99% |
| Burst (kPa) | ISO 2758 | ≥2100 | 2150–2350 | COA run chart | PASS | In control | 97% / 99% |
| SCT (kN/m) | ISO 9895 | ≥2.2 | 2.3–2.6 | Lab cert (7–14 d) | PASS | In control | 97% / 99% |
| RCT (N) | Named TAPPI/ISO | ≥150 | 155–165 | COA match | PASS | In control | 97% / 99% |
Include control-chart snapshots or status notes where available so reviewers can judge stability over time, not just point values from a single batch.
Populate It with Evidence (RFQ Pack → COAs → Lab Certs)
The matrix is only useful if you populate it with real data. This happens in stages, starting at the RFQ phase and continuing through to pilot qualification.
Stage 1: RFQ Evidence Pack
At the quote stage, require suppliers to submit method-named results for each property in your specification. A valid response includes the test method (e.g., “ISO 536“), the measured result with units (e.g., “130.2 gsm”), the tolerance or uncertainty (e.g., “±1.2 gsm”), and the date of the test. Request both a 30–60 day COA series to show ongoing production consistency and recent lab certificates (ideally from tests conducted within the past 7–14 days) to demonstrate current capability rather than historical best-case performance.
For guidance on structuring these requirements, see build a passport for your material: what to include in a kraft paper RFQ evidence pack. The principles transfer directly to containerboard procurement.
The RFQ phase is also where you confirm the supplier’s lab is ISO/IEC 17025 accredited for the relevant test methods. Accreditation ensures the lab follows documented procedures, maintains calibrated equipment, and participates in proficiency testing. Where relevant, request photos of test instruments with visible calibration stickers. This level of transparency signals a supplier that understands evidence-based qualification.
Stage 2: Normalize Before Shortlisting
Different suppliers may report the same property using different units or reference conditions. One supplier quotes Cobb60 at 30 g/m², another reports Cobb180 at 45 g/m². You can’t directly compare these without normalizing to the same test duration. Similarly, basis weight might be reported at different moisture content assumptions (oven-dry vs. air-dry).
Before filling in the hold window column, convert all quoted values to a common basis. For moisture-dependent properties, pick a standard reference condition (typically 23°C and 50% relative humidity per ISO 187) and adjust accordingly. This step prevents the false precision of comparing incomparable numbers.
Stage 3: Verify Control Signals
For shortlisted suppliers, request evidence of statistical process control. This might be a control chart showing the past month’s basis weight or moisture readings, a process capability study (Cp/Cpk analysis), or a simple run chart demonstrating variation around the target. Mills that actively monitor and respond to process variation are far less likely to deliver off-spec material than those relying on periodic batch testing alone.
The evidence-first capability matrix framework explains how to interpret and weight this type of control evidence when making final sourcing decisions.
Pilot-First: What “PASS” Looks Like

The capability matrix tells you what the mill has historically held. A pilot run tells you whether they can hold it under your conditions—at your line speed, with your material handling practices, and in your facility’s ambient environment.
Design the Pilot as a Stress Test, Not a Courtesy Sample
A pilot run should replicate production conditions as closely as possible. Run the trial at your target line speed, not at a conservative “safe” speed. Process multiple reels from different production lots if available, to capture lot-to-lot variation. Measure critical properties at multiple points across the web and through the reel to detect profile variation or internal moisture gradients.
Set measurable pass/fail thresholds before the trial starts. For moisture content, you might define “PASS” as all test points falling within 7.0–8.0% with no individual reading exceeding 8.2%. For ECT or BCT, define both the minimum acceptable value and the maximum allowable coefficient of variation (CV%) across samples.
Acceptance Criteria Framework
A complete pilot acceptance uses three verification layers:
Measurable thresholds for critical properties. Moisture, profile variation (caliper across the width), and structural strength (ECT/BCT) are typically the highest-risk properties. Define both the acceptable range and the maximum number of out-of-spec readings you’ll tolerate before triggering a retest.
Sample retention and COA match. Retain physical samples from the pilot run with full chain of custody—tag each sample with reel number, test date, operator ID, and instrument identification. Compare the lab-tested values on the supplier’s COA against your own incoming inspection results within the method’s expected repeatability and reproducibility limits. A PASS requires close agreement—typically within the combined measurement uncertainty of both labs. Significant discrepancies indicate either poor lab-to-lab correlation or, worse, selective reporting by the supplier.
Stable run at target speed. The material should run cleanly through your process without excessive breaks, flagging, or adjustment stops. Track the number of unplanned interruptions and compare against your baseline for approved suppliers. A pilot that requires constant babysitting won’t scale to volume production.
For a more detailed breakdown of how to structure these criteria, the principles in QA acceptance without debate: set method-named tolerances and attach results at quote time apply directly to containerboard pilot qualification.
Tie Acceptance to AQL by Defect Class
Not all defects carry the same business risk. Adopt a tiered acceptance sampling approach that applies different acceptable quality limits (AQL) based on defect severity.
For critical defects—those that cause immediate line stops or product recalls, such as contamination, severe moisture imbalance, or structural failure—use an AQL of 0.0. This means zero tolerance; any occurrence triggers a RETEST or disqualification.
For major defects like moisture drift outside the acceptable window or strength values below spec but above a safety threshold, an AQL of 1.0–2.5 is standard. This allows for occasional process variation without automatic rejection, but requires corrective action.
For cosmetic or minor issues that don’t affect functionality—edge quality, minor surface marks, or slight caliper variation within tolerance—an AQL of 4.0–6.5 provides reasonable flexibility while still maintaining quality expectations.
This tiered approach prevents over-rejection of otherwise capable suppliers while maintaining strict control on defects that truly matter to your operation.
Score, Compare, and De-risk
The matrix now contains verified hold windows and pilot results for each shortlisted supplier. The final step is to combine this manufacturer capability data with exporter reliability metrics to create a holistic risk assessment.

The Integration Score: Combining Capability and Reliability
Even a mill with perfect process capability becomes a high-risk supplier if the exporter handling logistics consistently misses shipment dates, submits inaccurate documentation, or lacks coverage in your key shipping lanes. The Integration Score combines both dimensions into a single, comparable metric.
A practical framework weights manufacturer capability at 60% and exporter reliability at 40%, though you can adjust these proportions based on your specific risk profile.
On the manufacturer capability side (60% of total), score each supplier 0–5 on:
- Product capability: Can they hold all critical properties within your spec windows? Full points require meeting all thresholds during the pilot.
- Quality systems: Do they maintain ISO 9001 or equivalent? Is there evidence of active statistical process control?
- Lab proof: Is their lab ISO/IEC 17025 accredited for your required test methods?
On the exporter reliability side (40% of total), score 0–5 on:
- On-time delivery rate: What percentage of shipments arrive within the agreed window over the past 90 days?
- Documentation accuracy: How often do their bills of lading, certificates of origin, and fumigation certificates match the actual shipment without errors or omissions?
- Rollover history: How frequently do their bookings get bumped to a later sailing due to overbooking or space issues?
- Lane coverage: Do they have established service in your required trade corridors, or will you be their first shipment on this route?
Combine the weighted scores across both categories to calculate a total Integration Score out of 100 points. Set a minimum threshold—typically 70 or higher—for award readiness, with a critical gate: no QA or Compliance category can score zero. A supplier might have excellent logistics but fail product capability, or vice versa. Neither scenario is acceptable. Scores below the threshold route to remediation plans or smaller trial orders before full qualification.
For a detailed operational scorecard that breaks down these exporter reliability predictors, refer to the kraft paper supplier reliability scorecard. The framework applies equally well to containerboard sourcing.
The integration model is detailed in Integration Playbook: how manufacturer evidence + exporter reliability de-risk international kraft paper supply, which demonstrates how to weight and combine these factors for final award decisions.
From Matrix to Award (Price in Context)
The capability matrix and integration score tell you who can reliably deliver. But price still matters—it just needs to be evaluated in context, not in isolation.
Normalize All Quotes to To-Door Comparability
You’ll receive quotes on mixed Incoterms: one supplier offers EXW (mill gate), another FOB (port of loading), a third CIF (cost, insurance, freight to your destination port). These aren’t directly comparable. The “cheapest” quote on paper might become the most expensive once you add freight, insurance, inland haulage, customs duties, and handling fees.
Convert every quote to the same delivery basis—ideally DDP (delivered duty paid) to your facility—before ranking suppliers. For quotes that don’t include freight, use recent freight rate data for the relevant lane and add a conservative buffer for potential surcharges. Factor in insurance (typically 0.3–0.5% of cargo value), customs duties (varies by HS code and origin), and any inland transport from the port to your warehouse.
This normalization process often changes the winner. A supplier quoting $50/ton more FOB might be $30/ton cheaper to-door once you account for the other supplier’s higher inland freight costs or less favorable shipping lane.
Stress-Test Freight Scenarios That Can Flip Winners
Ocean freight rates are volatile. A quote that assumes $1,200 per container might face reality at $1,800 if carrier surcharges increase or you book during peak season. Before finalizing an award, stress-test your normalized landed costs at +30%, +40%, and +50% freight surges.
Identify the “flip point”—the freight rate increase that would make your second-choice supplier cheaper than your first choice. If that flip point is uncomfortably low (say, a 15% increase flips the ranking), you’re vulnerable to freight volatility. Consider dual-sourcing or negotiating a freight cost-sharing mechanism with the supplier to manage this risk.
The approach is detailed in freight scenarios that flip kraft paper supplier rankings: when ocean rates change the winner.
Governance—Version Control & Update Cadence
The capability matrix is not a one-time qualification artifact. It’s a living document that requires regular updates to remain useful.
Set a Monthly Data Refresh Cycle
At a minimum, update the “Hold Window” and “Last-90-Day OTIF / Doc Accuracy” columns monthly for all active and approved suppliers. This keeps your view of supplier performance current and allows you to detect degradation trends before they cause line disruptions.
For critical or high-volume suppliers, consider quarterly re-verification pilots—smaller-scale runs that confirm the mill is maintaining its qualified capability. These aren’t full re-qualifications, but they provide an early warning if something has changed in the mill’s process or material sourcing.
Note Calibration Dates and Archive Prior Windows
Every time you update the matrix, timestamp the data and archive the previous version. This creates a historical record that’s invaluable when investigating quality issues or defending sourcing decisions during audits.
Track and document calibration logs with specific dates, instrument IDs, and any corrective actions taken. If a disputed result arises, knowing that the supplier’s burst tester was last calibrated six months ago (outside the typical three-month interval) provides crucial context for resolving the disagreement.
Align with Audit and CAPA Loops
Integrate the matrix review into your broader supplier audit and corrective action (CAPA) processes. When a supplier fails to meet a hold window during routine shipments, the matrix becomes the baseline for defining the corrective action target. For example, if the matrix shows the mill historically held moisture at 7.2–7.8% but recent shipments trend toward 8.2–8.6%, the CAPA should address what changed and how the mill will return to its qualified window. Log the root cause analysis, corrective actions, and verification results, then update the matrix accordingly.
The mill-first rule for evaluating kraft paper vendors: why process capability predicts supply performance explains how to use capability data as the foundation for ongoing supplier management, not just initial qualification.
Frequently Asked Questions
How do I handle suppliers that refuse to share hold window data or recent COAs?
Treat missing evidence as a red flag, not a negotiable point. A supplier unwilling to share recent test data either lacks confidence in their process control or doesn’t maintain the documentation systems needed for reliable, scalable supply. In either case, they’re a poor candidate for qualification. Focus your efforts on suppliers who view transparency as a competitive advantage, not a burden.
Should I weigh manufacturer capability more heavily than exporter reliability, or vice versa?
Both are necessary; neither is sufficient alone. A mill with perfect capability paired with an unreliable exporter will cause delivery chaos and documentation delays that negate the quality advantage. Conversely, an excellent exporter can’t fix a mill that drifts off-spec. The 60/40 weighting (capability/reliability) provides a balanced starting point, but you can adjust based on your specific risk profile. If your production schedule has zero buffer for late deliveries, increase the weight on OTIF. If quality failures trigger costly downstream claims, weight capability more heavily.
How often should I re-run pilot trials for approved suppliers?
For new suppliers in their first year of supply, conduct a re-verification pilot at the six-month mark to confirm sustained capability. After the first year, quarterly small-batch verifications (not full pilots) are typically sufficient for stable suppliers. If you detect a trend toward spec limits in routine shipments or if the mill reports a major process change (new equipment, different fiber sourcing), schedule an immediate re-qualification pilot before accepting further volume.
Disclaimer: This article provides educational guidance on supplier qualification frameworks and is not a substitute for professional procurement, quality assurance, or legal advice. Specific qualification requirements, test methods, and acceptance criteria should be developed in consultation with your internal quality and procurement teams and adapted to your facility’s unique requirements.
Ready to put these qualification principles into action? Explore containerboard suppliers on PaperIndex—where you can post RFQs, review supplier profiles, and connect directly with mills and exporters who understand evidence-based qualification. The platform is free for buyers, with no transaction fees or commissions.
Our Editorial Process
Our expert team uses AI tools to help organize and structure our initial drafts. Every piece is then extensively rewritten, fact-checked, and enriched with first-hand insights and experiences by expert humans on our Insights Team to ensure accuracy and clarity.
About the PaperIndex Insights Team
The PaperIndex Insights Team is our dedicated engine for synthesizing complex topics into clear, helpful guides. While our content is thoroughly reviewed for clarity and accuracy, it is for informational purposes and should not replace professional advice.
