Third-party risk management for organizations using AI vendors

Third-party risk management programs have evolved over many years to address the risks that come with relying on external service providers. Mature programs include vendor classification by criticality, due diligence assessments, contractual controls, and ongoing monitoring. The frameworks supporting these programs have served well across most categories of vendor relationships.

AI vendors introduce considerations that traditional frameworks address only partially. The risk surface of an AI vendor extends beyond the categories that standard assessments cover, and several questions that materially affect the risk profile are not asked in typical questionnaires. This article outlines the additional considerations that warrant attention when third-party risk management is applied to AI vendors.

The model training boundary

One of the most consequential questions in an AI vendor relationship is whether the vendor uses the customer's data, prompts, or outputs to train, fine-tune, or evaluate models. This question is sometimes addressed in contracts but is often left ambiguous. Where it is addressed, the technical enforcement may not be clear.

Effective third-party risk assessment of AI vendors should establish the contractual position on training data, the technical controls that prevent inadvertent inclusion of customer data in training pipelines, the position regarding fine-tuning and model evaluation in addition to base model training, and the treatment of derivative data such as embeddings.

The default contractual position should be that customer data is not used for any model improvement activity unless explicitly opted in. Vendors that offer this position credibly typically have technical controls that make it reliable rather than depending on operational discipline alone.

Output reliability and integration risk

Traditional vendors that produce incorrect output typically produce errors that are detectable and discrete. AI vendors that produce confidently incorrect output may produce errors that are not immediately distinguishable from correct output. This characteristic has implications for how the customer integrates the vendor's output into operations and for the controls that should accompany that integration.

Risk assessment should consider the customer's intended use of the AI vendor's output, the controls that validate or qualify outputs before they affect operational decisions, the monitoring that would detect output quality degradation, and the contractual provisions regarding output accuracy and remediation. Use cases where AI output drives consequential decisions warrant particular attention to these controls.

Subprocessor and foundation model dependencies

Many AI vendors depend on foundation models provided by other parties. The customer's data and the vendor's product flow through these underlying providers. The provider's training practices, data handling, and operational controls become part of the customer's effective vendor risk surface.

Assessment should establish which foundation models are used, the contractual relationship between the AI vendor and the foundation model provider regarding training and data handling, the customer's visibility into changes in foundation model providers or terms, and the customer's recourse if the foundation model relationship changes in ways that affect the customer's risk position.

This area is sometimes treated as a fourth-party concern that does not require detailed examination. For AI vendors, the dependency is typically more material than in other vendor categories, and the assessment work should reflect that materiality.

Tenant isolation in shared model serving

Most AI vendors operate shared infrastructure where multiple customers' workloads run on common model serving systems. The tenant isolation guarantees in this environment differ from those in traditional multi-tenant SaaS architectures and warrant specific examination.

Assessment considerations include the technical controls that prevent data leakage between tenants in shared model serving, the controls that prevent model behaviour learned from one tenant from affecting outputs to another tenant, and the audit logging that supports investigation of suspected isolation failures.

Model lifecycle and behaviour change

AI vendors update their underlying models, sometimes substantially, during the period of a customer relationship. These updates can change system behaviour in ways that affect the customer's use case. Traditional vendor assessments give limited attention to this dimension because traditional vendors do not typically experience behaviour changes of comparable magnitude.

Effective assessment of AI vendors should establish the vendor's process for testing model updates against regression, the customer's notice rights when material behaviour changes are planned, the customer's options if a model update degrades the use case, and the deprecation timeline for prior model versions.

Audit trail availability

The customer's ability to investigate incidents involving the AI vendor depends on the availability of audit logs. Traditional vendor relationships typically include some form of audit trail, but the granularity and accessibility for AI vendors varies considerably.

Assessment should establish what audit logs the vendor maintains, what visibility the customer has into those logs, the retention periods for logs relevant to incident investigation, and the access provisions in the contract. This dimension is increasingly important as customers face their own obligations to investigate and respond to incidents involving systems they operate, including those backed by AI vendors.

Contractual completeness

The contractual position with an AI vendor often requires specific provisions that are not standard in SaaS templates. Provisions that warrant explicit attention include training data commitments, model update notification requirements, audit log access, indemnification language addressing model output, data deletion commitments that account for embeddings and derivative data, and subprocessor disclosure that includes foundation model providers.

Organizations encountering these requirements for the first time often find that template language is insufficient and that vendor pushback on specific provisions is common. The requirements are nonetheless reasonable and increasingly expected by enterprise customers.

Building these considerations into the program

The practical approach to integrating these considerations into existing third-party risk management is to maintain the existing program structure while extending the assessment templates and contractual requirements for AI vendors specifically. The classification, due diligence, and ongoing monitoring framework continues to apply. The substance of what is assessed and what is required contractually expands to address the additional dimensions.

Organizations completing this expansion before regulatory or major customer pressure forces it typically find the work more straightforward than organizations attempting it under deadline pressure.