Data Protection Measures

Last updated: September 15, 2025

ITAR

Cofactr is ITAR registered and compliant. The Cofactr platform can be safely used to store export-controlled data. For more information, please read our help desk article specifically on ITAR compliance.

General Data Security

Cofactr is SOC2 certified and undergoes regular, extensive penetration testing.

To access our full SOC2 report, please visit our information security trust center.

DocAI & WebAI Data Security

DocAI is our proprietary AI product for accurately and securely extracting data from emails, PDF documents, and other attachments.

Cofactr's DocAI product powers numerous sourcing and procurement workflows within the Cofactr platform.

We have engineered DocAI to uniquely combine the adaptability and benefits of the latest LLM-based document processing techniques with the security and ITAR compliance that our application demands.

To help you understand the security approach we have taken with DocAI, we have broken the process into its key steps:

Step 1 - Ingestion

Documents (PDFs, emails, etc) enter DocAI in three ways:

  1. Manual Upload: Drop files directly into the Cofactr platform for DocAI processing

  2. Email Forwarding: Forward emails to your individual DocAI forwarding email address for processing

  3. Integrations: We offer numerous integrations with common file management products (Google Drive, Dropbox, OneDrive, SFTP, etc) and email providers (Gmail, Outlook, etc). These can be configured within the Integrations App in the Cofactr platform. Email integrations have specific additional filtering properties that we can help you configure to control which incoming emails will be processed by DocAI and which will be ignored.

Step 2 - Classification

In the Classification step, DocAI analyzes the document and/or email contents to determine what kind of document it is and matches it to the optimal extraction pipeline. This is performed with an LLM hosted within a FedRamp High/ITAR-compliant cloud environment.

Step 3 - OCR

In the OCR step, DocAI extracts text and semi-structured data, such as tables, from the source document and converts the source document into a format called Markdown. We perform this step using several different AI and non-AI approaches depending on the document. This step is performed entirely within our secure AWS GovCloud environment using models that we host. These AI models do not use documents or other customer data for training.

Step 4 - Chunking & Obfuscation

In this step, our code takes the Markdown extracted from the document and chops it up into many small pieces. The result of this step is a bunch of snippets of Markdown that we can turn back into the source material, but from which no other system could possibly extract usable information.

Step 5 - Extraction

In the Extraction step, we send each obfuscated chunk to an LLM and prompt it to extract structured JSON in a specified schema. Depending on the document, we use LLMs from Anthopic and Google. We have zero data retention agreements with our LLM providers and the models are hosted within a FedRamp High/ITAR-compliant cloud environment.

Step 6 - Reassembly

Back in our secure AWS GovCloud environment, we reassemble the chunks from the LLM responses. We then perform additional validation and data cleaning.