Data Protection Measures
Last updated: September 15, 2025
ITAR
Cofactr is ITAR registered and compliant. The Cofactr platform can be safely used to store export-controlled data. For more information, please read our help desk article specifically on ITAR compliance.
General Data Security
Cofactr is SOC2 certified and undergoes regular, extensive penetration testing.
To access our full SOC2 report, please visit our information security trust center.
DocAI & WebAI Data Security
DocAI is our proprietary AI product for accurately and securely extracting data from emails, PDF documents, and other attachments.
Cofactr's DocAI product powers numerous sourcing and procurement workflows within the Cofactr platform.
We have engineered DocAI to uniquely combine the adaptability and benefits of the latest LLM-based document processing techniques with the security and ITAR compliance that our application demands.
To help you understand the security approach we have taken with DocAI, we have broken the process into its key steps:
Step 1 - Ingestion
Documents (PDFs, emails, etc) enter DocAI in three ways:
Manual Upload: Drop files directly into the Cofactr platform for DocAI processing
Email Forwarding: Forward emails to your individual DocAI forwarding email address for processing
Integrations: We offer numerous integrations with common file management products (Google Drive, Dropbox, OneDrive, SFTP, etc) and email providers (Gmail, Outlook, etc). These can be configured within the Integrations App in the Cofactr platform. Email integrations have specific additional filtering properties that we can help you configure to control which incoming emails will be processed by DocAI and which will be ignored.
Step 2 - Classification
In the Classification step, DocAI analyzes the document and/or email contents to determine what kind of document it is and matches it to the optimal extraction pipeline. This is performed with an LLM hosted within a FedRamp High/ITAR-compliant cloud environment.
Step 3 - OCR
In the OCR step, DocAI extracts text and semi-structured data, such as tables, from the source document and converts the source document into a format called Markdown. We perform this step using several different AI and non-AI approaches depending on the document. This step is performed entirely within our secure AWS GovCloud environment using models that we host. These AI models do not use documents or other customer data for training.
Step 4 - Chunking & Obfuscation
In this step, our code takes the Markdown extracted from the document and chops it up into many small pieces. The result of this step is a bunch of snippets of Markdown that we can turn back into the source material, but from which no other system could possibly extract usable information.
Step 5 - Extraction
In the Extraction step, we send each obfuscated chunk to an LLM and prompt it to extract structured JSON in a specified schema. Depending on the document, we use LLMs from Anthopic and Google. We have zero data retention agreements with our LLM providers and the models are hosted within a FedRamp High/ITAR-compliant cloud environment.
Step 6 - Reassembly
Back in our secure AWS GovCloud environment, we reassemble the chunks from the LLM responses. We then perform additional validation and data cleaning.