Data Ingestion

Standard vs. Direct Load

OneStream offers two Workflow types for data ingestion: the Standard Workflow and the Direct Load Workflow. They achieve the same end result — data in the Cube — but take very different paths to get there. Choosing the right one depends on your data complexity, performance requirements, and auditability needs.

How They Differ

The fundamental difference is whether data passes through Stage tables on disk:
  • Standard Workflow — Data is parsed, transformed, and stored in Stage tables. The user walks through Import → Validate → Load as discrete steps. Source and target records persist in the database.
  • Direct Load Workflow — Data is parsed, transformed, validated, and loaded in a single in-memory operation. Source and target records are not written to Stage tables. Everything happens in one step called Load And Transform.
diagramStandard vs. Direct Load Data Paths

Loading diagram...

Feature Comparison

FeatureStandard WorkflowDirect Load Workflow
Data pathFile → Stage tables → CubeFile → memory → Cube
Source/target recordsStored in Stage — available for auditing and analysisNot stored — no audit trail in Stage
RetransformSupported — fix rules and re-apply without re-importingNot available — must re-import the file
Drill-BackSupported from Finance Engine to StageNot available
Validation error limitNo limit — pageable error list1,000 records per data load
Total error storageUnlimited (persisted in tables)50,000 records (in-memory limit)
Workflow stepsImport → Validate → Load (separate tasks)Single combined Load And Transform step
PerformanceSlower — Stage table I/O adds overheadFaster — no disk writes for staging
Stage persistenceFull source and target history retainedNo Stage history
Data managementSource file import history available for referenceNo historical file records
Time/Scenario constraintsStandard loading rules applyData must match the Workflow Scenario and Time — cannot load beyond current Workflow year

When to Use Each

Standard Workflow — Best For

  • Development and initial implementation — When you are building and refining Transformation Rules, Retransform and Drill-Back are invaluable
  • Complex mappings — When the source-to-target relationship is not straightforward and requires iterative debugging
  • Audit requirements — When you need to retain source file history and transformation history for compliance
  • Low-to-moderate volume data — Where Stage I/O overhead is negligible
  • First-time integrations — When you are connecting a new source system and expect mapping issues

Direct Load — Best For

  • Production loads with stable mappings — When Transformation Rules are proven and rarely change
  • High-volume nightly batch loads — Where optimal performance is critical and data is disposable (frequently deleted and reloaded)
  • Tightly coupled metadata — When source system metadata mirrors OneStream metadata, allowing simple * to * pass-through rules
  • Extended application data moves — Where data from a detailed application feeds a summarized target application
  • Non-durable data — Data that has a high frequency of changes and may only be valid for a short time
ℹ️Info
The Direct Load Workflow stores summarized records in one of two formats: Row (stored in Stage Summary Target Data table) or Blob (serialized byte array stored in Stage Direct Load Information). Row is the default. The choice affects storage size but not functionality.
The best practice is to use both types sequentially:
  1. Start with a Standard Workflow during development. Build your Transformation Rules with full Retransform and Drill-Back support. Use the pageable validation to identify and fix every mapping error.
  2. Once the core Transformation Rules are stable, create a production Direct Load Workflow. The stable mappings will pass through with minimal errors, and you get the performance benefit of in-memory processing.
  3. Keep the Standard Workflow available as a debugging tool. When new source values appear or mappings break, switch back to the Standard Workflow to investigate, fix the rules, and then return to Direct Load.
⚠️Warning
Do not start with a Direct Load Workflow for a new integration. The 1,000-record validation limit makes it impractical for initial mapping development — you may need many re-imports just to see all the errors. Use Standard first, then switch.

Direct Load Limitations Summary

For quick reference, here are the key limitations of the Direct Load:
  • No source/target record persistence — No Stage audit trail
  • No Retransform — Source records are not stored, so there is nothing to retransform. Must re-import the file.
  • No Drill-Back — Cannot drill from Finance Engine to Stage records
  • 1,000 validation errors per load — If more than 1,000 errors exist, you must re-import to see the next batch
  • 50,000 total error storage limit — After 50,000 cumulative errors, older errors are discarded
  • Time/Scenario locked — Data records' Time and Scenario must match the Workflow Scenario and Time exactly
  • No import history — No record of which files were loaded historically