Data Ingestion

Akurion turns project content into a searchable knowledge base. After content is added through direct upload or an enabled workspace ingestion path, Akurion prepares it for governed search, cited answers, workflows, graph context, REST APIs, and MCP tools.

Availability

Source connector availability is controlled in the Akurion app and may vary by workspace. Disabled connector types are not documented publicly until they are enabled for customer use.

Direct file upload is the default documented ingestion path for pilots, ad hoc analysis, and curated knowledge bases.

Sync Lifecycle

Each ingestion path follows this general lifecycle:

Add content to a project.
Store the file or source configuration.
Prepare content for processing.
Discover or receive files and records.
Track readiness and source status.
Enrich content with project metadata.
Make approved content available for search, answers, workflows, and APIs.
Add graph context when Graph RAG is enabled.

Source Status

File and source status tells you whether content is ready for retrieval.

Common states:

State	Meaning
`processing`	The source or file is actively being processed.
`processed` or `completed`	Content is ready for search and answer generation.
`failed`	Processing failed and may need admin action.
`retrying`	Akurion is retrying a job, often after transient failure or memory escalation.
`no_credits`	The subscription or project has insufficient credits for processing.
`permission_error`	The configured ingestion path cannot access the file or record.
`format_not_supported`	The file type could not be processed.

Direct Upload

Use file upload for pilots, ad hoc analysis, and small curated knowledge bases.

For developer uploads, use:

curl -X POST "https://api.structhub.io/api/v1/project/files/upload-url" \
  -H "Content-Type: application/json" \
  -H "API-KEY: YOUR_API_KEY" \
  -H "X-Project-ID: YOUR_PROJECT_ID" \
  -d '{
    "name": "policy.pdf",
    "file_type": "application/pdf"
  }'

Metadata for Better Retrieval

Metadata makes retrieval more precise. Useful metadata keys include:

Department
Region
Customer
Product
Source system
Document type
Effective date
Owner
Confidentiality
Workflow stage

After metadata is generated, users can filter in chat and developers can pass metadata filters to REST or MCP tools.

Operational Tips

Start pilots with a small set of high-value documents rather than all company content.
Define metadata keys before indexing large document sets.
Use file status and source health, where available, to validate readiness.
Enable Graph RAG when entity relationships are important.
Use resync after changing source scope or metadata settings.
Use project instructions to steer answer behavior for each knowledge domain.