Job
A Job in Alfred represents a single unit of work that performs a sequence of operations on one or more files for the purpose of document classification, extraction, and indexing. It is an asynchronous entity, orchestrated by a state machine that manages its progress through various stages.
For example, a Job could be responsible for ingesting a batch of scanned invoices, classifying them, extracting relevant fields, and then indexing them in a searchable database.
Responsibilities
Executes pre-defined workflows for document processing.
Manages state transitions and retries.
Orchestrates the processing of multiple files if applicable.
Emits events to signal significant changes in its lifecycle.
List all Jobs associated with the authorized user's company.
GET
https://<env>.tagshelf.com/api/job/all
This endpoint provides a list of Jobs that are specific to the company to which the authorized user belongs. It does not list all Jobs in the entire system but rather filters them based on company affiliation. This ensures that users only access and manage Jobs relevant to their organizational context.
Query Parameters
currentPage
Integer
The currentPage
parameter specifies the page number in the paginated list of Jobs. It is used to navigate to a specific page in the list, allowing users to access a particular subset of the Jobs data.
pageSize
Integer
The pageSize
parameter determines the number of Jobs displayed on each page of the paginated response. It allows users to customize the volume of data returned in a single API call, facilitating easier data handling and viewing, especially in cases of large datasets. The API typically has a predefined default value for the number of items per page, often set at 20. If the user does not specify a pageSize
, this default value is used. The minimun allowed value for this parameter is 10 and the maximum allowed value is 40.
Headers
X-TagshelfAPI-Key
String
Application API Key
Authorization
String
Bearer <access_token> or amx <hmac_token>
Retrieve detailed information about a specific Job.
GET
https://<env>.tagshelf.com/api/job/detail/:id
This endpoint provides comprehensive details about a Job identified by its unique id
. It's used to fetch the current status, progress, results, and any other relevant information about a specific Job. This is particularly useful for monitoring the ongoing process of a Job, understanding its current state, and for diagnostic purposes. The endpoint is essential for users or systems needing to track the progress and outcome of document processing tasks.
Path Parameters
id*
UUID
The unique identifier of the Job.
Headers
X-TagshelfAPI-Key
String
Application API Key
Authorization
String
Bearer <access_token> or amx <hmac_token>
Create a new Job and close the deferred upload session.
POST
https://<env>.tagshelf.com/api/job/create
This endpoint is used for creating a new Job in Alfred. It serves a dual purpose: it finalizes the deferred upload session, ensuring that all files needed for the Job are uploaded, and initiates the Job itself. This process includes transitioning from the file upload phase to the document processing phase, where the files undergo classification, extraction, and indexing based on predefined workflows. The endpoint is crucial for starting the document processing task and is invoked once all necessary files are in place.
Headers
X-TagshelfAPI-Key
string
Application API Key
Authorization
string
Bearer <access_token> or amx <hmac_token>
Request Body
session_id*
UUID
Session ID
metadata
Object or array of Objects
This parameter accepts a JSON object that encapsulates various metadata fields for the Job.
The metadata provided here serves as a set of descriptors or attributes that apply to the Job as a whole. Once defined, this metadata is automatically propagated to each file that is part of the Job. This means that every file within this Job will inherit the specified metadata, ensuring consistency and contextual relevance across all files associated with the Job.
This feature is particularly useful for maintaining uniformity in file attributes, aiding in categorization, and enhancing the searchability and traceability of files within a Job.
merge
boolean
The merge
parameter is a directive that instructs the file processor on how to handle multiple files within a Job. When set, and provided all files in the Job are either images or PDFs, this parameter signals that these files should be treated as a single unit of work. This means that instead of processing each file independently, the system combines them into a single file for the purpose of processing.
This approach is particularly beneficial when the files are parts of a larger document or dataset that need to be handled cohesively.
For example, if multiple scanned images of a document or several PDFs are part of a Job, setting merge
ensures they are processed together, maintaining the continuity and integrity of the document. This results in the Job having a singular output file, despite originating from multiple input files.
decompose
boolean
The decompose
parameter plays a critical role in how the file processor handles the input files. When enabled, this parameter triggers the decomposition of file inputs into multiple, distinct units of work. This is applicable in scenarios where a single file contains multiple separable components.
For example, in the case of a multi-page PDF, each page is treated as an individual file. Similarly, for an image or a PDF page containing multiple documents, each document is separated and processed independently.
This parameter is vital for tasks that require individual attention to each component of a file, such as detailed analysis, classification, or extraction of data from each part of a larger document. Decomposition enhances the granularity of processing, enabling more precise and targeted handling of each segment within a file.
propagate_metadata
boolean
This parameter enables the specification of a single metadata object to be applied across multiple files from remote URLs or remote sources. When used, propagate_metadata
ensures that the defined metadata is consistently attached to all the specified files during their upload and processing. This feature is particularly useful for maintaining uniform metadata across a batch of files, streamlining data organization and retrieval.
Last updated
Was this helpful?