Training

The Training screen contains the necessary tools to train and compare different custom classification models.

Types of Models

Alfred allows users to train two distinct types of classification models, binary classification and multi-class classification.

Model Type
Behavior Description

Binary Classification

These models analyze each uploaded file to determine if it matches their single assigned Tag. If a file does not match this Tag, it is assigned a “No Tag” status, indicating that it does not fit the model’s predefined category.

Multi-class Classification

These models analyze each uploaded file to determine to which of its Tag, each file belongs to. If the file fails to meet the expected confidence treshold for the predicted Tag, the file will be assigned the "No Tag" status.

Training a New Model

To train a new custom classification model, the user must click on the New Training + button. This will open the training pop up screen.

Data Preparation

The custom model training process requires two JSONL file containing training and test data. The JSONL file must follow a specific formatting requirment which is detailed below:

{"Text": "Sample Text for Label 0" "Label": "0"}
{"Text": "Sample Text for Label 1", "Label": "1"}
{"Text": "Sample Text for Label 2", "Label": "2"}
{"Text": "Sample Text for Label X", "Label": "X"}

Custom models do not classify using Tags! They will assign files to a Label represented by an integer.

Mapping Tags

Custom models classify files by assigning an integer Label to the analyzed file. Therefore it is necessary to map Label-Tag pairs in order to ensure proper classification within the Alfred processing pipeline.

In order to map the Label-Tag pairs the user must click on the Add Mapping button in the Training pop-up screen.

After clicking the button, a line with two requiered fields will be shown. The user must define the integer Label that corresponds to each specific Tag that the custom model will classify.

Uploading Data

Once the user has prepared the data and created the Label-Tag pairs, it is necessary to upload the Training and Test datasets to complete the training pipeline. User can drag and drop the JSONL files to either the training and test dataset.

Last updated