Documentation  
Data Preparation

Preparing Data for Classifier Training and Batch Classification

Batch Classification

To prepare your data for batch classification:

  1. Ensure your input file is in one of the following formats:

    • .xlsx (Excel)
    • .csv (Comma-Separated Values)
    • .jsonl (JSON Lines)
  2. Your file must contain a "text" field, which will be used for classification.

  3. The accepted file formats for upload are:

    • .csv
    • .xlsx
    • .jsonl
  4. Before uploading, consider the following parameters:

    • Select an appropriate classifier for your task
    • Determine the number of labels you want per text (default is 1)
    • Set a threshold value (default is 0.5) to filter classification results
  5. Once your file is prepared, you can drag and drop it into the designated area or click to select it from your file system.

  6. After uploading, you'll be able to select the input columns for classification.

Classifier Training

To prepare data for training a new classifier:

  1. Create a CSV or XLSX file with two columns:

    • 'label': The category or class name
    • 'label_description': A detailed description of what the label represents (optional but strongly recommended)
  2. Structure your file similar to this example:

    labellabel_description
    FinanceInvolves managing money, including investments, budgeting, forecasting, and financial planning.
    AccountingFocuses on recording, summarizing, and reporting financial transactions for businesses and organizations.
    SalesResponsible for selling products or services, maintaining customer relationships, and meeting sales targets.
    MarketingInvolves promoting and selling products or services, including market research, advertising, and public relations.
    Human ResourcesHandles recruitment, training, employee relations, and benefits administration.
    Information TechnologyManages computer systems, networks, software development, and IT support.
    OperationsOversees the production and delivery of goods or services, ensuring efficiency and quality.
  3. Save your file in either .csv or .xlsx format.

  4. When creating a new classifier:

    • Choose a descriptive name for your classifier (e.g., "user_sentiment_classifier_v2")
    • Upload your prepared file by dragging and dropping it into the designated area or clicking to select it
  5. After uploading, you can create your custom classifier, which will be trained to recognize and apply these labels to new text data.

Remember, providing detailed and accurate label descriptions will significantly improve the performance of your custom classifier.