Preparing Data for Classifier Training and Batch Classification
Batch Classification
To prepare your data for batch classification:
-
Ensure your input file is in one of the following formats:
- .xlsx (Excel)
- .csv (Comma-Separated Values)
- .jsonl (JSON Lines)
-
Your file must contain a "text" field, which will be used for classification.
-
The accepted file formats for upload are:
- .csv
- .xlsx
- .jsonl
-
Before uploading, consider the following parameters:
- Select an appropriate classifier for your task
- Determine the number of labels you want per text (default is 1)
- Set a threshold value (default is 0.5) to filter classification results
-
Once your file is prepared, you can drag and drop it into the designated area or click to select it from your file system.
-
After uploading, you'll be able to select the input columns for classification.
Classifier Training
To prepare data for training a new classifier:
-
Create a CSV or XLSX file with two columns:
- 'label': The category or class name
- 'label_description': A detailed description of what the label represents (optional but strongly recommended)
-
Structure your file similar to this example:
label label_description Finance Involves managing money, including investments, budgeting, forecasting, and financial planning. Accounting Focuses on recording, summarizing, and reporting financial transactions for businesses and organizations. Sales Responsible for selling products or services, maintaining customer relationships, and meeting sales targets. Marketing Involves promoting and selling products or services, including market research, advertising, and public relations. Human Resources Handles recruitment, training, employee relations, and benefits administration. Information Technology Manages computer systems, networks, software development, and IT support. Operations Oversees the production and delivery of goods or services, ensuring efficiency and quality. -
Save your file in either .csv or .xlsx format.
-
When creating a new classifier:
- Choose a descriptive name for your classifier (e.g., "user_sentiment_classifier_v2")
- Upload your prepared file by dragging and dropping it into the designated area or clicking to select it
-
After uploading, you can create your custom classifier, which will be trained to recognize and apply these labels to new text data.
Remember, providing detailed and accurate label descriptions will significantly improve the performance of your custom classifier.