Verification of a classifier is important to make sure that data does not get classified in an incorrect manner. To make sure that your classifier is doing what it is supposed to, follow the steps outlined here.
To verify our trainable classifier, we could go back to the list in the previous section and look at steps 8-12. Here, we will have to give our take on what the classifier has identified. Does it behave as we configured it to?
- Go to your classifier and test it on new content.
Figure 4.20 – Does the classifier identify relevant data?
2. Make sure to add your input for whether the data is relevant.
Figure 4.21 – Click on Yes if the data is relevant, No if not
3. Simply put, we let our classifier scour through even more data and make our decisions based on the assumptions from the classifier. If there are errors or data gets classified in an insufficient manner, we need to look deeper into how the classifier operates and how it identifies data.
We need to remember that the classifier will always look at the seed content from which it was trained. If data gets classified the wrong way, we might need to retrain our classifier using new seed content, which we will cover in the next topic.
Retraining a classifier
Say that a classifier is behaving wrongly or we need to use it to classify other data than the original purpose. The following figure shows the process involved with retraining a classifier:
Figure 4.22 – Process of retraining a classifier
This calls for retraining and the steps involved are listed as follows:
- To start the retraining process, let’s head over to the content explorer in the Microsoft 365 compliance center to start retraining our classifier. You’ll find the Content explorer tab under Data classification:
Figure 4.23 – Screenshot showing the location of Content explorer in the compliance center
2. Here, we will go to Filter on labels | Info types or categories, expand Trainable Classifiers, and select the classifier we wish to retrain.
3. Choose an item the classifier has processed and select Provide feedback.
4. In the Detailed feedback pane, we can provide feedback on whether or not data is true positives (Match) or false positives (Not a Match).
5. Once we have provided our feedback to the classifier, it will automatically start retraining. The timeline here is 1-4 hours, and any classifier can be retrained a maximum of 2 times per day.
When retraining finishes, we get an overview of the classifier looking like this:
Figure 4.24 – Overview of our classifier after retraining
6. Here, we can choose to republish the classifier if the results are satisfactory, start the retraining process once more, or simply do nothing, in which case the classifier will be published as it was before starting the first retraining process.
In this section, we have covered how to retrain our classifier in order to make it even more sensitive to how we store data.
Summary
This chapter has been about trainable classifiers, and we have discussed what they are, how they can be used, how to create classifiers, and lastly, how to manage them to make sure they are working as intended.
In the next chapter, we will take a deep dive into how to create and manage sensitivity labels.
Leave a Reply