Building a keyword dictionary using the Security & Compliance Center- Microsoft SC-400 Certification

You need to perform the following steps:

  1. Ensure you are logged into the Microsoft 365 Security & Compliance Centre with an account with Global Admin privileges (https://protection.office.com/).
  2. Next, go to Classification | Sensitive info types, as shown in the following screenshot:

Figure 3.14 – Sensitive info types

3. Click on Create sensitive info type and then enter a name and a description for the sensitive info type you want, as shown in the following screenshot. Then, click Next:

Figure 3.15 – Name and Description

4. Select Create pattern | Add primary element and choose Keyword dictionary:

Figure 3.16 – Add primary element

5. You now have a few different options, as follows:

  • You can enter a name and a list of keywords in the policy.
  • Click on Choose from existing dictionaries to select a built-in list of predefined words.

  • Click on Upload a dictionary to upload either a .csv or .txt file that has been created in advance:

Figure 3.17 – Add keyword dictionary

6. Now, click on Done and then Create. Review your configuration and click on Next and then Create to complete the configuration.

Next, we will learn about creating a keyword dictionary from a file using PowerShell.

Creating a keyword dictionary from a file using PowerShell

A lot of the time, you will use keywords from a file or a list that have been exported from some other source when creating a large dictionary. To complete this exercise, you will need to link up to the Security & Compliance Center in PowerShell, as shown in the Defining the schema for your database of sensitive information section, earlier in this chapter:

  1. Place the keywords into a text file and ensure that each one is on a separate line.
  2. Save the file with Unicode encoding.
  3. Run the following cmdlet in PowerShell to read the file into a variable. Replace <filename> with the full URL of where you saved the keywords file:

$fileData = Get-Content <filename> -Encoding Byte -ReadCount 0

4. Run the following cmdlet to create the dictionary:

New-DlpKeywordDictionary -Name <name> -Description <description> -FileData $fileData

Replace <name> and <description> with whatever name and description you wish to give the keyword dictionary. You can use keyword dictionaries in custom sensitive information types and DLP policies. For further information on this, you can refer to the following Microsoft Docs link: https://docs.microsoft.com/en-us/microsoft-365/compliance/create-a-keyword-dictionary?view=o365-worldwide.

Summary

In this chapter, we have covered a lot of different topics, including selecting a sensitive information type based on your organization’s requirements, how to create and manage a custom sensitive information type via the Microsoft 365 compliance center and PowerShell, how to create custom sensitive information types with an exact data match, what document fingerprinting is and why you should implement it, and how to create a keyword dictionary via the Microsoft Security & Compliance Center and PowerShell.

We have run through multiple exercises and if you have not followed these as of yet, I strongly recommend that you do them before moving on to the next chapter.

The next chapter will cover creating and managing trainable classifiers.

Leave a Reply

Your email address will not be published. Required fields are marked *