User Guide
Glossary of Terms

An Agent is a worker process, installed locally or on a remote VM, which scans and evaluates locations (cloud or local) for sensitive data such as social security numbers, credit card numbers, etc.
-
For speed and efficiency, Agents are typically deployed in a collection, called a Discovery Team. See Discovery Team, below.
-
See also What is an Agent?

See Spirion Agent, below.

An Agent Policy is a set of rules for the agent to follow.

An Asset or Data Asset is a location, local, or remote (such as cloud-based), that contains Targets (a Target is any data location inside an Asset that SDP can scan).
-
For example, an SQL server (Asset) with multiple SQL Databases hosted on it (Targets)
A location can be both an Asset and a Target.
-
For example, a workstation (Asset and single Target)

The action of applying a label to a location via the file system, directly within the file metadata, or within the SDP database.

For both macOS and Windows, see Select Files by Extension.

Compensating Controls are actions applied to sensitive data discovered in your environment.
-
These actions include: restricted access, script execution, quarantine, ignore, and Playbook user actions.
In the SPIglass™ Dashboard, Compensating Controls represents the total cost of all sensitive data matches with compensating controls in place. All costs are taken from the dollar value assigned to each data type in the global data types settings.
In the Organizational Data Risk semi-circle chart, Compensating Controls are shown against Inherent and Residual data. For more information, see SPIglass™ Dashboard.

Defined data structures that represent different types of sensitive data, such as a credit card number, password, or social security number.

When using scan playbooks, including adding a new scan playbook, you add or edit data types. During this process you are prompted with the option to specify additional valid delimiters when searching. In this box enter any additional valid delimiters Sensitive Data Platform may encounter when scanning for the specified data type in your environment. A delimiter is a character or sequence of characters that marks the boundary between different parts of data, like a comma in CSV files, a tab, or a space. Delimiters help separate individual items or fields in text or data streams, enabling applications such as Sensitive Data Platform to parse and understand the data's structure.

An end-user provided list of terms Sensitive Data Engine (SDE) can use to look for.

The action of scanning a file system to find files and folders OR databases / blob stores to identify data locations.

Discovery Teams are a collection Spirion Agents, installed on physical and/or virtual machines (local or remote), which are used to scan servers, cloud sources (SharePoint, box, Google Workspaces, etc.) for sensitive data. See What is an Agent?

A group of agents configured to scan targets and collectively work to complete that scan.

See Discovery Team, above.

A simple data type that is an exact case-sensitive match.

For users of Sensitive Data Platform who want to exclude specific results from one or more searches, the Pattern-Based Global Ignore List was introduced in version 13.6. This feature broadly excludes exact matches or regex patterns. This feature is accessible via the UI and does not require a separate SDD or Search API

A simple data type that is an exact case-sensitive match.

The amount of time elapsed since the agent sent a signal indicating it was active/ready.

An end-user who is directly logged into a given computer (that is, "At the keyboard" and not through Remote Desktop/RDP).

A scan match result such as "c:\temp\chat.docx".
A Location can have one to many matches.

Unmanaged data refers to data that is stored and managed by the data owner or organization, who is responsible for all aspects of data management:
-
Infrastructure
-
Security
-
Maintenance
This contrasts with Managed data, where a third-party provider, such as Microsoft Azure or AWS handles these responsibilities.
-
Unmanaged data example: Running a database on your own servers, where you manage the hardware, software, and security yourself.
-
Managed data example: Using a managed database service where a cloud provider handles the underlying infrastructure, software, and database management.

An instance of Sensitive Data, such as a single credit card number, found in a Location.
Each individual match is unique.

Spirion’s password syntax rules are as follows:
The password must be at least 10 characters long, and a minimum of:
- 1 alpha character
- 1 uppercase
- 1 lowercase
- 1 number
- 1 special character
Use only passwords which conform to these rules.

Payment Cardholder data, which includes information like credit card numbers, expiration dates, and cardholder names, protected by the Payment Card Industry Data Security Standard (PCI DSS).
This standard is a set of security rules created by major credit card brands to ensure that any entity processing, transmitting, or storing this sensitive data maintains a secure environment.
Businesses must be PCI compliant to protect this data, prevent fraud, and avoid penalties from payment processors and card networks.

Personally Identifiable Information.
Any information that can identify a person.
Examples include: name, address, social security number, telephone number, email address, gender, race, birth date, medical, educational, financial and employment information

A sequential set of rules which define the action(s) to be taken when performing a scan.
For example, when a scan discovers sensitive data matches, take the action of referring those specific matches to a specific department for review and remediation.

The administrative view for creating and defining a playbook.

The end user view for investigation and remediation of matches.

Settings that determine how an agent operates at its base state.

Spirion agents parse all known files and generate a list of locations with sensitive data which is put into the PostgreSQL database. Additional Spirion agents consume the list provided by the PostgreSQL queue and send their results back to PostgreSQL. Those results are bashed and sent back to the Spirion console.
-
Spirion Agents version 13.6 and later use PostgreSQL
PostgreSQL, or Postgres, is a powerful, free, and open-source object-relational database management system (RDBMS) known for its reliability, feature robustness, and extensibility. It uses the SQL language for querying and transactions. PostgreSQL supports a wide range of data types, complex queries, and transactional integrity (ACID-compliant), making it a popular choice for enterprise-level applications, web services, and data analytics.

Isolating vulnerable or sensitive data by moving the data to a secure location. For example, sensitive data, such as credit card numbers, are discovered on a user's laptop and moved to a secure OneDrive folder. See How to Quarantine to OneDrive.

Spirion agents parse all known files and generate a list of locations with sensitive data which are placed into a queue managed by RabbitMQ. Additional Spirion agents consume the list built up in the queue. Agents perform scans with queue information and send their results back to RabbitMQ. Those results are bashed and sent to the Spirion console. Note: RabbitMQ requires the Erlang programming language.
-
RabbitMQ is used by Spirion Agents v13-13.5. Spirion Agents v13.6 and later use PostgreSQL.
Note: Sensitive Data Manager (SDM), Spirion Mac Agents, and Spirion Linux Agents do not use RabbitMQ.

Data redaction permanently removes or obscures sensitive information within documents or records, which prevents unauthorized individuals from viewing or recovering it.
For example, sensitive data, such as passwords or social security numbers are discovered in a text file and are replaced with characters such as 'X' or '#'.

A common method of finding patterns within blocks of text. RegEx is used in Sensitive Data Platform to create custom search criteria for locating specific patterns in data. This includes identifying sensitive information such as personal data, financial details, and other confidential information by defining patterns that match various formats of data. RegEx enables precise and flexible searching, which is essential for data discovery and compliance with privacy regulations. Regular expressions can be run directly from the Spirion Client interface or via a Console policy. Additionally, you can test regular expressions using online RegEx testers like regex101.com, ensuring that they align with the Spirion implementation.

Remediation is a proactive approach to addressing vulnerabilities and ensuring data is accurate, complete, and consistent, thereby mitigating risks and adhering to regulations.
Quarantine, redaction, and deletion of vulnerable data are all examples of remediation actions.

Scans are the searches that agents perform on endpoints (targets) to find either the file locations (Discovery Scan) or find specific data types (Sensitive Data Scan) within the files and folders.

The action of scanning a file system to find files and folders OR databases / blob stores to identify data locations.

See Sensitive Data Scan below.

Settings that determine what is scanned, where scans occur, which agents perform the scan, and what configuration options are used during that scan.
For Sensitive Data Scans this includes a Playbook.

The action of scanning within a file, folder, database, or blob stores for specific data type matches.

This type of scan enables you to search for sensitive data, such as a credit card number, password, or social security number, within defined Targets and take actions on them based on the playbook rules defined for them.

Search engine logic created by end-users to find custom data types with accuracy.

Search engine used for classification comprised of various modules (for example, RegEx, Dictionary, Keyword, and so on).

A Social Insurance Number (SIN) is Canada's unique 9-digit identifier used for working, accessing government programs, and filing income tax returns.
It is confidential and must be protected to prevent fraud.
You are required to provide your SIN to your employer and other financial institutions for income-related matters.

Spirion Agent is the name used for the Spirion application installed separately from Spirion Sensitive Data Platform. The Spirion Agent is installed on an endpoint such as your local PC (Windows and Mac are supported. Linux is not supported).
The Spirion Agent provides a user-friendly interface for testing, configuring scans, viewing results, and managing sensitive data policies.For example, testing the connection string to a database such as PostgreSQL or MSSQL. See How to Configure and Test a Database for Searching.
Note: The Spirion Client is required to configure database scanning.
• The client offers faster responses and a better view of how scans are actually progressing.
The Spirion Client does not enable you to do the following:
• Leverage playbooks in Sensitive Data Platform.
• Configure certain Targets (Amazon S3, Exchange Online, others)

Settings that are required but not configurable by the user.

Settings that are used until changed by a user.

A Tag is a kind of container.
A Tag is a manual or dynamic group of Targets (such as Marketing Laptops or HR Databases).
Three are three types of Tags:
-
IP Range
-
Manual
-
Conditional
You can select the Targets for your Tag manually, or you can define the conditions that determine which Targets are placed into your Tag.
See Tag Management.

Any data location within an Asset that SDP can scan.
Targets can be in a “physical” box that can be scanned or they can be in a cloud asset.
-
Examples:
-
Targets in Local Assets: SQL Databases on a local SQL server
-
Targets in Cloud Assets:
-
Databases on Amazon S3, Azure Blob, Bitbucket, Google Drive
-
File Directories in SharePoint
-
-
Targets in Email:
-
Exchange On-Prem email which is housed on a local server
-
Exchange Online email which is housed in the cloud
-
-
-
Targets in Virtual Machines: databases on an Oracle VM, Amazon EC2, etc.

User Level Remediation.
Empowers the end user to address sensitive data policy violations, issues or risks and resolve them.
For example, a physical machine such as a local laptop or desktop or a cloud asset with data such as Amazon S3 or SharePoint.

Unmanaged data refers to data that is stored and managed by the data owner or organization, who is responsible for all aspects of data management:
-
Infrastructure
-
Security
-
Maintenance
Unmanaged data is often stored in various locations without a clear management structure.
Unmanaged often lacks proper access controls, encryption, and regular security audits.
Without proper oversight, unmanaged data is more vulnerable to breaches, malware, and other security threats.
This contrasts with Managed data, where a third-party provider, such as Microsoft Azure or AWS handles these responsibilities.
-
Unmanaged data example: Running a database on your own servers, where you manage the hardware, software, and security yourself.
-
Managed data example: Using a managed database service where a cloud provider handles the underlying infrastructure, software, and database management.

Job for the agent to do (for example, Discovery, Classification, Remediation).

The logic and actions to be performed automatically when matches are validated.