Connectors/Learn: Difference between revisions

From BigID Developer Portal
(Created page with "= Introduction to Connectors = == Connectors vs BigID Applications == BigID has two main ways to extend your system’s functionality: connectors and applications. While both connectors and applications have the same goal of giving you more value from BigID, they do that very differently. === Applications === Applications allow you to add new screens and business logic using the BigID application framework. While these applications can contact external systems, an appl...")
 
No edit summary
Line 5: Line 5:


=== Applications ===
=== Applications ===
Applications allow you to add new screens and business logic using the BigID application framework. While these applications can contact external systems, an application cannot be a data source or entity source within BigID. Applications are primarily for adding additional screens or logic, such as reports with additional statistics or synchronization with external tools.
Applications allow you to add new screens and business logic using the BigID application framework. While these applications can contact external systems, an application cannot be a data source or entity source within BigID. That means that BigID applications will not directly take part in the scanning process.
 
The main use case for applications is to add additional screens or business logic to your BigID system. An application could add a screen to your system that allows you to view a report with additional statistics specific to your industry or synchronize findings from BigID to an external DLP tool.
 
More information about applications and how to build your own is available in our BigID AppDev Pro class.


=== Connectors ===
=== Connectors ===
Connectors serve as the translator between your data source’s language and the language of BigID. BigID ships with over 50 connectors for common enterprise systems, such as AWS DynamoDB, Google Drive, and Microsoft Exchange.
For BigID to scan a system, a connector must exist. The BigID product ships with 50+ connectors out of the box for common enterprise systems. Connectors serve as the translator between your data source’s language and the language of BigID. They allow BigID to support any data source, whether that be a file share or a NoSQL database.
 
For a full list of supported data sources and their compatibility, visit [https://docs.bigid.com/docs/data-source-feature-summary].


== BigID Scanning Process ==
== BigID Scanning Process ==
The scanning process involves:
Before we create our connector, we need to understand more about the scanning process. Scanning is how BigID gets insights about your data. In a scan, BigID does the following:
* User executes a scan.
* Scan data repositories for PI, correlating records to entities across multiple data sources using advanced multi-prong data intelligence techniques.
* BigID sends a message to the scanner.
* Scan metadata to identify data objects containing PI that allow Open Access.
* Scanner contacts the data source via the connector.
* Classify data and files using advanced data analysis techniques.
* Data source returns information to the scanner.
 
* Scanner builds a search tree and returns results to BigID.
A high-level view of this scanning process is useful when thinking about the connector you wish to create.
* User views results in the Scan Result Details page.
 
=== Steps in Scanning ===
# **User executes a scan**: This can be done via the UI, a scheduled scan, or an API call.
# **BigID sends a message to the scanner**: The scanner is a separate server located close to your data source.
# **BigID scanner contacts the data source**: The scanner uses the appropriate connector to access the data source.
# **Data source returns information to the scanner**: The data source sends the requested data back to the scanner.
# **Scanner builds search tree and returns to BigID**: The search tree is used to correlate data across all sources.
# **User sees details in Scan Result Details**: The results are displayed in a report where users can view and adjust connections.


== Connector Types ==
== Connector Types ==
BigID supports two types of connectors:
There are two different types of connectors supported within BigID:
* **Internal (Java-based Connectors):** Customizable but complex, installed manually.
* **Internal (Java-based Connectors)**: These connectors are written in Java and distributed as JAR files. They offer customization options but are complex to develop and deploy.
* **External (Generic REST API Connectors):** Easier to develop, supports both structured and unstructured data.
* **External (Generic REST API Connectors)**: These allow developers to create connectors using their preferred programming language. External connectors can be hosted on any server accessible to both the scanner and the data source.


== Exercise: Finding Connectors ==
=== Deciding on a Connector Type ===
Instructions for finding and reviewing connectors in BigID documentation for specific use cases.
Unless your use case requires specific features of internal connectors, external connectors are recommended due to their ease of development and maintenance.


= Developing a Structured Connector =
= Developing a Structured Connector =


== The Connector Supermarket ==
== The Connector Supermarket ==
Structured connectors organize data hierarchically:
The hierarchy of BigID connectors mimics that of a supermarket:
* Departments are objects (e.g., "Customers").
* **Departments are Objects**: Departments represent similar types of items, akin to how objects in connectors represent a specific type of data.
* Items in departments are records.
* **Items are Records**: Items within departments are like records within objects.
* Each record contains fields with names, types, and values.
* **Information is Fields**: Each record contains fields with names, types, and values.


== Connector Services ==
== Connector Services ==
Structured connectors must implement the following services:
Structured connectors must fulfill the following services:
* List fields for objects.
* **List what fields all items in an object have.**
* List objects available.
* **List what objects exist.**
* List records within an object.
* **List what records are inside an object.**
* Retrieve specific record details.
* **Return the fields and values for a given record.**
* Search for records using specific fields.
* **Search for records with specific fields.**
 
=== Exercise: List Object Fields ===
Walkthrough for creating a "Customer" object and testing field responses using sample data.


= Developing an Unstructured Connector =
= Developing an Unstructured Connector =


== Unstructured Connectors ==
== Unstructured Connectors ==
Unstructured connectors focus on systems like file storage where data cannot be neatly categorized into fields. Components include:
Unstructured connectors handle systems that store unstructured data, like files. The hierarchy includes:
* Containers (e.g., directories or buckets).
* **Containers**: Storage mechanisms, such as folders or buckets.
* Objects (e.g., individual files).
* **Objects**: Individual files within containers.
* Metadata (e.g., creation dates, owner).
* **Metadata**: Information about files, such as creation date and size.


== Exercise: Implement Unstructured Connector Services ==
== Implementing Services ==
Implementation steps include listing containers, retrieving metadata, and handling content streams.
Unstructured connectors must provide services for listing containers, retrieving metadata, returning file contents, and searching for data.


= Distributing a Connector =
= Distributing a Connector =


== Distributing Your Connector ==
== Private Distribution ==
Connectors can be distributed privately (e.g., for internal use) or publicly through platforms like the BigID Community or Marketplace.
Private distribution allows connectors to be shared internally or with specific customers.
 
== Public Distribution ==
Public distribution makes connectors available to the community via platforms like:
* **BigID Community**: Open-source connectors shared freely.
* **BigID Marketplace**: Paid or free connectors for BigID customers.


== Exercise: Packaging Your Connector ==
== Packaging a Connector ==
Instructions for creating a Docker container for the connector, testing it locally, and preparing it for deployment.
Connectors are packaged as Docker containers. The process involves creating a Dockerfile, building the image, and compressing it for distribution.


== Exercise: Deploy Your Connector ==
== Deploying a Connector ==
Steps for deploying the connector in a BigID environment and enabling external connector support.
To deploy a connector in BigID:
# Enable external connectors in the environment.
# Use Docker to run the connector.
# Add the connector as a data source in BigID.


= Connector Setup and Testing =
= Connector Setup and Testing =


== Connector Setup ==
== Connector Setup ==
Adding a connector involves:
Adding a connector as a data source involves:
1. Populating configuration parameters.
# Configuring data source parameters.
2. Testing the connection.
# Testing the connection.
3. Saving the data source.
# Saving the configuration.


== Testing Your Connector ==
== Testing Your Connector ==
Steps for running scans and ensuring functionality, including viewing scan results in the BigID UI.
Test the connector by running scans and viewing results in the BigID UI.

Revision as of 17:27, 2 January 2025

Introduction to Connectors

Connectors vs BigID Applications

BigID has two main ways to extend your system’s functionality: connectors and applications. While both connectors and applications have the same goal of giving you more value from BigID, they do that very differently.

Applications

Applications allow you to add new screens and business logic using the BigID application framework. While these applications can contact external systems, an application cannot be a data source or entity source within BigID. That means that BigID applications will not directly take part in the scanning process.

The main use case for applications is to add additional screens or business logic to your BigID system. An application could add a screen to your system that allows you to view a report with additional statistics specific to your industry or synchronize findings from BigID to an external DLP tool.

More information about applications and how to build your own is available in our BigID AppDev Pro class.

Connectors

For BigID to scan a system, a connector must exist. The BigID product ships with 50+ connectors out of the box for common enterprise systems. Connectors serve as the translator between your data source’s language and the language of BigID. They allow BigID to support any data source, whether that be a file share or a NoSQL database.

For a full list of supported data sources and their compatibility, visit [1].

BigID Scanning Process

Before we create our connector, we need to understand more about the scanning process. Scanning is how BigID gets insights about your data. In a scan, BigID does the following:

  • Scan data repositories for PI, correlating records to entities across multiple data sources using advanced multi-prong data intelligence techniques.
  • Scan metadata to identify data objects containing PI that allow Open Access.
  • Classify data and files using advanced data analysis techniques.

A high-level view of this scanning process is useful when thinking about the connector you wish to create.

Steps in Scanning

  1. **User executes a scan**: This can be done via the UI, a scheduled scan, or an API call.
  2. **BigID sends a message to the scanner**: The scanner is a separate server located close to your data source.
  3. **BigID scanner contacts the data source**: The scanner uses the appropriate connector to access the data source.
  4. **Data source returns information to the scanner**: The data source sends the requested data back to the scanner.
  5. **Scanner builds search tree and returns to BigID**: The search tree is used to correlate data across all sources.
  6. **User sees details in Scan Result Details**: The results are displayed in a report where users can view and adjust connections.

Connector Types

There are two different types of connectors supported within BigID:

  • **Internal (Java-based Connectors)**: These connectors are written in Java and distributed as JAR files. They offer customization options but are complex to develop and deploy.
  • **External (Generic REST API Connectors)**: These allow developers to create connectors using their preferred programming language. External connectors can be hosted on any server accessible to both the scanner and the data source.

Deciding on a Connector Type

Unless your use case requires specific features of internal connectors, external connectors are recommended due to their ease of development and maintenance.

Developing a Structured Connector

The Connector Supermarket

The hierarchy of BigID connectors mimics that of a supermarket:

  • **Departments are Objects**: Departments represent similar types of items, akin to how objects in connectors represent a specific type of data.
  • **Items are Records**: Items within departments are like records within objects.
  • **Information is Fields**: Each record contains fields with names, types, and values.

Connector Services

Structured connectors must fulfill the following services:

  • **List what fields all items in an object have.**
  • **List what objects exist.**
  • **List what records are inside an object.**
  • **Return the fields and values for a given record.**
  • **Search for records with specific fields.**

Developing an Unstructured Connector

Unstructured Connectors

Unstructured connectors handle systems that store unstructured data, like files. The hierarchy includes:

  • **Containers**: Storage mechanisms, such as folders or buckets.
  • **Objects**: Individual files within containers.
  • **Metadata**: Information about files, such as creation date and size.

Implementing Services

Unstructured connectors must provide services for listing containers, retrieving metadata, returning file contents, and searching for data.

Distributing a Connector

Private Distribution

Private distribution allows connectors to be shared internally or with specific customers.

Public Distribution

Public distribution makes connectors available to the community via platforms like:

  • **BigID Community**: Open-source connectors shared freely.
  • **BigID Marketplace**: Paid or free connectors for BigID customers.

Packaging a Connector

Connectors are packaged as Docker containers. The process involves creating a Dockerfile, building the image, and compressing it for distribution.

Deploying a Connector

To deploy a connector in BigID:

  1. Enable external connectors in the environment.
  2. Use Docker to run the connector.
  3. Add the connector as a data source in BigID.

Connector Setup and Testing

Connector Setup

Adding a connector as a data source involves:

  1. Configuring data source parameters.
  2. Testing the connection.
  3. Saving the configuration.

Testing Your Connector

Test the connector by running scans and viewing results in the BigID UI.