Connectors/Learn: Difference between revisions
Line 76: | Line 76: | ||
Structured connectors allow BigID to scan databases. An example of a structured connector would | Structured connectors allow BigID to scan databases. An example of a structured connector would | ||
be our MySQL connector. | be our MySQL connector. | ||
== The Simplest BigID Connector == |
Revision as of 21:18, 8 January 2025
What is a BigID Connector
BigID Connectors allow your BigID system to provide insights about new types of data. Whether that's a known data type like CSV from a new type of data source or something completely new to the BigID ecosystem, a connector will allow you to bring BigID's data discovery capabilities to that system.
Why do we need connectors?
Every data source has its own way of communicating with third parties. Some data sources return information nicely organized, others return it as a jumbled mess. In order for BigID to give you the insights you expect, data needs to be fed to BigID in a consistent way. Connectors work as translators between the multitude of formats that data sources have adapted to the standard format BigID expects. Note that even if the data format is the same (REST JSON, REST XML, GraphQL, etc) small differences make it difficult to reuse connectors. Think of a connector as a way to interface with a single system.
How are connectors implemented?
Connectors can either be implemented as a REST API or as a Java JAR file. REST connectors are bound by the limitations of HTTP connections including timeouts, size limitations and more. Java connectors are well suited for complex use cases especially those involving data sources that stream data.
BigID Scanning Process
While BigID has different scanning methods (snapshots, metadata scans, Hyperscan), they all depend on scanners. Scanners allow BigID to contact data sources and create the search maps that are used to power the BigID system. Depending on your deployment model you may have scanners located in the BigID cloud, on-premise, or in your organization's cloud provider accounts. Scanners take the form of a Docker container and require only outbound network access.
In a scan, the scanner will do the following:
- If correlation is enabled, load all correlation records in order to find them within data sources.
- Scan table and file metadata to determine access permissions and ownership
- Classify data streams
After a user starts a scan, the scanner will use the data in the scan request to determine what type of connection to make. In the case of REST API scans, the connector will reach out to your connector. This means your REST connector must allow inbound network access from your scanner, and your data source must allow inbound access from the connector. For Java connectors, the scanner will directly communicate with the data source.
Connector Types
There are two different types of connectors supported within BigID. Which type of connector you want to use to connect to your data source will have broad implications on setup, network security settings, and connector installation.
Internal (Java-based Connectors)
Most of the connectors you are familiar with are Java-based connectors.
These connectors are written in the Java programming language and distributed as JAR files. To install a new Java-based connector, an administrator must manually load the connector JAR file into the scanner using the command-line. Thankfully, the 50+ BigID written internal connectors are bundled in the scanner by default. The scanner directly uses these connectors’ code to connect to your data sources.
These connectors allow large amounts of customization in the scanning process and the connection to your data source. Due to the customization options, they are more complicated to create and are not the recommended connector development method for BigID customers.
External (Generic REST API Connectors)
External connectors allow you to create a connector in your favorite programming language. The scanner will communicate with your connector over HTTPS so as long as your programming language of choice can respond to web requests, it can be used to create an external connector.
External connectors can be hosted on any server that has a network connection to both your scanner and your data source.
There are two different types of external connectors that you can create: unstructured and structured.
Unstructured External Connector
Unstructured connectors allow BigID to scan files from a given data source. An example of an unstructured data source is Google Drive.
Structured External Connector
Structured connectors allow BigID to scan databases. An example of a structured connector would be our MySQL connector.