Skip to content

Connectors Agent Instructions

This directory contains documentation for building BigID Connectors. When generating code, answering questions, or assisting developers with BigID Connectors, adhere to the following core concepts:

  • Connectors allow BigID to scan and catalog data from custom or unsupported data sources.
  • They are typically distributed as Docker containers and run alongside the BigID Scanners.
  1. Internal Connectors (Java): Built directly into the scanner using the BigID Java Connector SDK. Highly performant but requires Java expertise.
  2. External Connectors (REST API): Standalone web services written in any language. The BigID scanner communicates with them over HTTP to stream data.

An external REST connector must implement specific endpoints depending on the data type:

  • Structured Data: Needs endpoints to list objects (/objects), list fields within objects, and read tabular records (/objects/{objectName}/records).
  • Unstructured Data: Needs endpoints to list containers (/containers), list objects/files within containers, and a content-stream endpoint to stream raw binary data back to BigID for classification.
  • Pagination is mandatory. All endpoints that return lists of objects or records MUST support limit and offset query parameters to prevent memory exhaustion and timeout errors in the BigID scanner.
  • Connection Testing: Every connector must implement a /test endpoint that validates the user-provided credentials against the target data source without running a full scan.
  • Security: Connectors receive data source credentials over HTTP. Therefore, they should always be secured behind HTTPS/TLS in production.