BigID API/Duplicate Data Tutorial: Difference between revisions

From BigID Developer Portal
No edit summary
Line 11: Line 11:
== The BigID Catalog ==
== The BigID Catalog ==


<html><img src="https://resources.cdn.mybigid.com/images-animated/catalog-01.gif" /></html>
<html><img style="max-width:100%" src="https://resources.cdn.mybigid.com/images-animated/catalog-01.gif" /></html>





Revision as of 21:24, 4 November 2021

In this article, you'll learn:

  • What the BigID data catalog can be used for
  • Retrieving object data from the catalog via API
  • Retrieving column data from the catalog via API


scenarioYou're seeing increasingly high storage costs in your cloud data sources. Looking at the names of these data sources, they don't seem to be storing anything that's particularly large, but you suspect that they're storing similar data which is increasing your storage costs. Use the BigID catalog to get a list of duplicated data

The BigID Catalog


In the response, there's a bunch of information about the logged in user. For our purposes, we just care about line 4, the auth_token. This token is what we'll use the authenticate with the other BigID APIs. We've placed a sample below with the auth token highlighted. Copy the auth token from the request you placed above. We'll need it in just a second.

{
    "success": true,
    "message": "Enjoy your token!",
    "auth_token": "eyJhbGciOiJ<don't copy me! I'm just an example!>...",
    "username": "bigid",
    "firstName": "BigID Admin",
    "permissions": [
        "admin",
        "permission.tasks.edit",
        "permission.tasks.read_task_list",
    ...

Calling an API

Now that you have a session token we can directly call BigID APIs. Documentation for these APIs is available at https://www.docs.bigid.com/bigid/reference/api-getting-started . Since we're just trying to perform a simple task, we don't need the docs here, just to know that GET /ds-connections is the endpoint to retrieve a list of data source connections.

Add a new header named "Authorization" and paste the session token you got in the previous request to authenticate yourself.

In that API call, we can see a list of data sources and all the information for each data source.

{
    "status": "success",
    "statusCode": 200,
    "data": {
        "ds_connections": [
            "<data source info here>"
         ]
    }
}