HomeTechnologyHow to search in the Google Data Catalog by tags in Java

How to search in the Google Data Catalog by tags in Java

Reading Time: 3 minutes

What is Data Catalog?

Data Catalog is a fully managed, scalable metadata management service in Google Cloud’s Data Analytics .

Data Catalog Search scope

In Data Catalog search scope depends on users i.e. Search results may be different for users with different permissions.

For example, if a user has BigQuery metadata read entry to an object. Than object will appear in their Data Catalog search results. To search for a desk you need bigquery.tables.get permission for that desk. To search for a dataset, you need bigquery.tables.get permission for that dataset.

Datesharded tables

Data Catalog aggregates date-sharded tables into a single logical entry. This entry has the same schema as the desk shard with the most recent date, and contains aggregate information about the total number of shards. The entry derives its entry level from the dataset it belongs to.

How to Search for data assets

1629298265 650 How to search in the Google Data Catalog by tags

In this method just provide projectId and query.

For our case query will be =>

String query = “tag:RandomTag”;

How to Search for data assets by tags in Java

1629298266 833 How to search in the Google Data Catalog by tags

  • In this method first, we set the scope by providing the projectId and projectId can be 1 or many as per the requirement for the search.
  • Next, we will initialize the DataCatalogClient that we will use to send requests to the DataCatalog. This client only needs to be created once, and can be reused for multiple requests.
  • After that we will create the SearchCatalogRequest object by providing the query and the scope of its builder method.
  • Now we will use SearchCatalogRequest object to search in Data Catalog by using DataCatalogClient method searchCatalog and providing the argument SearchCatalogRequest and then store the result in SearchCatalogPagedResponse class variable.
  • Now we will iterate the response and fetch the column names of the tables which have been tagged by RandomTag.

1629298267 662 How to search in the Google Data Catalog by tags

  • For using the search results first we have to move inside the desk for that we will use the LookupEntryRequest.
  • To construct LookupEntryRequest we have to provide it the address of the desk of which we want to go inside that and we get the address from the method getLinkedResource().
  • Now just we have to make the Entry object by lookupEntry method.
  • Once the Entry object is made we have full entry to the columns of the DataCatalog so now only we have to find the columns which have tags associated with it.
  • For this DataCatalogClient has an inbuilt method listTags in which we will provide entry.getName() argument which means column names of the desk.
  • We will get the object of the ListTagsPagedResponse in which we have the column names of the tag columns of the desk.
  • Finally, we will iterate the list and print the columns.
  • Here the tag column names are in the line
  • String location =tag.getColumn() .


Knoldus-blog-footer-image



Go to the source

Most Popular