Loading Metadata and Documentations

Loading Metadata and Documentations

When configuring and loading information into Engrafo it is highly recommended to have a way to automate the creation of data catalogs and data usage documentation.

The goal is to minimize the manual efforts needed to document data and data usages.

One of the key elements in Engrafo is an API that allows you to extract all kinds of information, place that in a folder and simply let Engrafo import and store the information. In that way, you can schedule your extracts and let Engrafo fetch, structure and storage all the information.

But, the extract you make and store must match the structure of the API. We have explained how this is done in the following section.

On this page of the Engrafo Guide you will find information about:

If you have questions about alternative methods to automatically load metadata and information into Engrafo, please reach out to us at contact@engrafo.dk

The Overall Process for Uploading Metadata

Metadata load is the core for Engrafo making documentations and data catalogs automated

The overall process goes: Upload files → validate files → load files

Overall Structure.png

For Uploading of metadata files, it can be done in 3 ways

  1. Manual: You will have to select the file from your computer and upload it manually

  2. Semi-automated: All metadata files placed in wwwroot\uploads_csv\,\ are automatically imported and are ready for validation. Note that the csv-delimiter that is used is the name of the folder. Here a comma should be used: “,”

  3. Fully-automated. All metadata placed in wwwroot\uploads_CSVAuto\;\ are automatically imported validated and loaded to Engrafo. Note that the csv-delimiter that is used is the name of the folder. Here a semicolon should be used: “;” There is a check every 5 sec. on new files to import, validate and load. (If validation fails, and error messages on that metadata-file is produced and metadata is not imported)

FIle view CSV upload.png

 

 

Important: when placing metadata in a folder, it is the subfolder name which identifies the delimiter that is used:

For instance, if metadata is delimited with a “¤”, then place metadata in uploads_csv\¤\

 

 

 

 

Generating Data Catalog via .CSV Load

The following is a description on how to create and upload a metadata file for creating a Data Catalog in Engrafo. It’s very important to hold the right syntax.

Mandatory columns

Load a csv file that has the following required columns:

  • Category

  • Database

  • Schema

  • Table

  • Column

Additionally, the csv file may have the following optional columns:

If the columns are to be sorted differently than alphabetically

  • ColumnSort

If custom fields has to be imported

  • Database_[fieldname]

  • Schema_[fieldname]

  • Table_[fieldname]

  • Column_[fieldname]

IMPORTANT
The [fieldname] fields must exist as Custom fields before it is possible to load content into them.

The following is an example of what a CSV file might look like where the character ¤ is selected as the separator:

Category¤Database¤Schema¤Table¤Column¤Database_Description¤Schema_Description¤Table_TableOwner¤Column_Description¤Column_Owner Stage¤DB01¤Schema_A¤Table01¤Column01¤Important database¤Schema-Description¤Knus¤Den 1. Column-Description¤Peter Poulsen Stage¤DB01¤Schema_A¤Table01¤Column02¤Important database¤Schema-Description¤Jens¤Den 2. Column-Description¤Maria Hansen Stage¤DB01¤Schema_A¤Table01¤Column03¤Important database¤Schema-Description¤Mads¤Min 3. Column-Description¤Ole Jensen Stage¤DB01¤Schema_A¤Table01¤Column04¤Important database¤Schema-Description¤Ebbe¤Min 4. Column-Description¤Inge Madsen Stage¤DB01¤Schema_A¤Table01¤Column05¤Important database¤Schema-Description¤Gorm¤Min 5. Column-Description¤Jim Jensen

Updating Data Catalog via .CSV Load

If you wish to update your Data Catalog it must contain the following:

Mandatory columns

Load a csv file that has the following required columns:

  • CategoryID

  • DatabaseID

  • SchemaID

  • TableID

  • ColumnID

Additionally, the csv file may have the following optional columns:

If names need to be updated

  • Category

  • Database

  • Schema

  • Table

  • Column

If the columns are to be sorted differently than alphabetically

  • ColumnSort

If custom fields need to be updated

  • Database_id_[fieldname]

  • Database_[fieldname]

  • Schema_id_[fieldname]

  • Schema_[fieldname]

  • Table_id_[fieldname]

  • Table_[fieldname]

  • Column_id_[fieldname]

  • Column_[fieldname]

IMPORTANT
The [fieldname] fields must exist as Custom fields before it is possible to load content into them.

The following is an example of what a CSV file might look like where the character ¤ is selected as the separator:

CategoryID¤DatabaseID¤SchemaID¤TableID¤ColumnID¤Column 5¤3¤9¤7¤6¤CVR_NUMMER

Generating Data Usage Documentations via .CSV Load

If you wish to upload your Data Usage Documentations it must contain the following:

Mandatory columns

Load a csv file that has the following required columns:

  • Categoryhierarchy

  • TemplateName

  • UnitName

  • Owner

  • DocumentationName

Additionally, the csv file may have the following optional columns for loading text fields:

  • CustomField_text_name

  • CustomField_text_value

Additionally, the csv file may have the following optional columns for loading links:

  • CustomField_link_name

  • CustomField_link_href

  • CustomField_link_description

Additionally, the csv file may have the following optional columns for loading input data:

  • Inputdata_CustomField_name

  • Inputdata_category

  • Inputdata_database

  • Inputdata_Schema

  • Inputdata_table

  • Inputdata_Column (* can be used hvis ALL columns shall be imported)

Additionally, the csv file may have the following optional columns for loading output data:

  • Outputdata_CustomField_name

  • Outputdata_Category

  • Outputdata_database

  • Outputdata_Schema

  • Outputdata_Table

  • Outputdata_column (* can be used hvis ALL columns shall be imported)

IMPORTANT

  • The category hierarchy must exist before it is possible to load it

  • The category elements must be specified with ¤ as separator, for example: Level1¤Level2¤Level3

  • The templates specified in [TemplateName] must exist before it is possible to load them

  • Units specified in [UnitName] must exist before it is possible to load them

  • The fields specified in [CustomField_text_name] must exist as Custom fields before it is possible to load content into them

  • The fields specified in [CustomField_link_name] must exist as Custom fields before it is possible to load content into them

  • The fields specified in [Inputdata_CustomField_name] must exist as Custom fields before it is possible to load content into them

  • The fields specified in [Outputdata_CustomField_name] must exist as Custom fields before it is possible to load content into them

  • The column specified in [Inputdata_column] must exist before it can be loaded as input data

  • The column specified in [Outputdata_column] must exist before it is possible to load it as output data

Below is an example of what a CSV file might look like for loading both text fields and links where the character; is selected as separator:

Categoryhierarchy;TemplateName;UnitName;Owner;DocumentationName;CustomField_text_name;CustomField_text_value;CustomField_link_name;CustomField_link_href;CustomField_link_description Level1;ETL;Heading;Ole;Dataclean;Description;This is a short description...;Link til info;https://tv2.dk/;TV2 Level1;ETL;Details;Peter;Dataclean;Important fact;Here is a description of the facts;;; Level1¤Level2;ETL;Overview;Ole;Solvency;Description;This is a description of solvency calculations...;Link for info;https://www.dr.dk/;Danmarks Radio Level1¤Level2;ETL;Details;Peter;Solvency;Important fact;Information goes here...;;;

Below is an example of what a CSV file might look like for loading input data where the character; is selected as separator:

Categoryhierarchy;TemplateName;UnitName;Owner;DocumentationName;Inputdata_CustomField_name;Inputdata_Category;Inputdata_database;Inputdata_Schema;Inputdata_Table;Inputdata_column Level1¤Level2;ETL;Overview;Ole;Datacleaning;Inputdata;Category 1;Database A;Schema A;Table A;column A Level1¤Level2;ETL;Overview;Ole;Datacleaning;Inputdata;Category 1;Database A;Schema A;Table A;column B Level1¤Level2;ETL;Overview;Ole;Datacleaning;Inputdata;Category 1;Database A;Schema A;Table A;column C Level1¤Level2;ETL;Overview;Ole;Datacleaning;Inputdata;Category 1;Database A;Schema A;Table B;*

Generating Sample Space via .CSV Load

If you wish to load a sample space into your data catalog it must have the following:

Mandatory columns

Load a csv file that has the following required columns:

  • SampleSpace

  • Value

  • Description

The following is an example of what a CSV file might look like where the character ¤ is selected as the separator:

SampleSpace¤Value¤Description Postnr¤1301¤København K Postnr¤2000¤Frederiksberg Postnr¤2100¤København Ø Postnr¤2200¤København N Postnr¤2300¤København S Postnr¤2400¤København NV Postnr¤2450¤København SV Postnr¤2450¤Valby Postnr¤2600¤Glostrup

Generating Data Glossary via .CSV Load

To generate a Data glossary Automatically via .CSV Load, your file must contain the following:

Load a csv file that has the following required columns:

  • Path

  • Description

The path dictates how the glossary will structured, automatically creating folders for your glossary items. Below is an example of what the structure might look like with the character ¤ selected as the separator:

Path¤Description folder1/folder2/item1¤This is a description for item 1 folder1/item2¤This is a description for item 2 item3¤This is a description for item 3 item4¤This is a description for item 4

And the description is simply the explanation of the respective item.

Generating Data Catalog via Server Connection String

Connection string load.png

Another way to automatically upload data to the Engrafo Data Catalog is via a connection string. To do this, first make sure you have a category created in the Data Catalog. Now just head to “Load via Connection String” under “Load Metadata” and create a new connection string. Here, select the database type, category, add a description if wanted and of course your connection string. There is also an option to schedule the import in cron format (e.g. 15 14 * * * - That being everyday at 2:15 pm). After creating, simply load the connection string and your Data Catalog will be uploaded.

 

 

SQL Connection String

Below is an example video of to generate a Data Catalog from an SQL server based on connection string: