Configure metadata documentation layout for Engrafo Data Usage

A vital part of getting value from your data discovery platform is to have the right information at your fingertips. This might sound obvious, but it means it must be easy to configure your solution continuously, without the need for specialist developers who are often in short supply.

Luckily everyone with the right access can do this in Engrafo.

On this page of the Engrafo Guide you will find information about:

Creating categories and hierarchies in Engrafo Data Usage 

A way to organize your various data usage entities like ETL-jobs, ML algorithms, Analytics, Data Visualizations, etc. is to create a categorial hierarchy in Engrafo.

It is done in the “Configuration of elements in data usage” section simply by clicking Categories

… Where you can then easily create a new one and assign a parent category if applicable…

… and will reflect immediately in the Data usage section of Engrafo

NB! Important to note is that when automatically generating data usage documentations via import, these hierarchies must be created first. The import can (unlike with the data catalog) not be created by the import.

Creating templates, units and fields in Engrafo Data Usage 

The information pages for the data usage part of Engrafo consists of templates with several units which contain fields as to organize the information. It is all configured in the section “Configuration of elements in data usage” in config.

Once you have understood what is a template, what is a unit and what is a field as illustrated below, you are good to start configuring. In broad terms it can also be described as follow:

  • Template is the overall documentation structure for a given area. It can be “ETL jobs in SAS”, “Finance analytics”, “Finance analytics in Python”, “Power Bi VA in Norway”. Pretty much anything where a specific unique structure is needed.

  • Units are the building block of your templates. You can have one unit or several, that is up to you. Three typical units within a single template could be “Business background”, “Technical documentation” and “Data”.

  • Fields are the specific entities within a unit. So fields within the unit “Data” could be “Source data”, “Output data” and within Technical documentation it could be “Special calculations”, “Schedule” and “Lead developer”

Although this can very easily be changed at a later state it is highly advisable to do a workshop with hands on stakeholders, to make sure the needs of everyone is met AND that you for starters follow the golden principal “less is more”

Creating templates, units and fields pretty much all follow the same procedure. Simply click what you want to create in the overview (e.g. Custom fields), select which template unit is should be a part of, and press “Create new”

After that assign the type of field, a name for it, where how it should be sorted relative to other fields in the unit, give an explanation and assign properties as needed.

In the Engrafo free trial version for download there are several examples of data usage templates to play around with.

Types of fields and properties for fields in Engrafo Data Usage 

Below is a pretty boring list of the types of fields you can make as metadata attributes and the properties you can give them. Elaboration is only provided where they aren’t mostly self-explanatory.

  • “inputdata” is a special field where input/source data for the specific data usage is mapped towards the data catalog. What goes into the field after it has been created preferably flows automatically through integration, but can quite easily be selected manually via the integrated data catalog browser.

  • “outputdata” is a special field where output data for the specific data usage is mapped towards the data catalog. What goes into the field after it has been created preferably flows automatically through integration, but can quite easily be selected manually via the integrated data catalog browser.

  • “link i Modal View” means it’s a URL you enter in the field, but the target website (or whatever) will open up in a frame inside Engrafo

  • Metadata-load SAS (scaproc) is a special field for automatically generating code visualization and analytics around SAS-code.

  • The property “Multiple” for a field or unit means you can make multiple of the same field under a certain unit.

  • The property “Show at search” means this particular field will be included in the overview search results in the “Search” section of Engrafo