This section will provide basic guidelines for you to prepare your Regular (aka KNox) dataset for upload. The file for a Regular dataset must include a specific set of worksheets and columns (download sample file here), as follows:
Required Sheets
- Sheet with dataset metadata. Name this sheet Dataset.
- Sheet for dimension/s with lists of dimension members (one sheet per dimension).
- Sheets with data. You may organize your data in one data sheet or across multiple data sheets.
Dataset sheet
This sheet must contain two required columns: Name and Value.
Name column must include at least the following rows:
- DatasetName
- Dimensions
- Data
In the corresponding Value column, specify the parameters of your dataset:
- DatasetName. The title of the dataset.
- Dimensions. The names of sheets that contain lists of dimension members (names must be delimited by semicolons without spaces).
- Data. Names of sheets with data, each delimited by semicolons without spaces.
Also the following rows can be added:
- Dataset. Past dataset ID. Can be added if the dataset should have a specific ID.
Note: You should have specific Data Team or manager role to use this feature. - Dataset Description
- Provider
- Original URL
- Publication Date
- Accessed On
- Next Release Date
- StartDate
- End Date
- Frequencies
Dimension member sheet(s)
Sheets with lists of dimension members must include four required columns: Code, Name, Order, and Parent, as follows:
- Code. (ID) Any unique numerical, text or combination of symbols identifying a member of a dimension. Indicator’s code should not contain spaces and length should be < 116 chars.
Note: Use ISO 2 codes for countries to support parametrization by country, maps, ranking tables with flags etc in the visualizations and dashboards.
- Name. The title of a dimension member.
- Order. The numerical sequence in which dimension members should appear in the dataset.
- Parent. The code (ID) of a dimension member that is higher by one level in the hierarchy. It's possible to have multiple hierarchical levels.
Optional: Indicator notes/description
Use notes if a specific indicator (country, measure, id, etc.) has a definition, methodology, or other associated information that you will want to specify for users to view via the blue icons within a dataset next to an indicator name.
To add notes, add a column named Notes (same as it's done with Parent) in the applicable dimension-specific worksheet. In the example above, that would mean a new column E named Notes.
Optional: Translations of indicator names
Within the Indicator worksheet, add a column for each additional language after the required columns. Each column should be labeled with the language name and contain the translation for each item within the dimensions that appear in the original language Name column.
Data sheet(s)
Sheets with data must contain the following columns:
- Columns with names of each dimension. This will contain codes of the dimension members.
- Frequency, specified in one of the following supported time frequencies:
- A - annual
- H - semi-annual
- Q - quarterly
- M - monthly
- D - daily
- Columns with dates. Use the statistical format of dates for different frequencies:
- Years: 2009, 2010, 2011
- Half years: 2009H1, 2013H2
- Quarters: 2009Q1, 2010Q3, 2012Q4
- Months: 2009M2, 2011M7
- Days: MM/DD/YYYY, DD.MM.YYYY
A few additional tips:
- Instead of a single columns with dates, you may use two columns, Date and Value, with the former containing dates and the latter values.
- In addition, you may use columns with additional metadata, such as Unit.
- Your data worksheet cannot contain any error values (Inf or NaN, for example) or the file will fail to upload.
- If there are duplicates in your data, the file will also fail to upload.
NOTE: Columns like 'CountryName' and 'IndicatorName' in the example are not required. They are typically included to make the worksheets more user friendly.