About DQ-Kit

DQ-Kit aims to provide thorough data quality checks for scientific data publication and reuse, at the BonaRes Repository at ZALF and beyond. The initial release emphasizes explorative statistics. Future versions will focus on generating metadata and conducting plausibility checks for agricultural and soil science data. Importantly, we adhere to the "Fit for Use" principle of data quality. Therefore, we do not rate or categorize datasets based on quality. Instead, DQ-Kit enables data authors to cross-check their data and allows re-users to select datasets that best suit their specific needs.

The initial release of DQ-Kit, version 1.0, integrates exploratory statistics and data quality alerts as implemented in YData Profiling. When data is uploaded, DQ-Kit generates a report that includes an overview of the dataset, summary statistics for each variable, illustrations of variable associations, patterns of missing data, and alerts for potential quality issues with variables, such as skewness, missing data, and collinearity.

For additional details, including the formatting requirements, please refer to the FAQ .

We will be regularly updating DQ-Kit's functionality. Please check back often for new releases.

If you have questions, comments, encounter errors while executing DQ-Kit, or want to contribute to its development, please don't hesitate to contact us.

Contact: support-data@bonares.de

Version

Version 1.0
Published: 17.05.2024


Drag and Drop
or click here to browse


A collaboration of: