Best Practices of Data Publication

Version: Draft 0.2/September 2017

Please cite as: DataWiz (2017): Best Practices of Data Publication. Version Draft 0.2. Available: https://datawiz.zpid.de/index.php/tools-and-resources/checklists-and-guidance/

Section Guidance
A. Legal Aspects 
Anonymization Share your data only if sensitive information and personal data have been removed or informed consent explicitly includes their publication.
Informed Consent Re-read your informed consent: Was data sharing considered in your informed consent? Are any restrictions on data sharing imposed (those restrictions may also apply to anonymized data)?

  1. If restrictions were imposed, choose a repository that allows restricting access or customizing terms of use for data re-use.
  2. If data sharing was precluded, do not publish the data.
Third Party Rights Consider if any third party rights apply to your data hindering you from sharing the data.

  1. If reusing data: Did the data providing agency/researcher allow you to publish the data?
  2. If using copyright protected materials: Do your data include any copyright protected elements (e.g. exact item wordings of a psychometric test)?
Jurisdiction When publishing data within a repository which does not underlie your jurisdiction, consider the consequences.

  1. If personal data is included check if repositories are in line with data protection laws applying to your jurisdiction.
  2. If you are not sure that the repository is in line with laws applying to your jurisdiction, search for a repository that underlies your jurisdiction.
B. Communicate your Data
Consider Focus of Data Publication

(What purpose does the data publication serve? Who is the target audience of the data publication?)

  1. If you only want to publish your data as supplement to a text publication: Less documentation on your data collection may suffice since researchers that use the data can be expected to have read the corresponding article.
  2. If you want to make your data publicly available to a broader audience and understandable on its own, an advanced documentation of the data collection is required. Thus, you should choose a repository that provides an extensive description framework.
Document your Variables Variable-level documentation by means of codebooks is always necessary, as information included in codebooks (naming conventions, missing value coding, etc) is, in general, not retrievable from articles. A repository that offers a standardized description framework for variable documentation may assist you in compiling such a codebook.
Test your Documentation Be aware that your own data will always seem to be “self-explaining” while you are analyzing it. A good way to test, if your data is efficiently communicated, is sending the data to a colleague not directly involved in your project. Ask this colleague to use the data as you expect your target audience to use it (e.g. replicate reported analyses), and obtain your colleague’s feedback on problems that occurred.
C. File Formats
  The current format in which you are working with your data is not necessarily the format you should use for depositing the data.

  1. To avoid issues related to the choice of file formats, you can deposit your data with a repository that offers automatized file conversions.
  2. If you choose a repository that does not offer automatized file conversions, consider depositing your data in more than one format. At least one of these formats should be a non-proprietary format suitable for long-term archiving.
  3. Formats for data deposit should be chosen based on the expected needs of your target audience.

 

If your research data incorporates special types of data, such as fMRI records or video data, you may want to consider repositories specialized in archiving those data types.

D. Long-term Preservation
Storage Guarantee Check if your repository offers long-term storage.
File Formats and Documentation
  1. Consider special requirements that may arise from your file formats. Do you require a repository that applies format migration?
  2. Consider special requirements arising for your documentation. Do you require a repository that helps you to provide extensive documentation?
E. Searchability
Cite your Data Cite your data in your print publications and indicate where it can be retrieved.

  1. We strongly encourage data depositors to cite their data in text publications since infrastructure on data publication (including search engines for research data) is not (yet) well developed.
  2. Moreover, data publications should be included in publication lists on personal or institutional websites.
Choose Repositories that Assign DOIs Choose a repository that assigns DOIs to its research data in order to include your data in catalogues which are based on the registration agencies’ information.
F. Guidance
Self-Deposit Guidance If you are not sure how to prepare your data choose a repository that offers guidance.
Further Assistance If you need further assistance with depositing or preparing your data, choose a repository that offers support in the data submission process and/or actively curates its data.