Publishing Research Data
CIP’s Dataverse (https://data.cipotato.org) is available to CIP researchers to store and make available final versions of the data that they create or compile. The repository accepts data in all disciplines and formats.
CIP’s Dataverse has three sessions which houses the following datasets:
- Potato Biological Science
- Sweetpotato Biological Science
- Social and Health Science
Before You Publish Data:
Researchers or data authors are responsible for ensuring that their data is ready for publishing. Before publishing data authors should ensure that:
- The dataset has been cleaned, verified for correctness, is suitable for the intended use and is well structured.
- The dataset is well documented (so that other researchers can find, understand and use the data). This can be accomplished through the study description (see the Metadata Template), a codebook or variable descriptions (see the Data Dictionary template), and the uploading of additional documentation that describes the methodology, data collection and the contents of the data file(s) where applicable.
- That the dataset uses reusable file formats. This includes open standard file formats such as CSV files or proprietary formats that have become a de-facto standard.
- Consider safeguarding of research subject confidentiality (e.g., only submit de-identified data, without direct or indirect identifiers)
- Submit your Dataset plus the Documentation files to CIP-RIU@cgiar.org If the files are too big to be sent by email, send us link to a shared folder. RIU will upload the Dataset to CIP’s Dataverse. Currently the system can take in files up to a size limit of 2GB per file. If you need to upload individual files larger than 2 GB, contact RIU to discuss if larger file sizes may be feasible for your project.
When to Publish Data
According to CIP’s Open Data and Data Management Policy. Data and datasets should be published within 12 months of completion of data collection or appropriate project milestone, or within 6 months of publication of the information products underpinned by that data.
Workflows for publishing research data
One common question from researchers is whether publishing of data and receiving a citation and permanent identifier such as DOI can be considered “prior publication” by Journals that they may submit a research paper that uses the data. Many journals allow work based on prior published datasets such as journals from Nature, Science, Elsevier, PLOS and SAGE. However, we advise researchers to always verify with the target journal in their publication plan before publishing the datasets. If the target journal does consider published data as prior work, then we help researchers publish the replication data after the publication of the article.
Restrictions to Publishing Data
Not all data is suitable for publishing and exceptions to adhering to the CIP’s Open Data and Data Management Policy exists. Usually data will not be suitable for publishing because of one of the following issues.
- Privacy – Information that identifies and individual.
- Confidentiality – Information that should not be shared.
- Security – Release of data will cause threats to someone or something.