FSBS Privacy & Data Management

Reusing and Publishing Data

Reusing data

Sometimes you will not collect a new dataset from participants, but rather use an existing dataset. If this dataset contains personal data, reuse is not automatically permitted. Below an extensive flow chart that aims to provide some assistance in establishing whether reuse is permitted. It does not matter whether you are using data collected yourself, or whether it has been collected by a colleague at another institute. Of key importance for FSBS researchers is what participants were initially told their personal data would be used for, and whether reuse is compatible with those parameters. For example, if your participants were initially told their data would not be shared with anyone, even sharing their data anonymously may be unethical.

This guideline differs from what is stated on Reuse of data on the Intranet page Research exceptions in privacy legislation

 

Publishing data

You may want to make your dataset publicly available, either accompanying an article or on an open repository. However, especially when dealing with personal data, this is not always permitted. Questions you will need to ask yourself:

  1. Are there restrictions on data reuse stemming from agreements, intellectual property, copyright? While the concept of ‘data ownership’ does not exist under Dutch law, there may be restrictions stemming from signed agreements with partners, or things like intellectual property or copyright that prevent the publishing of the dataset. Read more about this at auteursrechten.nl.
  2. Does the information presented to the participant during the initial collection contain any restrictions on sharing? If participants were initially told data would not be shared with other researchers, kept confidential or would not be used for any other purpose than the current research, then sharing is not permitted. The Ethics Review Board may decide in on whether publication after anonymization is still permissible in that case.
  3. If the data contains personal identifiable information, anonymize the data. Remember, anonymization may not always be possible. In addition, there may be ethical restrictions in reusing the anonymous data when participants were explicitly told nothing would be shared during the initial data collection.

If the dataset is fully anonymous, publishing it as an open dataset is permitted (barring any restrictions stemming from points 1 and 2). When in doubt, contact the privacy officer.

If the dataset does contain personally identifiable information, making it publicly available is problematic. Because, publishing personal information, even with consent, can conflict with the GDPR (articles 5(1)(c) and 4(1)(c)), which provides that personal data must be “adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed”. Furthermore, under article 6(1)(a) of the GDPR, consent must be given for one or more specific purposes. Can the publication of a dataset be seen as ‘specific enough’? Publishing data containing personal information can also make it difficult (or impossible) for participants to exercise their ‘right to be forgotten’ (articles 17 & 19), as further reuse may be the intended goal of the publication. Finally, participants are allowed to withdraw their consent. Participants need to be made fully aware of these issues. For some types of data publishing them may be a trivial decision, but for more sensitive categories or complex types of data require a serious weighing of pros and cons.

While on the one hand, individuals have the autonomy to determine how much of their data they wish to disclose, and to accept the risk associated with the potential loss of control over data once it becomes publicly available, recognizing the challenges in effectively removing such data once released. On the other hand, researchers should perhaps protect participants from themselves and be cautious with sharing their data. Therefore, it’s advisable to consult with the ethics committee to determine if disclosure is truly permissible. Furthermore, when personally identifiable information is published to the public domain, those reusing the data will also need their own legal ground under the GDPR to do so.

In conclusion, it remains advisable to aim for fully anonymous datasets when publishing data. When this is not possible, publish with restricted access to better safeguard the rights of the participants.