9.3 Data protection
Data protection rights must be observed when collecting, storing, processing, and passing on research data relating to individuals. If you work as a researcher at a Hessian university with such data, it is advisable to know the main features of the following legal texts in particular:
- General Data Protection Regulation of the European Union (GDPR)
- German Federal Data Protection Act (BDSG)
- Hessian Data Protection and Freedom of Information Act (HDSIG)
The following video briefly introduces the data protection laws that are particularly relevant to scientific research and explains how they relate to each other:
Source: Excerpt from MLS LEGAL - Data Protection in Research (YouTube) [Creative Commons licence with attribution (reuse permitted)]
Data without personal reference or anonymised information, on the other hand, do not fall under data protection law and can usually be processed freely, taking into account other rights (e.g. copyrights).
What exactly distinguishes personal data from other (anonymous) research data is explained in detail in the following section. In case of doubt, you should assume a personal reference to avoid liability risks.
9.3.1 Personal data and special categories of personal data
According to Art. 4 (1) of the GDPR, personal data is any information relating to an identified or identifiable living person. Examples of personal research data include survey data in the social sciences or health data in medical research.
An identifiable person is one who can be identified directly or indirectly by means of attribution:
-
in particular to an identifier such as a name, an identification number, location data, an online identifier or
-
to one or more particular characteristics that are an expression of the physical, physiological, genetic, mental, economic, cultural, or social identity of that natural person.
The following cases in particular have recently been decided in case law:
- Images, film, and sound recordings if there is a reference to a person
- IP addresses
- written answers of a candidate in a vocational examination
- Examiner's comments on the assessment of these answers
In determining whether a person is identifiable, the GDPR requires that account be taken of all the means likely to be used by the data protection officer or by any other person, in normal circumstances (in terms of cost and time), to identify the person (Recital 26 GDPR).
Source: Excerpt from MLS LEGAL - Data protection in research (YouTube) [Creative Commons licence with attribution (re-use permitted)].
In addition, there are categories of data in case law that are considered particularly sensitive. These include, for example, data on a person's state of health, sexual orientation, and political or religious views. A list of these special categories of personal data can be found in Article 9 of the GDPR.
This data is subject to special protection and special due diligence obligations during processing. This means, for example, that participants in scientific studies must explicitly consent to the processing of these special categories of personal data before the data is collected. Further aspects are explained in the following video:
Source: Excerpt from MLS LEGAL - Data protection in research (YouTube) [Creative Commons licence with attribution (re-use permitted)].
When processing personal data, the so-called Principles relating to processing of personal data (Art. 5 GDPR) must be observed:
- Personal research data may only be collected if they are necessary to achieve the research purpose.
- The collection and processing must be done transparently and with due probity vis-à-vis the data subjects.
- Data subjects must at all times be able to understand the processing of their personal data and must not be misled by false and omitted information.
-
Protecting privacy by safeguarding personal data should be central to all collection and processing considerations.
- The data must also correctly reflect the circumstances of the person concerned, i.e. it must not falsify them.
- They shall be protected against misuse (e.g. removal, alteration, damage) technically and organisationally within the bounds of what is reasonable.
9.3.2 Informed consent and legal permission standards
Personal research data may only be collected and processed with the informed consent of the person concerned or with a legal standard of permission (so-called principle of prohibition with reservation of permission).
According to Recital 32 p.2 GDPR, the following requirements can be stated for informed consent:
- Consent must be freely given (i.e. without physical or psychological influence)
- Especially when processing sensitive personal data (according to Art. 9 or 10 GDPR), it is advisable to write down the consent.
- The persons giving consent must be able to understand in advance which of their personal data will be used how, for what, by whom and for how long. In other words, people should be put in a position in which they are able to assess the consequences of their own consent.
On the other hand, legal permissions are granted without the consent of the data subject. Particular importance is attached to the exceptions for scientific research purposes contained in § 27 BDSG, but also in many state data protection laws (e.g. § 13 LDSG-BW, § 17 DSG-NRW, § 13 NDSG).
According to this, the processing of personal data is permitted if the interests pursued with the research project outweigh those of the persons concerned (cf. forschungsdaten.info). However, since this rarely applies, you should always obtain consent in case of doubt.
Consent does not require any special form. However, it must be verifiable – e.g. in the event of a review by the data protection supervisory authority – so that written or electronic documentation is strongly recommended. The declaration of consent should contain at least the following information:
- Person responsible for data collection (legal entity) who is also the addressee of the declaration of consent;
- Project title;
- Specific information on the type of data collected;
- Data processing procedures, data protection officer;
- Reference to voluntariness, to the right of withdrawal, reference to the consequences or the absence of consequences in the event of refusal or withdrawal;***
- particularly important: Intended use(s).
Above all, the data subject must be informed that their consent is completely voluntary, that they can therefore also refuse to consent and – if they do – that they can revoke the consent with effect for the future at any time, but that previous usages cannot be reversed (Cf. https://www.forschungsdaten-bildung.de/einwilligung).
The declaration of consent must be supplemented with information on the processing of the data. This includes the legal basis and purposes of the processing (insofar as these go beyond the processing), any data transfer to countries outside the EU, the storage or deletion periods of the personal data and the right of appeal to a data protection supervisory authority (cf. Watteler/Ebel 2019: 60).
Consent can also be given in the abstract for scientific purposes that are not known at the time of collection (so-called broad consent). However, the more specific the description, the more likely the scope of the consent in question will be able to extend to uses that go beyond the use of the primary purpose.
If the publication of data within the framework of the RDM is intended, the consent should explicitly include the storage and publication of the data. A practicable compromise between abstract and concrete broad consent can, for example, be a graded consent.

Fig. 9.2: Example of informed consent in "broad consent format" (source: Baumann/Krahn 2020).
Das folgende Video fasst alle Aspekte zur informierten Einwilligung und zu den gesetzlichen Erlaubnistatbeständen noch einmal zusammen:
Further information
Some disciplines offer assistance and examples of wording for written informed consent (cf. e.g. VerbundFDB, RatSWD).
9.3.3 Means of removing identifying features
In general, personal research data must be anonymised after collection as soon as possible for the research purpose (at the latest when the research project is completed).
Anonymisation**
A change in the data to such an extent that the individual data on personal or factual circumstances can no longer be attributed to a specific or identifiable natural person (so-called absolute anonymisation) or can only be attributed to a specific or identifiable natural person with a disproportionate effort in terms of time, costs, and manpower (so-called de facto anonymisation).
The first step is to remove direct identification features (name, address, telephone number, etc.). Often, however, this is not sufficient to eliminate a reference to a person. In this case, reducing the accuracy of the information (aggregation) can be an effective measure that also allows certain parts of the information to be retained.
Aggregation
Summary of several individual values of the same kind to reduce the granularity of information. From the summarised information, it is no longer possible to draw conclusions about the individual pieces of information.
Here, detailed individual information (e.g. salary in the last month) is grouped into classes (e.g. lower, middle, upper class). The degree of aggregation necessary to exclude a personal reference can vary. It essentially depends on which other potential identification features are available in the data or can be obtained from external sources.
Example of gradual aggregation
Address → City → State → East/West → Country → Continent
In each case, careful consideration must be given to which of the available means appear to be the most suitable and proportionate to remove the identifying characteristics in such a way that no or only very limited de-anonymisation is possible, even with any additional knowledge as well as extensive capacities for data research and aggregation.
Postponement of anonymisation is only possible if characteristics that reveal a personal reference are needed to achieve the research purpose or individual research steps. This is the case, for example, during an ongoing research project that uses biometric data.
In this case, however, the personal characteristics must be securely and separately stored immediately after collection. This can be done, for example, by pseudonymising the personal research data.
Pseudonymisation
The separation of personal characteristics immediately after collection from the rest of the data, so that the data can no longer be assigned to a specific person without adding information.
One example is the use of a key table that assigns corresponding ID codes to the plain names of persons. In this way, the personal reference can only be established if one is in possession of the key table. If necessary, this can also be held by an independent trustee.
However, the data processed in this way continue to have a personal reference until the personal characteristics to be stored separately are deleted and are therefore subject to the requirements of data protection law.
Source: Excerpt from
MLS LEGAL - Data protection in research (YouTube) [Creative Commons licence with attribution (re-use permitted)].