Research Data Policy of the Technische Universität Berlin

adopted by the Academic Senate on 23 October 2019; updated due to the decision of the executive board on 15 March 2023

1. Preamble

The goal of TU Berlin is to further develop science and technology for the benefit of our society. The members of the university are wholly committed to the principle of sustainable development. TU Berlin carries out basic and application-oriented research at top international level and promotes cross-faculty research activities and networks with external actors as well as knowledge and technology transfer between the university and practical applications. To this goal, TU Berlin forms strategic alliances with companies as well as university and non-university research facilities.

Research data is a valuable resource and a basis for scientific knowledge and has a long-term value for research and science, with the potential for widespread use in society. Research data is all information (regardless of form or representation) that arise during a research process or is its result, including the information necessary to verify and reproduce the results. Research data includes measurement data, lab values, audiovisual information, texts, objects from collections or samples, surveys, and interviews, but also notes, time histories/recordings, calculations, software and code. TU Berlin is aware of the fundamental significance of research data for maintaining the quality of research and scientific integrity and is committed to follow recognized standards that meet the highest requirements. TU Berlin acknowledges that correct and easily retrievable research data is the foundation and integral part of every research activity as it is necessary for the verification and reproducibility of research processes and results.

With this policy, TU Berlin wants to provide its current and future researchers with an orientation for handling research data. The policy refers in particular to the recommendations of the German Council of Science and the Humanities (2012) and the German Rectors’ Conference (2014) as well as to the “Guidelines on the Handling of Research Data” (2015) of the German Research Foundation. TU Berlin formulates this policy according to the “Statute on the Safeguarding of Good Academic Practice at TU Berlin” and the “Open Access Policy of TU Berlin”.

2. Scope of Application

This policy on the management of research data principally applies to all researchers active at TU Berlin. Generally, it must also be adhered to when research is conducted with third parties. In exceptional and justified cases, there may be deviations from the policy to enable the research, especially when it comes to agreements with third parties regarding intellectual property rights, access rights, and the storage of research data, provided that the "Statute on the Safeguarding of Good Academic Practice at TU Berlin" and their implementation regulations remain unaffected and that they are not contrary to the principles of "TU Berlin Transfer Strategy", the "Recommendations for Action Regarding Knowledge and Technology Transfer within the Context of Open Science" and the "Code of Conduct for Research Involving Commercial Enterprises".

3. Legal Aspects

Intellectual property rights and rights to research data shall be defined through specific agreements (e.g. research contracts such as grant or consortium agreements and contract research agreements). In cases where the rights to the research data belong to TU Berlin, TU Berlin decides how the research data will be published, shared, and reused. If research data belongs to a researcher, the researcher will decide how to proceed with the data.

In research data management, TU Berlin and its researchers observe ethical and legal issues such as data protection and patent law. Priority shall be given to rules on confidentiality.

4. Handling Research Data

Maintaining the integrity of research data is essential: Research data must be stored in a correct, complete, unadulterated, and reliable manner. Furthermore, according to the FAIR principles, research data must also be identifiable, accessible, verifiable, interoperable, and, whenever possible, available for subsequent use.

In accordance with its Open Access Policy, TU Berlin supports open access to research data. In compliance with intellectual property rights and if no third-party rights, data protection rights, and legal or contractual provisions prohibit it, TU Berlin recommends that research data should be assigned a license for open use, such as Creative Commons, to enable the reuse of the research data.

Research data shall be stored in a suitable repository or archive system, shall be assigned a persistent identifier and metadata, and if possible, made openly accessible. The applicable research ethics and data privacy regulations are to be considered.

The minimum storage period for research data is ten years after either the assignment of a persistent identifier or the publication of the related work following research project completion, whichever is later.

Adherence to citation rules and requirements regarding publication and future use of the research data must be ensured. When research data is reused, the origin of the data must clearly be verifiable and the original sources must be named.

If research data is to be deleted or destroyed after the expiration period of the storage, this will be carried out only after considering all legal and ethical aspects. When making a decision about the retention and destruction of data, the interests and contractual stipulations of third-party funders and other stakeholders, employees, and partners, as well as the aspects regarding confidentiality and safety must be taken into consideration. Any action must be documented.

5. Responsibilities

The responsibility for research data management during and after a research project lies with TU Berlin and its researchers and should be compliant with the codes of TU Berlin for responsible conduct of research. The researchers are in particular responsible for:

  • a) Management of research data in adherence with the principles and requirements laid out in this policy. This includes specifications made by the principal investigator regarding the handling of research data in the framework of projects;
  • b) Collection, documentation, access to, and storage or proper destruction of research data and research-related records;
  • c) Planning to enable the continued use of data even after the completion of a research project. This includes defining usage rights, with the assignation of appropriate licenses. This also includes the clarification of data storage and archiving if the researcher leaves TU Berlin;
  • d) Creation and updating of Data Management Plans that explicitely define the collection, administration, integrity, confidentiality, storage, use, and publication of the research data.

TU Berlin commits itself to creating the conditions which enable the fulfillment of the principles laid out in this policy.

6. Validity

The Research Data Policy of TU Berlin was adopted by the Academic Senate on 23 October 2019 and will be continued by resolution of the Academic Senate of 15 March 2023. The policy will be reviewed by the university library within the framework of the Service Center Research Data Management every three years or when needed and will be introduced to the Academic Senate for resolution by the member of the Executive Board responsible for research.

Glossary for the Research Data Policy

Contract research

Contract research is scientific research carried out on behalf of a private or public funding organization. The task is predefined and the rights to the research results generally belong to the principal, whereby the University is granted a non-exclusive right of use for the purpose of research and teaching.

Cooperative research

Cooperative research is cooperation on equal terms between partners pursuing a common goal. In principle, the rights to the results belong to the partner who developed them.

Data management plan

A data management plan (DMP) is a structured guideline on how to handle research data both during and after a research project. It documents the process how the research data is generated and how it is stored appropriately so that it can be interpreted and verified, and remains available, authentic, citable, and reusable in later years. To this end, clearly defined legal parameters and appropriate security measures (such as contracts and licenses) shall also be included in the DMP. To optimize research data management and as a basis for institutional support, a DMP should be created before the start of a research project and updated in the course of the project.

FAIR principles

The FAIR principles are international principles for sustainably reusable research data. The main objective of the principles is to ensure that research data is optimally prepared so that it is findable, accessible, interoperable, and reusable. The FAIR principles were created and first published in 2016 by a broad-based interest group consisting of representatives from science, industry, funding organizations, and scientific publishers.

Licence for free use

A licence for free use (or free license) is a standardized license agreement which allows research data to be reused. It regulates which rights of use the author/originator grants to the general public that go beyond the applicable copyright law. Depending on the data type, a license is selected. Established free licenses in software are the GNU General Public License (GPL), the MIT license, or the Apache license. Creative Common licenses (CC) apply to texts, images, music and videos.

Metadata

Metadata is data about data. It is data which provides descriptive or contextual information about other data, can be indexed, and facilitates archiving and retrieval. There are different types or categories of metadata. Metadata in a repository, for example, can be classified in four types of content: bibliographic metadata (e.g. title, author, abstract), structural metadata (relationships between and within objects, e.g. links, references), administrative metadata (authorizations/status, e.g. access rights, embargo), technical metadata (data evaluated by the system, e.g. size of files, checksums, modification date).

Persistent identifier

A persistent identifier is a constant Internet address for digital objects. It guarantees that a dataset remains permanently findable, accessible, and citable, even if its physical location changes. Well known examples are DOI (digital object identifier) or URN (uniform resource name).

Repository

A repository is a storage platform for the archiving and worldwide publication of scientific publications, research data, or cultural heritage data. Storage of research results in a repository ensures that they are sustainably accessible, verifiable, citable, and reusable.

Research Data Management

Research Data Management (RDM) encompasses all measures for the quality assurance of research data with regard to storage, access, and preservation of research data in order to make research results sustainably reproducible and available for reuse. The activities cover the entire research data lifecycle, starting with the planning and execution of the research project, through the creation and storing of data to the long-term storage of the results after the research project is completed. Specific tasks in RDM include quality control and quality assurance, documentation, metadata creation, archiving, data exchange and reuse as well as measures for data integrity and data security.

Researchers

Researchers include all members of TU Berlin active in research including employees and doctoral candidates. People who are not directly connected with TU Berlin but use its infrastructure or are physically present at the University for research purposes are included in this term. Visiting researchers and external cooperative partners are expected to comply with the Research Data Policy.

Guidelines on the implementation of the Research Data Policy

On 23 October, 2019, the Academic Senate of TU Berlin adopted the Research Data Policy of the Technische Universität Berlin. With this policy, TU Berlin wants to provide its current and future researchers with an orientation for handling research data. The Guidelines on the implementation of the Research Data Policy of the Technische Universität Berlin provide practical advice for the Research Data Management (RDM) in the different phases of a research project. The Academic Senate agreed to the Guidelines on 23 October, 2019. They are maintained and continuously updated by the Service Center Research Data Management.

Abbreviations

AbbreviationsMeaning
DFGGerman Research Foundation
DMPData management plan
Dept. VResearch and Technology Transfer Department
FAIRFindable, Accessible, Interoperable, Reusable
RDMResearch data management
UBUniversity Library
SZFService Center Research Data Management
ZECMCenter for Campusmanagement

Central Point of Contact at TU Berlin

The Service Center Research Data Management is the central point of contact for all issues related to RDM at TU Berlin.

Within the Service Center Research Data Management (SZF), the University Library (UB), the Center for Campusmanagement (ZECM), and the Research and Technology Transfer Department (Dept. V) cooperate and bundle their competences to support the University’s researchers in handling research data. SZF is managed and coordinated by the University Library. SZF operates the research data infrastructure at TU Berlin, which is integrated into the University’s IT infrastructure. The central technical infrastructure services include DepositOnce, the repository for research data and publications of TU Berlin, and TUB-DMP, a web tool for creating data management plans (DMPs). A wide range of training and advisory services complements the technical services.

RDM and the research data infrastructure of TU Berlin are aligned with the FAIR principles, international guidelines for the optimal management of research data so that the data is findable, accessible, interoperable, and reusable.

The SZF website serves as a central platform for RDM at TU Berlin. Here, you will find comprehensive information on the RDM services at TU Berlin, including the respective contact persons.

Corresponding to the task sharing of SZF, there are distributed responsibilities along the research data life cycle (see figure).

Guidelines on the Handling of Research Data

Early planning of your research data management not only has the advantage of creating a transparent framework for the uniform handling of data in a project. It is also important in terms of possible costs for RDM. You can apply for funding for staff (e.g. for the development and administration of research infrastructure) or even the infrastructure itself (e.g. INF project in Collaborative Research Centres).

In advance: Ten questions about RDM during the proposal phase

  1. Are there any guidelines for the handling of research data (e.g., from the funding organization, the research institution, the project partner or the discipline)?
  2. Have there been assigned responsibilities for research data management in the project?
  3. Is there a data management plan for the project?
  4. Will you use the IT infrastructure of TU Berlin for storage? Did you already contact ZECM on this?
  5. Will you use third-party research data? Will you need funding to access this data?
  6. Do you plan to use external repositories for archiving and/or publication of research data? If so, which ones?
  7. Is there a publication strategy for research data and publications in your project (e.g., open access)?
  8. Will you need funding for archiving and/or publishing the research data?
  9. Will you apply for staff to manage the research data and/or to administrate the technical infrastructure?
  10. Is it ensured that the staff is trained in the handling of research data?

I. Planning and proposal phase (before the research project)

In case of a third-party funded project, inform yourself in advance of any applicable funding guidelines regarding research data. Costs arising for RDM – e.g., human resources for data processing or for the development of project-internal workflows, publication costs or resources for long-term archiving of the data that extend beyond the University's basic infrastructure – can and should be part of the funds applied for. Use the advisory offers of Dept. V to learn about the different requirements of funding organizations and funding possibilities and to determine your individual needs.

In all research projects and also in the publication of the research results legal frameworks must be observed and preferably are to be settled in advance. Certain research data, for example in the social or life sciences, are subject to strict requirements, e.g. on dataprotection or ethics. Copyright and the protection of third-party interests must also be ensured.

In order to receive adequate support for your new research project, it is recommended to announce the project during the proposal phase to Dept. V. This project announcement is achieved by using the electronic project notification (ePA).

 

Research data management

For any project in which data is collected or forms the foundation of your research it is strongly recommended that you make an early examination of the requirements and possibilities of efficient and sustainable research data management. Already during the proposal phase, a strategy for the sustainable archiving and availability of the research data should be determined. Also, the legal status of the data and suitable security measures for data use during and after the research project are to be defined.

These definitions should be included in a DMP. A DMP documents the process how the research data is generated and how it is stored appropriately so that in later years it can be interpreted and verified and remains available, authentic, citable, and reusable. To optimize RDM and as a basis for institutional assistance, a DMP should be created before the start of the research project and updated over the course of the project (living document).

Almost all funding organizations require a DMP to be submitted as part of the project proposal. As support for your DMP, you can use the web tool TUB-DMP. It contains templates with relevant questions, which you can answer in a step-by-step workflow as they pertain to your project.

  • Advice on RDM and TUB-DMP: SZF

II. Implementation phase (during the research project)

State of the art processes are to be used for the storage and processing of research data as well as for collaboration based on this data. This particularly includes compliance with data security as regards the availability, integrity, and authenticity of data. This requires, for instance, the use of data backup and secure data exchange platforms.

In the implementation phase of a project, datasets may evolve several stages (e.g., by selection, aggregation, integration). It is a good practice to label, document and keep the different versions at least for the duration of the project. Especially in case of text-based data, the use of versioning tools, as commonly used in software development (e.g., GitLab, SVN), helps with the management of different versions.

ZECM provides the following services for research work at the University; further information can be found on the ZECM website.

  • Use of network file systems (incl. data backup)
  • Archive storage services on tape drives
  • Provision of virtual root servers (server hosting)
  • Accommodation of real servers (server housing)
  • Block storage services for servers (virtual hard disks over a dedicated storage network)
  • Data exchange services
  • Versioning services

These services are either included as basic equipment and are free of charge or offered at cost price.  The same applies to the collaborative working tools provided by ZECM.

Different scientific disciplines and their research domains apply different methods in handling research data. This makes comprehensive recommendations for the concrete procedures hard to define. Therefore it is generally recommended to become aquainted in advance with the established data formats, software, and standards that are used in your scientific community for the documentation and annotation of research data (e.g., ontologies, controlled vocabulary, or metadata schemes). Using open, nonproprietary file formats supports the access to and long-term availability of research data.

Describing research data with metadata is fundamental for the reusability of research data. Metadata is data about data and describes the context in which the data was created. As a rule of thumb, metadata should answer the classic six questions: Who? What? Why? How? When? Where? Metadata are a prerequisite for enabling potential subsequent users to find data and assess its suitability for the intended use. Ideally, the description is structured and machine-readable. For this purpose, metadata standards and standardized terminologies exist in most disciplines. If these do not exist, generic standards, such as Dublin Core, should be used to describe the data. They are developed and promoted by worldwide initiatives and help to make research results better findable and interoperable.

In collaborative projects or projects with large amounts of data, the use of dedicated work environments and portals for data management is advisable. Operating these infrastructres usually requires additional resources, but they provide the advantage of a uniform and central management for research data. Finding and sharing research data is thereby facilitated, but should be governed within the project consortium by a project-specific data policy.

  • Advice on technical infrastructure and IT services: ZECM
  • Advice on metadata and metadata standards: SZF

III. Final phase (after the research project)

According to good academic practice, by the end of the project research data is to be stored and, if possible, made accessible, if there are no contradictory contractual, ethical, or legal regulations. Many funding organizations now place particular importance on accessibility, in order to enable the verification of the research results and the reuse of the research data. In keeping with its Research Data Policy, TU Berlin supports open access to research data. When publishing research data, TU Berlin recommends to follow the principle, “Accessible if possible, restricted if necessary”.

The following basic principles regarding the publication of research data should be observed:

  • Individual websites (e.g., of projects, working groups, academic chairs, employees) are generally not a suitable location for the publication of research data. The long-term availability of such websites is often not ensured and the unique identification (keyword: persistent identifier) is usually not possible.
  • When selecting data to be published, the DFG recommends: "Research data should be made accessible at a stage of processing that allows it to be usefully reused by third parties (raw data or structured data)." In particular, data which forms the basis of a scientific article should be made accessible, if there are no contradicting data protection, legal or research ethics regulations.
  • As is the case for scientific articles, research data should be assigned a unique persistent identifier (PID) upon publication. In this way, research data can be found and cited independently of a publication. Well known examples are DOI (digital object identifier) or URN (uniform resource name).
  • To regulate the rights of use and utilization of research data, data should always be published with an appropriate license. The choice of a license should at least allow open access for scientific purposes. Any special requirements of the funding organization or of repositories are to be observed. Established free licenses in software are the GNU General Public License (GPL), MIT License or the Apache License. Creative Commons licenses are standard for texts, images, music and videos.

As a member of TU Berlin, you and your cooperation partners can use the interdisciplinary repository of TU Berlin, DepositOnce, to publish your research results (research data and publications). Research results, meaning consolidated data and all information needed to reproduce these results (such as notes, time histories/recordings, calculations, etc.) are stored in DepositOnce. Pursuant to the Statute on the Safeguarding of Good Academic Practice of TU Berlin, research data is stored for at least 10 years.

  • All data in DepositOnce is provided with metadata (standard format Extended Dublin Core).
  • All datasets automatically receive a persistent Internet address (DOI).
  • Various free licenses can be assigned to the datasets.
  • Via the DOI, related research data and publications in DepositOnce can be linked to each other and then refer to each other.

In accordance with the rules of good scientific practice, published research data can no longer be modified in DepositOnce. This maintains the data's citability and verifiability. DepositOnce utilizes a versioning in which new versions can be published while previous versions remain available. Every new version receives a new DOI; previous and current versions are automatically linked to one another and refer to each other.

DepositOnce is committed to Open Access. The metadata are publicly accessible on the Internet and are broadly distributed and made searchable via standard interfaces (Google Scholar, etc.). An embargo can be placed on the research data itself.

As part of various initiatives and projects in which TU Berlin researchers are also involved, there were built and are still being built discipline-specific research data infrastructures for many disciplines. Meanwhile there is a large number of discipline-specific repositories. These may offer advantages compared to DepositOnce, such as discipline-specific metadata schemes and specific search options. If you already use a repository in your scientific community, you should continue to do so. The same applies if it seems reasonable to publish the research data in a discipline-specific repository and such a repository exists for your discipline. In some disciplines, it is common to publish data as a supplement to the respective article. However, this form of data publication has the disadvantage that the data can only be found via the article and does not form an independent, citable publication object.

When choosing a discipline-specific repository, the following criteria should be observed: long-term availability (at least 10 years), allocation of persistent identifiers (e.g. DOI, URN), licenses and usage rights of the data, reputation and visibility, costs. The portal re3data.org offers a helpful overview with comprehensive search and filter functions when you are searching for a suitable discipline-specific repository for your research data.

Research data is to be published as soon as possible. If applicable reasons exist, an embargo can be placed on data in DepositOnce. In this case only the metadata are published; the data itself is stored in the repository and is only visible after expiry of the embargo. Interested persons can request the data via email during the embargo. The embargo is determined by the responsible researchers whereby the requirements and guidelines of research funding agencies and repositories must be observed. Embargo periods should not exceed a maximum of 5 years after the project end. An embargo must be justified, for example in a file in the repository which also includes the expiration date of the embargo.

  • Advice on the publication of research data, licenses, DepositOnce/repositories: SZF