Research Data Policy of the Technische Universität Berlin

Adopted by the Academic Senate on 23 October 2019

1. Preamble

The goal of TU Berlin is to further develop science and technology for the benefit of our society. The members of the university are wholly committed to the principle of sustainable development. TU Berlin carries out basic and application-oriented research at top international level and promotes cross-faculty research activities and networks with external actors as well as knowledge and technology transfer between the university and practical applications. To this goal, TU Berlin forms strategic alliances with companies as well as university and non-university research facilities.

Research data is a valuable resource and a basis for scientific knowledge and has a long-term value for research and science, with the potential for widespread use in society. Research data is all information (regardless of form or representation) that arise during a research process or is its result, including the information necessary to verify and reproduce the results. Research data includes measurement data, lab values, audiovisual information, texts, objects from collections or samples, surveys, and interviews, but also notes, time histories/recordings, calculations, software and code. TU Berlin is aware of the fundamental significance of research data for maintaining the quality of research and scientific integrity and is committed to follow recognized standards that meet the highest requirements. TU Berlin acknowledges that correct and easily retrievable research data is the foundation and integral part of every research activity as it is necessary for the verification and reproducibility of research processes and results.

With this policy, TU Berlin wants to provide its current and future researchers with an orientation for handling research data. The policy refers in particular to the recommendations of the German Council of Science and the Humanities (2012) and the German Rectors’ Conference (2014) as well as to the “Guidelines on the Handling of Research Data” (2015) of the German Research Foundation. TU Berlin formulates this policy according to the “Statute on the Safeguarding of Good Academic Practice at TU Berlin” and the “Open Access Policy of TU Berlin”.

2. Scope of Application

This policy on the management of research data principally applies to all researchers active at TU Berlin. In cases where research is carried out with third parties, arrangements with the third party regarding intellectual property rights, access rights, and the storage of research data precede this policy as long as they do not contradict the “TU Berlin Transfer Strategy”, the “Recommendations for Action Regarding Knowledge and Technology Transfer within the Context of Open Science” and the “Code of Conduct for Research Involving Commercial Enterprises”. As a rule, contractual regulations deviating from the Research Data Policy are given priority.

3. Legal Aspects

Intellectual property rights and rights to research data shall be defined through specific agreements (e.g. research contracts such as grant or consortium agreements and contract research agreements). In cases where the rights to the research data belong to TU Berlin, TU Berlin decides how the research data will be published, shared, and reused. If research data belongs to a researcher, the researcher will decide how to proceed with the data.

In research data management, TU Berlin and its researchers observe ethical and legal issues such as data protection and patent law. Priority shall be given to rules on confidentiality.

4. Handling Research Data

Maintaining the integrity of research data is essential: Research data must be stored in a correct, complete, unadulterated, and reliable manner. Furthermore, according to the FAIR principles, research data must also be identifiable, accessible, verifiable, interoperable, and, whenever possible, available for subsequent use.

In accordance with its Open Access Policy, TU Berlin supports open access to research data. In compliance with intellectual property rights and if no third-party rights, data protection rights, and legal or contractual provisions prohibit it, TU Berlin recommends that research data should be assigned a license for open use, such as Creative Commons, to enable the reuse of the research data.

Research data shall be stored in a suitable repository or archive system, shall be assigned a persistent identifier and metadata, and if possible, made openly accessible. The applicable research ethics and data privacy regulations are to be considered.

The minimum storage period for research data is ten years after either the assignment of a persistent identifier or the publication of the related work following research project completion, whichever is later.

Adherence to citation rules and requirements regarding publication and future use of the research data must be ensured. When research data is reused, the origin of the data must clearly be verifiable and the original sources must be named.

If research data is to be deleted or destroyed after the expiration period of the storage, this will be carried out only after considering all legal and ethical aspects. When making a decision about the retention and destruction of data, the interests and contractual stipulations of third-party funders and other stakeholders, employees, and partners, as well as the aspects regarding confidentiality and safety must be taken into consideration. Any action must be documented.

5. Responsibilities

The responsibility for research data management during and after a research project lies with TU Berlin and its researchers and should be compliant with the codes of TU Berlin for responsible conduct of research. The researchers are in particular responsible for:

  • a) Management of research data in adherence with the principles and requirements laid out in this policy. This includes specifications made by the principal investigator regarding the handling of research data in the framework of projects;
  • b) Collection, documentation, access to, and storage or proper destruction of research data and research-related records;
  • c) Planning to enable the continued use of data even after the completion of a research project. This includes defining usage rights, with the assignation of appropriate licenses. This also includes the clarification of data storage and archiving if the researcher leaves TU Berlin;
  • d) Creation and updating of Data Management Plans that explicitely define the collection, administration, integrity, confidentiality, storage, use, and publication of the research data.

TU Berlin commits itself to creating the conditions which enable the fulfillment of the principles laid out in this policy.

6. Validity

The Research Data Policy of TU Berlin was adopted by the Academic Senate on 23 October 2019. This policy will be reviewed every three years by the Academic Senate and updated if necessary.

Glossary for the Research Data Policy

Contract research

Contract research is scientific research carried out on behalf of a private or public funding organization. The task is predefined and the rights to the research results generally belong to the principal, whereby the University is granted a non-exclusive right of use for the purpose of research and teaching.

Cooperative research

Cooperative research is cooperation on equal terms between partners pursuing a common goal. In principle, the rights to the results belong to the partner who developed them.

Data management plan

A data management plan (DMP) is a structured guideline on how to handle research data both during and after a research project. It documents the process how the research data is generated and how it is stored appropriately so that it can be interpreted and verified, and remains available, authentic, citable, and reusable in later years. To this end, clearly defined legal parameters and appropriate security measures (such as contracts and licenses) shall also be included in the DMP. To optimize research data management and as a basis for institutional support, a DMP should be created before the start of a research project and updated in the course of the project.

FAIR principles

The FAIR principles are international principles for sustainably reusable research data. The main objective of the principles is to ensure that research data is optimally prepared so that it is findable, accessible, interoperable, and reusable. The FAIR principles were created and first published in 2016 by a broad-based interest group consisting of representatives from science, industry, funding organizations, and scientific publishers.

Licence for free use

A licence for free use (or free license) is a standardized license agreement which allows research data to be reused. It regulates which rights of use the author/originator grants to the general public that go beyond the applicable copyright law. Depending on the data type, a license is selected. Established free licenses in software are the GNU General Public License (GPL), the MIT license, or the Apache license. Creative Common licenses (CC) apply to texts, images, music and videos.

Metadata

Metadata is data about data. It is data which provides descriptive or contextual information about other data, can be indexed, and facilitates archiving and retrieval. There are different types or categories of metadata. Metadata in a repository, for example, can be classified in four types of content: bibliographic metadata (e.g. title, author, abstract), structural metadata (relationships between and within objects, e.g. links, references), administrative metadata (authorizations/status, e.g. access rights, embargo), technical metadata (data evaluated by the system, e.g. size of files, checksums, modification date).

Persistent identifier

A persistent identifier is a constant Internet address for digital objects. It guarantees that a dataset remains permanently findable, accessible, and citable, even if its physical location changes. Well known examples are DOI (digital object identifier) or URN (uniform resource name).

Repository

A repository is a storage platform for the archiving and worldwide publication of scientific publications, research data, or cultural heritage data. Storage of research results in a repository ensures that they are sustainably accessible, verifiable, citable, and reusable.

Research Data Management

Research Data Management (RDM) encompasses all measures for the quality assurance of research data with regard to storage, access, and preservation of research data in order to make research results sustainably reproducible and available for reuse. The activities cover the entire research data lifecycle, starting with the planning and execution of the research project, through the creation and storing of data to the long-term storage of the results after the research project is completed. Specific tasks in RDM include quality control and quality assurance, documentation, metadata creation, archiving, data exchange and reuse as well as measures for data integrity and data security.

Researchers

Researchers include all members of TU Berlin active in research including employees and doctoral candidates. People who are not directly connected with TU Berlin but use its infrastructure or are physically present at the University for research purposes are included in this term. Visiting researchers and external cooperative partners are expected to comply with the Research Data Policy.

Guidelines on the implementation of the Research Data Policy

Preface

The Technische Universität Berlin considers research data as a valuable resource and an essential basis for scientific knowledge. On 23 October 2019, the Academic Senate of TU Berlin adopted the “Research Data Policy of the Technische Universität Berlin” pursuant to the “Statute on the Safeguarding of Good Academic Practice at TU Berlin“ and the “TU Berlin’s Open Access Policy” and in accordance with the “TU Berlin Transfer Strategy”, the “Recommendations for Action for Knowledge and Technology Transfer in the Context of Open Science” and the “Code of Conduct for Research Involving Commercial Enterprises”. In the same meeting, the Academic Senate approved the “Guidelines on implementing the Research Data Policy of TU Berlin” drafted by the Service Center Research Data Management. TU Berlin supports its researchers with suitable offers for research data management according to financial possibilities. The guidelines* supplement the principles described in the Research Data Policy of TU Berlin and provide practical advice for their implementation.

* These guidelines are based on the guidelines for research data policies of the Humboldt-Universität zu Berlin (2014) and the Friedrich Schiller University Jena (2016).

Point of Contact

The Service Center Research Data Management (SZF) is the central point of contact for all issues related to research data management at TU Berlin. Within SZF, the University Library, the Center for Campusmanagement (ZECM, formerly tubIT), and the Department V Research work together and bundle their competences to support the University’s researchers in handling research data. SZF is managed and coordinated by the University Library. SZF operates the research data infrastructure at TU Berlin, which is integrated into the University’s IT infrastructure, and develops further services as needed. The central technical infrastructure services include DepositOnce, the repository for research data and publications of TU Berlin, and TUB-DMP, a web tool for creating data management plans. Advisory services and a help desk complement the technical services. The research data management and the research data infrastructure of TU Berlin are aligned with the FAIR principles, international guidelines for the optimal management of research data so that data is findable, accessible, interoperable, and reusable.

The SZF webpages serve as a central platform providing comprehensive information on the handling of research data and the services for research data management as well as advisory offers and contact information for the respective contact persons. Corresponding to the task sharing of SZF, the teams of Department V Research (see below) advise you on the requirements of funding organizations and project proposals. The SFZ team of the University Library advises you on the handling of research data and research data management. ZECM advises you on the acquisition of infrastructure and the use of its services.

Guidelines on the Handling of Research Data

Below are guidelines on the handling of research data for the different phases of a research project.

I. Planning phase: before the research project

In case of a third-party funded project, inform yourself in advance of any applicable guidelines regarding the reuse of research data originating in the project. Costs arising for the long-term archiving of data that extend beyond the University¡¦s basic infrastructure can and should be part of the funds applied for. Use the Research Department¡¦s advisory offers to learn about the different requirements of funding organizations and funding possibilities and to determine your individual needs.

In order to receive adaequate support for your new research project, it is recommended to announce the project during the proposal phase to the Research Department. This project announcement is achieved by using the electronic project notification (ePA). If you have any questions concerning the announcement, the Research Promotion Section will assist you.

Furthermore, for any project in which data is collected or forms the foundation of your research it is strongly recommended that you make an early examination of the requirements and possibilities of efficient and sustainable research data management. Contact the SZF team to find out about the handling of research data and about services for research data management and to develop a suitable strategy for your project.

The creation of a Data Management Plan (DMP) is recommended and increasingly required by funding agencies. A data management plan is a structured guideline outlining how to handle research data both during the project and after its completion. It documents the process how the research data is generated and how it is stored appropriately so that in later years it can be interpreted and verified and remains available, authentic, citable, and reusable.

During the proposal phase, a strategy for the sustainable archiving and availability of the research data should be determined and included in the data management plan. Clearly defined legal parameters and suitable security measures (such as contracts and licenses) for later use are also to be defined in the data management plan. To optimize research data management and as a basis for institutional assistance, a DMP should be created before the start of the research project and updated over the course of the project (keyword: living document).

For individual as well as for joint projects, you can use the web tool TUB-DMP that will assist you with the creation, versioning, and long-term storage of your data management plan. It contains templates in the form of checklists with relevant questions, which you can answer in a step-by-step workflow as they pertain to your project. The tool also includes a template for creating a data management plan that is compliant with Horizon 2020. The SZF team will assist you with questions about your data management plans.

In all research projects and also in the publication of the research results legal frameworks must be observed. Certain research data, for example in the social or life sciences as well as in medicine, are subject to strict requirements, such as data privacy or previous review by an ethics committee. Copyright and the protection of third-party interests must also be ensured. For this reason, fundamental legal questions are to be settled in advance when planning a research project. The section “Research Contracts, Patents, and Corporate Investments” in Department V can assist you with this. To learn about data protection requirements, contact the Data Protection Officer of TU Berlin early on.

II. Implementation phase: during the research project

State of the art processes are to be used for the storage and processing of research data as well as for collaboration based on this data. This particularly includes compliance with data security as regards the availability, integrity, and authenticity of data. This requires, for instance, the use of data backup and protection, secure data exchange platforms, and versioning tools.

In the implementation phase of a project, datasets may evolve several stages (e.g. by selection, aggregation, integration). It is a good practice to label, document and keep the different versions at least for the duration of the project. Especially in case of text-based data, the use of versioning tools, as commonly used in software development (e.g. git, SVN), helps with the management of different versions.

ZECM provides the following services for research work at the University; further information can be found on the ZECM website:

  • Use of network file systems (incl. data backup)
  • Archive storage services on tape drives
  • Provision of virtual root servers (server hosting)
  • Accomodation of real servers (server housing)
  • Block storage services for servers (virtual hard disks over a dedicated storage network)
  • Data exchange services
  • Versioning services These services are either included as basic equipment and are free of charge or offered at cost price.

The same applies to the collaborative working tools provided by ZECM.

Different scientific disciplines and their research domains apply different methods in handling research data. This makes comprehensive recommendations for the concrete procedures hard to define. Therefore it is generally recommended to become aquainted in advance with the established data formats, software, and standards that are used in your scientific community for the documentation and annotation of research data (e.g. ontologies, controlled vocabulary, or metadata schemes). Using open, nonproprietary file formats supports the access to and long-term availability of research data.

Describing research data with metadata is fundamental for the reusability of research data. Metadata is data about data and describes the context in which the data was created. As a rule of thumb, metadata should answer the classic six questions: Who? What? Why? How? When? Where? Metadata are a prerequisite for enabling potential subsequent users to find data and assess its suitability for the intended use. Ideally, the description is structured and machine-readable. For this purpose, metadata standards and standardized terminologies exist in most disciplines. If these do not exist, generic standards, such as Dublin Core, should be used to describe the data. They are developed and promoted by worldwide initiatives and help to make research results better findable and interoperable.

In collaborative projects or projects with large amounts of data, the use of dedicated work environments and portals for data management is advisable. Operating these infrastructres usually requires additional resources, but they provide the advantage of a uniform and central management for research data. Finding and sharing research data is thereby facilitated, but should be governed within the project consortium by a project-specific data policy.

III. Final phase: after completion of the research project

According to good academic practice, by the end of the project research data is to be stored and, if possible, made accessible, if there are no contradictory contractual, ethical, or legal regulations. Many funding organizations now place particular importance on accessibility, in order to enable the verification of the research results and the reuse of the research data. In keeping with its Open Access Policy, TU Berlin supports open access to research data. When publishing research data, TU Berlin recommends to follow the principle, "Accessible if possible, restricted if necessary".

The following basic principles regarding the publication of research data should be observed:

  • Individual websites (e.g. of projects, working groups, academic chairs, employees) are generally not a suitable location for the publication of research data. The long-term availability of such websites is often not ensured and the unique identification (keyword: persistent identifier) is usually not possible.
  • When selecting data to be published, the DFG recommends (2015): "Data should be made accessible at a stage of processing that allows it to be usefully reused by third parties (raw data or structured data)."22 In particular, data which forms the basis of a scientific article should be made accessible, if there are no contradicting data protection, legal or research ethics regulations.
  • As is the case for scientific articles, research data should be assigned a unique persistent identifier (PID) upon publication. In this way, research data can be found and cited independently of a publication. Well known examples are DOI (digital object identifier) or URN (uniform resource name).
  • To regulate the rights of use and utilization of research data, data should always be published with an appropriate license. The choice of a license should at least allow open access for scientific purposes. Any special requirements of the funding organization or of repositories are to be observed. Established free licenses in software are the GNU General Public License (GPL), the MIT license or the Apache license. Creative Commons licenses are standard for texts, images, music and videos.

As a member of TU Berlin, you and your cooperation partners can use the interdisciplinary repository of TU Berlin, DepositOnce, to publish your research results (research data and publications). Research results, meaning consolidated data and all information needed to reproduce these results (such as notes, time histories/recordings, calculations, etc.) are stored in DepositOnce. Pursuant to the Statute on the Safeguarding of Good Academic Practice of TU Berlin, research data is stored for at least 10 years.

  • All data in DepositOnce is provides with metadata (standard format Extended Dublin Core).
  • All datasets automatically receive a persistent Internet address (DOI).
  • Various free licenses can be assigned to the datasets.
  • Via the DOI, related research data and publications in DepositOnce can be linked to each other and then refer to each other.
  • In accordance with the rules of good scientific practice, published research data can no longer be modified in DepositOnce. This maintains the data's citability and verifiability. DepositOnce utilizes a versioning in which new versions can be published while previous versions remain available. Every new version receives a new DOI; previous and current versions are automatically linked to one another and refer to each other.
  • DepositOnce is committed to Open Access. The metadata are publicly accessible on the Internet and are broadly distributed and made searchable via standard interfaces (Google Scholar, etc.). An embargo can be placed on the research data itself.

In recent years, scientific communities worldwide have built discipline-specific research data infrastructures in which TU Berlin researchers are also involved. Meanwhile there is a large number of discipline-specific repositories. These may offer advantages compared to DepositOnce, such as discipline-specific metadata schemes and specific search options. If you already use a repository in your scientific community, you should continue to do so. The same applies if it seems reasonable to publish the research data in a discipline-specific repository and such a repository exists for your discipline. It is also possible to publish the data in special data journals. In some disciplines, it is common to publish data as a supplement to the respective article. However, this form of data publication has the disadvantage that the data can only be found via the article and does not form an independent, citable publication object.

When choosing a discipline-specific repository, the following criteria should be observed: long-term availability (at least 10 years), allocation of a persistent identifier (e.g. DOI, URN), licenses and usage rights of the data, reputation and visibility, costs. The portal re3data.org offers a helpful overview with comprehensive search and filter functions when you are searching for a suitable discipline-specific repository for your research data. The SZF team will assist you with questions regarding DepositOnce and repositories.

Research data is to be published as soon as possible. If applicable reasons exist, an embargo can be placed on data in DepositOnce. In this case only the metadata are published; the data itself is stored in the repository and is only visible after expiry of the embargo. Interested persons can request the data via email during the embargo. The embargo is determined by the responsible researchers whereby the requirements and guidelines of research funding agencies and repositories must be observed. Embargo periods should not exceed a maximum of 5 years after the project end. An embargo must be justified, for example in a file in the repository which also includes the expiration date of the embargo.