Skip to content!

Linking Possibilities

Short Description

In addition to data from our main studies, SOEP-Core and SOEP-IS, the SOEP Research Data Center (SOEP-RDC) offers a number of other datasets. These provide diverse possibilities for data linkage for example in spatial or regional analysis.


The Microm-SOEP dataset enables users to link SOEP data with small-scale indicators from the micro-marketing provider microm. The Microm indicators have been matched with SOEP data on the housing block level. To protect the confidentiality of respondents’ data in accordance with data protection law, the data linkage was carried out on site at Kantar Public, the survey institute responsible for the SOEP fieldwork, which is alone in knowing respondents’ addresses.

All survey households remain completely anonymous. For security reasons—due to the small-scale nature of the data—analysis is only possible on specially protected SOEP computers on site at DIW Berlin.

Jan Goebel, C. Katharina Spieß, Nils R. J. Witte, Susanne Gerstenberg
Die Verknüpfung des SOEP mit MICROM-Indikatoren: Der MICROM-SOEP-Datensatz (PDF, 0.75 MB)

Contact Person

Neighbourhood Effects

The project "neighbourhood effects" aims at combining existing individual-level datasets at a small scale regional level with informations on the respective neighbourhood, which are generated from different data sources. In a second step, the role of neighbourhood effects on varying outcome variables is analyzed in a social context. Possible approaches are, among others, the investigation of the importance of neighbourhood effects on the individual labor market success or the individual likelihood of receiving welfare benefits.

cooperation partner at FDZ der Bundesagentur für Arbeit im Institut für Arbeitsmarkt- und Berufsforschung:

Stefan Bender (Project Head)
Theresa Scholz (Project Liaison)

cooperation partner at Rheinisch-Westfälisches Institut für Wirtschaftsforschung e. V.:

Matthias Vorell (Project Liaison)
Thomas K. Bauer (Project Liaison)

Data available at the RDC Ruhr as RWI-GEO-LAB data. Metadata: DOI 10.7807/DIWIABRWI:V1

Contact Person


Longitudinal survey of migrants from the social insurance statistics

There is currently a lack of reliable empirical data in two areas of growing importance for future migration and integration research: a) data on the integration of German-born children and grandchildren of immigrants, and b) data on the integration of immigrants from countries that have joined the EU since 2004. In cooperation with the Institute for Employment Research (IAB) in Nuremberg, the SOEP is creating a sample of immigrants to Germany based on administrative data from the Federal Employment Agency that will then be continued as a longitudinal household survey in the SOEP study framework. The initial linkage of survey data on migrants with administrative data - including an experiment on agreement to the linkage of register data - opens up new analytical potentials for research and policy advice, and is of major importance for the research infrastructure in Germany.

Survey data from the IAB-SOEP Migration Sample can be linked with administrative labor market and income data if the respondents in question provided explicit consent to record linkage. But since this linked dataset contains weakly anonymized social data, the datasets are only accessible on site at the Research Data Center of the German Federal Employment Agency at the IAB (FDZ IAB). Researchers can use FDZ IAB data on a guest visit to the IAB or through remote data processing, which can also be arranged with the IAB. Requests for data access should be directed to FDZ IAB, since a contract with IAB for data use is required (further information).

Philipp Simon Eisnecker, Klaudia Erhardt, Martin Kroh, Parvati Trübswetter
The Request for Record Linkage in the IAB-SOEP Migration Sample (PDF, 433.04 KB)

Philipp Eisnecker, Martin Kroh
The Informed Consent to Record Linkage in Panel Studies: Optimal Starting Wave, Consent Refusals, and Subsequent Panel Attrition

Armin Falk, Fabian Kosse
Early Childhood Environment, Breastfeeding and the Formation of Preferences (PDF, 0.86 MB)

Fabian Kosse, Thomas Deckers, Hannah Schildber-Hörisch, Armin Falk
The Formation of Prosociality: Causal Evidence on the Role of Social Environment (PDF, 0.58 MB)


In cooperation with the Research Data Center of the German Pension Insurance (FDZ-RV), we implement a record linkage of SOEP household survey data with administrative individual employment/retirement biographies, which are available on a monthly basis for employees since age 14.

The 2018 SOEP-RV RTBN Scientific Use File became available on September 1, 2021, as special sample “SOEP-RV.RTBN2018.” The sample includes 2,120 SOEP respondents who agreed to have their survey data linked to anonymized data from the Research Data Center of the German Pension Insurance Association (FDZ-RV). The file merges detailed account information on pension entitlements as well as pension payment amounts with SOEP sociodemographic data on material well-being and the like at both the individual and household levels. The codebook for this dataset and how to order it can be found on this page of the Forschungsdatenzentrum der Rentenversicherung (FDZ-RV)

Contact Person


The SOEP provides various linked employer-employee datasets. Some of them stem from the two SOEP-LEE studies, both of which included data collection from establishments that employ SOEP-Core participants. The first SOEP-LEE study collected one wave of data in 2013, while SOEP-LEE2 added two more waves in 2022 and 2024 (scheduled). SOEP-LEE2 also comprises a business-related survey of self-employed SOEP-Core participants, which was fielded in 2022 and 2024 (scheduled), extending the 2020 wave contributed by the INNOMSME study. Within SOEP-LEE2, additional data were collected from establishments that are not linkable to SOEP-Core, but that received a similar questionnaire, resulting in a larger dataset for company-level analyses.

The two SOEP-LEE studies collected data on different topics. The first SOEP-LEE study focused on organization and management, human resources policies, wages and inequality, and the financial situation of the establishments. SOEP-LEE2 kept some of these topics, but its main focus were workplace digitalization, the organization of work, personnel management and development, as well as the COVID-19 pandemic. The self-employed survey asked in the 2020 wave about innovation and productivity, R&D, (intangible) capital, and perceptions about one's own entrepreneurial activity. The 2022 wave continued with these themes, but adopted some questions from the SOEP-LEE2 establishment questionnaire for larger coherence.

Data access
Researchers who wish to access the data can do so by visiting the Research Data Center of the SOEP (RDC SOEP onsite). For more information, please check the website of the SOEP-in-Residence program.

Data Structure and Linkage
We provide the data of the two SOEP-LEE studies in different datasets. Data of the first SOEP-LEE study are contained in the datasets slee_estab and slee_sample. slee_estab includes the data collected in the establishment survey, while slee_sample is the linkage file that contains SOEP-Core person identifiers (pid) and establishment identifiers (eid), allowing for linkage with SOEP-Core. Data of SOEP-LEE2 employer survey is distributed in the datasets lee2estab, lee2brutto, and lee2person. lee2estab contains the survey data themselve, while lee2brutto provides additional field work information. lee2person is the linkage file that contains SOEP-Core person identifiers (pid) and establishment identifiers (eid), allowing for linkage with SOEP-Core. Data for the self-employed is provided in the selfempl dataset. Each individual's business is identified by the SOEP-Core person identifier (pid) so that no further linkage file is required. Note that it is not possible to combine the 2012 wave of SOEP-LEE with the subsequent waves into a single panel dataset because the 2012 wave uses different establishment identifiers, also if by chance the same establishment was surveyed.

For the first SOEP-LEE study, please cite: Weinhardt, M.; Meyermann, A.; Liebig, S.; Schupp, J. (2017). The Linked Employer-Employee Study of the Socio-Economic Panel (SOEP-LEE): Content, Design and Research Potential. Jahrbücher für Nationalökonomie und Statistik 237(5), 457–467.

For SOEP-LEE2, please cite: Matiaske, W., Schmidt, T. D., Halbmeier, C., Maas, M., Holtmann, D., Schröder, C., Böhm, T., Liebig, S., and Kritikos, A. S. (2023). SOEP-LEE2 : Linking Surveys on Employees to Employers in Germany. Jahrbücher Für Nationalökonomie Und Statistik Data Observer, 1–14.

Questions and variables are documented as part of SOEP-Core on Moreover, the following documentation is currently available:

slee_estab, 2011:

  • Questionnaire, pseudo-PAPI, PDF (de (PDF, 2.76 MB))
  • Codebook (de (PDF, 335.4 KB))
  • Methodological report (de (PDF, 3.1 MB))
  • Project Report (en (PDF, 143.92 KB))

lee2estab, 2021

  • Questionnaire, pseudo-PAPI, PDF (de)
  • Questionnaire with variable names, PDF (de)

selfempl 2020
Questionnaire with variable names, PDF (de)


Titel: K²ID-SOEP extension study

DOI: 10.5684/k2id-soep-2013-15/v1
Collection Period: 2013-2015
Publication date:2017-09-08
Principal investigators: Pia S. Schober, C. Katharina Spieß
Further Researchers: Juliane F. Stahl, Georg F. Camehl

K2ID is short for „Kinder und Kitas in Deutschland“ and refers to the German name of the surveys carried out as part of a project entitled “Early childhood education and care quality in the Socio-Economic Panel” (K²ID-SOEP).

It aims at investigating effects of the quality of early childhood education and care (ECEC) institutions on children’s development and parents’ employment and wellbeing. It also examines socio-economic differences in parental choices of ECEC quality and whether they are linked to information asymmetries between mothers and ECEC providers.

The data collection of K2ID is based on participants of the Socio-Economic Panel (SOEP). In addition, participants of the “Families in Germany” (FID) study (which is in the process of being integrated with the SOEP) were also included in the sampling frame. From this group of people those with one or more children below school age at the date of the survey were given an additional questionnaire concerning their child care arrangements with a focus on quality. In case they used an ECEC institution they were also asked to provide the address of this institution and, if applicable to identify the specific group which their children attend. In a second step the ECEC institution directors and group educators were also given a questionnaire to collect additional information on quality in the respective setting.

More information on the study and data collection on the Homepage

Since October 2017, the data from the K2ID-SOEP survey have been passed on within the framework of the Research Data Center SOEP.

Documentation and Questionnaires

Questionnaires are only available in German by now, English versions will follow.

Questionnaire for parents:

Questionnaire for child-minders:

Questionnaire for daycare managers:



Philipp Kaminsky, and Janine Napieraj
User support and contract management for the Research Data Center of the SOEP