Impact of an Experiment-Based Intervention on Pre-Service Primary School Teachers’ Experiment-Related and Science Teaching-Related Self-Concepts

Teachers’ academic self-concept is considered an important factor influencing their professional competence. Regarding primary science education, positive science (teaching) related self-concepts might encourage teachers to plan and teach ‘minds on’ experiment-based science lessons leading to deep learning processes. However, research on pre- and in-service primary teachers’ self-concepts and influencing factors, such as previous experimental experience, is scarce. Thus, this study investigates the impact of an experiment-based intervention on pre-service primary school teachers’ experiment-related self-concept and self-concepts on planning and teaching experiment-based lessons. The evaluation followed a quasi-experimental, longitudinal (pre-post) design with an experimental group of N = 158 pre-service primary teachers and a baseline group (N = 44), not attending the course. According to the results, pre-service teachers gained little to moderate experimental experience in school and studying at university. Besides, the pre-service teachers with a science major gained significantly more experimental experience than those with other majors during their time at the university. Significant, positive correlations were found between previous experimental experiences and the self-concepts examined in this study. While self-concepts did not change in the baseline group, they increased significantly in the experimental group. One reason for this could be the perception of competence, as the findings reveal positive correlations between changes in self-concepts and perceived experimental competence during the intervention. Regarding the impact of the variable ‘course format’ on reinforcing the self-concepts, participants of the intensive block format seem to have a slight advantage compared to pre-service teachers attending the traditional, weekly course format. Furthermore, the results indicate that the course is equally beneficial for pre-service teachers with and without a science major.


INTRODUCTION
Primary school teachers need specific professional competencies to plan, teach and reflect experimentbased science lessons that encourage pupils to 'minds-on' scientific investigations (Minner et al., 2010;National Academies of Sciences, Engineering, and Medicine, 2015). Competent science teachers depend on syntactic knowledge, e.g. knowledge of the control-of-variables strategy (Haslbeck, 2019;Schwichow et al., 2016) and knowledge of the steps of experimentation. Pedagogical content knowledgeincluding knowledge about appropriate instructional strategies for experiment-based lessons and (mis)conceptions as well as possible learning difficulties of pupils -is also required (Baumert & Kunter, 2013;Franken et al., 2020;KMK, 2008KMK, /2019. However, professional knowledge alone is not sufficient for planning and teaching inquiry-based science lessons. Primary school teachers should also be able to generate and test hypotheses themselves (KMK, 2008(KMK, /2019). According to Capps et al. (2012), inquiry professional development benefits from experimental experiences. Experiences (of competence) can, in turn, influence teachers' academic self-concept (Bong & Skaalvik, 2003;Dickhäuser, 2006;Shavelson et al., 1976), which is an essential predictor of teaching effectiveness and teaching behavior (Guskey, 1988;Yeung et al., 2014). It is defined as "self-evaluated perception of [teachers'] professional knowledge" (Paulick et al., 2016, p. 174), abilities and performance (Dickhäuser, 2006) and is considered a multidimensional, hierarchical construct (Marsh et al., 1988;Paulick et al., 2016;Shavelson et al., 1976).
Perceptions of competencies regarding scientific inquiry and planning/teaching inquiry-based science lessons can be located on lower hierarchical levels of teachers' academic self-concept (Atzert et al., 2020;Franken et al., 2020;Paulick et al., 2016). The experiment-related self-concept is a person's perception of her/his abilities and skills in experimentation (Buse et al., 2018;Damerau, 2012;Franken et al., 2020). Based on the model of experimentation competency by Schreiber et al. (2009), latter authors differentiate between the dimensions of planning (hypothesize, develop an experimental setup), conducting (handling of equipment, measurement, record results), and interpreting (analyze data, draw conclusions, verify/falsify hypotheses) experiments. However, in other studies, the two subscales 'planning & interpreting' and 'conducting' experiments (Atzert et al., under review;Rautenstrauch & Busker, 2020) or an overall scale (Peschel, 2018) are used. Within the field of competence in teaching, a distinction can be made between self-concepts on planning, teaching, and reflecting experiment-based lessons referring to the work of Bosch (2006) and Merkens (2010). The phase of planning lessons includes, among other things, determining learning objectives and activities, whereas teaching comprises interacting with the pupils or dealing with deviations from the plan. The phase of reflecting covers activities such as drawing consequences from the evaluation of the lesson (Bosch, 2006;Merkens, 2010).
According to Yeung et al. (2014), teachers with a positive self-concept are more likely to motivate their pupils and initiate deep learning processes than teachers with a negative self-concept. Low domainspecific self-concepts can lead teachers to avoid certain topics or activities in class (Appleton, 2007;Damerau, 2012). Besides, teachers' self-concept was found to predict the usage of new instructional practices (Guskey, 1988) and indicate teachers' performance in professional knowledge (Paulick et al., 2016). Regarding these findings, it is to be assumed that primary school teachers with a high experiment-related self-concept and self-concept on planning/teaching experiment-based lessons are more willing and able to arrange challenging experiment-based lessons than teachers with low self-concepts . Furthermore, these self-concepts can be used as an indicator for factual experimental competencies and syntactic knowledge (Paulick et al., 2016;Schreiber et al., 2016).
To support primary school teachers in planning and teaching 'minds on' experiment-based science lessons, teacher education programs should aim at enhancing pre-service teachers' corresponding selfconcepts as early as possible. However, there is a lack of research on (the development of) pre-and inservice teachers' academic self-concepts and influencing factors (Franken, 2020;Paulick et al., 2016, Sorge et al., 2019. Regarding the experiment-related self-concept, it was found that both secondary school students (Buse, 2017;Damerau, 2012;Rautenstrauch & Busker, 2020) and pre-service biology, chemistry (secondary school), and primary science and social studies teachers within the master program  assess their skills in 'conducting experiments' higher than in the dimensions of planning and interpreting experiments. This could be due to few opportunities for experimentation in school -especially regarding the steps of planning and interpreting -and dominating 'recipe-type' activities, where learners follow prescribed procedures (Seidel et al., 2007;Yip, 2001). Thus, many pupils and university students show deficits in planning and evaluating experiments (Fleischer et al., 2020;Hilfert-Rüppell et al., 2013). Among other things, this applies to formulating usable research questions (Hofstein et al., 2005), understanding and adopting the control-of-variables strategy (Emereole, 2009;Hilfert-Rüppell et al., 2013), recording data, and attending to the hypothesis to draw conclusions (Germann & Aram, 1996). So far, however, it has not been explicitly investigated whether there is a positive correlation between teachers' past experimental experiences and their experiment-related self-concepts and selfconcepts on planning and teaching experiment-based lessons.
Longitudinal studies concerning these selfconcepts are also scarce. It has been shown that experiment-based educational settings can reinforce secondary school students' (Buse et al., 2018;Damerau, 2012) and pre-service primary science and social studies teachers' (Peschel, 2018) experimentrelated self-concept. A similar development was found for pre-service primary teachers' self-concept on the control-of-variables strategy in the study by Haslbeck (2019). In contrast, the experiment-related self-concept of pre-service biology, chemistry, and primary science and social studies teachers did not change significantly during their practical semester in the master program (Franken, 2020).
To date, it has rarely been explored which variables and factors in the course of study exactly play a role in forming teachers' self-concepts. In addition to experiences of competence (Dickhäuser, 2006;Shavelson et al., 1976), individual feedback on performance and a supportive learning environment could be of relevance (Bong & Skaalvik, 2003;Lüdtke et al., 2005;Möller & Trautwein, 2015). According to the results of Paulick et al. (2017) and Sorge et al. (2019), pre-service teachers' self-concepts are also formed using external and internal frames of reference (Marsh et al., 1988). Due to the findings of Franken et al. (2020) and Nadelson et al. (2013), the major field of study can also be assumed to be an important variable influencing science (teaching) related self-concepts. This factor is of great relevance as many primary school teachers have to teach science without appropriate training in this area (Porsch & Wendt, 2016). However, it has not yet been investigated whether pre-service primary teachers who do not have a science major have lower experiment-related and science teaching-related selfconcepts than those with this major. Besides, it is not known whether the academic self-concepts of these two groups of students change differently as a result of university courses focusing on experimentation. Especially with a view to pre-service primary teachers without a science major, who might be interested in further training in the area of science (education), intensive block course formats could be attractive. Compared to 'traditional' courses taking place once or twice a week, their "main characteristic […] appears to be that an equal number of class hours is delivered in more concentrated bursts" (Burton & Nesbit, 2008, p. 5). The question arises as to whether the course format influences the development of preservice teachers' academic self-concepts. Here, too, research is minimal. Schaal & Randler (2004) and Hilkenmeier & Sommer (2014) revealed higher scores in perceived competence for pre-service teachers' attending a block course format. Since experiencing competence is described as a determinant forming the self-concept (Bong & Skaalvik, 2003;Shavelson et al., 1976), these findings could suggest that block courses are more likely to reinforce pre-service teachers' self-concepts than weekly course formats.

Aim of the Study and Research Questions
The following research questions were derived from the lack of empirical research on the status quo and the impact of university courses on pre-service primary teachers' experiment-related self-concepts and self-concepts on planning and teaching experiment-based lessons described above.
Since academic self-concepts can be fed by experiences with the environment and experienced competence in the past (Bong & Skaalvik, 2003;Shavelson et al., 1976), the first research questions are: RQ1 (Experimental experience) RQ1.1: How much experimental experience have pre-service primary-teachers gained in primary school, in secondary school, and while studying at university? Are there any differences between students with a science major and those with a different major? RQ1.2: Is there a positive correlation between previous experimental experiences and the self-concepts examined in this study before starting with the intervention?
The next step was to investigate whether a university course (intervention) that focuses on experimentation impacts pre-service primary teachers' experiment-related self-concept and selfconcepts on planning and teaching experiment-based lessons.

RQ2 (Effects of the intervention)
RQ2.1: Does participation in the intervention lead to a change in self-concepts compared to non-participation? RQ2.2: Do participants of the intervention show differences in developing these selfconcepts depending on the course format (weekly/traditional or block/intensive)? RQ2.3: Do participants of the intervention show differences in developing these selfconcepts depending on their major field of study (science major or not)? Assuming that changes in these self-concepts result from the experience of competence during the intervention (Bong & Skaalvik, 2003;Dickhäuser, 2006), the third research question is: Is there a positive correlation between perceived competence in planning, conducting, and interpreting experiments during the intervention and a change of self-concepts from pre-to posttest?

METHOD Study Design and Data Collection
The present study examined two different university course formats (intervention) with a quasiexperimental longitudinal design with two points of data collection and two main study groups of preservice primary teachers (see Figure 1). The experimental group (EG) took part in the intervention offered either in a 'traditional' course format weekly during the lecture period or as an intensive, four-day block course during semester break. The control group, hereinafter referred to as the baseline group (BG), did not attend the course. Its purpose was to determine the effects of other factors on the dependent variables studied (e.g. the impact of participating in other courses or the practice semester). To ensure comparability of the results, the BG was also divided into two subgroups: Like participants in the weekly course format, one subgroup took the survey (see chapter Measurement Instrument) at the beginning and end of the lecture period. The other BG subgroup filled in the questionnaire as if they were taking a block course. All participants were asked to take two online questionnaires (tool: SoSci Survey; Leiner, 2006) via the news forum of a digital learning room. The pretest took place within ten days prior to the intervention and a posttest within ten days after completing the last course session. As many factors influence an intervention, it was always held in the same classroom and taught by the same lecturer and student assistant. For this reason, the influence of these factors is negligible for the evaluation.
Educational Concept and its Curricular Framework This is a summary of the curricular framework and educational concept, focusing on reinforcing preservice primary teachers' experimental competencies in experimentation and planning as well as teaching experiment-based lessons. For a detailed description of this newly developed intervention, refer to Beudels, Schilling, and Preisfeld (under review).
The intervention can be attended by pre-service primary school teachers in the bachelor's and master's programs. Due to the examination regulations, most of these students can only take the course voluntarily, i.e. not earn any credit points. This applies for example to students of science and social studies in their master's program and students focusing on special education, music, or theology. The reason to make this intervention available to all preservice primary teachers is that many of them have to teach science (within the subject of science and social studies) without an adequate university education (Porsch & Wendt, 2016). As part of a pilot project, bachelor students having a major field of study in science (biology, chemistry, physics) and technology were able to attend the course as a replacement for a mandatory module component.
The intervention consists of twelve 100-minute sessions each for both course format types (see Figure 1). To guarantee a good and similar supervisory relationship, there was a maximum of 30 students per course. Based on previous research findings (Kleickmann et al., 2006;Schwichow et al., 2016), a moderate-constructivist learning environment with tutorial support was chosen: Accompanied by the lecturer and phases of reflection, Figure 1. Overview of the study design (top) and the sequence as well as the contents of the course sessions (bottom) many opportunities are given to check previous knowledge, to further develop competencies and to exchange ideas with fellow students. Framed by an introductory and a closing session, two five-sessionlong contexts from children's living environment, 'the pond and its surrounding' and 'the human being and its physical performance', form the framework of the activities described below. Each context block was divided into station learning and planning an experiment for a science lesson (see Figure 1).
1. Experimentation at stations (three sessions each): Questions such as 'How can water striders walk on water?' or 'Can I recognize food by taste if I cannot see and smell it?' form the starting point of station learning in partner work. There are two to four stations per session with several sets of experimental materials. In the following, a 'pedagogical biplane' (Tolsdorf & Markic, 2018;Wahl, 2013) is applied: Being in the role of learners, the preservice teachers can learn science content, learn about scientific inquiry (e.g. principles of systematic observations), and learn to do inquiry (e.g. skills to handle measuring devices; Gyllenpalm & Wickman, 2011) through experimentation. From a teacher's perspective, experimentation at stations can simultaneously be used as a model for a possible learning setting in science lessons (learning to teach science, Gyllenpalm & Wickman, 2011).
Worksheets located at the stations are designed so that the participants undergo all three phases of experiments: They are asked to formulate hypotheses on a research question, plan a suitable experimental setup, interpret the results of the experiments and draw conclusions. This design intends to strengthen the experimental competencies, especially regarding the phases of planning and interpreting experiments based on the identified shortcomings in these phases described above. To avoid excessive demands on the participants -e.g. due to a lack of routine (Girwidz, 2020) -strongly guided experimentation (prescribed procedures) is applied at the beginning of the course. Later, more open, 'minds-on' experimental formats, e.g. including the phase of planning an experiment autonomously, are also used.

Debriefing
and reflection regarding professional competencies and teaching practice: After each experimental phase, the findings are collected and reflected on in the plenum. Guided by the lecturer, the participants are asked to briefly summarize the process of gaining scientific knowledge: What was the assumption? What was done? What was observed? What can be conducted from this? Difficulties encountered in experimentation are collected to reflect possible solutions to similar problems in the classroom (e.g. uncertainties regarding the handling of experimental equipment; Kurth & Wodzinski, 2020). The lecturer also uses this phase to discuss different forms of classroom experiments or provide theoretical input regarding syntactic knowledge (e.g. control-ofvariables strategy; Schwichow et al., 2016).
3. Planning an experiment for a science lesson (two sessions each): Once in each context, the teams plan a classroom experiment. They are faced with the challenge of applying their previous experimental experiences and the findings from the reflection phases in a situation typical of their profession (Kirsch, 2020). Since planning an experimental lesson is a complex process (Nerdel, 2017), only particular aspects of planning are taken up in the course. The task is to design a worksheet or researcher's protocol for an experiment to be used in a science lesson at primary school. Grade level and the type of experiment can be freely selected; typical steps of gaining scientific knowledge via experimentation (see above) should be integrated. In parallel, teaching-related considerations, e.g. learning objectives and dealing with emerging difficulties, are recorded on a poster. Planning aids include a list of operators for formulating learning objectives and excerpts from teaching materials.
4. Reflection and feedback: These planning phases are again reflected in the plenum. In addition, all groups receive written feedback from the lecturer on their teaching-related considerations, the experiment, and the worksheet. After finishing the planning activities within context two, each group also conducts the experiment of another team and gives short written feedback using predetermined feedback rules.

Sample
The study was carried out from winter semester 2017/2018 until the end of summer semester 2019 at a university in North Rhine-Westphalia (Germany). 238 pre-service primary school teachers participated. 191 of them formed the EG, participating in the intervention, whereas 47 persons belonged to the BG. The course took place four times weekly and four times in a block format during the survey period. Data records of participants who answered the survey at none or one of the two measurement times were excluded. Hence, data analysis (see below) was performed with a total sample size of Ntotal = 202 (NEG = 158; NBG = 44). 79.2% of the participants were enrolled in bachelor's degree programs and 20.8% were master students. When the posttest started, the mean age was 22.65 years (SD = 2.93 years). The gender distribution of 90.1% female participants represents the high percentage of female primary school teachers in Germany (89.4% female teachers in the school year 2019/2020; Federal Statistical Office, 2020). 55.7% (NEG, weekly = 88) of the participants in the EG completed the weekly course format, 44.3% (NEG, block = 70) attended the block format. To ensure comparability, BG participants were also divided in a 'weekly' (NBG,weekly = 21; i.e. 47.7%) and 'block' subgroup (NBG,block = 23; i.e. 52.3%). While 66.5% of the participants in the EG stated that they study primary school teaching with a science major (SciMaj), the other 33.5% were enrolled with other majors (non-SciMaj), e.g. social or religious studies. In the BG, 61.4% had a science major, 38.6% studied other majors. Almost half of the participants (e.g. all students without a science major) could only take part in the intervention voluntarily (NEG, voluntary = 72; i.e. 45.6%). As a pilot project, the other half (NEG, mandatory = 86; i.e. 54.4%) participated as part of a mandatory module (see above).

Measurement Instrument
Experimental experience, experiment-related selfconcepts, self-concepts on planning and teaching experiment-based lessons, and perceived experimental competencies during the intervention were investigated using five-point Likert-type scales from 1 = strongly disagree to 5 = strongly agree.
Examples of items associated with these scales are shown in Table A1 (Appendix). As the test language was German, the items are listed in their original form and translated versions. Exploratory factor analyses (principal axis factor analysis with varimax rotation; Bühner, 2021) were performed to examine the construct validity of the subscales presented below (see also Table A1) since the items were selfconstructed or adapted from scales used in a survey with another study group (Damerau, 2012).
Via Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy (Kaiser & Rice, 1974) and Bartlett's test of sphericity (Bartlett, 1951), adequacy of the data for factor analysis was checked previously. The number of extracted factors was determined regarding theoretical considerations (Bühner, 2021) and the total variance of the items explained by the factors (Janssen & Laatz, 2017). Then the items were assigned to the factor on which they load the most, whereby factor loading values λ ≤ .50 were not considered (Backhaus et al., 2018). Items with crossloadings ≥ .40 (Noorman, 2017) were also excluded. Factor loadings λ < 0.3 are not reported (Fromm, 2012).
In the pretest, experimental experience in primary school, in secondary school, and while studying at university (RQ1) was measured using two items each (see Table A1; self-construction). The factor analysis results support this division into three subscales (see Table A2; three-factor-solution explaining 79.36 % of total variance). As shown in Table A1, the reliability of the constructs in the form of internal consistency is good (Cronbach's α ≥ .8) or excellent (α ≥ .9; George & Mallery, 2003) respectively.
At both test times, experiment-related selfconcepts were quantified using two subscales with items adapted from Damerau (2012). The original scale -consisting of three subscales ('planning', 'conducting' and 'interpreting') -could not be confirmed by the factor analysis with two factors being extracted (see Table A3). Based on the factor analysis results and in accordance with Atzert et al. (under review), the subscales 'planning' and 'interpreting' were combined into one subscale, the self-concept on planning & interpreting experiments (six items; see Table A1). It focuses on self-assessed abilities associated with theoretical considerations such as making predictions, developing suitable setups for experiments, and analyzing experimental observations (Rautenstrauch & Busker, 2020;Schreiber et al., 2009). In the case of the self-concept on conducting experiments (three items), the focus is on psycho-motor skills required in the phase of conducting an experiment (Schreiber et al., 2009). With Cronbach's α values ≥ .8 in pre-and posttest, the internal consistency for both subscales is satisfactory.
Self-concepts on planning (three items; selfconstructed) and teaching (three items; selfconstructed) experiment-based lessons were also queried at both measurement times. A subdivision into the dimensions of planning and teaching is based on the work of Bosch (2006), Merkens (2010), and the standards of educational sciences determined by the Standing Conference of the Ministers of Education and Cultural Affairs in Germany (KMK, 2004). As shown in Table A4, the factor analysis results also support this division into two subscales (two-factorsolution that explains 78.59 % of total variance). The reliability of both subscales is good (Cronbach's α ≥ .8) or excellent (α ≥ .9; see Table A1) respectively.
To answer RQ3, the EG was also asked in the posttest for perceived experimental competencies during the intervention. All items of this scale were adapted from Damerau (2012), who operationalized the sub-competencies 'planning', 'conducting' and 'interpreting' experiments in the model of experimental competence by Schreiber et al. (2009). As with the construct of experiment-related selfconcept, a two-factor solution was chosen based on the factor analytical results (see Table A5), resulting in the two subscales perceived competence in planning & interpreting experiments (six items) and perceived competence in conducting experiments (three items). Three items had to be removed because the factor loading was too low (λ ≤ .50). The reliability of the first subscale is good (α = .816), that of the second is acceptable (α = .707; George & Mallery, 2003).
Data Analysis All data were analyzed using SPSS Statistics (version 27). Since the function of reminding participants to submit an answer for each item was set in the survey tool, no missing values had to be replaced in the records. With tight constructs, as is the case here, item discriminability values rit between .3-.5 are classified as medium, values of rit > 5 as high (Döring & Bortz, 2016). Therefore, items that had a rit < .3 in the pre-or posttest were removed before further analysis steps were carried out. Internal consistency was measured with Cronbach's α (Döring & Bortz, 2016). In all subscales, Cronbach's α-value is satisfactory with α ≥ .7 (see above; George & Mallery, 2003). Since all total scores -normed to a maximum of 5 -show a Gaussian distribution, parametric methods could be used for the cross-sectional and longitudinal comparisons now described. Pretest mean values of the scales examined were first considered to be able to exclude floor and ceiling effects (Döring & Bortz, 2016).
Based on Damerau (2012), the following evaluation procedure was carried out for longitudinal comparisons between two groups: First, independent samples t-tests (Janssen & Laatz, 2017) were used to check data of two groups for pretest differences.
In case of no significant pretest differences, twoway (RQ2.1 and 2.3) or rather three-way (RQ2.2) repeated measures analyses of variance (ANOVAs; Rasch et al., 2014b) were conducted to detect significant interaction group/treatment*reference time (factor 1 (within-subject): reference time; factor 2: between-subject factor defining the main comparison group/treatment; factor 3 (RQ2.2): further between-subject factor the impact of which should be factored out). The influence of the latter factor ('type of participation' (voluntary/mandatory) or 'major' (SciMaj/non-SciMaj)) was factored out to be able to trace back possible changes in selfconcepts to the impact of the course format. Those between-subject factors -nominally scaled -could not be used as covariates since metrically scaled covariates are required for ANCOVAs (Backhaus et al., 2018). In addition, it was impossible to include both factored-out factors simultaneously as students without a science major were not obliged to participate, resulting in a subgroup sample size of Nnon-SciMaj + mandatory participation = 0. The interaction effects -representing the combined effects of factors on the dependent variable (Rasch et al., 2014b) -show whether the self-concepts of the observed groups develop significantly differently over time.
In the case of pretest differences, analyses of covariance (ANCOVAs; Backhaus et al., 2018) with the posttest sum scores as the dependent variable, the pretest sum scores as the covariate, and the independent variable as fixed (between-subject) factor were conducted (one fixed factor to answer RQ2.1 (treatment) and 2.3 (major); two fixed factors (course format and factored-out variable: 'type of participation' or 'major', RQ2.2)). By integrating the covariate and the second fixed factor, their impact on the dependent variable was factored out. The socalled main effect of the first fixed factor (RQ2.1: treatment, RQ2.2: course format, RQ2.3: major) was examined to detect differences in the development of the self-concepts between the groups.
Paired samples t-tests were then used to determine whether pre-and posttest means differ significantly within a group (Janssen & Laatz, 2017). To be able to trace back possible changes in selfconcepts to the impact of the course format (RQ2.2), the influence of the variables 'type of participation' and 'major' was factored out as follows: Instead of paired samples t-tests, two-way repeated measures ANOVAs with 'time' as within-subject factor and 'type of participation' or 'major' as between-subject factor were employed. The main effect was examined to assess the impact of the factor 'time' on the dependent variable (Rasch et al., 2014b).
Before carrying out the independent samples ttests and ANCOVAs, Levene's tests were performed to check the equality of variances of the two populations. In case of inequality of variances, Welch's tests were conducted (Janssen & Laatz, 2017). Partial eta squared (ηp 2 ) is given as the measure of effect size for paired samples t-tests and the analyses of variance (Rasch et al., 2014a, b). Values of .01 ≤ ηp 2 < .06 can be interpreted as small effects, .06 ≤ ηp 2 < .14 as medium, and ηp 2 ≥ .14 as large effects (Cohen, 1988). For independent samples t-tests omega squared (ω 2 ) was calculated to assess the effect size (.01 ≤ ω 2 < .06: small effect; .06 ≤ ω 2 < .15: medium effect; ω 2 ≥ .15: large effect; Albert & Koster, 2002). Effect sizes are only reported in the text if the results are significant.
To answer RQ3, self-concept changes were first calculated by determining the differences between pre-and posttest for each participant. Pearson r correlation was then used to examine the strength and direction of the relationship (Janssen & Laatz, 2017) between perceived competence in planning, conducting, and interpreting experiments during the intervention (posttest) and change in self-concepts from pre-to posttest. A Pearson correlation coefficient of |r| ≥ .1 is regarded as small/weak, of |r| ≥ .3 as medium/moderate, and of |r| ≥ .5 as large/strong correlation (Cohen, 1988). If 0 < r ≤ 1, the linear relationship is positive, if -1 ≤ r < 0, the linear relationship is negative (Kuckartz et al., 2013).

RESULTS
Considering the results of the total sample size (N = 202) the pre-service primary school teachers gained little to moderate experimental experience in their past (RQ1.1). The highest but only moderate experimental experience was reported for the period of secondary school (see Table 1). Less experience was gained while studying at university and primary school (see Table 1). While the SciMaj pre-service teachers' experimental experience in primary school does not differ from the non-SciMaj group (t(170.272) = .688, p = .492), differences can be observed during the secondary school and university period. The SciMaj group gained significantly higher experimental experience than the non-SciMaj group both in secondary school (t(200) = 2.594, p = .010, ω 2 = .028) and university (t(200) = 8.799, p ≤ .001, ω 2 = .274; see Table 1).
Moderate correlations (RQ1.2) were found between experimental experiences in primary school  Table 2). Moderate correlations were also found between the experimental experiences in primary school (r (200)  The pretest values of both the self-concept on planning & interpreting experiments just as on planning experiment-based lessons are of moderate height (see Table 3). The self-concepts on conducting experiments and on teaching experiment-based lessons are also only slightly higher. .000*** Significance levels: p ≤ .05 significant (*), p ≤ .01 very significant (**), p ≤ .001 highly significant (***) (Bühl, 2019) Therefore, floor and ceiling effects (Döring & Bortz, 2016) can be largely neglected. The independent samples t-test revealed no differences between the EG and BG at the pretest time in any scales except for the self-concept on planning experiment-based lessons (t(200) = -5.07, p ≤ .001, ω 2 = .109).
When comparing the effect of the intervention on EG and BG (RQ2. increase significantly from pre-to posttest in the EG. Except for a slight increase in the self-concept on planning experiment-based lessons (t = -2.791, p = .008, ηp 2 = .153, n = 44), there are no significant pretest-posttest changes in the BG (see Table 3).
To analyze the effect of the course format on EG`s self-concepts (RQ2.2), the effects of the type of participation (voluntary/mandatory) and major (SciMaj/non-SciMaj) were factored out as described above. Since there were no pretest differences in all constructs, three-way repeated measures ANOVAs were conducted (1. factor: time, 2. factor: course format, 3. factor: the factored-out variable, that means the type of participation or major respectively). This results in the need to report two statistical characteristics per scale and group. Therefore, these are only reported in Table 4 to ensure the reading flow.
The academic self-concept increased significantly in each subscale, regardless of the course format and the type of factored-out variable (Table 4). There are no significant interactions between course format and reference time in the self-concepts on conducting experiments and teaching experiment-based lessons. A slightly but significantly higher increase in the selfconcepts on planning & interpreting experiments and planning experiment-based lessons is observed in the block group, albeit with small effect sizes (see Table  4). There are no significant interactions between time, course format, and type of participation (factored-out variable 1) or rather time, course format, and major (factored-out variable 2). Therefore, no relevant distortion of the results by the type of participation and the major is to be expected. .87 Significance levels: p ≤ .05 significant (*), p ≤ .01 very significant (**), p ≤ .001 highly significant (***) (Bühl, 2019); effect size: .01 ≤ ηp 2 < .06: small effect, .06 ≤ ηp 2 < .14: medium effect, ηp 2 ≥ .14: large effect (Cohen, 1988) Despite the non-significant interactions between time, course format, and the factored-out variable 'major' (see results for RQ2.2), two-way repeated ANOVAs (in case of pretest equivalency) and ANCOVAs with the pretest values as covariate and the posttest values as the dependent variable (in case of non-existent pretest equivalency) were conducted to test the impact of major separated into the course formats (RQ2.3; see Table 5). The effect of the type of participation (voluntary/mandatory) could not be factored out, as no cases of non-SciMaj participants mandatorily attending the intervention are available due to examination regulations. Particularly striking are the pretest differences between both subgroups. Significantly higher pretest values in the SciMaj group compared to the non-SciMaj group were found in the self-concept on planning & interpreting experiments (weekly course format: t(86) = 3.000, p = .004, ω 2 = .083), as well as in the self-concepts on conducting experiments (weekly: t(86) = 3.645, p ≤ .001, ω 2 = .123; block: t(33.699) = 2.367, p = .024, ω 2 = .064), on planning experiment-based lessons (weekly: t(86) = 4.650, p ≤ .001, ω 2 = .190), and on teaching experiment-based lessons (weekly: t(37.982) = 5.380, p ≤ .001, ω 2 = .241; block: t(68) = 3.350, p ≤ .001, ω 2 = .132). The SciMaj and the non-SciMaj group showed a significant increase in all self-concepts regardless of the course format. Only in terms of planning experiment-based lessons, the changes in self-concept between pretest and posttest differed for the two groups. The two-way repeated measures ANOVAs showed a significant interaction between time and major for the students attending the block format (F(1,68) = 7.242, p = .009, ηp 2 = .096). Nevertheless, both the SciMaj and the non-SciMaj group benefited from the course.
To investigate the relationship between the changes in self-concepts (posttest-pretest) and perceived experimental competencies related to the course (RQ3), bivariate correlations including all participants of the EG were calculated ( Table 6). The more the participants experienced themselves as competent in planning & interpreting experiments, the more their self-concept on planning and interpreting experiments increased (r(156) = .228; p = .004). Another small but significant correlation was found between the perceived competence in conducting experiments and the increase in selfconcept on conducting experiments (r(156) = .215; p = .007). Furthermore, a small but significant negative correlation between the perceived competence in planning & interpreting experiments and the selfconcept on teaching experiment-based lessons can be observed (r(156) = -.158; p = .047). Table 4. Mean values (M) and standard deviations (SD) of experiment-related self-concepts and self-concepts on planning and teaching experiment-based lessons at both measurement times in comparison of the group of weekly (N = 88) and block format (N = 70) participants. p-values and effect sizes ηp 2 given for inner-and intergroup comparison 4.55 .54 round brackets = ANOVA with further between-subject factor 'type of participation'; square brackets = ANOVA with further between-subject factor 'major'; significance levels: p ≤ .05 significant (*), p ≤ .01 very significant (**), p ≤ .001 highly significant (***) (Bühl, 2019); effect size: .01 ≤ ηp 2 < .06: small effect, .06 ≤ ηp 2 < .14: medium effect, ηp 2 ≥ .14: large effect (Cohen, 1988) DISCUSSION Experimental Experiences and Correlations with Academic Self-Concepts This study contributes to filling the research gap regarding the factors that influence primary teachers' experiment-related and science teaching-related selfconcepts. Similar to previous reports on experimentation in school (Seidel et al., 2007), the pretest revealed that the participants gained relatively little experimental experience in primary school and moderate experience in secondary school. There have also been few opportunities for experimentation whilst studying at university, especially for pre-service teachers without a science major. Given the following results, this fact is worrying: While no significant correlations were found between experimental experience in primary and secondary school and the self-concept on planning experiment-based lessons, there was a highly significant correlation between this selfconcept and experimental experience while studying at university. These findings emphasize the relevance of experimental experience at university to develop professional competence, mentioned in the literature Table 5. Mean values (M) and standard deviations (SD) of experiment-related self-concepts and self-concepts on planning and teaching experiment-based lessons at both measurement times in comparison of weekly participants with a science major (SciMaj, N = 60) and with other majors (non-SciMaj, N = 28) and participants of the block format (SciMaj,N = 25 non-SciMaj pre 3.07 1.14 .000*** .684 post 4.59 .64 Significance levels: p ≤ .05 significant (*), p ≤ .01 very significant (**), p ≤ .001 highly significant (***) (Bühl, 2019); effect size: .01 ≤ ηp 2 < .06: small effect, .06 ≤ ηp 2 < .14: medium effect, ηp 2 ≥ .14: large effect (Cohen, 1988) (Capps et al., 2012). The other moderate correlations b etween experimental experiences at school and university and experiment-related as well as teaching-related self-concepts indicate that experiences with experimentation dating back far in the past can have an influence on these self-concepts, too (Bong & Skaalvik, 2003;Dickhäuser, 2006;Shavelson et al., 1976).
As suggested by Franken et al. (2020), experiences in the dimensions of planning, conducting, and interpreting experiments for each of the three educational institutions could be correlated with the subscales of self-concepts and needs to be investigated in future studies to get a deeper insight into possible factors influencing these domainspecific self-concepts. As subjective interpretations of previous experiences also impact self-concepts (Wigfield & Eccles, 1992), it might be necessary to record not only the level of experimental experience but also individual perception.
The Dimensionality of the Experiment-Related Self-Concept While Damerau (2012) and Franken et al. (2020) distinguish between three subscales (planning, conducting, and interpreting experiments) based on factor analytical findings, participants of this study differentiate between their abilities in planning and interpreting experiments and their abilities in conducting experiments according to our factor analytical results. As with Rautenstrauch & Busker (2020) and Atzert et al. (under review), the first subscale includes phases of theoretical considerations in experimentation, whilst the second comprises the phase of hands-on activities (Schreiber et al., 2009). For the context of school, it was shown that pupils differentiate between highly interrelated sub-competencies even if no corresponding grading occurs (e.g. Arens & Jansen, 2016). In the case of this study, the relatively limited experimental experience (see results on RQ1.1) and a lack of opportunities to independently plan and interpret experiments at school and university (Seidel et al., 2007;Schulz, Wirtz, & Starauschek, 2012;Tesch & Duit, 2004) could have made it difficult for the pre-service teachers to differentiate between their abilities in planning and evaluating experiments. Subsequent studies should take a closer look at the structural stability (Möller & Trautwein, 2015) of experiment-related self-concept subscales. It needs to be investigated whether participating in several experiment-based interventions leads to a differentiation between the perception of abilities in planning, conducting, and interpreting experiments.
Looking at the level of the experiment-related selfconcept, values in the subscale 'conducting' are higher than those in the subscale 'planning & interpreting' at both measurement times and in all (sub-)groups. This finding is consistent with the results of Buse (2017), Damerau (2012), and Rautenstrauch & Busker (2020) for secondary school students and master's pre-service biology, chemistry, and science and social studies teachers (Franken et al., 2020). Due to the previously discussed difficulties and deficits concerning planning and interpreting experiments of school students (Germann & Aram, 1996;Hofstein et al., 2005) and pre-/in-service teachers (Emereole, 2009;Fleischer et al., 2020;Hilfert-Rüppell et al., 2013), such a result was to be expected. Similarly, Schulz et al. (2012) postulate that both school and university students are most experienced in conducting experiments, while they lack experience in the phases of theoretical considerations. These differences in the perception of experiment-related abilities emphasize the relevance of courses like the one presented here, in which planning and interpreting experiments can also be practiced.

Effects of the Intervention and the Role of Perceived Experimental Competences
The present study revealed a positive impact of the intervention on pre-service primary teachers' experiment-related self-concepts and self-concepts on planning and teaching experiment-based lessons Table 6. Correlations between perceived experimental competences (posttest) and change of self-concepts (pre-postdifference): Pearson correlation coefficient r (each in the top line) and p-value (each in the lower line) (N = 158)

Self-concept on conducting experiments
Self-concept on planning experiment-based lessons Significance levels: p ≤ .05 significant (*), p ≤ .01 very significant (**), p ≤ .001 highly significant (***) (Bühl, 2019) (RQ2.1). Regarding experiment-related selfconcepts, the results of Peschel (2018), Buse et al. (2018), and Damerau (2012) -showing that experiment-based educational settings can lead to a significant increase in university and school students' self-concepts on planning, conducting, and interpreting experiments -are confirmed. The finding that the BG's self-concepts do not change significantly over time is consistent with the results on RQ1.1: There seem to be only a few possibilities at university to gain experience in experimentation and/or teaching experiment-based lessons. Only the BG's self-concept on planning experiment-based lessons also increased significantly from pre-to posttest. Those experiences in planning experimentbased lessons could have been gained in advanced courses (bachelor program) or during the practical semester.
Several possible factors could have led to a significant increase in self-concepts: EG's responses to the open-ended questions presented in Beudels et al. (under review) point to positive experiences regarding experimentation and planning experiment-based lessons. The course activities offer many opportunities to test own abilities, make mistakes and reflect on them (see Educational Concept). Regarding RQ3, partial results are consistent with the thesis that self-concept is influenced by competence experiences (Bong & Skaalvik, 2003;Dickhäuser, 2006;Shavelson et al. 1976). The more the participants experienced themselves as competent in planning, conducting, and interpreting experiments during the intervention, the higher the experiment-related selfconcept was stated. The small but significant negative correlation between the perceived competence in planning & interpreting experiments and the selfconcept on teaching experiment-based lessons indicates that other factors also played a role in the positive development of the teaching-related selfconcepts. The EG also lists individualized, positive feedback on performance and a supportive learning environment as reasons for recommending the course (Beudels et al., under review). These two factors have already been mentioned as determinants influencing the self-concept in literature (Bong & Skaalvik, 2003;Lüdtke et al., 2005;Möller & Trautwein, 2015).
As shown by Paulick et al. (2017) and Sorge et al. (2019), pre-service teachers' self-concepts are also formed with internal and external frames of reference (Marsh et al., 1988). Participants may have compared their performance at the end of the course to their previous performances in school or the beginning of the intervention (temporal comparisons; Marsh et al., 2015). In some open answers (see Beudels et al., under review), students stated that planning an experiment for a science lesson was easier for them the second time than the first time. Dimensional comparisons (Wolff et al., 2018), e.g. comparing one's abilities in the competence areas of experimentation, may also have played a role. Successfully handling criterial requirements (criterial reference norms; Möller & Trautwein, 2015) could have led to the self-concepts being assessed more positively in the posttest. According to the results of Atzert et al. (2020), criterial reference norms, in particular, positively influence pupils' experiment-related self-concept. In our study, such criterial reference norms could have been learning objectives, verbal and written tasks for experimentation at the stations and planning experiment-based lessons, and competence expectations for primary school teachers (KMK, 2008(KMK, /2019 discussed. An educational setting with a very heterogeneous learning group, where partner work predominates, and other teams can easily be observed in their actions, has the potential to provoke external, social comparisons (Marsh et al., 1988;Möller & Trautwein, 2015). The following elements were used to avoid adverse effects of social comparisons: Students did not receive grades for their planning products during the course; written feedback was given based on criteria; written feedback was anonymized so that the participants did not know which people were behind which planning product.

Impact of the Course Format
There is a lack of research on the effect of the course format on the development of teachers' self-concepts (RQ2.2). Under control of the variables 'type of participation' and 'major', it was shown that the examined self-concepts increased significantly with high effect size both when participating in the weekly and the block format. However, block course participants seem to have a slight advantage in terms of the reinforcement of self-concepts on planning & interpreting experiments and planning experimentbased lessons. These findings support the results of Schaal & Randler (2004) and Hilkenmeier & Sommer (2014), indicating that pre-service teachers' selfconcepts develop more positively through block scheduling than through traditional scheduling as the students showed higher scores regarding perceived competence, a factor influencing the self-concept (Bong & Skaalvik, 2003;Dickhäuser, 2006;Shavelson et al., 1976). According to Dixon & O'Gorman (2019), block formats give students a faster sense of achievement as tasks are completed in a short time. Without being interrupted by other courses and thus having a continuous learning experience (Daniel, 2000), they can get an overview of (the development of) their skills and performance. Such 'concentrated' competence experiences and feedback combined with a positive learning environment, which is found more often with block formats than weekly formats (Samarawickrema & Cleary, 2021), could lead to a significantly higher increase in self-concepts. Weekly participants may also compare their achievements in the course with those in other courses they attend in parallel during the semester. This dimensional frame of reference (Wolff et al., 2018) does not apply to students of the block course. All these factors could have led to the block course participants overestimating their abilities. However, the small effect sizes indicate that the course format did not play a major role in changing self-concepts in this study.

Impact of the Major Field of Study
Due to the facts mentioned in the introduction, one main objective of the intervention was to positively develop the self-concepts of those pre-service teachers without a science major. With one exception (block format, self-concept on planning experimentbased lessons, medium effect), self-concepts of preservice teachers with and without a science major developed equally positively. Despite the finding that SciMaj students' self-concepts were almost consistently significantly higher in the pretest than those of the non-SciMajs, both groups seem to have encountered factors that led to a positive development of the self-concepts (see discussion on RQ2.1). Concerning the social reference norm (Marsh et al., 1988;, non-SciMaj participants could have either benefited from the attempt not to focus on social comparison processes or -using social comparisons -they might have realized that their skills in terms of experimentation and planning experiment-based lessons are equal to those of SciMajs. At the same time, all participants may have perceived gains in skills and performance through temporal comparisons (Möller & Trautwein, 2015). The application of a questionnaire similar to that of Atzert et al. (2020), gathering students' self-concepts influenced by different reference norms, is recommendable to empirically substantiate these assumptions. The findings on RQ2.3 are also important for designing university curricula insofar as the course concept seems to be suitable for a very heterogeneous audience, which can be an advantage concerning the limited capacity of lecturers.

Study Limitations and Implications for Future Studies
Regarding the sample of this study, a positive selection can be assumed, as many of the pre-service teachers attended the course voluntarily (see Curricular Framework). The same applies to the choice of the course format: Students could choose the format they prefer. Thus, a participant who thinks that the block format better suits her/his learning style will likely choose this format and vice versa (Burton & Nesbit, 2008). Daniel (2000) suggested that this situation could be countered with a random assignment of the pre-service teachers to the (sub-) groups. However, a suitable curricular framework is necessary for this. As long as course participation is counted towards a study module in some degree programs and in others not, it will be difficult to justify a randomized allocation of participants (Burton & Nesbit, 2008).
Due to the existing study regulations, too, it was not possible to include the two factored-out variables in the calculations simultaneously. Integrating this in the study design requires students without a science major to participate mandatorily in the intervention. Since self-concept was found to be changeable through interventions (e.g. Damerau, 2012;Peschel, 2018), but is also described as relatively stable (Shavelson et al., 1976), future studies could investigate the long-term effects of the intervention.
Operationalization of the constructs with more and more heterogeneous items would be appropriate to capture the self-concepts in more detail. For example, the scale 'self-concept on planning an experiment-based lesson' items could be developed on the six sub-areas of planning primary science and social studies lessons, mentioned by Tänzer (2010). However, the questionnaire was also designed to record other course effects (see Beudels et al, under review). Therefore, the number of items per subscale was minimized to maintain test efficiency and for motivational reasons.
In this study, the subjective perception of abilities was assessed. Even though a medium, positive correlation between self-concepts and factual skills and competencies is to be expected (Marsh, 1992;Paulick et al., 2016), the self-concept alone provides limited information about actual competencies (Festner et al., 2018). As mentioned above, it is questionable whether all pre-service teachers can realistically assess their experimental competencies (Schreiber et al., 2016). Similar to Paulick et al. (2016) and Sorge et al. (2019) and as suggested by Festner et al. (2018), in addition to self-concepts, professional knowledge (e.g. syntactic knowledge concerning experimentation (Haslbeck, 2019); pedagogical content knowledge for teaching experiment-based lessons) might need to be investigated to ensure the validity of the scales. It might also be an option to evaluate individual experimentation skills and abilities for planning experimental lessons through a category-based evaluation of video recordings -as in Tesch & Duit (2004) -or the analysis of planning products. However, a more complex study design and assessment strategy might limit the sample size (Schreiber et al., 2009).

CONCLUSIONS
This study contributes to closing the gap regarding research into pre-service primary teachers' experiment-related and science teaching-related selfconcepts. The results illustrate the relevance of experimental experiences at university to develop prospective teachers' professional competence. Besides, this study could demonstrate that the course format influences the development of self-concepts. The impact of this variable is not only to be considered in future studies but also in designing study curricula or training programs for in-service teachers. The intervention has the potential to strengthen the self-concepts of pre-service teachers with very different experimental experiences and motivations equally. Especially the positive impact on self-concepts of pre-service primary teachers without a science major is desirable given the academic self-concept being a predictor of teachers' performance in primary science lessons.  Table A1. Operationalization of the constructs in the questionnaire. Scale names including the number of items, translations of the original items, item abbreviations, and internal consistency (Cronbach's α) It was easy for me to use the devices and materials provided for the experiments.