How teachers' perceptions affect the academic and language assessment of immigrant children


Recent research evidences inconsistencies in teachers' practice regarding skills assessment of L2 students. Scientific evidence supports that less experienced teachers have lower orientation toward multiple task-tests for non-native students. Research questions: Whether school teachers as having different teaching training and unequal teaching experience with non-native students perceive differently a four-skills scale. Purpose of the study: This study intends to analyse the importance degree between the four skills/tasks: reading, writing, speaking and listening, in the perspective of school teachers. Method: 77 teachers, aged 32-62, with (and without) experience in teaching and adapting materials for immigrant students, divided into six groups according to their scientific domain. Assessment tools included a scale for judgement of four academic tasks adapted from the original “Inventory of Undergraduate and Graduate Level: Reading, Writing, Speaking and Listening Tasks ( Rosenfeld, Leung & Ottman, 2001 ). Main Findings: 1) different degrees of importance attributed by teachers on tasks that should be included in academic and language test for immigrant students; 2) perceptions of teachers are determined by predictors in this order: scientific domain, experience with multicultural classes and lower prediction from teaching service and age; 3) different results between american and portuguese samples answering the same questionnaire.

Keywords: Teachers’experienceacademic language skillsimmigrant studentsevaluation perceptions


The non-native school population has increased in the classrooms, especially considering

immigration in the North American and European continents (Callahan & Obenchain, 2013; Guo, 2013;

Huddleston, Niessn & Tjaden, 2013; Kolano & King, 2015). However, teachers and education and

psychology professionals in specific European and other contexts are not yet fully aware of the

satisfactory academic profile expected of a native student according to the education level he is in

(D’hondt, Eccles, Houtte et al., 2016; Sudkamp, Kaiser & Moller, 2012; Pérez-Sabater, 2012). On the

other hand, they also do not know, in a sustained way, the (expected) satisfactory profile teachers who

evaluate foreign students are supposed to have (Kolano & King, 2015; Rogers-Sirin, Ryce & Sirin,

2013; Rosenfeld, Leung & Ottman, 2001). When referring to the "knowledge" of the profiles of native

teachers and of non-native students, we specifically focus on the perceptions and experience of teachers

regarding the competence, the pace and the assessment of immigrant students (Levine-Rasky, 2001;

Tejada, Pino, Tatar et al., 2012), in particular focusing on pupils newly arrived to schools, therefore

with less exposure to the language of the host country.

Past studies indicate that some linguistic minorities can be affected by the expectations they feel on

the part of their teachers and even peers (Creemers, 1994; Derwing, DeCorby, Ichikawa e al., 1999;

Driessen & Withagen, 1999; Schneider & Yongsook, 1990). Recently, other authors (Brok, Tartwijk,

Wubbels et al., 2010; Tatar & Horenczyk, 2003) have been studying this influence on minority groups

in schools and conclude that specific groups of students, according to their nationality, value and need

more solid interpersonal relationships with their teachers, especially the second generation of

immigrants (Brok et al., Callahan & Obenchain, 2013). Native students do not seem to benefit from, or

depend so much on, the relationships established with the expectations of teachers. On the other hand,

different results (D’hondt, Eccles, Houtte et al., 2016; Schaedel, Freung, Azaiza et al., 2015) suggest

that the perceptions and expectations of teachers can affect only some minority groups because they

rely heavily on other factors, such as parental investment.

When valuing teachers’ support capacity (Tejada, Pino, Tatar et al., 2012), the

assessment relates to the well-established definition of measurements of different skills - reading,

writing, speaking and listening skills – of the immigrant student population and these skills are then

judged according to well-organized tasks that span across multiple cognitive levels estimated within the

context of each of these skills (Bialystok, 2002; Cummins, 1980; Hinkel, 2012). These levels or

cognitive dimensions are the problem regarding defining clearly the tests for school and higher

education teachers (Hazenberg & Hulstijn, 1996; Nation, 2001) in that the evaluation of a Second

Language (L2) is a proven complex process because it is explained by a dynamic set of multiple

variables according to the rate of processing, the students’ exposure time to the language, the culture

design of the nationality groups (Hinkel, 2012; Jia, Aaronson & Wu, 2002), and the scientific field of

teachers informing the orientation of their perceptions.

The scientific field of the teachers has been examined as an important predictor of differentiation of

representations and practice in the classroom regarding the evaluation and teaching of linguistic

minorities (Richard & Rodgers, 2001, cited by Hinkel, 2012). The teachers of scientific areas more

related to the natural sciences are more negligent with regard to the comprehensive teaching of the

language, and value more the learning of the syllabus content of these subjects (Hinkel, 2012). The

multiple language skills method has been a reality for teachers in the last two decades with regard to

differentiated immigrant pupils in the classroom (Hinkel, 2012). The teaching of L2, which is different

from Foreign Language (FL) teaching, doubled teachers’ efforts to present it in a more cognitive

perspective and not just foster the development of communication skills (Bialystok, 2002; Hinkel,

2012). According to studies conducted in the 1990s (Lightbown & Spada, 1990; Schmidt, 1993, cited

by Hinkel, 2012) analysing the four academic skills in the context of non-native students, the speaking

(and listening) skill depended on exposure to L2 and valued, therefore, intensive immersion

programmes. However, recent studies (indicated in Hinkel’s meta-analysis, 2012) have shown that this

is one among many variables that explain, for example, access to vocabulary and grammar awareness

(Jia, Aaronson & Wu, 2002).

Knowing what is more or less important, and the teaching methods (Hinkel, 2012) to use to develop

the academic and language skills of immigrant students also depend on the teachers’ experience

regarding two aspects: school teaching experience (1) and teaching experience of non-native students

(2). This study aims to analyze Portuguese teachers and students of non-mother tongue Portuguese

(PLNM). Studies in the 1980s identified the main concerns of teachers starting their careers, therefore

less experienced, and the experience of dealing with non-native students was not among those concerns

(Veenman, 1984). On the contrary, recent studies indicate that assessing teachers’ perceptions and

competences involves the challenge of coping with diversity, which may lead to situations of

“diversity-related burnout” (Taylor & Sobel, 2001, cited by Tatar & Horenczyk, 2003). Teachers'

experience of multicultural groups has been studied more recently, highlighting the role of "typical

teacher of non-mother tongue" or "teacher supporting L2 learning" binomial (Tejada, Pino, Tatar et al.

2012; Yoon, 2004).

This idea of support includes the expectations perceived by teachers and students (Tatar &

Horenczyk, 2003), which are not always facilitated by the type of social language that teachers use to

identify and differentiate groups of students in the classroom (Devine, 2006; Marshall, 1996). The issue

of teachers' expectations on the evaluation and learning of immigrant students started mainly with the

concern to do valid tests for this population and seize the perceptions of evaluators (Rosenfeld, Leung

& Ottman, 2001), moving away from the older perspective of studies conducted in the school’s

organizational culture (Marshall, 1996). In what regards Europe, more scientific analysis of the

teachers’ perceptions is required, based on predictive testing to understand the inconsistencies believed

to be made with non-native students in the classroom.

As regards the instruments to assess foreign population, the literature in the field of preparing

Reference Frameworks and tests to evaluate cognitive and academic performance of immigrant pupils

has shown inconsistencies in teachers’ evaluation and teaching practices regarding the assessment of L2

learners’ performance (Papgeorgiou, 2014; Pérez-Sabater, 2012). The Anglo-Saxon model is the one

with more validated instruments in the L2 area, with a background of test review procedures that points

to the constant concern with the type and validity of tasks to be used with linguistic minorities and in

different school levels (Bachman, 2000; Bailey & Huang, 2011). In the European context, the

frameworks of reference are different and there is no consistent literature in the specific case of

Portuguese immigration (Pires, 2009).

This study aims to identify the type and level of importance that teachers of educational levels

ranging from pre-school to high education assign to tasks to specifically assess the academic

competence and performance of non-native students. The ultimate goal is to match the type of tasks to

effectively measured academic performance. The study examines the relevant areas for the creation of

tasks in proficiency and performance evaluation tests as well as replicating the previous study by

Rosenfeld, Leung and Ottman (2001) in the TOEFL review by using a scale of 42 items that identifies

tasks and responsibilities in the areas of reading, writing, speaking and listening skills, at the university

level. In this study, the target population consists of teachers of basic education and high school and

aims to: 1) identify the most revealing tasks for the establishment of items in proficiency and

competency tests; 2) assess the knowledge and representations that these teachers have about evaluation

tests, according to age (a) length of service (b), and experience with multicultural classes (c).



The study involved 77 teachers aged between 32 and 62 years (M=47 years, SD=7.4), of whom 11

(14.3%) were male and 60 (77.9%) were female, with an average of 22 years teaching experience

(SD=6.7). Teachers teach at nine schools/groupings in the district of Lisbon, with 9 being teachers of

Portuguese language (11.7%), 12 of FLs (15.6%), 26 of Pre-school and Basic Education (33.8%), 8 of

Hard Sciences (10.4%), 16 of History/Geography (20.8%) and 3 of Visual Arts (3.9%), distributed by

the various levels of education (excluding higher education). 58 (75.3%) have experience of

multicultural classes and 16 (20.8%) have never had non-native students in their classes. 46 (59.7%)

used Non-Mother Tongue Portuguese Language (PLNM) measures and 19 (24.7%) admitted to never

having used them.

ANOVA tests were carried out to compare results according to the participants' scientific

domain and in relation to several variables: age, grade level, teaching experience, teaching experience

with non-native students and experience with measures for second language learners. The results were:

F(5,66) = 3.518, p = .001 for the age variable; F(5,67) = 16.161, p = .000 for the grade level; F(5,68) =

3.198, p = .012 for the teaching experience. No significant difference was found in the experience with

non-native students (and measures used in their evaluation and learning).


The Inventory of undergraduate and graduate level – reading, writing, speaking and listening tasks

questionnaire by Rosenfeld, Leung and Ottman (2001) was used and adapted to the sample of

Portuguese teachers. This questionnaire was originally developed by four scientific committees

(framework teams) under the TOEFL and the Educational Testing Service in order to measure the

importance, from the viewpoint of American university professors and students in education training

courses, of the reading, writing, oral and listening tasks to be included in a test capable of assessing the

academic competence and proficiency of non-native students. The original test has 42 items, of which

we have adapted 40 distributed by the four scientific areas: reading (10 items), writing (10 items),

speaking (10 items) and listening (10 items). The original test in the English version has no information

about its reliability properties, but the Portuguese version has a high Cronbach's coefficient (.94). The

exploratory factor analysis was then conducted and all items were considered like in the original study

that asserted that, although some items scored below 3.5, they were not excluded from the scale. We

used the cut-off point established by the authors of the original version - 3.5. All items that scored

below 3.5 revealed that teachers do not consider the task of integrating an assessment test for non-

native students important.

The test was submitted to an exploratory factor analysis (EFA) to assess participants' answers and

the factor structure of the test that was hypothesized as a four factor structure. The items with a factor

loading of .40 or higher were used to define each factor. The Kaiser–Meyer–Olkin test showed that the

sample size was adequate (.70) and the Bartlett test showed there was a good correlation index among

the variables. As such, our acceptability rate allowed us to test our hypothesis (p =.000). By excluding

no items, 11 factors were found and the first factor received the highest upload of almost all items from

the “reading”, “writing”, “speaking” and “listening” scales. Eleven eigenvalues were higher than 1,

explaining 75% of total variance. This was not expected considering that the EFA was hypothesized as

a four-factor dimension structure. The original study did not produce an EFA. Fourteen items were


The main component analysis showed that almost all 40 items loaded on the first factor, which was

not expected considering that the first factor should correspond to the first scale (of four scales) -

Reading. Items in this factor include items from the four academic skills. Three items (9. “Distinguish

factual information from opinions”; 10. “Compare and contrast ideas in a single text and/or across

texts”, 11. “Synthesize ideas in a single text and/or across texts” ) from Reading scale loaded greatly

only in the second factor.


The data was collected in 2013 and 2015 in basic and high schools in the district of Lisbon.

Contact was established with schools of the Lisbon district network to propose the study and

disseminate the research aims. Communication with schools allowed identifying a vast group of

teachers, which resulted in 77 teachers who fully completed the questionnaire. Following the informed

consent and the demographic record of the selected school population, the questionnaire was answered

and assessed (using points) according to the original test. Teachers responded to the questionnaire’s

forty questions on paper and returned it to the class Board and Department which, in turn, ensured it

was returned to us. The procedure took place in the same way and in different academic periods in all


The socio-demographic information was provided by the schools following informed consent after

the beginning of each school year. The questionnaire was part of the empirical context of using

linguistic and cognitive tests simultaneously with 108 immigrant students from different linguistic

minorities who attended the same schools. Data were analysed using SPSS, version 21.


We made statistical analyses using SPSS to determine univariate analysis of variance to identify

significant differences between groups and effect size (1); and regression analysis using the stepwise

method (2).

3.1.Univariate analysis of variance (ANOVA): effect size and post-hoc analyses

Age groups

For all the items there was no substantial effect size. Cohen’s benchmarks for statistical value of η2

were established as norm (Cohen, 1988).

Teaching Experience

Reading: Regarding the ability “locate by skimming and scanning’’, the groups differed

significantly ( p =.017; η2 =.113). The group with the highest mean in this task was the one with most

teaching experience (up to 35 years - M=4.36) and there was a statistically significant difference

between this group and the group with average experience (up to 25 years of teaching experience -

M=3.68). The group with less experience had a mean close to that of the most experienced group.

As for the item "Underline the important ideas in the text" the groups also behaved significantly

differently (p=.020; η2=.108) with regard to the value they placed on the item. Similarly, as in the

previous task, more experienced teachers had the highest mean (M=4.73) compared to the group who

had up to 25 years of experience (M=4.19).

A post-hoc (Tukey) test revealed significant differences among the groups of teachers with different

experience levels (p<.05), with regard to differences in the specific Reading items: skimming and

scanning strategy (F(2.68)=.725;p=.017), and outlining important ideas strategy

(F(2.68)=3.164;p=.049). The differences were between teachers with larger experience (up to 35 years

of teaching experience) and teachers with 25 years of experience. Again, there was no statistically

significant difference between these groups and teachers with less teaching experience

Writing: Regarding the ability “time constraints’’, the groups differed significantly ( p =.010; η2

=.128). The group with the highest mean in this task was the group who had more teaching experience

(up to 35 years - M=4,09) and there was a statistically significant difference between this group and the

group with average experience (up to 25 years of teaching experience - M=3,51). The group with less

experience stood close to the mean of the most experienced group, just like in the Reading items,

which showed differences in the perception behaviour of the groups.

A post-hoc (Tukey) test revealed significant differences among the groups of teachers with different

experience levels (p<.05), considering differences only for the time constraints writing item

(F(2.68)=4.396;p=.022). The differences were between more experienced teachers (up to 35 years of

teaching) and teachers with 25 years of experience. The groups’ mean responses have been presented

above. Again, the two groups had the same perception of items, just as in the Reading skill.

Listening: Only in the "recognition of the communicative intention" task, the groups differed

significantly (considering the results for the p value and effect size, η2) in the results obtained in the

“teaching experience” variable ( p =.036; η2 =.093). Again, and supported by the Tukey tests (post-hoc

analyses), significant differences ( p <.05) only occurred among groups with up to 25 (3.78) and 35

years (4.32) teaching experience. The younger group’s mean remained high and close to that of the

older group, but there are no significant differences in the comparison between groups for this variable

as well (see table 1).

Teaching Experience with foreign students

There was significant effect size (η2 ranging between .060 and .120) in all the items, but only in

some items in the reading and speaking skills. The groups differ depending on whether they have/have

had PLNM classes or not.

Reading: Regarding the “outline important ideas’’ ability, the groups differed significantly ( p =.044;

η2 =.060). A post-hoc (Tukey) test confirmed significant differences among the groups of teachers

with different experience levels with foreign students (p<.05), regarding differences in the outlining

important ideas strategy (F(1.66)=4.224;p=.044). The group with the highest mean in this task was the

group that never had PLNM classes (M=4.73) and the group with experience of non-native students is

the one that values the said item (M=4.30) the least. There were also differences between groups in this

same item/task regarding the "teaching experience" variable, which suggests highly consistent results.

Speaking: Regarding the items "give instructions clearly" and "arguing", the groups also showed

significantly different behaviour (p=.014; η2=.120, p=.026; η2=.072, respectively per item). Similarly,

as in the previous task, teachers with experience of foreign students (migrants and L2 context) have

higher mean (M=4.47 and 4.27, respectively) compared to the group without experience (M=3.75 and

3.79, respectively).

A post-hoc (Tukey) test confirmed the significant differences among the groups of teachers with

different experience levels with non-native students in classroom (p<.05), regarding the differences for

the only two Speaking items observed: giving directions and instructions (F(1.66)=1.937; p =.014) and

structuring hypotheses (F(1.66)=1.365; p =026, see table 1).

Still in the context of experience with non-native students’ classes, in a context of learning

Portuguese as L2, the same univariate analyses were conducted (effect size, comparison and means and

Tukey test) to check the variable related to the experience factor: use or not of PLNM measures in the

classroom (considering the group that has/had multicultural classes).

Differences were noted only regarding the task "reading based on synthesis of ideas", where the η2

value was substantial ( p =.010; η2 =.106). Teachers who admitted failing to apply measures in their

classes present the lower mean (M = 3.22) below the cut-off point (M=.5), unlike the teachers who

state having used measures to support PLNM (M=3.86). All data, checked with a statistically

significant difference, on the group behaviour analysis for each item and factor are presented in table 1.

Figure 1: Comparison among groups (means, pearson and effect sizes): teachers perceptions according to teaching service, experience with multicultural classes and L2 measures application.
Comparison among groups (means, pearson and effect sizes): teachers perceptions according to teaching service, experience with multicultural classes and L2 measures application.
See Full Size >

3.2.Linear Multiple Regression Analysis

Considering the sequence of past results, and having particularly noted how the groups of subjects

determined according to different independent variables behaved in the valuation of tasks of the four

scientific areas, it was decided to resort to linear regression analysis to ascertain the main predictors

among the group of independent variables under study using the stepwise method, and how the model

is used to establish the importance of specific tasks to be given to non-native students by teachers in

Portuguese schools (Lisbon district). Only the tasks that showed significant differences in the groups

and significant effect sizes were considered, according to the results of the previous tests.


For the task “compare and contrast ideas in a single text and/or across texts” regression results also

showed that the teachers’ scientific domain variable has predictive value ( b =-.333, p =.009) but also the

“experience with measures applied to foreign students” ( b =-.249, p =.041), as opposed to the other 3

factors – teaching experience, age, experience with foreign students in classes - where no significant

predictive power was shown. The importance of that specific reading items is affected by the

perceptions of the different teachers (by scientific domain). In order to clarify this result, and having

examined the overall result (through a frequencies previous test) of the answers to all the items of the

scale (14 factors), it was found that Portuguese language and FLs teachers are the ones who had more

positive perceptions regarding all the tasks listed in the questionnaire. Teachers of other scientific

fields value different items and attach less importance to items. The teaching areas are important

factors in predicting tasks and enforcing them among linguistic minorities in schools. As for the other

variable that the model shows as being the second and final model predictor, it appears that the

negative experience of teachers (absence of experience) predicts lower application of measures for

students inside the classroom. There were 2 reading tasks (theme identification and facts/opinions

distinction) that regression analysis revealed as having no significant predictive value for any

independent variables. The results are summarized in Table 2.


Regarding both tasks “awareness of audience needs and write to a particular audience or reader” (1)

and “time constraints” (2), results showed that only the teachers’ scientific domain variable has

predictive value (task 1: b =-.261, p =.044; task 2: b =-.613, p =.000) as opposed to the other 4 factors –

teaching experience, age, experience with foreign students in classes, measures applied to foreign

students in classroom - where no significant predictive power was found. Importance for that specific

writing item is affected by the scientific domain of teachers, meaning that there are perceptions of

teachers, according to the teaching area, that produce differences on the importance attributed to that

task, for foreign/immigrant students. The results are summarized in Table 2.


For the tasks “questioning teacher” (1), “participation toward other students” (2) and “presentation

toward other students” (3), results showed that only the teachers’ scientific domain variable has

maintained predictive value (task 1: b =-.421, p =.001; task 2: b =-.335, p =.009; task 2: b =-.365, p =.004).

For the task “giving instructions/directions” (4) results showed that experience with multicultural

classes is the only predictor ( b =.270, p =.037). For the task “structuring hypotheses” (5), results showed

that the teachers’ scientific domain variable has maintained the predictive value ( b =-.389, p =.002), but

also other variable emerged from the model as a predictor: age variable ( b =-.287, p =.017). In the

ANOVAs the age variable did not display significant differences among the groups for any items.


For the task “recognize the speaker’s attitudinal signals”, results showed that the experience with

measures for foreign students is the only predictor ( b =-,272, p =.034). The importance of that item is

affected by teachers’ experience in dealing with pedagogical measures (considering the descriptive

statistics of ANOVA, we determined that the teachers with experience of pedagogical measures

attribute greater importance to this task than the group with no experience or knowledge of those

measures). The results are summarized in Table 2.

Figure 2: Table 2.Linear regression analysis of tasks relevance (*dependent variables appeared in the prediction model).
Table 2.Linear regression analysis of tasks relevance (*dependent variables appeared in the prediction model).
See Full Size >


The results respond to the questions of this study. With regard to the first question, it turns out that

Portuguese teachers differentiate tasks as being more or less relevant throughout the four specific

academic areas in the creation of items in proficiency and competence tests of learners of Portuguese as

L2. There are differences in assigning importance to items within each academic area (reading, writing,

speaking and listening comprehension) but there are also differences in the groups regarding perception

of specific items, as the previous results show. It was found that the less valued items lie in the writing

area (item associated with the ability to write according to the type of audience) and in speaking (item

related to the ability of structuring hypotheses during speech - arguments). These data support previous

results (Gebhardt, Chen, Graham et al., 2013) and show that one of the greatest writing difficulties of

L2 learners is to produce texts related to the specific contents of the subjects.

This difficulty is related to the teachers’ instruction method, which focuses on grammar and

underestimates the sociolinguistic sense, in this case the skill to write according to a specific audience.

If the difficulties lie in this aspect of writing (audience awareness), then the explanation is the type of

instruction and the teachers who underestimate this item (one of the weakest observed in this study)

indicate that they maintain an inadequate teaching method. According to Gebhardt et al., the way

teachers perceive the teaching of language rules generated problems to the correct teaching of needs in

the writing skill, like knowing how to write properly for different publics.

It should be noted that in a previous study (Shanahan, 1992), it has been highlighted that writing

depends, for its proper development, on a great awareness of the audience for which one is writing.

This study stresses a problem related to the perception of teachers that directly affects the teaching

done. This study demonstrates that teachers effectively value this type of skill less (writing skills

according to the audience) and for this reason it appears as one of the items with a lower mean. It

seems that for this kind of task, Foreign Language teachers value it less than the Portuguese language

teachers, and in the same proportion regarding the valuing of grammar rules.

FLs teachers are the ones who undervalue these two aspects of writing the most, when they should

be the most sensitive to the grammar question and sociolinguistics of the audience. However, these

data are in line with a meta-analysis study conducted by Graham and Peri (2007) who identified the

priorities of teachers in the writing teaching tasks as being teaching strategies and 'peer assistance' to

the detriment of grammar, pre-writing activities and processing aspects in texts written by the students.

According to studies conducted in the eighties (Scarcella, 1984), L2 writers have major problems in

writing for specific audiences, which is related to "attention engaging" (p. 671) during the writing task.

One cause may be related to the subject’s incomplete proficiency in Mother Tongue (Khuwaileh &

Shoumali, 2000) and to difference in cognitive processing (and therefore different strategies) in the

contexts of Mother Tongue and L2 (Silva, 1993).

Over the last decade, L2 studies started to focus more on issues related to the writing skill as

regards planning and processing ability, common to writing and speaking (Khuwaileh & Shoumali),

therefore more focused on specific language issues and not on aspects of social and professional

communication as in studies in the nineties (Scott, 1996): the principles and priorities of evaluating the

writing of FLs and L2 learners by FLs teachers was not informed by the "language-specific" aspect (p.

1). In this study, these two dimensions common to the two skills are the most undervalued by the

Portuguese teachers, which probably justifies the shift in focus in L2 research. In addition, there is the

analysis of grammatical and textual cohesion mistakes that authors like Khuwaileh and Shoumali found

in texts written by L2 learners.

In what regards the relevance of tasks indicating what kind of items a L2 evaluation test must

include, one finds that the strongest items are in reading (understanding teachers’ instructions and

identifying the topic of the text, for example) and listening comprehension (understanding teachers’

instructions, for example). These results are consistent with Brown's conclusions (2009), who found

that teachers have perceptions of the tasks more focused on understanding the errors (language use)

and on correcting them, therefore on the more "communicative classroom" aspect, which explains the

valuing we found of items that focus on understanding teachers' directions in class. The

"communicative classroom" aspect is productive if expanded in the direction of "communicative

performance" (Bygate, Swain & Skehan, 2013), i.e. a teaching context in the classroom that extends

the transfer of cognitive strategies between skills (DeKeyser, 2007). Understanding (listening)

instructions and reading them (reading comprehension) are valued equivalently in this study because

there is a tested cognitive relationship between these two tasks/skills (Hinkel, 2006). We are facing

Hinkel’s (2006; 2012) multiple language skills approach that has been addressed before by several

authors but in the terminology of the common transfer of skills during the learning process in a L2

context (DeKeyser, 2007; Ellis, 2003; Van Patten, 2007).

Cognitive strategies such as organization of speech, fluency, and inferences can be common to

reading and listening comprehension, which promotes more effective learning due to the transfer of

cognitive skills across the four foundational skills than actually focusing on the exclusive use of a top-

down or bottom-up teaching approach (Hinkel, 2006).

The positive correlation between reading and listening comprehension tasks is supported by

studies by Zeeland and Schmitt (2013) and by Harding, Alderson and Brunfaut (2015) who, in the

context of English as L2, identify common cognitive processes of native and non-native students in

tasks to evaluate listening and reading, but with greater significance to L2 learners. Also Chien, Palau

and Sun (2014) conclude there is a close relationship between cognitive strategies (interdependence) of

writing and reading.

On the other hand, teachers value these skills (reading and listening) as they are skills that involve

vocabulary retention (not exactly production), which is the area where the biggest problems for L2

learners are to be found (August, Carlo, Dressler et al., 2005; Lesaux & Kieffer, 2010) and which

allows students to know the language of the text (note how the item "identifying the topic of the text" is

considered to be relevant) and access the reading scheme (Eskey, 2002). Still, with particular reference

to listening comprehension in L2, the results are consistent with studies by Hasan (2010), Siegel (2013)

and Carrier (2003), who analyse the teachers’ explicit signs/instructions during listening

comprehension exercises as being prominent. The relevance of tasks within each of the four fields

relates primarily to the acquisition of content and language schema of the text, no longer focused (in

the bottom-up method) on grammar isolated aspects and devoid of connection with the explicit

techniques of instruction and authenticity of the input (Bygate & Skehan, 2010; Siegel, 2013;Van

Patten, 2007; Vandergrift, 2003).

Also within question 1 of the study, with regard to the correlation between the four core skills, the

poor correlation between "integration" (compare and contrast ideas within a text or through various

texts; and synthesize ideas within a text or through various texts) and all items from the same factor to

which it belongs - reading - suggests a lack of consistency in the range of items related to reading.

Teachers seem to be dispersed regarding the importance paid to the tasks assigned to the reading

aspect, as the original questionnaire proposes. The ability to compare and contrast texts should be

related to good writing indexes, which implies an inherent correlation between reading and writing

(Grabe, 2003), but should also be an important predictor of the acquisition of described schema in

different texts, since it is a skill potentiated by comparison and contrast along different texts (Grabe).

A negative correlation was also found between writing and reading items, as well as low

correlation, although not negative, between specific reading tasks (location and basic understanding)

and writing (linguistic rules) in the relationship with listening comprehension (understanding the main

ideas and recognition of the communicative functions of the speaker).

On the one hand, this result is also partially in conflict with the study by Cho, Rijmen and Novak

(2013) who, having applied the TOEFL IBT, like this study, found consistency in the items and

respective skills through the difficulty recognized by teachers evaluating the tasks belonging to the four

areas (reading, writing, listening and speaking). On the other hand, the same study detected specific

problems highlighted in the relationship between reading and listening skills (Cho, Rijmen & Novak,


With regard to the negative correlation between specific items of writing and reading, which

would not be expected given the known interdependence relationship between these skills (Chien,

Palau & Sun, 2014), it is curious we have evidence that there is inconsistency in teachers' practices to

promote non-native students' learning, as the relationship between reading and writing skills is very

strong and proven (Shanahan, 2008). However, this easy correlation assumption goes back to the

premise of the eighties when it was believed that the cognitive processes underlying reading and

written skills in Mother Tongue were the same as in the L2 context. Graber (2003) focuses on this

premise as founder of the idea of a direct correlation advocated by the authors of the following decade

(Carson, 1990; Flahive & Bailey, 1993). The author poses the question to assess the relationship, in L2,

between those two skills, but not as a direct correlation. Also Shanahan (2008) analyses the

relationship between the four skills but aligning the writing and reading, underlining the late

acquisition of the former. And above all making writing dependent and benefiting from acquisitions

and processes learned with other skills, highlighting the role of reading and speaking skills. It reveals

that there is, however, clear intersections among all skills and that there are subareas (dimensions in

each skill) that may have higher correlation and prediction than others between writing and reading

(i.e. word decoding).

We started from the same principle in this study because our results explain how some items

contradict the transversal relationship equation between skills. However, the same study (Shanahan)

found that the speaking and writing skill in progress is parallel, particularly with implications for the

acquisition of grammar rules that cut across both skills in the L2 context.

In fact, the author states that the most recent studies have confirmed the role of speaking for

reading and not only of writing for reading and vice versa, in a different proportion of the one observed

in the Mother Tongue context. Also, Ball (2003) concluded in his study that younger children (grades

3-6) are the ones who benefit most in reading using their speaking skills, dismissing the explanation of

the largest prediction of writing for success in reading. The data seem to confirm this relationship

(overlapping) in that the correlations are high and positive between such writing items (linguistic rules)

and speaking skills (fluency and grammar rules).

Age is the factor that differs less between groups in univariate analysis of variance, except for the

regression analysis test that shows the predictive influence of the age of teachers only to explain their

choice of the task referring to the ability to reason within the speaking skill. In fact, there are few

studies that examine this correlation (age factor) and usually authors (Kanno & Stuart, 2011; Tsui,

2003) focus on the differentiation of experiences about teaching L2 between teachers at an early stage

in their careers and more experienced ones. They conclude that younger teachers are at a stage of

learning teaching and concepts processes, before actually implementing the tasks and processes (Kanno

& Stuart).

It was only in the last two decades that analysis of the cognition of teachers on the actual teaching

started, and very recently in the specific area of teaching content to non-native students (Horwitz,

1985; 2014; Johnson, 1994; Liu & Fisher, 2006; Tsui, 2007), and without the longitudinal dimension of

studies that are able to assess the modification of knowledge and practices of those teachers.

Regardless of the area, but dependent on age, teachers begin by being more focused on solving

classroom management problems, and only the more experienced care for language tasks in a L2

context (Nunan, 1992). In this study, age was not a significant variable in the comparative analysis of

groups of teachers in the evaluation of the tasks for the four major skills of non-native students.

However, compared to the study by Rosenfeld, Leung and Ottman (2001), younger teachers in this

study are the ones who perceive fewer items in the tasks, especially in the speaking skill. Younger

Portuguese teachers value more tasks, which attests their increased awareness compared to that of

American teachers in the same assessed context of choosing relevant tasks for non-native students. On

the other hand, younger teachers (and students in teaching training courses) in the previous study, as

the younger teachers in the Portuguese study, devalue the writing tasks in a similar way, exactly in the

audience item (writing for different audiences).

Isaacs and Thomson (2013) corroborate the specific situation of the preparation of the most

inexperienced teachers, concluding that they are the teachers who find it most difficult to distinguish

items within the test ranges.

Regarding the other variable - experience with multicultural groups - it was also found that it has

predictive value (linear regressive analysis) for the implementation of action measures in non-native

students, that is, less experience is associated with lower capacity and initiative to apply measures in

PLNM. These results corroborate studies of the past two decades that reveal that inexperience in a L2

teaching context creates serious ambiguities and conceptual errors about how to teach and what

materials to use in class with immigrant students (Horwitz, 1988; 2014; Kern, 1995; Mantle-Bromlwy,

1995; Peacock, 2001; Samimy & Lee, 1997). Indeed, studies insist on distinguishing teachers in a pre-

service situation and real teachers in a L2 context, and the authors (Bree, Hird, Milton et

al.,22001;Peacock, 2001) concluded that younger teachers at the start of their careers have more

mistaken beliefs about teaching techniques and the learning priorities of non-native students. The same

authors found that less experienced teachers with little contact with multicultural classes (or without

any contact) are the teachers who characterize L2 teaching as focused exclusively on teaching grammar

and vocabulary at the expense of other skills.

This study’s results are partially in accordance with the data of international studies (Bree, Hird,

Milton et al., 2001; Horwitz, 1985; Peacock, 2001) of the nineties and forward, in that Portuguese

teachers devalue tasks related to the assessment and teaching of language rules, while agreeing with

items that focus on vocabulary. Reading and listening comprehension, especially of test questions and

written instructions, are the priorities, as noted, of the Portuguese teacher sample. On the other hand, in

the groups within the Portuguese sample, teachers with less experience (the younger ones) do not differ

from the more experienced ones when determining the reading tasks, which may denote less capacity

to differentiate assessment and teaching tasks in general. Also, on the power of the variables referred to

in Question 2 of this study, other authors value the analysis of the experience with multicultural groups

(Flores & Smith, 2009; Lee & Oxelson, 2006; Pettit, 2011) and other factors, such as length of service

and training for L2 teaching, (Flores & Smith, 2009) for professionals to develop positive attitudes

towards L2 education and the maintenance of Mother Tongue students as something important for the

L2 development process (Bialystok, 2002; Cummins, 1980; Hinkel, 2012; Lee & Oxelson).

The experience of implementing measures in non-native students was an important variable in this

study to explain tasks valuation differences, but as a predicting variable. These results contradict

previous studies that assert the importance of experience with linguistic diverse classes to build

teachers’ favourable attitudes towards the teaching of L2 learners (Karabenick & Noda, 2004; Reeves,

2006; García-Nevarez, Stafford, Arias, 2005). Teachers with no experience in this area may develop

representation problems on the needs of these students and the need to differentiate groups of learners

(immigrants, refugees and bilingual), for which reason research in this topic is important to understand

the lack of perception of teachers from any scientific area (Freeman, 1975; Karabenick & Noda;

Reeves, 2006). Since the diversity of the teachers’ scientific areas is found mainly in high school, we

find studies that corroborate the problem of the ill-preparation of those teachers, which has serious

implications for the development of L2 in minorities within the classroom (Hansen-Thomas &

Cavagnetto, 2010; Rubinstein-Avila & Lee, 2014).

Freeman (1975) confirms that prior knowledge of teachers is crucial in attitudes and practices

within the classroom with non-native students because this prior knowledge encompasses

predispositions and socio-cultural ideas. The diversity of linguistic minorities is increasingly broad and

challenging. However, it was confirmed that teachers with no experience with measures/materials are

those with lower means in assessing the relevance of the tasks. And regardless of length of service and

experience with multicultural classes, experience with implementing L2 teaching measures/materials is

a predictor variable only for listening comprehension with regard to the item "recognize attitudinal

signals in the interlocutor" (i.e. communicative functions). Again, listening tasks appear to be highly

valued by Portuguese teachers, confirming the research trend on L2 teaching methods that recognize

the interdependencies of cognitive processes between listening and reading, as mentioned above

(Brown, 2009; Bygate, Swain & Skehan, 2013; DeKeyser, 2007; Harding, Alderson & Brunfaut, 2015;

Zeeland & Schmitt; 2013).

The results presented in this study are an important contribution especially in two aspects: the

analysis of teachers’ perception of relevant tasks in L2 is pioneer in Portugal. On the other hand, it

presents a corpus of results that corroborate and contrast those of previous international studies, with

implications for education and concepts of practices that teachers from various scientific fields reveal

about L2 teaching and the type of tasks to consider in tests and in the classroom.

The data suggest that teachers may be developing inadequate practices and concepts, especially

considering the differences according to scientific field and high school level; that they undervalue the

grammar component of all skills to be developed by the students; that they overemphasize listening

comprehension and its relationship with reading; that they follow closely a L2 teaching model

(originally of American design, Horwitz, 1985) but only basic education teachers (for students aged 4-

11 years); and that they have poor notions regarding L2 tasks and evaluation tests, in general.


This work was supported by the Foundation for Science and Technology (FCT) under the Grant n.º SFRH/BPD/86618/2012; and Center of Psychology Research of Universidade Autónoma de Lisboa, Lisbon Portugal.


