Since Messick’s (1989) introduction of the validity as an integrated evaluative judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test score, the issue has found an unprecedented significance. Also Weir and Shaw (2007) in their socio- cognitive framework which views language testing and validation within a contemporary evidence- based paradigm, consider scoring validity as an important element to provide theoretical, logical and empirical evidence to support validity claims and arguments about the quality and usefulness of writing tests. The obtained results showed that raters conducted their impressionistic and subjective ratings based on their own criteria developed and internalized over the path of their rating practice. Their elaborations in the interviews scheme showed that they firmly believed in their rating despite the fact that there was no use of any explicit rating scale in their rating.
None of the interviewees spoke of an explicit rating scale being in use in their rating; rather they developed a kind of ad hoc rating scale based on their own experience when involved in the rating task. In sum, current impressionistic rating situation in the practice of Iranian EFL writing assessment can be argued in some main grounds.The current state of rating practice can be the by- product of general educational policies in the EFL context. According to Fraizer (200 ) writing assessment can be no longer something we pretend to do on our own within the realm of academic institutions; rather, power is an element of the assessment process that can not be ignored ( Huot and Williamson, 1997 ). Writing and its assessment is not attended to seriously in the EFL curriculum. In fact pragmatic concerns including increasing student population, lack of rater training courses, lack of time, and more seriously lack of an ordered validation program have caused raters to feel safe with their scoring an as a result, a vague rating situation combining elements of criterion- and norm- referenced approaches prevails (Barkaoui, 2007) Majority of the raters in this study have a positive view to the use of rating scale.
They like to have a reliable and valid assessment by use of rating scale. Several researchers have reported that rater’s assessment is more reliable if a scale is used. (Jonsson & Svingby, 2001; Silvestri & oescher, 2006). But all of the raters in this study believe that there should be a rating scale that is suitable for our context.