Simulation of intellectual system for evaluation of multilevel test tasks on the basis of fuzzy logic

The article describes the stages of modeling an intelligent system for evaluating multilevel test tasks based on fuzzy logic in the MATLAB application package, namely the Fuzzy Logic Toolbox. The analysis of existing approaches to fuzzy assessment of test methods, their advantages and disadvantages is given. The considered methods for assessing students are presented in the general case by two methods: using fuzzy sets and corresponding membership functions; fuzzy estimation method and generalized fuzzy estimation method. In the present work, the Sugeno production model is used as the closest to the natural language. This closeness allows for closer interaction with a subject area expert and build well-understood, easily interpreted inference systems. The structure of a fuzzy system, functions and mechanisms of model building are described. The system is presented in the form of a block diagram of fuzzy logical nodes and consists of four input variables, corresponding to the levels of knowledge assimilation and one initial one. The surface of the response of a fuzzy system reflects the dependence of the final grade on the level of difficulty of the task and the degree of correctness of the task. The structure and functions of the fuzzy system are indicated. The modeled in this way intelligent system for assessing multilevel test tasks based on fuzzy logic makes it possible to take into account the fuzzy characteristics of the test: the level of difficulty of the task, which can be assessed as “easy”, “average", “above average”, “difficult”; the degree of correctness of the task, which can be assessed as “correct”, “partially correct”, “rather correct”, “incorrect”; time allotted for the execution of a test task or test, which can be assessed as “short”, “medium”, “long”, “very long”; the percentage of correctly completed tasks, which can be assessed as “small”, “medium”, “large”, “very large”; the final mark for the test, which can be assessed as “poor”, “satisfactory”, “good”, “excellent”, which are included in the assessment. This approach ensures the maximum consideration of answers to questions of all levels of complexity by formulating a base of inference rules and selection of weighting coefficients when deriving the final estimate. The robustness of the system is achieved by using Gaussian membership functions. The testing of the controller on the test sample brings the functional suitability of the developed model.


Introduction
Test control is increasingly becoming an integral part of the educational process for all types and levels of educational institutions.Having become widespread in Western European countries and the United States, it is gradually gaining new positions in the domestic higher education.There are many practical implementations of automated testing systems, both in individual disciplines, and universal knowledge assessment systems, fully or partially invariant to specific disciplines and allowing teachers to edit their information content.Analysis of the effectiveness of automated testing in educational institutions shows that the most significant disadvantages of modern approaches to automated testing include [1, p. 4]: • the need to formulate options for answers to test items on the principle of "one is absolutely correct" -"other N are absolutely wrong"; • the primitiveness and inflexibility of the procedures for calculating the final grade, which can be reduced either to determining the ratio of the number of correct answers to the number of questions asked, or to the summation of points assigned for each correct answer; • impossibility of automating various methods of knowledge control, widely used in pedagogical practice; • significant laboriousness of manual formation of such a set of test tasks and options for answers to each of them, which makes it possible to exclude or minimize the likelihood of presenting the same task to different people while simultaneously checking their knowledge.
From this it follows that it is necessary to develop an automated knowledge control system, which requires the use of fundamentally different approaches to the presentation and processing of information based on methods and models developed within the framework of the theory of intelligent computing and knowledge engineering.
A lot of studies in pedagogy are devoted to the issue of assessing knowledge.In particular: monitoring the quality of education (Cherednichenko and Yangolenko [2], He and He [3], Igbape and Idogho [4], Leontev et al. [5], Li et al. [6], Muhd Nor et al. [7], Qin et al. [8], Sorour et al. [9], Wei [10], Zhi and Nan [11] and others); development of modern innovative technologies that are included in the knowledge assessment system (Anohina-Naumeca et al. [12], Anohina-Naumeca and Grundspenkis [13], Gierłowski and Nowicki [14], Grundspenkis [15], Schmuck et al. [16], Szöllosi et al. [17] and others); the use of a multi-point scale for assessing knowledge, abilities, and skills (Bespalko [18], Linn [19] and others); theoretical approaches to the assessment of students' knowledge, their development and improvement (Clotfelter et al. [20], Falchikov and Boud [21], Falchikov and Goldfinch [22], Host et al. [23], Hwang and Chang [24], Newble and Jaeger [25], Osadchyi et al. [26], Rust et al. [27], Scouller [28], Topping [29], Wiliam et al. [30] and others); evaluation of test results in an adaptive automated testing system, taking into account the ambiguity of the formulations of answers (Barker [31], Phankokkruad and Woraratpanya [32], Rudinskiy [1] and others).In [33] we substantiated the structural model of the neuro-fuzzy system of professional selection of students for training in IT specialties by studying the psychological characteristics, personal qualities and factual knowledge, skills and abilities of students as a unity of fuzzy and stochastic data base of the intellectual system.The issue of using fuzzy logic to describe the indicators of expert competence assessment using linguistic variables instead of numerical ones or in addition to them and the development of Sugeno's intelligent system for determining expert competence was covered by us in [34].The process of modeling intelligent systems based on fuzzy logic in various fields and analysis of the effectiveness of systems implemented in MATLAB is disclosed in the works of: Taylor [35] -fuzzy logic methodology, which is widely used in research and engineering practice and education, Lutsyk et al. [36] -use of parametric identification and adaptive neuro-fuzzy technologies to determine energy efficient modes of production equipment, Shtovba -the theory of fuzzy identification, methods of fuzzy clustering and their application for fuzzy rule extraction, as well as the method of decision-making in fuzzy conditions based on the merger of goals and constraints, author's package solutions for designing fuzzy classifiers, building hierarchical fuzzy systems, training of fuzzy knowledge bases such as Mamdani, as well as for logical output with fuzzy source data [37,38,39,40,41,42,43,44,45,46,47,48,49,50].

Materials and methods
Models based on fuzzy logic are more flexible, as they mostly allow taking into account the experience and intuition of a specialist in a particular field.They are more adequate to the simulated reality and make it possible to obtain a solution correlated in accuracy with the initial data [51].
As a rule, the following characteristics are referred to fuzzy test characteristics: 1) the level of difficulty of the task, which can be assessed as "easy", "average", "above average", "difficult"; 2) the degree of correctness of the task, which can be assessed as "correct", "partially correct", "rather correct", "incorrect"; 3) time allotted for the execution of a test task or test, which can be assessed as "short", "average", "long", "very long"; 4) percentage of correctly completed tasks, which can be assessed as "small", "medium", "large", "very large"; 5) final mark for the test, which can be assessed as "bad", "satisfactory", "good", "excellent".
Among the fuzzy models for evaluating test results, adaptive models are interesting.In the work of Rudinskiy [1, p. 49], an adaptive model for evaluating the results of a "fuzzy" test is described.The idea is that the set of reference answers for each test item has a fuzzy grading scale.This fuzzy scale corresponds to the normalized numerical scale (1,  1 ,  2 ,  3 , 0), where   ∈ (0, 1),  = 1..3.All answers, except for the correct one, are assigned a subsequent question with a subset of answers.If an inaccurate answer is given to question  at the -th step of testing, a clarifying question is asked next, and the subset of answers contains both "more correct" ("correct", "incomplete") and "less correct" ("uncertain", "wrong") answers.If this question is answered differently from the correct one, no further additional questions are asked (otherwise the laboriousness of compiling such a structure of questions with subsets of answers to them would be very great), testing goes to the next step (question).Thus, the testing process can be represented as a movement along a directed graph, where vertices are questions, and arcs are transitions from the previous question to the next.
An adaptive testing model using the apparatus of fuzzy logic is considered by Duplik [52, p. 60].As a scale for evaluating test results, a 12-point scale proposed by Bespalko [18] is used.At the same time, the author proposes a correspondence between the percent of correct answers of the student and estimates on 12-point and 5-point scales, which, in turn, correspond to fuzzy concepts.
Danilova [53, p. 17] developed an adaptive fuzzy model for evaluating the results of automated testing with division of tasks according to the levels of assimilation, proposed by Bespalko [18].The paper presents models for evaluating test results: formalization of question-answer relations in test tasks according to the levels of assimilation is carried out for recognizing the answers of the tested person and formal presentation of test results; the scaling of the value estimates of the test items was performed; the bases of rules of fuzzy productions for evaluating test items of closed and open types have been developed; in order to ensure the adaptability of testing, a base of rules for fuzzy products has been developed for ranking tasks in the test; the calculation of the integral assessment of the test performance was done based on the assessment results of each test task.The fuzzy inference for evaluating the test results, based on the Mamdani method of fuzzy inference, is described.
Belov [54] considers the problem of building an automated testing system (ATS) with the analysis of the respondent's answers in natural language (NL).To recognize the responses of the person and the reference in the automated testing system, a linguistic analyzer module has been developed, which processes text in NL.The result of the surface-syntactic analysis of the phrases of the reference and user answers are syntactic dependency trees, including the word forms of the phrase, with the definition for each of them morphological descriptors and syntactic properties that combine words into syntactic fragments and groups.
A limitation of the presented comparison model is the use of well-formed sentences.A sentence that is not well-formed is discarded by the linguistic analyzer with the requirement to the respondent to reformulate the answer.Each type of response is associated with a so-called syntactic template (SynT), which determines a set of typical syntactic constructions of a sentence and their significance.The obtained result -the degree of correspondence (relevance of phrases) -is taken as the degree of "correctness" of the respondent's answer on the scale [0; 1].

Results
Thus, all the methods for evaluating test methods that we have considered have both advantages and disadvantages, which we have summarized in table 1 for clarity.

Table 1
Advantages and disadvantages of test assessment methods

Author
Advantages Disadvantages Rudinskiy [1] The introduction of fuzziness in the organization of the adaptive test, which allows the compilers of the test at the stage of its creation for each test task to build a hierarchical structure of questions in the form of a directed graph.
When evaluating test tasks and the test, the apparatus of fuzzy logic is not used, and the obtained linguistic values are simply projected onto a normalized numerical scale.The values obtained on this scale determine the degree of correctness of the answers, which are substituted into a specially designed formula to obtain the final grade.Duplik [52] The use of a fuzzy logic apparatus to obtain an integral assessment of test results.The integral assessment is influenced by such indistinct characteristics of the test as the current level of training, the percentage of correct answers, the complexity of the task, and the time it takes to complete the task.
The 12-point assessment scale, proposed by V. P. Bespalko, is used only to evenly distribute the traditional 5-point scale on it and is not tied to the levels of assimilation of knowledge.
Danilova [53] The sophistication of models for assessing test tasks, adaptive testing, integral assessment of test results.
The set of fuzzy production rules for evaluating test tasks with an open-ended question is applicable only to test tasks of the "Substitution" type.Belov [54] Revealed classification of question types and corresponding types of answers in natural language.
The graph comparison method is very labor intensive and complex.Firstly, the syntactic templates of all reference answers must be built in advance, and secondly, the proximity of two phrases is determined on the scale [0; 1] by means of a complex algorithm, which would be easier to do using the apparatus of fuzzy logic.
The considered methods for assessing students are presented in the general case by two methods: using fuzzy sets and corresponding membership functions; fuzzy estimation method and generalized fuzzy estimation method.The assessment system should be regularly reviewed and improved to ensure its suitability to assess students impartially and fairly.
It makes sense to use a fuzzy model to describe an object when we do not have its analytical description, or it is too cumbersome to use, but at the same time there is a sufficiently large amount of experimental data on the behavior of an object and/or heuristic rules for its functioning.
In this work, the Sugeno production model is used as the closest to the natural language.This closeness allows for closer interaction with a subject area expert and build well-understood, easily interpreted inference systems.
It is important for us to develop an assessment strategy based on fuzzy sets, which requires careful consideration of the factors included in the assessment.These include: the level of difficulty of the task, the degree of correctness of the task, the final mark for the test, which can be assessed as "bad", "satisfactory", "good", and "excellent".The system is presented in the form of a block diagram of fuzzy logical nodes in figure 1 and consists of four input variables, corresponding to the levels of knowledge assimilation and one initial one.With this method, the system contains two nodes.The first node takes into account the level of complexity of the task and the degree of correctness of the task, depending on the supported task type of the automated system that is used for testing, for example Moodle [55].
The next three nodes behave like a fuzzy logic controller with two inputs with corresponding weights and one output, as in figure 2.

Fuzzy system implementation
From the subject expert, we get the value of the matrix and the dimensions that describe the degree of importance of each question in the fuzzy domain, that is, the set of all allowed atomic values of the matrix column.The clear values are given as a vector.In the first node, the resulting data will be the experimental data, while the next nodes work as a fuzzy controller, the input of which is the output of the previous node (corresponding to the levels).The output of each node can be in the form of fuzzy values or in the form of linguistic variables.Each node has weighted coefficients that can be set equal to one with the equal influence of each input parameter.The output occurs according to the inference mechanism of the Sugeno fuzzy system.Here is a description of the system.
System name: Correctness.Input variables: Level 1, Level 2, Level 3, Level 4. Initial variable: Final grade.The names of the terms of input variables: correct, wrong.The names of the terms of the original variable: correct, almost correct, partly correct, rather correct, probably wrong, wrong, zero.
Fuzzy membership functions of the system are defined in the interval [0; 100] (see figure 3), the parameters of the input and initial ones, respectively, are given in tables 2 and 3.
Set of rules " If . . .then": 1.If (level1 is wrong) and (level2 is wrong) and (level3 is wrong) and (level4 is wrong) then (final grade is zero) (1) 2. If (level1 is wrong) and (level2 is wrong) and (level3 is wrong) and (level4 is correct) then (final grade is probably wrong) (1) 3.If (level1 is wrong) and (level2 is wrong) and (level3 is correct) and (level4 is wrong) then (final grade is probably wrong) (1) 4. If (level1 is wrong) and (level2 is wrong) and (level3 is correct) and (level4 is correct) then (final grade is partly correct) (1) . . .14.If (level1 is correct) and (level2 is correct) and (level3 is wrong) and (level4 is correct) then (final grade is almost correct) (1) 15.If (level1 is correct) and (level2 is correct) and (level3 is correct) and (level4 is wrong) then (final grade is partly correct) (1) 16.If (level1 is correct) and (level2 is correct) and (level3 is correct) and (level4 is correct) then (final-grade is correct) (1) As a result of modeling this system in the MATLAB application package, in particular the Fuzzy Logic Toolbox package, we obtained the response surfaces of the system at constant values of the input variables level3 and level4 equal to 50: in figure 4a -manually configured by an expert; in figure 4b -configured according to the ANFIS algorithm.Analysis of the response surface of a manually tuned system shows incorrect operation at intervals corresponding to  intermediate values of functional membership such as constants of the output variable.To eliminate these differences, the fuzzy system was trained using the ANFIS algorithm based on the training sample.
Training program:  where the initial parameters: learn -a tuned system of the Sugeno type, the parameters of which minimize the error on the training set; error -system error at each training iteration; input parameters: tr_data -training sample; initfis -the original fuzzy output system; number 10 is responsible for the number of training iterations.
As can be seen from figure 4b, the trained system according to the ANFIS algorithm reproduces the expert's opinion as accurately as possible, which makes it possible, accordingly, to more accurately formulate the final assessment, taking into account the level of the tasks done correctly.The results of testing the fuzzy system are shown in table 4.

Conclusion
An intelligent system for assessing multilevel test tasks based on fuzzy logic modeled in this way makes it possible to consider all the above factors using fuzzy logic that are included in the assessment.This approach ensures the maximum consideration of answers to questions of all levels of complexity by formulating a base of inference rules and selection of weighting coefficients when deriving the final grade.The stability of the system is achieved by using Gaussian membership functions, as discussed in [56, p. 14].We see the prospect of further research in the processing of the information received from the fuzzy system and the formulation of appropriate recommendations for specialists in different fields of knowledge for interpreting the final grade using multilevel test tasks.

Figure 1 :
Figure 1: Block diagram of a fuzzy estimation system.

Figure 2 :
Figure 2: Node of presentation in the form of a fuzzy logical controller.

Figure 3 :
Figure 3: Membership functions of input linguistic variables of a fuzzy system.

Figure 4 :
Figure 4: The surface of the system response at constant values of the input variables level3 and level4 equal to 50: a -manually configured by an expert; b -configured according to the ANFIS algorithm.

Table 2
Parameters of membership functions of initial variables

Table 3
Fuzzy system testing resultsDifficulty level 1 Difficulty level 2 Difficulty level 3 Difficulty level 4 Final Grade