Evaluation of Intelligent Grouping Based on Learners’ Collaboration Competence Level in Online Collaborative Learning Environment

In this paper we explore the impact of an intelligent grouping algorithm based on learners’ collaborative competency when compared with (a) instructor based Grade Point Average (GPA) method level and (b) random method, on group outcomes and group collaboration problems in an online collaborative learning environment. An intelligent grouping algorithm has been added in a Learning Management System (LMS) which is capable of forming heterogeneous groups based on learners’ collaborative competency level. True experiment design methodology was deployed to examine whether there is any association between group formation method and group scores, learning experiences and group problems. From the findings, all groups had almost similar mean scores in all group tests, and shared many similar group collaboration problems and learning experiences. However, with the understanding that GPA group formation method involves the instructor, may not be dynamic, and the random method does not guarantee heterogeneity based on learner’s collaboration competence level, instructors are more likely to adopt our intelligent grouping method as the findings show that it has similar results. Furthermore, it provides an added advantage in supporting group formation due to its guarantee on heterogeneity, dynamism, and less instructor involvement.


Introduction
Group formation on group work has big impact on group performance. Depending on how the group is formed, it can result in homogeneity in student characteristics and ineffective peer learning. Thus, there is a need to constitute heterogeneous groups in collaborative learning which constitutes students with different collaborative competencies and knowledge levels. However, without empirical study it becomes difficult to conclude which characteristics are desirable in group heterogeneity as different learning needs may require different group orientations. Previous research has focused on various group orientation techniques and their impact on group performance like different learning styles in group orientation (Alfonseca, Carro, Martin, Ortigosa, & Paredes, 2006;Deibel, 2005;Grigoriadou, Papanikolaou, & Gouli, 2006). However, there is need to investigate the impact of other group orientation techniques on group performance like grouping students based on their Collaboration Competence Levels (CCL). Furthermore, most of the previous research in group formation lacks the true experiment design methodology which is recommend when investigating learning outcomes with different instructional methods.
This research sought to investigate the impact of different group orientation techniques (GPA, Intelligent Grouping, and Random) on group outcomes in an online collaborative learning environment. Hence, the research questions we intended to answer in this respect are: 1. Which group of learners amongst the intelligently grouped, randomly grouped and instructor grouped methods using GPA collaborates more effectively and performs better in an online group task? 2. What is the association between grouping method and group outcomes in terms of (a) students' learning experiences (b) perceived problems (c) group leadership satisfaction, and (d) group task satisfaction?
3. What are the students' perceived benefits of online collaborative learning?

Literature Review Group Formation
Group formation is the process of identifying students and assigning them to a specific group so that they belong to one specific group when doing a group task (Wessner & Pfister, 2001). In a group task a group can be either homogenous or heterogeneous. In homogeneous group formation a student joins a group with other members who have similar characteristics such as course interests, work schedules and residential proximity. For instance, grouping students with interests in the same academic major or with similar course interests may be an effective procedure for promoting bonding, productivity, and synergy among group members, while grouping students with similar class and work schedules can facilitate outof-class collaboration among teammates. Also, grouping students with respect to residential proximity may be an effective strategy for enabling group members to get together conveniently outside of class to complete group tasks. On the other hand, in heterogeneous group formation a student joins a group with other members who have different or diverse characteristics such as academic achievement, learning styles, personality profiles and demographic information which could include age, gender, racial, and ethnic or cultural background.
Heterogeneous groups are always preferred because it's believed they produce constructive controversy (de Faria, Adan-Coello, & Yamanaka, 2006). However, though heterogeneous groups are preferred, there is always are dilemma as to what extent the heterogeneity is in terms of academic achievement, gender, age, social group, and personality. Consequently, numerous studies have been conducted to establish the effect of group formation method on group performance. However, two methods (random selection and self-selection) tend to dominate in the literature, probably due to the fact that there is little involvement of the instructor. However, of these two methods, researchers have posited that self-selection offers the best advantages for students in classroom work groups (Connerley & Mael, 2001;Koppenhaver & Shrader, 2003;Strong & Anderson, 1990).
The criteria for selecting members in a group can also affect the members' commitment. Group members who choose with whom to work are more relationally satisfied with their group and more committed to work together than members who are randomly assigned to work with each other (Scott, 2001). Random selection method is highly utilized by instructors due to the ease of implementation and "fair" distribution, which gives a student equal chance to be a member of any group, hence both social and academic heterogeneity can somehow be achieved. However, it can also lead to lack of diversity in skills within the group (Bacon, Stewart, & Anderson, 2001). Randomly selected groups have also proven to utilize their time during group meetings more effectively and group members are more task oriented probably because, familiarity among members is less which makes the groups' social network less compared with self-selected membership (Chapman, Meuter, Toy, & Wright, 2006). Despite these advantages, random selection has proved to be (a) less effective in improving group performance, (b) inferior in group dynamic ratings, and (c) increased conflicts (Chapman et al., 2006). Hence, there is need to explore other intelligent techniques which are more dynamic and are capable of considering collaboration competences among learners.
The use of intelligent systems to do group formation in online collaborative learning environments has also been reported in recent research (Liu, Joy, & Griffiths, 2009;Messeguer, Medina, Royo, Navarro, & Juarez, 2010). Although computer based random selection methods have been preferred in large classes, intelligent techniques are better because they do incorporate learner's characteristics like learning style (Liu et al., 2009), learner's profile, and context (Muehlenbrock, 2006) and contextual information (Messeguer et al., 2010). They could also change the group allocation. The ability to change the group member composition in real time enables the leveling up of learning results and improvements in the participants' social relationships. Some of the intelligent techniques have applied the use of machine learning techniques like instance-based learning and bayesian network which are capable of using contextual information to learn the user behavior and predict an appropriate group for the learner. Liu et al. (2009) andMesseguer et al. (2010) developed an intelligent grouping algorithm based on learning style and integrated it in a LMS to group students with different learning styles together. They also demonstrated its use in a realistic online collaborative learning environment by comparing it with group assignments based on similar learning styles. However, in their study they failed to address the impact of the algorithm when compared to other methods such as random and self-selection (popular in LMS). In addition, there are no true experimental studies on these intelligent systems in order to prove their effect in group performance when compared with instructor-based methods.

Intelligent Grouping based on Collaboration Competence Level
Forum data in the Moodle database has many attributes such as a new post which is an original idea, a reply to a post which corresponds to a response to an existing idea, and the average rating of the posts (done by an instructor and indicates the level of relevance of the post on the issues under discussion).
Once processed into an appropriate form, these data can be processed by machine learning tools such as Weka clustering algorithms (Aher & Lobo, 2011) and create clusters based on forum data (Aher & Lobo, 2011;Muuro, Wagacha, & Oboko, 2014). In this study we have extracted data from the Moodle database which include: (a) user id (taken from mdl_role_assignments table by checking the role and enroll conditions), (b) number of posts (taken from mdl_forum_posts table), (c) number of replies (taken from mdl_forum_posts table), and (d) forum ratings (taken from mdl_rating table). These data were stored in a .csv text file and were entered into the Weka.PHP program which has the clustering algorithms to create three clusters representing three different collaborative competence levels (High, Medium and Low) as discussed later in Table 1.
Data stored in these clusters were used to form heterogeneous groups using an intelligent grouping algorithm (Muuro, Wagacha, & Oboko, 2014). This grouping algorithm is capable of selecting students from different clusters to form a group which represents diverse collaboration competencies in group membership. To create heterogeneous groups through the intelligent grouping algorithm, first, the data stored in the three collaborative competence levels (Cluster 0, Cluster 1 and Cluster 2) are converted to an array with userid values. Secondly, userids are ranked from Cluster 0 (most collaborative) to Cluster 2 (least collaborative). The result is stored in an array called rankedArray. It's from the rankedArray that the algorithm picks students from different collaborative levels as per the rank and assigns them to one group as per the specified group size. This process is performed iteratively until all students are assigned to a group. Students who are most collaborative are assigned a mentor role in their group.

Conceptual Framework
The conceptual framework is defined in terms of (a) definition of conceptual elements; (b) relationship between independent, intervening, and dependent variables; and (c) operationalization of the variables.

Definitions of Conceptual Elements
Independent variables. The independent variables in this study are derived from group formation techniques. Three different group formation techniques are studied, which include: random assignment, Grade Point Average (GPA) and intelligent grouping. These three different group formation techniques are used to construct our independent variables. In random assignment, group members are assigned at random and therefore, random numbers are used as indicators. In GPA method, students' performance in a given period of time is used as an indicator. In intelligent grouping, collaboration competence level is used as an indicator whereby data mined from a discussion forum are used to cluster students based on their collaboration competence level. Dependent variables. Our dependent variables are derived from the group outcomes. The group outcomes include the group performance, learning experiences, perceived group problems, group task satisfaction, and group leader satisfaction. These five different group outcomes are used to construct our dependent variables. Performance in group work can be characterized by three characteristics: interdependence, synthesis and independence. Indicators for these group outcomes include: number of new posts or replies in a discussion forum, forum rating scores assigned by an instructor, and scores obtained from a written test or quiz related to the discussion forum. Figure 1 illustrates the relationship between independent and dependent variables.

Operationalization of Variables
In order to measure collaboration competence level we introduce three collaboration characteristics: interdependence, independence and synthesis. Interdependence requires active participation by each member; participation can be measured by counting the number of messages and statements submitted by each individual and the group to the other participants. This allows both groups and individuals to be compared in their level of participation. Independence, on the other hand, can be analyzed by measuring the extent of influence by the instructor or other participants in individual participation and interaction.
Individuals who post new ideas rather than just replies are more independent, hence, more collaborative.
Synthesis can be measured in two ways. First, by the interaction pattern of the discussion that occurs In the light of the above arguments, we apply the three attributes to define three collaboration competence levels (High, Medium and Low) which are characterized by different levels of interdependence, synthesis and independence as described in Table 1. Operationalization of variables which are indicated in the conceptual framework is shown in Table 2. Student logs in often, participates and interacts actively and indicates high level of interdependence, synthesis, and dependence. At this level the learner can be ranked into a higher level of collaboration competence.

Medium
Student logs in often, participates and interacts moderately and indicates moderate interdependence, synthesis and dependence. At this level the learner needs assistance to improve to high level.

Low
Student logs in and participates rarely and there is no indication of interdependence, synthesis and dependence. At this level, the learner needs immediate attention to improve to medium level.

Population and Sampling
The students who participated in this study were first-year students who were doing a Bachelor of Science in Computers Science and Bachelor of Science in Mathematics and Computer Science at Kenyatta University, Kenya. First-year students were targeted because senior students have socially interacted more and they do prefer to work through social groups which can skew the experiment results. These students were studying a first year course called Foundations of Artificial Intelligence. This is a course in computer science which has a number of topics like problem solving in a state space which has the potential to elicit some discussion, hence a good course to be done through collaborative learning. The entire population for the first year class was 108 students who had registered for the course by the time the research was being conducted. All the students were picked to participate in the study. Therefore, the sample size was the same as the population.
The 108 students were randomly assigned into three classes with equal numbers (36 students per class).
The randomization was done through generating random numbers in an excel worksheet. Randomization was preferred to ensure that participants had an equal probability of being assigned to any class. This also reduces the effect of extraneous variables such as subject characteristics which is major threat to internal validity (Fraenkel, Wallen, & Hyun, 2012).

Research Design
A true experimental design was adopted where an experimental group and two control groups were used.
The control groups played the role of comparison groups as they also received different treatment in terms of group orientation. Experimental design was adopted because it could help to identify the effect of the independent variable (group formation) to the dependent variable (group outcome). The three classes which were formed through randomization as discussed earlier were used in the group design, where one class served as the experimental group and the other two classes as the control groups. Each class was then assigned an instructor who was responsible to teach the course and oversee the discussions throughout the experimental period. The instruction design and teaching materials were prepared before the start of the course by the three instructors. This was to ensure same course materials and instruction design was used throughout in the three classes.
During the third and fourth week, students were given some discussion questions, such that for every week there was group task to be solved. Self-selected groups were used in all three classes during this period of four weeks. The purpose of this discussion was to orient the students on forums in Moodle and at same time to generate discussion data which was to be used in the intelligent grouping. Self-selected grouping method was used because of: (a) known advantages such as allowing students to communicate better, have positive attitude towards group work, and feel more excited to work together (Chapman et al., 2006); and (b) to ensure internal validity as this grouping method was not included in the research question under study. At the end of four weeks of discussion, the students did a pretest which was taken as the first Continuous Assessment Test (CAT). The pretest was also used to confirm whether the randomization method used in creating the three classes was heterogeneous in terms of learning capability.
During the sixth week, students were placed into groups of four using different methods for each class.
Group size of four was preferred as this was an average size which was small enough to represent heterogeneous learning characteristics and also to utilize the advantages that are realized when students discuss in groups of small size (Schellenberg, 1959). Students were expected to collaborate online at different times in the same location (same computer lab) using asynchronous communication tools. Each group had a group leader who was expected to initiate the discussion, moderate the discussion and summarize the main points. The following procedures were adopted to assign students into groups and also to assign group leaders to each group: 1. In class one the instructor used Grade Points Average (GPA) which were calculated from the results for the last one semester. This class served as comparison group.
2. In class two the instructor used the intelligent grouping algorithm to cluster students and group them based on learners' collaboration competence level. These collaboration competence levels were created using clustering algorithms and using the first four weeks' discussion data. This class served as an experimental group.
3. In class three; the instructor used the random grouping method available in Moodle which automatically assigned students into groups of four. This class served as a comparison group.
After the exercise of grouping was over, students were informed of their group assignments, how the rest of the discussion was to be carried out, and how evaluation would be done during the experimental period. Table 3 describes how internal validity was enhanced. Table 3 Summary of Internal Validity Threats and Measures Taken

Instruments
The instruments which were used in this study include a pretest, posttests and a poststudy questionnaire.
The next section discusses how the instruments were constructed and the measures taken to ensure validity.
Pretest. Thirty multiple choice questions were constructed where the question items were drawn from Artificial Intelligence (AI) books. The topic covered in the pretest was introduction to AI. To ensure the test involved thorough comprehension and critical thinking by the students, multiple choices were closely associated to the right answer for all items. The thirty questions were then added into Moodle as a quiz and each question was assigned 1 mark. The multiple choice questions were reshuffled dynamically by the system to avoid copying of answers among students.

Type of Threat to Internal Validity
Measures taken

Subject characteristics
Randomization in assigning participants to groups and test (pretest) was done to measure the effectiveness of the randomization.

Location
Same learning environment was used, (i.e., the whole experiment was conducted in Kenyatta University [KU]).

Instrumentation
Validation on each instrument was done as described in the respective sections and all tests were conducted at the same time for all the groups. Different groups were used to pretest the instruments rather than the participants. Successfully approved assessment tools in Moodle were used to assess the forums.

Testing
Pretest and posttest were different. Pretest was only meant to measure effectiveness of randomization.

Attitude of subjects
Students were informed about the purpose of the study at the start of the course and the tests were to be part of the CAT for the course.

Implementation
Three different instructors who are experts in the course were used to facilitate teaching of the course in the three classes but the same instructional materials were used throughout.
Posttest. The posttest was made up of three tests which were designed differently but the contents were drawn from the same topic. That way, different taxonomies on knowledge construction were examined as recommended in Bloom's taxonomy (Bloom, 1956). The first section was a discussion forum which required the students to solve state space search problems. State space search problems were preferred because they generate a lot of discussion since there could be multiple solutions depending on the description of the state space and the heuristic function used to generate a solution. It is also possible to set many questions which are of the same weight by simply examining: (a) description of the state space, (b) rules and operators for moving from one state to another, (c) possible solutions, and (d) optimal solution and related heuristic function.
For each class there were nine groups, where group size ranged between 3 and 4. To minimize crossover problems during discussion, nine questions of similar weight were constructed such that each group had its own question, but; the nine questions were replicated in the three classes. The replication had no effect among the classes since each class was assigned a separate lab and the discussion forum was conducted the same time in all the three classes. Discussion forum was preferred because forums are a powerful tool in Moodle which allow course participants to post messages and reply to each other online.
The following assessment tools were used to mark the discussion forum: 1. Rating tool in Moodle. This is an assessment tool in Moodle which allows an instructor or a student to award a mark to a post (new post or a reply) in a discussion forum in the form of a rating. Different aggregation types do exist in Moodle which include: (a) average rating, (b) count of ratings, (c) maximum rating, (d) minimum rating, and (v) sum of ratings. These ratings are then aggregated using the selected type to produce the final individual grade for that activity. Sum of ratings aggregate type was used where addition of each rating is done to calculate the activity grade, which cannot exceed the maximum scale for the forum. Sum of ratings aggregate type was selected because of its capability to assess the quality and quantity of posts at the same time.
2. Learning Analytics Enriched Rubric (LAe-R) is an assessment rubric tool which contains enriched criteria and grading levels that are associated to data extracted from the analysis of learners' interaction and learning behavior in an online discussion forum. LAe-R has been developed as a plugin for the Moodle learning management system and has been tested and proven to be very usable tool that is highly appreciated by teachers and students in evaluating online collaborative learning tasks (Dimopoulos, Petropoulou, Boloudakis, & Retalis, 2013). In forums, the tool analyzes and visualizes data such as forum posts (new or reply messages), and number of files attached to the forum post.
This tool was used to assess the quantity of posts sent by an individual in terms of log in, new post, replies, and file attachment, therefore providing the assessment scores on how active a student was during the discussion period. The tool was preferred because it required minimal involvement of the instructor and included a number of parameters for assessing the individual participation level in the forum. This tool was downloaded and installed in Moodle as plug-in and then integrated as an advanced assessment tool for the forum. The scaling of marks on each parameter was discussed among the instructors and the final score was agreed as 10 marks. Table 4 describes the marking criteria adopted for the rubric analytic tool. The second test was given inform of a quiz which consisted of 10 multiple choice questions which were constructed to examine the expected solutions in the discussion forum. This test was meant to measure an individual's knowledge comprehension and knowledge construction during the discussion forum. The quiz was availed online immediately the discussion forum session was closed. Each student was given a single attempt for each item and was required to finish the 10 questions in the quiz within a period of 30 minutes. The process of marking and assigning scores for this quiz was automated, but students were not informed about their scores at this junction as they had to do another test. This was to avoid poorly scoring students being less motivated in the third test.
The third test was a written test which was constructed to test individual knowledge comprehension through short answers and easy questions. The test had weight of 20 marks and the tested items were based on the discussion forum. The test was administered immediately after the quiz and student were allocated one hour to do the test. Since the test was not meant to test memorization student were allowed to refer to their short notes they had prepared during the discussion session. This ensured that those students who had discussed a lot and arrived to the right solutions had a higher chance of scoring high if they prepared good notes from the discussion. The test was marked later using a marking scheme which was constructed by the three instructors and allocation of marks on each item was also agreed among the three instructors.

Posttest Validation
Before the posttest was given to the participants the following measures were taken to enhance validity: 1. All the three instructors were involved to provide different expertise when setting the questions and checking content validity.
2. Quiz multiple choice questions were reshuffled dynamically by the system to avoid copying of answers among students.

A trial of the posttest was done with a group of second year computer science students in Kenyatta
University who were doing a similar course through Moodle. It was found that most students were not able to attempt the discussion questions and a majority requested more examples in order to understand the concept. This prompted the instructors to give more examples on similar concepts.

Post Study Questionnaire
The purpose of this questionnaire was to collect data on the students' experiences on the group task.
These students' experiences were categorized into different categories as summarized in Table 5. Nineteen items in the questionnaire were close ended while three items were open ended. The Google doc. was used to construct the questionnaire, this made it easier to have the questionnaire availed online to the respondents. Validation of the instrument. To ensure validity, content-related evidence was used and two experts in e-learning were requested to review the content and the format of the questionnaire. Based on their comments some of the items were rephrased, more items were added, some content enriched, and reformatting done as recommended. The questionnaire was also pretested with a group of second year computer science students who were doing a similar course through Moodle. About fifty students were selected and emailed the questionnaires that were completed online. The Cronbach's coefficient alpha for the 5-point Likert scale items had satisfactory reliability (alpha=0.86; Nunnally, 1978).
Data collection and analysis. The pretest and posttest results were archived in the Moodle database. One-hundred eight students were emailed the final questionnaires that were completed online.
A total of 90 students responded (83% response rate) which was considered adequate for analysis. The collected data were exported to SPSS and coded in order to carry out both descriptive and inferential statistics as per the research objectives. Using SPSS, quantitative analysis was carried out and the results were tabulated. To compare the students' experiences with different group formation methods, crosstabulations were carried out on various items as per the research questions.

One way ANOVA on Pretest and Posttests
The ANOVA analysis results shown in table 7 indicate that, the Sig. values (p) for pretest and posttests were above the alpha value (0.05). Therefore, there was no statisticaly signficant difference in the mean score for all the tests between the three classes. = the value to be compared with the alpha value (0.05)

Post-study Questionnaire
Demographic information. A total of 90 students responded out of 108 students who had participated in the study, with class one having 29, class two 29 and class three 32. There was a gap in the gender equity as 75% were male and 17% female. The low percentage for female participants was expected because the study was based on students who were doing computer science course which had few female students enrolled for the course. Table 8 summarizes the demographic data.

Note. n=90
Problems experienced during the group task. Table 9 summarizes the frequencies of the observed problems in terms of the class mean and the overall mean. Participants who experienced problems in individual contribution imbalance and problems with negotiation skills were fewer in class two than the other two classes. However, as observed from p values there was no statistically significant difference on the problems experienced during the group task between the three classes. Note. The mean is equivalent to the proportion of yes responses. The p value = significance of difference between class one, class two and class three: *p<0.05 Group outcomes. Table 10 summarizes the group outcomes which include: (a) effectiveness of the group discussion as a learning tool, (b) effectiveness of the group leader, and (c) group task satisfaction. Note. Ratings are based on a 5-point Likert scale where 1 = strongly disagree and 5 = strongly agree. p value = significance of difference between class one, class two and class three: *p<0.05

Experiences during the group task.
Through an open ended item, the participants were requested to briefly explain their best and worst experiences they had during the discussion period. The results from best experiences were coded into seven items which are shown in Figure 2 and those for worst experiences were also coded into eight items which are shown in Figure 3. Basically majority of the student reported that learning from peers was a good experience (27%) and it helped them understand the concepts studied (11%). For the worst experiences, slow access to site or slow internet (36%), was a major problem.    It was fantastic moment since I was able to learn a lot from my peers who are doing the same course as me since people who could not contribute on face to face discussion group may be due to lack of confidence and may be didn't know how to express themselves in front of people contributed and it was just surprising to see how they had good ideas which really helped a lot during discussion. having the lecturers summarized notes online made learning easier and peaceful Understanding the concept I was able to understand the topic under discussion better than when I came in. I experience the most effective way of learning, it built my knowledge on online skills Learning experience was interesting Discussing the subject matter and giving views. The chance I got to interact with the other members in that platform was really good. It was better than face to face discussions because I could research by myself and post to the group.
It was new, enjoyable. I got to learn about AI more than I did individually Social Interaction and exchange of ideas online During the online discussion, I manage to gain a lot since we were able to openly post. Question and discuss the possible answers in length unlike when we are in class. More so the discussion group minimal enough for effective discussion, furthermore those I did not know I was able to know them better

Discussion
We discuss the results of the study based on three research questions.

Research Question 1 (RQ1)
Which group of learners amongst the intelligently-grouped (class two), randomly-grouped (class three) and instructor-grouped using GPA (class one) performs best in an online collaborative environment task?
From the ANOVA analysis shown in Tables 7 there was no statistical significance difference between the means for all the posttest scores; therefore, there was no statistical significance difference among the classes. This means the effectiveness of intelligent group algorithm is equally the same as the random assignment and GPA instructor based grouping mechanisms. Therefore, the intelligent grouping algorithm was able to generate heterogeneous groups where members have diverse backgrounds including collaboration competencies, learning capabilities and social background similarly to what has been proved in random assignment. However, the method of group formation had a slight effect on the mean scores in all posttest scores. The differences in minimum scores could account for this slight difference in the mean scores. For example, in the quiz, class two had the highest median score (9.0) and the minimum score (2.8). This minimum score could have reduced the mean.

Research Question 2 (RQ2)
What is the association between grouping method used and group outcomes in terms of: (a) students' learning experiences, (b) perceived problems, (c) group leadership satisfaction, and (d) group task satisfaction?
Findings from the study indicate that two major problems were experienced: first, an individual contribution imbalance with some members contributing less than others (52%); and second, a lack of participation feedback in all the three classes (48%). This coincides with other studies in which the two major problems do prevail in an online collaborative learning environment (Capdeferro & Romero, 2012;Liu, Joy, & Griffiths, 2010;Muuro, Wagacha, Oboko, & Kihoro, 2014;Roberts & McInnerney, 2007;Zorko, 2009). The mean scores for yes responses were different on the two major problems in the three classes. The GPA assignment method (class one, 66%) had more participants experiencing contribution imbalance than the other participants who were assigned groups through intelligent grouping algorithm (class two, 41%) and random assignment (class three, 50%). This could probably be explained by the fact that GPA method had assigned students to groups based on their academic performance such that for each group there was student with higher GPA. These students with higher GPA could have dominated the discussion because they are more knowledgeable than others causing contribution imbalance. On the other hand, intelligent grouping method had the lowest participants experiencing this problem. This could probably be explained by the fact that this method had grouped students based on their collaboration competence levels such that for each group there was at least one student who had high collaborative competence. These student could have pulled the team together and make members collaborate more evenly with minimal contribution imbalance. However, in regard to these differences no statistically significant relationship found among the three classes in group problems as per the p values (see Table 9).
As observed from Table 10, all the items for evaluating the effectiveness of the discussion forum as a learning tool were positively rated with some having an overall mean value above average in all items.
This coincides with other studies for constructivist approach to learning where peer learning has been reported to be more effective on helping learners to interpret, clarify and validate their understanding through constructed dialogue and negation with their peers than individual learning (Garrison, 1993).
Furthermore, this also supports the fact that discussion forums do support e-learning by enabling learners to actively construct knowledge by formulating ideas into words that are shared with and built on through the reactions and responses of their peers in the forum (Harasim, Hiltz, Teles, & Turoff, 1995). Although there was a slight mean difference on the learning experiences in the three classes, according to the p values in Table 10 none of the p values was less than 0.05 (p<0.05 ); hence, there was no statistically significant relationship between the group formation method and the learning experience outcome.
Therefore, the study found that the learning experience outcome was similar for all learners regardless of the group formation method.
On the effectiveness of the group leader, the mean values ranged from 2.88 to 3.52; students positively recognized the roles played by the peer group leaders with the highest being enjoying working together and the lowest role was summarization of group's discussion. Groups formed using GPA (class one) had their group leaders assigned using GPA, where the student with the highest GPA value in the group being assigned the role. Intelligent grouping method group (class two) had their leaders assigned from cluster one which had the most collaborative group as per the collaboration competence level. In random group formation group (class three), the group leader assignment was done through random assignment.
Regardless of the group formation and group leader assignment method, group leaders agreed that they enjoyed playing the leadership role and this motivated them to read widely. Group members also enjoyed the role played by their leaders but they acknowledge that most of the group leaders were unable to summarize the group's discussion. This was an indication that some roles, like summarization and making conclusion in a discussion, are more difficult to be realized through a group leader. Furthermore, there was a statistically significant relationship between the group leader summarization role and the group formation method (class type) where the p value = .020 (Significance of difference between class one, class two and class three: p<0.05).
On group task satisfaction, all the items were positively rated in all the three classes. Members enjoyed working in groups and more specifically on peer learning where they are able to criticize one another and reach a consensus. Group size which was four students per group was felt to be effective and most students recommended more group work in future studies with the same group membership, with a few citing a need for a change in group membership to get new experiences and exposure from new members.
These group task outcome experiences were felt almost similarly in all classes regardless of the group formation technique. Therefore, the study did not find any statistically significant association as observed from Table 10 where none of the p values was less than .05 (p<.05). These outcomes coincide with other studies which found that when group work learning is shifted from teacher control to student peer groups, it helps learners to acknowledge their dissent, disagreements and share their doubts (Bruffee, 1999). In addittion, students become co-constructors of knowledge rather than consumers.

Research Question 3 (RQ3)
What are the students' perceived benefits of online collaborative learning?
From the cited examples on benefits which were reported by the participants (see Table 11), students' responses confirmed that online collaborative learning has a number of benefits including: peer learning which provides a platform to freely criticize others work and offer alternatives making the learning process enjoyable, a platform for social interaction and exchange of ideas, and it provides a better opportunity for understanding concepts which are difficult to learn individually. These cited benefits truly correspond to the advantages of constructivism theory of learning (Palloff & Pratt, 1999) and the observed benefits of online collaborative learning from other studies.

Conclusion
Results from this study provide empirical evidence on the capability of an intelligent grouping algorithm to group students in a desirable manner which provides learning opportunities among peers similar to those ones realized through random assignment and GPA instructor-based methods. In addition, this intelligent grouping method guarantees heterogeneity based on learners' collaboration competence level unlike the random assignments method which only increases the likelihood of heterogeneity in the group.
With the understanding that GPA group formation method involves the instructor and it may not be dynamic, instructors are more likely to adopt our intelligent grouping methods as the findings show that both have similar results. Overall, it appears the intelligent grouping algorithm provide an added advantage in supporting group formation due to its guarantee on heterogeneity, dynamism, and less instructor involvement.
The positive findings on the role of group work as a learning tool from the students' perspective informs the instructors the importance of including collaborative work in instructional design. In addittion, the positive findings provide a learning experienace to students with poor individual leaning skills to improve their learning through group learning. This enhances the overall quality of e-learning as well as increases the learner's confidence.
Further research should explore how online collaborative learning can be made more effective by examining the instructors' role in supporting group work, perceptions of group work, and level of experience in conducting collaborative learning. This could also shed more light on how to improve the quality of online collaborative learning. Future studies could also consider examining the effectiveness of collaborative learning in enhancing students' learning skills and improving the level of knowledge constructed in blended e-learning platforms.