Wednesday 4 July 2012

Education Emptiness

I have read too much Roger Schank. And I've been thinking too much. This is something best avoided, especially when marking student essays. Why? Because it makes you question what it is we're doing in Higher Education.

Assessing Essays

Marking essays is a soul-destroying task, as you have too little time to spend on each essay, and you have a large pile of essays to process.  Most students spend days and weeks on preparing their essays, so it always feels wrong to read and assess them in about half an hour, including writing up your feedback. This is very unsatisfactory, but otherwise one simply cannot turn around the marking in the allocated time.

But the worst thing about marking is its reductionist nature. An essay is a complex piece of writing, comprised of style, argument, expression of knowledge, understanding, interpretation, analysis, discussion etc etc. And all these different dimensions get conflated into a single point on a one-dimensional scale: a grade between about 40 and 70. This is just not right.

Many essays end up having the same numerical grade assigned to it, but they are not really comparable. One student might write eloquently but superficially, another provides deep insights with terrible grammar. One student might have a great idea, but not much understanding of the underlying concepts. Another one has solidly learned all what was required but lacks the creativity to apply the principles to a given problem. Yet, they all get the same numerical value. Different feedback, sure, but that does not really count for anything.

School children get detailed reports (besides a few simple letter grades), but in HE there are simply not the resources to do this, as there are too few staff and too many students, unless you are in Oxbridge. In principle that should not be an insurmountable problem.

You're doing it wrong!

After grading a mathematical crime is committed: the numerical grades are added up and averaged. This is simply not possible. The numbers are not numbers, they are labels that look like numbers. If we assigned the essays letters, then it would be more obvious: what is the average of A and B? But that is a completely different issue to be discussed on another day...

Essentially, then, after a lot of adding and averaging, the whole three years a student spends at university is reduced to a single label again, the degree classification. This is again an enormous reduction of a multitude of information into a single point out of four. And this point decides what possible career a student can then pursue...

As the degree class is so important (and expensive, especially for the incoming cohorts of students), this tends to be at the forefront of students' minds. This is of course a wild generalisation, and there are many exceptions, but from my experience a lot of students are primarily interested in getting good grades. Learning becomes secondary, and only the means to the end of achieving good grades. That means, curiosity, a central ingredient for successful learning, suffers, or rather, is redirected into finding out how to get grades. One cannot really blame students for trying to game the system, which is essentially what they learn to do in the end.

It does work elsewhere...

Postgraduate work is different, though, as it is less regimented. And, more importantly I think, PhD students do not get a grade. It's pass or fail. You either get a PhD, or you don't. There are of course, differences: you could get through with major corrections, minor correction, or no corrections. But nobody will know whether you scraped through with a 'revise and re-submit' or sailed through without any required corrections. If it works for PhDs, why not for UGs as well?

The problem is that we've got too many undergraduates, so there needs to be some differentiation. But why? Who wants it? Presumably those who employ graduates, so that they can see who is better or worse. But does the degree class really reflect vocational ability? I would doubt that, but in the end it is just another filter to reduce the 52 applications for each graduate job to a manageable number. With PhDs this is not so much of an issue, as you can get a more rounded picture by looking at their previous grades or even publications.

In-Conclusion

So what is the solution to this dilemma? It basically requires a system change, which is probably not feasible. Employers want differentiation, universities want to climb up the league tables (which nowadays tend to include employability metrics), and students want to have something that distinguishes them from the crowd. But in the process, education suffers. Learning is not really the focus of HE, and we're just churning out graduates who are good at spotting what is needed to get a good grade and doing just that.

We're assessing far too much, and it destroys what I think universities are all about: expanding your horizons, applying your knowledge and curiosity to interesting problems, be able to fail tasks without jeopardising your future career, and generally maturing and learning stuff.

Thursday 12 January 2012

On-line Education?

I have last term completed the on-line module in Artificial Intelligence offered by Stanford's Sebastian Thrun and Google's Peter Norvig–both top-academics and experts in their field. I guess it was successful, as I received a grade of 79% (a 'first' in UK terms, but I have the suspicion it doesn't work like that). Given the minimal effort I put in (mainly due to lack of time) I could very likely have achieved a better result with some extra work. But with a full-time job it's not so easy to put aside 10 hours a week for doing so, which was the amount of time recommended by the course leaders.

So I got 79%, but did I learn anything? And do the 79% reflect my achievements? And what was the overall learning experience like?

First, the learning experience: the module was delivered as a series of short low-tech video lectures, interspersed with multiple-choice or number-entry quizzes. Then there was homework (multiple-choice and number-entry quizzes) and a mid-term and final exam (both multiple choice and... you get the idea).

The lectures were interesting: it was a camera view top-down on a piece of paper on which the lecturers would (hand-)write, not just a filmed 'lecture'. The tone was informal and friendly, and Thrun's charming German accent made me almost feel at home. And I also learned–from the few head-shot video sequences–that Peter Norvig likes colourful shirts.

The quizzes, however, were rather limited. There was the problem of turning quite complex material into a simple format, and also (which I found hardest) missing context. As a result the questions were often trivial side-aspects, or impossible to answer due to ambiguity (judging from the few forum posts I looked at, many other people had the same issue). You can interpret a question in many different ways, especially if you need to take into account external constraints which have not been clearly specified.

Quite often you get an answer wrong, and then look at the explanation of the proper outcome, and you think "oh right, that's how they meant it".

I quite struggled with Bayes networks, and I consistently got the wrong answers when asked how many independent parameters I would need to describe one. To this day I do not know why I need to know this. I can guess, but it wasn't really explained. Formal logic was one of the things I felt very comfortable with, as I had covered that in my own UG studies as a computational linguist, but I only got 1 out of 4 points in the final exam question, as I made one small error; based on that the subsequent answers were also wrong.

My best results were in computer vision–100%. And that's even though I'm short-sighted! But do I really understand computer vision so much better than all the other areas of AI? No. Thing is, all that was asked in the relevant quizzes was basic maths. There was a simple formula, relating various parameters such as focal length and distances to each other, and all you had to do was resolve the equation for different values and work out the result. I would have been able to do this beforehand, and didn't even learn that in the course. Still, I was assessed on it and scored 100%. But anybody with basic maths would have been able to do that, even without watching a single minute of any of the class videos.

So my first criticism is: the quizzes were not designed properly. There is a lot more one can do with multiple choice question, but Thrun and Norvig didn't do it. The assessments felt like an ad-hoc addition, along the lines of "I need a quiz now, so what could I ask?".

My second criticism is the way the scoring worked. One slight mistake, nil points. In a real exam you would get points for results which are wrong, but only because of a mistake in an intermediate step. An all-or-nothing approach is not very helpful.

Is this the future of education? Are on-line classes like this all we need? I don't think so. Apart from the implementation–it'd be easy to come up with some better quizzes–it's also quite detached. There is little direct interaction (impossible with 140,000+ students), and at times you feel a bit lost. It is obvious that this was an experiment, and as such it is not possible to expect wonderful and perfect results, but there is still a long way to go.

Did I learn anything I would not have learned from reading a book? Probably not. The main advantage for me was to have the pressure of getting through the weekly session before the hand-in date, which makes you put aside time you would otherwise spend on something else. So in that respect it is alright; and the fact that it was delivered on-line was convenient as you could choose the time when you wanted to study it. But while this is good for a supplementary course, I am glad I did have proper seminars and lectures when I went to university.

While you can't argue with a free course (you did get more than you paid for!), there is still a lot of scope for improvement for this particular type of course, an on-line distance course, and I cannot see it replacing 'proper' seminars any time soon. But it was overall an interesting experience, if only to find out what 'real' teaching should be like.