Educational Research:The Gold Standard?

10 Feb


The power of blogging lies in a community of people, having similar interests weighing in on a single subject from different perspectives.  As the information is generated, worked over and cooked a sort of cross pollination occurs where an idea becomes improved upon.  This is more than just criticism, which can be part of the process.  It is just giving differing point of views and contributing different perspectives.


NCLBlog opened up a can of worms with their post on research.  John basically points out how differing interests manipulate research claims and studies to suit their own purposes.

Each side cites studies that support their claims while they refute the opposing studies for flawed methodology or being insignificant.


No kidding.


Educational research, under the best of conditions, is difficult.  The randomized group designs advocated by NCLB as the gold standard have their roots in the methods we use when testing the yield of various agricultural crops or the performance of animals.  For instance, if I want to test the effectiveness of weed control measures, I randomly assign different plots of crops to the experimental or control conditions. Then, they all get treated the same otherwise as far as weather, fertilizer, hours of day light and other pests.  The crops are monitored and observations are made throughout the growing season and a person might be able to see the result visually if the results are remarkable enough.  But the telling evidence is in the yield, when the crops are harvested.  If there is a significant difference in yield in all the experimental plots as opposed to the control plots, then we might attribute it towards the independent variable, which in this case is weed control.


The problem with using this method of research on students is that they are not plants, which are relatively easy to control.  Plants don’t ride home on a bus at the end of the day entering a myriad of different environments that can affect educational performance.  They are not refugees from a hurricane, for instance.  Another problem, and this is even more critical, is that the random assignment of students only yield results of sufficient statistical power if the groups are large.  This is fine in a wheat field where one plant is pretty much like another.  Each wheat plant only represents a handful of grain.  But each individual student represents a life span much longer than that of any agricultural commodity and a potential resource to a family, neighborhood or community much greater than an entire field of wheat.  With large groups, statistical significance is measured only in terms of the aggregate as if each student is a data point and might as well be a bushel of grain.


I’ll give one more flaw to this methodology, which is one of ethical consideration.  Supposing I study a group of 200 students who are behind in reading, comparing some type of new reading instruction designed accelerate the reading abilities of fifth graders.  Students are randomly assigned to two different groups.  One group of 100 receives the new type of instruction by specially trained teachers.  The other group is the control group and receives whatever instruction is regularly given.  At the end of my study, I discover that my experimental group increased their reading ability by an entire standard deviation over the control group.  Woohoo!  High fives all around!  Right?  Well, yes.  And no.  What happens to those 100 students randomly assigned to the control group as they head off to middle school?


  And this is considered the “gold standard” of NCLB?


This is an often ignored flaw in educational research where groups are randomly assigned.  In a pretest/post-test method, large spans of time elapse between tests.  Through no fault of their own, the control group gets left…behind.  No one cares if an individual wheat plant is assigned to a condition that offers or denies a variable that might radically improve its yield.  But for parents, each child represents an investment that is significant beyond anything that can be measured by statistics. 


This is not to say all group designs are unethical or useless.  But the more statistical power a group study has, the less meaning it has for any given individual.  This simply goes against the grain (oops, no pun intended!) of most teachers who know each student by name.  None of us want to leave any of them behind.  Just something to keep in mind during the debate over which study is or is not valid.