Essentially by scouring likely corpora (a much, much easier job than it used to be thanks to the Internet) for matching strings. As pointed out in Language Log, identical strings of more than five words are exceedingly rare unless they are clichés of some sort or another. If, for instance, an essay on Peoria includes the line "this fort would later burn to the ground", you can be all but certain that the author lifted it from the the relevant Wikipedia article (or from the same source that its author cribbed from).
no subject
Date: 2009-04-10 09:52 pm (UTC)