![]() ![]() Describing the conference for a 2009 article in a linguistics journal, Isabelle remembered people “shaking their heads and spurting grunts of disbelief or even of hostility.” “We were all flabbergasted,” recalled Pierre Isabelle, a computer scientist who had been active in the field since 1975. Several years later, Canadian computer-translation researchers sat, shocked, as they listened to an IBM team describe for conference audiences a revolutionary new translation method the company had developed using, as quoted in their published paper, “our Hansard data.” As IBM wrote, “We have chosen to work with the English and French languages because we were able to obtain the bilingual Hansard corpus of proceedings of the Canadian parliament.” Then, the Hansard data arrived at their door. IBM researchers had a theory they had no ability to test. Today, a computer can scan a book that contains side-by-side translation and extract its text relatively easily in the 1980s, nothing remotely like that was possible. Running a probability analysis like that-one robust enough to map one entire language onto another solely on the basis of word frequency and order-requires a huge data set. Rather than a linguistic art, IBM saw translation as a matter of statistical optimization. ![]() They wondered what would happen if they simply looked at translation as a probability calculation, examining the frequency with which words appeared, and in what order, in any given language-sheer mathematical guesswork. But IBM researchers had a different idea. Computerized translation efforts were therefore based on attempts to deeply analyze the grammars of two languages and then program complex sets of rules that would tell a computer how to transform one of those languages into the other. Most researchers working on computer translation at that time primarily understood it as a linguistic problem: solving it would mean uncovering how a given language was structured. But computer scientists there started experimenting with the data it contained, wondering if they could use it to assist in their efforts to develop a method for automated translation. ![]() To this day, nobody is quite sure who sent the tape reel or whether they were authorized to forward it to IBM.
0 Comments
Leave a Reply. |