Помилки мають здатність “множитися” з рівня на рівень, і в результаті якість аналізу є такою, що про бажаний кінцевий результат — машинний переклад ніхто й не згадує. Втім, можна обійтися без “лінгвістичної інформації” та застосовувати лише статистичні методи. Це було зроблено на початку 60-х і результати не були втішними. До статистичних методів повернулися років за 30 — і тут багато дослідників були здивовані. IBM оприлюднило результати в межах проекту Candide (мови англійська-французька), які свідчили, що половина перекладених фраз була абсолютно правильною (точний відповідник)
Verbmobil почався у 1993 і закінчився у 2000 році, об’єднавши при цьому 22 університети та 7 підприємств. Мови перекладу — англійська, німецька та японська. Причому, переклад не текстів, а мовлення, себто спершу слід було оцифрувати діалоги, розпізнати слова та речення. Труднощі розпізнавання слів полягають у мугиканнях, хмиканнях, й що цікаво, виявилося, у розмовах люди часом повторювали одне й теж слово кілька разів підряд, або й взагалі говорили неграмотно.
Якщо в минулому багато проектів з машинного перекладу базувалися на правилах то у Verbmobil було вирішено застосувати гібридний підхід. Згенерувавши можливий переклад з застосуванням багаторівневого лінгвістичного аналізу та переклад з застосуванням статистичних методів, обирається найкращий. За оцінками перекладачів, з 25000 перекладених прикладів 74,2% були перекладені правильно. Або, іншими словами, статистичні методи є простими у застосуванні, хоча переклад не завжди є влучним, а використання семантики вимагає багато часу, але продукує якісніший переклад. Але є учасники проекту, що стверджують — його “витягнули” саме статистичні методи. Тому особливо цікавим було б побачити детальні результати окремо по кожній з підсистем: тій, що використовувала лінгвістичний аналіз, і тій, котра використовувала статистичні методи. Цікаво, тому що вперше був виконаний глибинний аналіз для трьох мов — від розпізнавання слів та речень, і до семантики дискурсу, були використані сучасні формалізми, зокрема, в області синтаксису HPSG, дискурсу — теорія представлення дискурсу DRT. І, певна річ, важливим було б оцінити, наскільки здійснення й впровадження глибинного лінгвістичного аналізу покращують якість машинного перекладу.
Результаты (
английский) 1:
[копия]Скопировано!
Bugs have the ability to "there" from level to level, and as a result the quality of the analysis is such that the desired end result is machine translation nobody mentions. However, you can bypass the "linguistic information" and apply only to statistical methods. This was done at the beginning of the 1960s, and the results were not satisfying. The statistical methods returned 30 years — and there are many researchers were surprised. IBM released the results within the project's Candide (English-French), who testified that half of the translated phrases was absolutely correct (exact match)Verbmobil began in 1993 and ended in 2000, joining the 22 universities and 7 enterprises. Language translation--English, German and Japanese. Moreover, the translation of texts and speech, that is the first to be digitize dialogues, to recognize words and sentences. Difficulty recognizing words are mugikannâh, hmikannâh, and Interestingly, it turned out, in the conversations people at times echoed the one and also the word several times in a row, or generally talked about illiterate.If in the past a lot of projects with machine translation based on the rules in the Verbmobil was decided to use a hybrid approach. Generating the possible translation using multi-level linguistic analysis and translation of the application of statistical methods is best. According to translators, with 25,000 translated examples 74,2% were translated correctly. Or, in other words, statistical methods are simple to use, although the translation is not always on the mark, and the use of semantics requires a lot of time, but produces a better translation. But there are project participants that claim — it "dragged" is the statistical methods. Therefore, it would be particularly interesting to see the detailed results separately on each of the subsystems: the, which used the linguistic analysis, and one which used statistical methods. Interesting, because the first time was made a deep analysis for three languages — from the recognition of words and sentences, and the semantics of discourse have been used by modern formalisms of systems, in particular in the field of syntax HPSG, discourse representation theory discourse DRT. And, of course, it would be important to assess how implementation and introduction of deep linguistic analysis to improve the quality of machine translation.
переводится, пожалуйста, подождите..
Результаты (
английский) 2:
[копия]Скопировано!
Errors have the ability to "multiply" from level to level, and as a result the quality of the analysis is that of the desired end result - machine translation, no one remembers. However, you can not do without "linguistic" and only apply statistical methods. This was done in the early 60s and the results were not encouraging. By statistical methods back 30 years - and there are many researchers were surprised. IBM announced the results of the project Candide (English-French language), which showed that half of translated sentences was absolutely correct (exact match) Verbmobil began in 1993 and ended in 2000, combining with 22 universities and 7 companies. Translations - English, German and Japanese. Moreover, no translation of texts and speech, that first one would digitize dialogues, recognize words and sentences. The difficulty lies in the recognition of words muhykannyah, hmykannyah, and interestingly, it appears in conversation people sometimes repeat one and the same word several times in succession, or generally speaking illiterate. Whereas in the past a lot of projects on machine translation based on rules then decided Verbmobil apply a hybrid approach. Generating possible translation using multi-linguistic analysis and translation using statistical methods, elected best. According translators, with 25,000 examples of translated 74.2% were translated correctly. Or, in other words, statistical methods are simple to use, although the translation is not always accurate, and the use of semantics requires a lot of time, but produces higher quality translation. But there are project participants that claim - it "pulled" is statistical methods. It is therefore particularly interesting to see detailed results separately for each of the subsystems to that used linguistic analysis, and the one that used statistical methods. I wonder why that was first performed in-depth analysis of three languages - recognition of words and sentences, and the semantics of discourse used modern formalism, particularly in the area of syntax HPSG, discourse - discourse representation theory DRT. And, of course, important to assess whether the implementation and application of deep linguistic analysis to improve the quality of machine translation.
переводится, пожалуйста, подождите..
Результаты (
английский) 3:
[копия]Скопировано!
mistakes have the ability to "multiply" with a level level, and as a result the quality of analysis is such that about the desired end result - machine translation and nobody remembers. However,You can do without the "linguistic and used only statistical methods. This was done at the beginning of 60s and results have encouraging.The statistical methods back years over 30 - and here many researchers were surprised. IBM released results within the framework of a project Candide (languages available for English and French), which demonstrate,Half with phrases was absolutely correct (the lists of tabulator)
Verbmobil began in 1993 and ended in 2000, uniting 22 universities and 7 enterprises. Translation Languages is available for English,German and Japanese. Moreover, translation don't texts, but broadcasting, i.e. first have to be digitized dialogs, identify words and sentences. Difficulties word completion lie in мугиканнях, хмиканнях and interestingly,It turned out, in the conversations people sometimes repeted one and the same word several times in succession or will generally speaking неграмотно.
If last many projects with machine translation based on the rules of the Verbmobil it was decided to apply hybrid approach.Generating a possible translation with the use of multilevel linguistic analysis and translation of statistical methods is elected by the best. According to the estimates of the translators,With 25000 with examples of 74.2 percent were translated into correctly. Or in other words, statistical methods are simple to use, although the translation is not always accurate, and use semantics requires much time,But generates struc translation. But there are participants of the project, which according to its "pulled" this kind of statistical methods. So was especially interesting to see detailed separately for each subsystems: age groups,What used linguistic analysis, and age groups, who used the statistical methods. Interesting, because was first implemented in-depth analysis for three languages - from the face of words and sentences,And semantics discourse, were used modern формалізми, in particular, in the field of syntax, HPSG discourse - the theory of discourse DRT. Naturally, this was an important would have to evaluate,As far as the exercise and introduction of in-depth linguistic analysis of improving quality of machine translation.
переводится, пожалуйста, подождите..