Tux

...making Linux just a little more fun!

Apertium-en-es 0.7.0

Jimmy O'Regan [joregan at gmail.com]


Sat, 5 Dec 2009 16:44:28 +0000

---------- Forwarded message ----------

From: Jimmy O'Regan <joregan@gmail.com>
Date: 2009/12/5
Subject: Re: Apertium-en-es 0.7.0
To: Apertium-stuff <apertium-stuff@lists.sourceforge.net>

2009/12/5 Jimmy O'Regan <joregan@gmail.com>:

> I've just released en-es 0.7.0. I'm not a big fan of release notes, so
> here's the changelog:
>
> Sat Dec  5 15:23:45 GMT 2009
>
>  * Release 0.7.0 ('More of an escape than a release')
>  * Vastly improved handling of Anglo-Saxon genitive in transfer (Mireia)
>  * 'Apostrophe genitive' (Jesus' etc.) in analysis
>  * Pre-transfer processing of apostrophes/genitive (based on an idea by
>    Jacob Nordfalk/Francis Tyers)
>  * Greatly increased vocabulary (most notably, Paul "greenbreen" Breen)
>
> In addition, this version has some generation support for the two main
> dialects of English, British and American, though this only currently
> supports simple spelling differences for the moment ('favour' vs.
> 'favor').
>

Oh, and because I like to collect amusing translation errors, this error from 0.6 is fixed in 0.7:

0.6, es->en
'Fondo Monetario Internacional' -> 'International Monetary bottom'
0.7, es-en
'Fondo Monetario Internacional' -> 'International Monetary Fund'


Top    Back


Ben Okopnik [ben at linuxgazette.net]


Sat, 5 Dec 2009 12:24:28 -0500

On Sat, Dec 05, 2009 at 04:44:28PM +0000, Jimmy O'Regan wrote:

> 
> Oh, and because I like to collect amusing translation errors, this
> error from 0.6 is fixed in 0.7:
> 0.6, es->en
> 'Fondo Monetario Internacional' -> 'International Monetary bottom'
> 0.7, es-en
> 'Fondo Monetario Internacional' -> 'International Monetary Fund'

[grin] Spanish, for whatever reason, lends itself to this kind of thing. Given the colloquial variations, the slang [1], and the homophones, it can be quite entertaining... and gives Spanish speakers many reasons to laugh when they hear someone learning their language.

I'm not sure if I've told this story here before, that of a friend who was married to a Puerto Rican woman. He went on vacation somewhere in South America (I don't recall where) with her and his sister-in-law. His wife caused a bit of a sensation when they were down at the beach and she yelled to him to go get her sister - "agarre mi hermana!" "Agarrar", you see, literally means 'to grab' - and is interpreted quite differently in PR and wherever it was that they were vacationing...

[1] There should be some term that differentiates between the types of slang that use new words for old meanings - e.g., "skag", "boodle", etc. - and the types that use common words and assign new/additional meanings to them. The second type in Spanish is very broad - much broader than it is in English, in my opinion - and leads to much confusion, particularly when speakers from different Hispanic cultures come together.

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *


Top    Back


Jimmy O'Regan [joregan at gmail.com]


Sat, 5 Dec 2009 17:53:27 +0000

2009/12/5 Ben Okopnik <ben@linuxgazette.net>:

> On Sat, Dec 05, 2009 at 04:44:28PM +0000, Jimmy O'Regan wrote:
>>
>> Oh, and because I like to collect amusing translation errors, this
>> error from 0.6 is fixed in 0.7:
>> 0.6, es->en
>> 'Fondo Monetario Internacional' -> 'International Monetary bottom'
>> 0.7, es-en
>> 'Fondo Monetario Internacional' -> 'International Monetary Fund'
>
> [grin] Spanish, for whatever reason, lends itself to this kind of thing.
> Given the colloquial variations, the slang [1], and the homophones, it
> can be quite entertaining... and gives Spanish speakers many reasons to
> laugh when they hear someone learning their language.
>

Eh... not particularly, for the limited subset of Spanish that we work with.

Our English-Spanish module is aimed at 'general purpose' translation, which more or less means 'things likely to appear in a newspaper' - the same (mostly) goes for our other translation modules, though some have 'dictionary variations', where we have a preprocessor that allows us to build more domain-specific dictionaries as well as 'general purpose'.

Homophones don't present much of a problem, as, in 'newspaper grade' text, you can more or less assume that the editing staff have caught any potential errors caused by such ambiguities. I've started to work on speech-to-speech translation recently, but haven't encountered many problems with Spanish (English, however, is extremely problematic).

Anyway, I say 'not particularly' more so because I've also got English-Italian and English-Portuguese slowly brewing, and both of those languages have a similar set of problem areas (check out how many meanings 'pasta' has in Italian :).

> I'm not sure if I've told this story here before, that of a friend who
> was married to a Puerto Rican woman. He went on vacation somewhere in
> South America (I don't recall where) with her and his sister-in-law. His
> wife caused a bit of a sensation when they were down at the beach and
> she yelled to him to go get her sister - "agarre mi hermana!" "Agarrar",
> you see, literally means 'to grab' - and is interpreted quite
> differently in PR and wherever it was that they were vacationing...
>

:)

>
> [1] There should be some term that differentiates between the types of
> slang that use new words for old meanings - e.g., "skag", "boodle", etc.
> - and the types that use common words and assign new/additional meanings
> to them. The second type in Spanish is very broad - much broader than
> it is in English, in my opinion - and leads to much confusion,
> particularly when speakers from different Hispanic cultures come
> together.

Heh. If you ever have a few years idle, studying slang always makes for an interesting PhD thesis...

-- 
<Leftmost> jimregan, that's because deep inside you, you are evil.
<Leftmost> Also not-so-deep inside you.


Top    Back