Russian Language Blog
Menu
Search

Four Resources for Natural-Sounding Russian Posted by on Feb 20, 2014 in language

We struggle with sounding idiomatic in our second language(s). We may master vocabulary and grammar and be able to express ideas in a way that is technically not wrong, but… так не говорят (you don’t say that). Naturally, learners are encouraged to watch TV, listen to music and podcasts, and travel to the country of the language, but what if you need to form a sentence right now and can’t do any of these things?

Let’s do a case study where we take two English phrases and test them on different resources to get idiomatic Russian results. For this example, I will use a basic phrase like “I’m a good student” and a more advanced one like “cutoff date.” Of course, you can look up the individual words in a dictionary, but do have no way to know if the translation is any good and, if it is, if it’s something people actually say.

1. Russian-speaking friend

Pros: idiomatic – most likely correct
Cons: could be regional – could be colloquial – could be grammatically incorrect – your friend may not know the answer – you can’t tell how frequently their variant is used

The obvious answer is to ask a Russian-speaking friend for a translation. This will most likely give you a fluent-sounding answer, although you have to be careful since your friend does not represent all Russian-speaking people and may not be aware of register and regional differences. Lastly, you may simply not have a Russian-speaking friend!

For the purposes of this demo, let me be your Russian friend. I would go with “я хорошо учусь” and “конечный срок,” respectively. As you see, you have no way of telling whether these answers are appropriate for all registers and whether most Russian speakers would chose them.

2. Search engine

Pros: gives you figures – points out non-existing phrases
Cons: you need to know what to look for – search may pick up homonyms

Another way of checking whether your suggested answer is something people usually say is to type it into a search engine (поисковая система), in quotation marks if you need to check an exact phrase, to see how many hits you get. You will then have numbers to back up your choice. Moreover, if your proposed choice is off the wall, you will get few to no hits.

Try using a Russian search engine like Yandex — in my experience, Google tends to be biased towards the US, especially if your system and browser uses US English. “Я хороший студент” returns 185 hits in Yandex; “я хороший ученик,” 3,ooo; “я хорошо учусь,” 6,000; and “я учусь хорошо,” 4,000. The results for “cutoff date” are:

крайний срок” – 394k
последний срок” – 220k
конечный срок” – 173k
крайняя дата” – 10k
конечная дата” – 131k

The obvious problem is what if you don’t have any proposed variants or the ones you come up with return no matches. In the example above I was the one who came up with all the phrases, so a search engine may not be the first choice for beginner Russian speakers. Moreover, depending on the search engine, you may get a lot of noise due to homonymy and may not be able to control for context and register.

3. Monolingual corpus

Pros: robust search system (can control for grammar and sense) – authentic texts
Cons: complex search syntax – biased towards journalism and fiction – you need to know what to look for

The Russian National Corpus (Национальный корпус русского языка) is a robust resource that allows you to search more than 5 million words from authentic Russian texts. It has an English interface to make tagging for grammar and meaning easier. Unlike in bilingual corpora, most texts in this corpus were originally written in Russian.

я учусь хорошо” – 3 results
я хорошо учусь” – 5 results
я хороший студент”  – 0 results
я хороший ученик” – 0 results

You can do the same for the numerous variants we had for “cutoff date.”

On the downside, the complex tagging system and choosing a sub-corpus may be daunting for the first-time user. In addition, this corpus is biased towards fiction and journalism and may not fully reflect the current business, technology, or commerce usage.

4. Bilingual corpus

Pros: can see context – can (sometimes) compare frequency
Cons: not all Russian authentic – noise – biased toward business and finance

My favorite bilingual corpus Linguee recently added Russian to the list of languages. It is a collection of parallel texts harvested from the Internet. The huge advantage of this method is that you can input a whole phrase in English and look at the different contexts it appears in and their Russian equivalents. Moreover, it lets you think outside the English syntax rather than translate the English words literally.

The downside is that, unlike in the fiction-based National Corpus, not all texts were initially authored in Russian, and you may end up reading translations done by low-proficiency speakers. Searching for “cutoff date” gives you some viable results (“крайний срок“, “крайняя дата“), but searching for “good student” does not. This is probably due to the large proportion of business and trade texts in this corpus.

As a sidenote, the Russian National Corpus has a smaller, fiction-based bilingual section, and the (in)famous Google Translator uses a similar principle in collecting parallel data to train its machine translation engine.

022014GT

I know you’ve seen it… but I couldn’t resist

I hope this article will help you with Russian usage next time you’re not sure what native speakers say. Happy browsing!

Tags: , , , , ,
Keep learning Russian with us!

Build vocabulary, practice pronunciation, and more with Transparent Language Online. Available anytime, anywhere, on any device.

Try it Free Find it at your Library
Share this:
Pin it

About the Author: Maria

Maria is a Russian-born translator from Western New York. She is excited to share her fascination with all things Russian on this blog. Maria's professional updates are available in English on her website and Twitter and in Russian on Telegram.


Comments:

  1. Roger:

    This raises a question about what the American English language phrase “I am a good student” is intended to communicate. If the intention is to comment on the act or process of studying, the Russian phrase “я хорошо учусь,” seems to be the most accurate translation of the thought.
    This is a basic problem of translation of words or phrases from one language to another. Any process falls short if the intention of the speaker or writer it not known. I am not a native speaker; I have studied the Russian language for about 54 years. Even though I do not use the Russian language, I think I have a sense of meaning.

    • Maria:

      @Roger Roger, that’s an excellent point — we really want to express a similar intention rather than a similar syntactic structure. I would say, in Russian, if you want to comment on your ability in a certain capacity, verbs are preferable over nouns. So, “Who’s a better cook?” is “Кто лучше готовит?” (literally, “who cooks better?”). Unless you are a professional chef, you wouldn’t refer to yourself as “повар” (cook). Same goes for “I’m a slow learner,” etc.

  2. Jörg:

    Thank you for this interesting post. I’m gonna take a closer look at these corpora. In your response you really make a good point concerning the problem of word-for-word translations and the necessity of changing the sentence structure.

    • Maria:

      @Jörg Всегда пожалуйста!