... and here comes support for Portuguese
von Jens Guballa
The Vigenere Solver as well as the Substitution Solver are tools to break classical ciphers without knowing the key. This is done by using n-grams, and now I am generating the n-grams on my own. So I thought it would be a good exercise for adding another language. I have chosen Portuguese, honestly for no reason. Or in other words: just because I can.
The Vigenere Solver uses a set of monograms (sometimes called unigrams), bigrams (sometimes called digrams) and trigrams. On the other hand for the Substitution Solver quadgrams are used. This time the text corpus used contained more than 95 million characters, not counting blanks, punctuation marks, numbers and other special characters. There is still some manual effort required, as characters with the acute accent, the circumflex accent, the tilde and so on must be considered, but all in all it is not too much effort.
What will be the next language to add?