Rediscovering Ruby: Abbrev

The Ruby API is full of hidden-in-plain-sight libraries that you’ll probably find extremely useful if only you knew they were there in the first place.For instance if you go to the Ruby Doc Standard Library right at the top, the only entry under ‘A’ is Abbrev, a one function API that might save some trouble when it comes to text processing.

The single function of the Abbrev library is to “calculate the set of unique abbreviations for a given set of strings”. Put slightly more simply given an array of strings, Abbrev will return a list of unique non-ambiguous prefixes for each string such that no two strings have the same prefix. So given the words ‘fox’ and ‘fig’ the unique abbreviations for ‘fox’ are ‘fo’ and ‘fox’ and for ‘fig’ is ‘fi’ and ‘fig’ with the prefix of ‘f’ being ambiguous and therefore excluded.

Abbrev has one method abbrev that can either be called as a module method on the Abbrev module or mixed in to Array. This method will return a hash comprising of the abbreviation as the key and the original word as the value.

1 >> require 'abbrev'
2 >> [ 'Fig', 'Fox' ].abbrev
3 => {"Fi"=>"Fig", "Fo"=>"Fox", "Fig"=>"Fig", "Fox"=>"Fox"}

The first usage that I had for Abbrev was to generate short codes to use as labels given a list of names for a graph. Usually the labels all begin with a unique letter but in the odd case where that is not the case a two letter short label will suffice. First we get the list of abbreviations:

1 labels = [ "Fox", "Fax", "Dog" ]
2 abbreviations = labels.abbrev

Group the abbreviations by the label:

1 grouped_abbreviations = abbreviations.group_by{ |abbreviation,label| label } 

And then create a hash grouping the abbreviations to the label

1 label_abbreviations = grouped_abbreviations.inject( {} ) do |hash, grouped_abbrevs| 
2   label = grouped_abbrevs[ 0 ]
3   abbrevs = grouped_abbrevs[ 1 ].map{ |a| a[ 0 ] }
4   hash.merge( label => abbrevs )
5 end

which produces the following hash:

{"Fox"=>["Fo", "Fox"], "Fax"=>["Fa", "Fax"], "Dog"=>["Do", "D", "Dog"]}

For each group of abbreviations calculate the shortest abbreviation and create a hash to reference the short abbreviation given the label:

1 short_abbreviations = label_abbreviations.inject( {} ) do |hash, label_abbrevs|
2   label = label_abbrevs[ 0 ]
3   abbrevs = label_abbrevs[ 1 ]
4   shortest_abbrev = abbrevs.sorty_by{ |abbrev| abbrev.length }[ 0 ]
5   hash.merge( label => shortest_abbrev.upcase )
6 end

which produces the hash we want:

{"Fox"=>"FO", "Fax"=>"FA", "Dog"=>"D"}

In terms of visual appeal of abbreviated labels I’ve found that sometimes it’s best to strip vowels out of words. For instance for ‘Fox’ and ‘Fig’, ‘FX’ and ‘FG’ are much more readable than ‘FO’ and ‘FI’.


Farrel Lifson is a lead developer at Aimred.

About Aimred

Aimred is a specialist Ruby and Ruby on Rails development house and consultancy based in Cape Town, South Africa.

We provide Ruby and Ruby on Rails development, consulting and training services to businesses and organisations of all sizes. If you want to find out how we can help you, contact us at info@aimred.com.

Recent Posts

Yearly Archives