Shortest unique prefix - it's a builtin function

Shortest unique prefix - hack part #3

  • I want to find the shortest unique prefix for each string in a vector (N < 20)
    • i.e. in a vector of strings, what’s the shortest a string can be and still match only itself
  • This isn’t a computational bottleneck in the overall process, so there’s no need for heroic/drastic solutions.
  • I wasn’t going to write a trie implementation just to solve this problem.

Yesterday’s solution got a bit ugly.

Earlier today I got all cute using charmatch().

And now I’ve just discovered that there’s a function in base R that already does this. (Thanks to mdsummer for pointing this out)

Feeling a bit silly now.

abbreviate(c("ab", "apple", "apart", 'b', 'ag'), use.classes = FALSE, minlength = 1)
   ab apple apart     b    ag 
 "ab" "app" "apa"   "b"  "ag" 
abbreviate(c('blue', 'black', 'bold')          , use.classes = FALSE, minlength = 1)
 blue black  bold 
"blu" "bla"  "bo"