Short unique reliable numbers
Have you ever had to enter a long code that you couldn't memorize at a glance and you get lost counting a dozen digits? I always wondered if it's possible to make a number that's both short and easy to read. For example, ABC123-DFG678 is easy to copy, although it's quite long. And you can also make the system detect errors if there were any. I had a few ideas on this topic, so I sat down and did the math.
Patterns are easier to read than a random set of letters and numbers. Two or three letters, two or three digits, then repeat if needed - seems like an obvious solution. It will be slightly shorter than just digits and slightly longer than alphanumeric encoding with arbitrary positions of digits and letters.
I've already shown what makes it easy to read
What to give the user
What about error resistance?
You can add a checksum character. And even if you make a mistake with one character when entering, it will be immediately noticeable. And if you add two characters, you can algorithmically correct a one-character error and detect a two-character error. So you don't accidentally confuse it with another account. Reliable.
What if you want to number things randomly instead of sequentially, and without collisions?
Usually the solution here is to choose a key that's a gazillion bits long, and let users suffer. But I'm against it. As long as all N-character keys haven't run out, please don't switch to N+1 character keys. But how to protect against collisions? Very simple. Don't check for uniqueness, of course. Just count from one and encode with any block cipher. That's how block ciphers work. You input sequential numbers 0000, 0001, 0002, and get "random", unique, non-repeating numbers 8372, 9274, 1290 as output. Absolutely reliable without collisions, very fast with minimal memory usage - you don't need to store all numbers. The reverse procedure is similar - take the "random" key, "decrypt" it and get its "number". As long as the number hasn't reached the maximum, you definitely haven't exhausted all "random" key variants and you have no collisions.
You just need to choose a block cipher with the same block length as the key length you want. And if there's no such cipher - create an algorithm with that block size, take AES for example and modify it. It's simple. There won't be cryptographic strength, but that's not what we were aiming for. Issuing "random" numbers without collisions will work the same way.
We could end here, but I had an interesting idea...
What if we generate numbers so that making a one-character error can't lead to another valid number? A brilliant idea.
I tried to find algorithms, but didn't have much luck. Well, brute force it is - generate values, and if making a one-character error leads us to at least one existing unique value, then we don't add this number to the list. I brute-forced different patterns - alphanumeric numbers, numeric numbers. And then it hit me. This is the same as a checksum with one character. In short — KISS, use a one-character checksum. The irony is that I started writing this blog post while the brute force was running, wanting to share the results of my genius idea. But it turns out this was all figured out long ago. Well, reinventing the wheel is still useful.
Comments