This topic bother me for a while. Thanks for interview code challenge I found a proper solution for it.
Here is my source code as well as RSpec test at my github: https://github.com/rickwxc/ruby-detect-given-string-is-numeric/tree/master
I am happy to talk about more in details.
Step 1. Let’s define what is valid numeric string format
Given any string X, if x.to_f result in proper value that means it is valid numeric format.
here is some valid string format examples:
irb(main):001:0> “.23”.to_f
=> 0.23
irb(main):002:0> “-.23”.to_f
=> -0.23
irb(main):003:0> "-.8e-16".to_f
=> -8.0e-17
irb(main):005:0> '+.8e-16'.to_f
=> 8.0e-17
However, since ruby can make any numeric from string if start with number. I still consider it is NOT VALID numeric format. The reason is that I prefer reporting those string as invalid rather than just cast it to number.
#Ruby casting number followed by any characters
irb(main):009:0> "13abc".to_f
=> 13.0
irb(main):009:0> "13.".to_f
=> 13.0# even through its ends up with proper value, I still consider "13abc" it is invalid numeric string.
Step 2. Break down the valid float format and build the Regex
Basically, there are 5 component of valid float by checking this article
here is the regex given by article above
^[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?$
part 1 - or +or empty
part 2 numbers before the . (can be empty)
part 3 the .(0 or 1)
part 4 numbers after .
part 5 e -+digits
and here is my regex, basically the same thing as above
/^((\+|-)?\d*\.?\d+)([eE](\+|-){1}\d+)?$/
here are some valid the example (already included in test case)
without e
12
+12
-12
12.3
.3
-.3
+.3with e
+123.123e-13
-.8e+11
For INVALID examples, the most trick format is like those
#NOTE: none of below are considered as valid numeric string
irb(main):037:0> "-456.e+13".to_f
=> -456.0
irb(main):039:0> '-3.e-16'.to_f
=> -3.0
WHY?
If you pay attention to the regex, you will figure it out:
Whenever there is a .(decimal points), it must follow by at least one number.
\.?\d+
That’s the reason why “456.” , “3.” , “12.e+10”are consider invalid.
Since ruby thought it is invalid float number, it will try cast number from the start of the string until it meet a none digit.
irb(main):040:0> "12.e+10".to_f
=> 12.0
#start from beginning, then stop at ., at this time, decimal point just been treat as non-digits so it return the longest number its find: 12.
Hopefully, this article make some sense in terms of how ruby to_f function,
and what is proper formatted float number.