# More Bad Input

Now that the `from_roman()`

function works properly with good input, it’s time to fit in the last piece of the puzzle: making it work properly with bad input. That means finding a way to look at a string and determine if it’s a valid Roman numeral. This is inherently more difficult than validating numeric input in the `to_roman()`

function, but you have a powerful tool at your disposal: regular expressions. (If you’re not familiar with regular expressions, now would be a good time to read the regular expressions chapter.)

As you saw in Case Study: Roman Numerals, there are several simple rules for constructing a Roman numeral, using the letters `M`

, `D`

, `C`

, `L`

, `X`

, `V`

, and `I`

. Let’s review the rules:

- Sometimes characters are additive.
`I`

is`1`

,`II`

is`2`

, and`III`

is`3`

.`VI`

is 6 (literally, “5 and 1”),`VII`

is`7`

, and`VIII`

is`8`

. - The tens characters (
`I`

,`X`

,`C`

, and`M`

) can be repeated up to three times. At`4`

, you need to subtract from the next highest fives character. You can’t represent`4`

as`IIII`

; instead, it is represented as`IV`

(“`1`

less than`5`

”).`40`

is written as`XL`

(“`10`

less than`50`

”),`41`

as`XLI`

,`42`

as`XLII`

,`43`

as`XLIII`

, and then`44`

as`XLIV`

(“`10`

less than`50`

, then`1`

less than`5`

”). - Sometimes characters are… the opposite of additive. By putting certain characters before others, you subtract from the final value. For example, at
`9`

, you need to subtract from the next highest tens character:`8`

is`VIII`

, but`9`

is`IX`

(“1 less than 10”), not`VIIII`

(since the I character can not be repeated four times).`90`

is`XC`

,`900`

is`CM`

. - The fives characters can not be repeated.
`10`

is always represented as`X`

, never as`VV`

.`100`

is always`C`

, never`LL`

. - Roman numerals are read left to right, so the order of characters matters very much.
`DC`

is`600`

;`CD`

is a completely different number (`400`

, “`100`

less than`500`

”).`CI`

is`101`

;`IC`

is not even a valid Roman numeral (because you can’t subtract`1`

directly from`100`

; you would need to write it as`XCIX`

, “`10`

less than`100`

, then 1 less than`10`

”).

Thus, one useful test would be to ensure that the `from_roman()`

function should fail when you pass it a string with too many repeated numerals. How many is “too many” depends on the numeral.

Get hands-on with 1200+ tech skills courses.