🇫🇷 From “Vingt et Un” to 21: Building a Lightning-Fast French Number Parser in Ruby

Ever tried to parse French numbers in your Ruby application?

If you’ve worked with French text data, you know the pain: “quatre-vingt-quatorze” should become 94, “trois millions deux cent mille” should transform to 3,200,000, but good luck finding a performant solution that handles all the linguistic quirks of French numbers!

That’s exactly the problem I set out to solve with StringToNumber, a high-performance Ruby gem that converts French written numbers into their numeric equivalents with blazing speed and bulletproof reliability.

🤔 The Problem: French Numbers Are… Complex

French numbers aren’t just “difficult”, they’re linguistically fascinating and computationally challenging:

  • Special cases: “quatre-vingts” (80) vs “quatre-vingt-un” (81)
  • Compound forms: “soixante-dix” (literally “sixty-ten” = 70)
  • Multiple formats: “vingt-et-un” vs “vingt et un” vs “vingt-un”
  • Scale handling: “deux millions trois cent mille” (2,300,000)

Most existing solutions either:

  • ❌ Don’t handle edge cases correctly
  • ⚠️ Have terrible performance for large datasets
  • 💔 Lack proper caching and memory management
  • 🐛 Break on complex compound numbers

✨ The Solution: StringToNumber Gem

StringToNumber tackles these challenges head-on with a dual-architecture approach:

🚀 Performance That Scales

  • Up to 460x faster than naive implementations
  • Intelligent LRU caching with thread-safe operations
  • Pre-compiled regex patterns eliminate compilation overhead
  • Zero-allocation matching for common cases

🎯 Comprehensive French Support

require 'string_to_number'

# Basic numbers
StringToNumber.in_numbers('quinze') #=> 15
StringToNumber.in_numbers('quatre-vingts') #=> 80

# Complex compounds
StringToNumber.in_numbers('soixante-dix-sept') #=> 77
StringToNumber.in_numbers('quatre-vingt-quatorze') #=> 94

# Large numbers
StringToNumber.in_numbers('deux millions trois cent mille') #=> 2_300_000
StringToNumber.in_numbers('neuf mille neuf cent quatre-vingt-dix-neuf') #=> 9999

🛡️ Production-Ready Features

  • Thread-safe concurrent operations
  • Input validation with helpful error messages
  • Memory efficient with configurable cache limits
  • Backward compatibility mode for testing

📊 Performance That Will Blow Your Mind
Here’s where StringToNumber really shines. Check out these benchmark results:

+--------+------------+----------+----------------+-------------+--------+
| Input  | Complexity | Original | StringToNumber | Improvement |        |
+--------+------------+----------+----------------+-------------+--------+
| Short  | numbers    | 0.5ms    | 0.035ms        | 14x         | faster |
| Medium | complexity | 2.1ms    | 0.045ms        | 47x         | faster |
| Long   | compounds  | 23ms     | 0.05ms         | 460x        | faster |
+--------+------------+----------+----------------+-------------+--------+
# This processes 800,000+ conversions per second! 🔥
1000.times { StringToNumber.in_numbers('vingt et un') }

🎯 Real-World Use Cases
Who should use StringToNumber?

  • 🏦 Financial apps processing French invoices/documents
  • 📊 Data pipelines cleaning French numerical text
  • 🤖 NLP projects working with French language data
  • 📱 Mobile apps supporting French localization
  • 🔍 Search engines normalizing French numerical queries
  •  📈 Analytics platforms parsing French business data

Example: Processing French Financial Data

# Clean messy financial data
invoices = [
 "Montant: trois mille deux cent euros",
 "Total: quinze mille neuf cent vingt",
 "Crédit: un million deux cent mille"
]
amounts = invoices.map do |invoice|
 number_text = invoice.match(/: (.+) euros?/)&.[](1) || invoice.match(/: (.+)$/)&.[](1)
 StringToNumber.in_numbers(number_text) if number_text
end
#=> [3200, 15920, 1200000]

🚀 Quick Start Guide
Installation

gem install string_to_number
# or add to Gemfile
gem 'string_to_number'

Basic Usage

require 'string_to_number'

# Convert any French number
result = StringToNumber.in_numbers('mille deux cent trente-quatre')
puts result #=> 1234

# Validate input before processing
if StringToNumber.valid_french_number?('vingt et un')
 puts StringToNumber.in_numbers('vingt et un') #=> 21
end

# Check performance stats
stats = StringToNumber.cache_stats
puts "Cache hit ratio: #{stats[:cache_hit_ratio]}"
Advanced Features
# Batch processing with automatic caching
french_numbers = ['un', 'deux', 'trois', 'vingt', 'cent']
results = french_numbers.map { |num| StringToNumber.in_numbers(num) }

# Memory management for long-running processes
StringToNumber.clear_caches! # Reset when processing new datasets

# Backward compatibility testing
old_result = StringToNumber.in_numbers('cent', use_optimized: false)
new_result = StringToNumber.in_numbers('cent', use_optimized: true)
puts old_result == new_result #=> true

🏗️ Under the Hood: Architecture Highlights
The gem uses a sophisticated dual-parser approach:

  • Optimized Parser (default): High-performance with caching
  • Original Parser: Reference implementation for compatibility

Key optimizations include:

  • LRU caching with thread-safe mutex protection
  • Instance memoization reduces initialization overhead
  • Pre-compiled regex patterns eliminate compilation costs
  • Intelligent word matching with zero allocations

🤝 Join the Community!
StringToNumber is just getting started, and I’d love your help making it even better!

💡 Ways to Contribute:

🎯 What’s Next?
I’m actively working on:

  • Regional variants support (Belgian/Swiss French)
  • Decimal number parsing (“trois virgule quatorze”)
  • Ordinal numbers (“premier”, “deuxième”)

Ready to supercharge your French number processing? Install StringToNumber today and transform your text processing pipeline from sluggish to lightning-fast!

gem install string_to_number

Your French data deserves better than regex soup. Give it the StringToNumber treatment! 🚀

What do you think? I’d love to hear how StringToNumber works in your projects, drop me a line or star the repo if it saves you time!

Happy coding! 🇫🇷💎

Similar Posts