Monday 18 January 2010

fscanf with python

Python does not have a fscanf function like c. Regular expressions can be used efficently to do the job. Here it is an example of how a filestream can be scanned row-by-row returning a list of all numbers encountered for each row.
First, a regex object is created with this pattern:
regex = re.compile("(\d+\.*\d*|\d*\.\d+)((e|E)(\+|\-)(\d+))*")

then it is used to find all matches within a string:
nums = regex.findall(line)

finally the list of found items is scanned in order to fill a list of numbers:
for num in nums:
    res.append(eval("%s%s" % (num[0],num[1])))

Here it is the complete function:

  import re

  regex = re.compile("(\d+\.*\d*|\d*\.\d+)((e|E)(\+|\-)(\d+))*")
  # this object will find numbers within a string
  # usage: regex.findall()

  def numscan(fstream):
      '''
      Scan a row of a given filestream 
      searching for numbers
      '''
      res = []
      # initialize an empty string

      line = fstream.readline()
      # read a line from fstream

      nums = regex.findall(line)
      # find numbers in line

      for num in nums:
          res.append(eval("%s%s" % (num[0],num[1])))
          # format the number output
          # and append it to the res list

      return res


1 comment:

  1. Thanks for posting this example, it came in handy for a program I wrote to visualize numerical data.
    However, I noticed that your regular expression does not handle negative numbers.
    I simply added -* to the front of your regular expression to make it work:
    regex = re.compile("(-*\d+\.*\d*|\d*\.\d+)((e|E)(\+|\-)(\d+))*")

    ReplyDelete