Wednesday 27 January 2010

How to convert a matrix from a NxM to a X-Y-Z format

Somtimes a matrix need to be converted from the standard NxM format to a vector of X-Y-Z triplets (row-index, column-index, value).
Here you have a simple way to convert a 2D array in a X-Y-Z format in python:


from pylab import *

def mat2triplet(mat):

    rows = arange(mat.shape[0]*mat.shape[1])/mat.shape[1]
    # a vector of row indices

    cols = arange(mat.shape[0]*mat.shape[1])%mat.shape[1]
    # a vector of column indices

    values = mat.flatten()
    # a vector of values

    triplet = dstack((rows,cols,values))
    triplet = squeeze(triplet)

    return triplet


A matrix:
[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]


is converted into:
[[ 0  0  0]
 [ 0  1  1]
 [ 0  2  2]
 [ 1  0  3]
 [ 1  1  4]
 [ 1  2  5]
 [ 2  0  6]
 [ 2  1  7]
 [ 2  2  8]
 [ 3  0  9]
 [ 3  1 10]
 [ 3  2 11]]

Monday 18 January 2010

fscanf with python

Python does not have a fscanf function like c. Regular expressions can be used efficently to do the job. Here it is an example of how a filestream can be scanned row-by-row returning a list of all numbers encountered for each row.
First, a regex object is created with this pattern:
regex = re.compile("(\d+\.*\d*|\d*\.\d+)((e|E)(\+|\-)(\d+))*")

then it is used to find all matches within a string:
nums = regex.findall(line)

finally the list of found items is scanned in order to fill a list of numbers:
for num in nums:
    res.append(eval("%s%s" % (num[0],num[1])))

Here it is the complete function:

  import re

  regex = re.compile("(\d+\.*\d*|\d*\.\d+)((e|E)(\+|\-)(\d+))*")
  # this object will find numbers within a string
  # usage: regex.findall()

  def numscan(fstream):
      '''
      Scan a row of a given filestream 
      searching for numbers
      '''
      res = []
      # initialize an empty string

      line = fstream.readline()
      # read a line from fstream

      nums = regex.findall(line)
      # find numbers in line

      for num in nums:
          res.append(eval("%s%s" % (num[0],num[1])))
          # format the number output
          # and append it to the res list

      return res