Saturday, 26 September 2009

Haskell: Parsing SVN log output

Problem:
I decided to solve a problem not done properly previously at work. We use lotus notes to track of bugs in the system. Developers would jot down what files were changed for that bug. In our SVN, we would write the bug number when we commit. The problem was everybody had their own idea on how they want to write the commit description. The previous build engineer tried to google for an existing solution instead of building one. The tool was useless and this was a problem for whoever was tasked to perform deployment.

Over one weekend, I decided to solve this problem. Here's the code to parse the output from svn log.
module SVN (parse, parseFile, Log(MkLog, revision, author, comment, changes), FileChange(Add, Delete, Modify, Unknown)) where

import String (split, isBlankLine, trim)
import List (partition)
import Maybe (catMaybes, Maybe(Just, Nothing))

data FileChange = Add String | Delete String | Modify String | Unknown String
    deriving (Show)

data Log = MkLog {
    revision :: String,
    author :: String,
    comment :: String,
    changes :: [FileChange]
    } deriving (Show)

parseFile :: FilePath -> IO [Log]
parseFile path = do
  content <- readFile path
  return (parse content)

parse :: String -> [Log]
parse xs = parseLogs (lines xs) [] 

parseLogs :: [String] -> [Maybe Log] -> [Log] 
parseLogs [] logs = catMaybes logs
parseLogs xs logs = parseLogs rest ((parseLog log) : logs)
  where (log, (_:rest)) = splitBy isSeperatorLine xs

parseLog :: [String] -> Maybe Log
parseLog []       = Nothing
parseLog xs
  | len > 4   = Just (MkLog revision author (trim (unlines comment)) (parseFileChanges filechanges))
  | otherwise = Nothing
  where len                              = length xs
        ((x:_:filechanges), (_:comment)) = splitBy isBlankLine xs
        (revision:author:_)              = map trim (split '|' x)

parseFileChanges :: [String] -> [FileChange]
parseFileChanges [] = []
parseFileChanges xs = map parseFileChange xs

parseFileChange :: String -> FileChange
parseFileChange [] = Unknown []
parseFileChange x
  | t == "A"  = Add path
  | t == "D"  = Delete path
  | t == "M"  = Modify path
  | otherwise = Unknown x
  where (t:ts) = words x
        path   = head ts

isSeperatorLine :: String -> Bool
isSeperatorLine []    = False
isSeperatorLine (x:_) = '-' == x

splitBy :: (a -> Bool) -> [a] -> ([a], [a])
splitBy _    [] = ([], [])
splitBy pred xs = splitBy' pred [] xs
  where splitBy' _    xs []         = (xs, []) 
        splitBy' pred xs yys@(y:ys) = if (pred y) then (xs, yys) else splitBy' pred (xs ++ [y]) ys  

String library is my own implementation and it should be obvious what those functions do.

No comments: