Rss Feed Tweeter button Facebook button Technorati button Reddit button Linkedin button Webonews button Delicious button Digg button Flickr button Stumbleupon button Newsvine button

A Waage Blog

Ruby, Rails, Life

CSV Escaping Double Quotes

with 3 comments

What is the proper way to escape double-quotes in CSV files?

When you try to export from MySQL in CSV format, the MySQL reference manual itself suggests using

FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'

Now, I thought I had a nicely formatted CSV file where fields that have double quotes would be properly escaped. Looking through this .csv file, I can see that sure enough, all the double-quotes are turned from ” into \”.

This is fine, except that when I tried to use Excel to open the CSV file or FasterCSV (a nice Ruby library for working with CSV files), I find that the double quotes were not escaped “properly”. They are expecting to escape double-quotes with TWO double-quotes (”"). According to CSVreader.com, this is a known discrepancy between the way different programs escape double-quotes in CSV files. Great.

Well, don’t worry – here’s a sed 1-liner that you can use on the command line to convert those \” (backslash double-quote) into “” (two double-quotes):

$cat file.csv | sed -E 's/\\"/""/g' > file2.csv

Hope that helps someone out there!

Written by Andrew Waage

February 16th, 2012 at 12:51 am

3 Responses to 'CSV Escaping Double Quotes'

Subscribe to comments with RSS or TrackBack to 'CSV Escaping Double Quotes'.

  1. FYI, this breaks if you ever happen to have a backslash immediately before a field-closing quote. e.g., this:

    echo ’something,”something \”quoted\” \\”‘ | sed -E ’s/\\”/”"/g’

    yields this:

    something,”something “”quoted”" \”"

    but it is *supposed* yield this:

    something,”something “”quoted”" \\”

    Jordan B

    28 Sep 12 at 11:41 am

  2. Thinking about it a bit more, it should really yield this:

    something,”something “”quoted”” \”

    And likewise all the other stuff that MySQL escapes with “\” (see http://dev.mysql.com/doc/refman/5.1/en/load-data.html) needs to be un-backslashed to robustly be converted over to the format that FasterCSV uses.

    Jordan B

    28 Sep 12 at 11:58 am

  3. If your happy enough having just a multi-dimensional array, this should work fine. I had wanted to use the one provided by keananda but it was choking on pr($lines). <?php function f_parse_csv($file, $longest, $delimiter) { $mdarray = array(); $file = fopen($file, “r”); while ($line = fgetcsv($file, $longest, $delimiter)) { array_push($mdarray, $line); } fclose($file); return $mdarray; } ?> $longest is a number that represents the longest line in the csv file as required by fgetcsv(). The page for fgetcsv() said that the longest line could be set to 0 or left out, but I couldn’t get it to work without. I just made it extra large when I had to use it.

    Janette Cantu

    9 Oct 13 at 5:00 am

Leave a Reply