Matlab LogoRecently, we got an inquiry through the contact form about how to import a particular data set. Since importing text-based data is an important task that many people have questions on, today’s post focuses on how to import a particular data set. Specifically, this post focuses on using tdfread to import tab delimited data. The tdfread command is a built-in MATLAB function that reads in tab delimited data. It’s amazingly easy to use! In addition, we’ll also revisit the textscan command and we use it to import data.

Contents

The Input Data Set

Scott () wrote:

I have a problem that I need some help solving. Pasted onto this message is
following data. It contains column headers for the Region, State, Sales, Head
Count. What needs to be done is to read this data in (either by .txt or .csv)
and organize it then sum the data and display in table form.

The data as .txt (this data is tab delimited):

Region  State   Sales   Head Count
North   North Dakota    80078   81
North   Montana 90608   391
North   Michigan        4598    27
North   Wisconsin       11622   36
South   Florida 9788    73
South   Georgia 86456   385
South   Alabama 94766   91
South   Mississippi     13004   61
East    North Carolina  612     25
East    Virginia        85508   233
West    California      84419   262
West    Washington      92682   97
West    Oregon  53185   51

The data in .csv format:

i,Region,State,Sales,Head Count
1,North,North Dakota,80078,81
2,North,Montana,90608,391
3,North,Michigan,4598,27
4,North,Wisconsin,11622,36
5,South,Florida,9788,73
6,South,Georgia,86456,385
7,South,Alabama,94766,91
8,South,Mississippi,13004,61
9,East,North Carolina,612,25
10,East,Virginia,85508,233
11,West,California,84419,262
12,West,Washington,92682,97
13,West,Oregon,53185,51

You can download the sample files here: Text Sample Data | CSV Sample Data

Which Data Set Should I Work With?

Half the battle was deciding which data set was easier to work with. To make this post more interesting, Rob and I decided on a friendly wager to see who could import the data into MATLAB first. The victorious party (to be determined from reader comments) wins free lunch.

My first inclination was to work with the tab delimited data. Since I’ve worked with a bunch of tab delimited data before, I was more familiar with this. I was crossing my fingers that I would prevail against Mr. Slazas.

On the other hand, our resident textscan master, Rob Slazas, chose to work with the .csv file. Since he uses textscan in his sleep, Rob immediately recognized that the .csv file was set up perfectly for textscan.

We imported both files using different techniques (and different file formats). You can be the judge on what method works better.

Using TDFREAD

A very useful function that I discovered when working with tab delimited data is tdfread.

The MATLAB help states the following when I query for help on the command tdfread:

TDFREAD Read in text and numeric data from tab-delimited file.

Using the following command, I was able to import the data quite easily.

%notice that I didn't have to use fopen!
quanData = tdfread('SampleData.txt')

The following output is stored into a structure.

quanData = 

        Region: [13x5 char]
         State: [13x14 char]
         Sales: [13x1 double]
    Head_Count: [13x1 double]

At this point, the data is successfully imported into MATLAB and is ready to be processed in whatever manner.

Using TEXTSCAN

The textscan command is a personal favorite of Rob Slazas. Rob had previously dedicated a post to this wonderful command. Independently of my me, Rob chose to use the .csv file and the textscan command to import the data.

He came up with the following code:

%open up the data file
fid = fopen('SampleData.csv');
%use textscan to import the data
robData = textscan(fid,'%f%s%s%f%f','delimiter',',','headerlines',1);
%close the data file
fclose(fid);

At this point, the imported data is stored in a cell array as shown below. It should be straight forward to manipulate the data accordingly in this form.

robData = 

    [13x1 double]    {13x1 cell}    {13x1 cell}    [13x1 double]    [13x1 double]

The Best Way to Import Data?

Is there a best way to import data? No, not really. I would suggest that you first try the built in commands first such as: csvread, dlmread, load, importdata, tdfread, etc as they could potentially save you some time. Depending on your data file, these functions may or may not work. If you find that these built in commands cannot successfully import your data, then the textscan command is probably your best bet. It’s very flexible and provides tons of formatting options.

So who wins the battle of importing data? Quan or Rob? The author of this post thinks that Quan takes this contest hands down. Then again, the author of this post is Quan himself.