下面的代码给出了将mat格式数据集转换为arff与txt格式的matlab代码。
注意,每个.mat文件中只有一个数据集,其中共有m+1列,最后一列是label。
转为arff: mat2arff.m代码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 |
% % This function is used to convert the input data to '.arff' % file format,which is compatible to weka file format ... % % Parameters: % input_filename -- Input file name,only can conversion '.mat','.txt' % or '.csv' file format ... % arff_filename -- the output '.arff' file ... % NOTEs: %The input 'M*N' file data must be the following format: % M: sampel numbers; % N: sample features and label,"1:N-1" -- features, "N" - sample label ... % 读取文件数据 ... clear clc input_filename = 'GLIOMA-t.mat'; arff_filename = 'GLIOMA.arff'; if strfind(input_filename,'.mat') matdata = importdata(input_filename); elseif strfind(input_filename,'.txt') matdata = textread(input_filename) ; elseif strfind(input_filename,'.csv') matdata = csvread(input_filename); end [row,col] = size(matdata); f = fopen(arff_filename,'wt'); if (f < 0) error(sprintf('Unable to open the file %s',arff_filename)); return end fprintf(f,'%s\n',['@relation ',arff_filename]); for i = 1 : col - 1 st = ['@attribute att_',num2str(i),' numeric']; fprintf(f,'%s\n',st); end % 保存文件头最后一行类别信息 floatformat = '%.16g'; Y = matdata(:,col); uY = unique(Y); % 得到label类型 st = ['@attribute label {']; for j = 1 : size(uY) - 1 st = [st sprintf([floatformat ' ,'],uY(j))]; end st = [st sprintf([floatformat '}'],uY(length(uY)))]; fprintf(f,'%s\n\n',st); % 开始保存数据 ... labelformat = [floatformat ' ']; fprintf(f,'@data\n'); for i = 1 : row Xi = matdata(i,1:col-1); s = sprintf(labelformat,Y(i)); s = [sprintf([floatformat ' '],[; Xi]) s]; fprintf(f,'%s\n',s); end fclose(f); |
转为txt: mat2txt.m代码
当然也可用save直接转换,但是会出现每一行开头会空两格的情况。
注意dataName.mat中的数据集名称是data
1 2 3 4 5 6 7 8 9 10 11 |
clc clear load('dataName.mat') fid = fopen('dataName.txt', 'wt'); for i = 1 : size(data, 1) for j = 1 : size(data, 2) - 1 fprintf(fid,'%e ',data(i, j)); end fprintf(fid,'%e\n',data(i, size(data, 2))); end fclose(fid); |