432 строки
14 KiB
Plaintext
432 строки
14 KiB
Plaintext
csv_decoder_new
|
|
|
|
SYNOPSIS
|
|
Instantiate a parser for CSV data
|
|
|
|
USAGE
|
|
obj = csv_decoder_new (filename|File_Type|Strings[])
|
|
|
|
DESCRIPTION
|
|
This function instantiates an object that may be used to parse and
|
|
read so-called comma-separated-value (CSV) data. It requires a
|
|
single argument, which may be the name of a file, an open file
|
|
pointer, or an array of strings.
|
|
|
|
QUALIFIERS
|
|
; delim: character used for the delimiter (default: ',')
|
|
; quote: character used for the quoting fields (default: '"')
|
|
; skiplines: number of lines to skip before parsing (default: 0)
|
|
; comment: lines beginning with this string will be skipped
|
|
; blankrows: default for how blank rows should be handled (default: "skip")
|
|
|
|
METHODS
|
|
.readrow : Read and parse a row from the CSV object
|
|
.readcol : Read one or more columns from the CSV object
|
|
|
|
EXAMPLE
|
|
See the documentation for the `csv.readcol' and
|
|
`csv.readrow' methods for examples.
|
|
|
|
NOTES
|
|
The current implementation assumes the CSV format specified according
|
|
to RFC 4180.
|
|
|
|
It is important to understand the difference between a ROW and a LINE
|
|
in a CSV formatted file: a row may span more than one line in a file.
|
|
The `skiplines' qualifier specifies the number of LINES to be
|
|
skipped, not ROWS.
|
|
|
|
CSV files have no notion of data-types: all field values are strings.
|
|
For this reason, the `type' qualifier introduces an extra layer
|
|
that is not part CSV format.
|
|
|
|
SEE ALSO
|
|
csv.readcol, csv.readrow
|
|
|
|
--------------------------------------------------------------
|
|
|
|
csv.readcol
|
|
|
|
SYNOPSIS
|
|
Read one or more columns from a CSV file
|
|
|
|
USAGE
|
|
datastruct = csv.readcol([columns])
|
|
|
|
DESCRIPTION
|
|
This function method may be used to read one or more columns from a
|
|
comma-separated-value file. If passed with no arguments, all columns
|
|
of the file will be returned. Otherwise, only those columns
|
|
specified by the columns argument will be returned.
|
|
|
|
The return value is a structure with fields that correspond to the
|
|
desired columns. The default is for the structure to have field
|
|
names `col1', `col2', etc., where the integer suffix
|
|
specifies the column number. The `fields' and `header'
|
|
qualifiers may be used to specify a different set of names.
|
|
|
|
QUALIFIERS
|
|
; fields: An array of field names to use for the returned structure
|
|
; header: Array of strings that correspond to the header row
|
|
; type: An scalar or array type-specifier, see below
|
|
; typeN: Type-specifier for column N
|
|
; snan: String value to use for an empty string element (default: "")
|
|
; inan: Integer value to use for an empty integer element (default: 0)
|
|
; lnan: Long int value to use for an empty long int element (default: 0L)
|
|
; fnan: Float value to use for an empty float element (default: _NaN)
|
|
; dnan: Double value to use for an empty double element (default: _NaN)
|
|
; nanN: Value used for an empty element in the column N
|
|
; blankrows: How a blank row should be handled (default: "skip")
|
|
|
|
The type-specifier is used to specify the type of a field. It must
|
|
be one of the following characters:
|
|
|
|
's' (String_Type)
|
|
'i' (Int_Type)
|
|
'l' (Long_Type)
|
|
'f' (Float_Type)
|
|
'd' (Double_Type)
|
|
|
|
If the value of the `type' qualifier is scalar, then all
|
|
columns will default to use the corresponding type. If different
|
|
types are desired, then an array of type-specifiers may be used.
|
|
The length of the array must be the same as the number of columns to
|
|
be returned. The `typeN' qualifier may be used to give the
|
|
type of column N.
|
|
|
|
If the `columns' argument is string-valued, then the
|
|
`header' qualifier must be supplied to provide a mapping
|
|
from column names to column numbers. If it is present, it will also
|
|
be used to give normalized field names to the returned structure.
|
|
For normalization, the column name is first lower-cased, then all
|
|
non-alphanumeric values are converted to "_", and excess underscore
|
|
characters removed.
|
|
|
|
See the documentation for the `csv.readrow' for more
|
|
information about how blank rows should be handled.
|
|
|
|
EXAMPLE
|
|
Suppose that `data.csv' is a file that contains
|
|
|
|
# The data below are from runs 6 and 7
|
|
x,y,errx,erry,Notes - or - Comments
|
|
10.2,0.5,,0.1,
|
|
13.4,0.9,0.1,0.16,
|
|
20.7,18.2,,0.3,Vacuum leak in beam line
|
|
29.6,1.3,,0.31,
|
|
31.2,1.2,0.11,0.33,"This data point
|
|
taken from run 7"
|
|
|
|
This file consists of 8 lines and forms a CSV file with 6 rows.
|
|
The first row consists of a single column, and the subsequent rows of
|
|
consist of 5 columns. columns. Note that the last row is split
|
|
across two lines. The row with the single column will be regarded as
|
|
a comment in what follows.
|
|
|
|
The first step is to instantiate a parser object using:
|
|
|
|
csv = csv_decoder_new ("data.csv" ;comment="#");
|
|
|
|
The use of the `comment' qualifier will cause all lines
|
|
beginning with `"#"' to be skipped. Alternatively, the first
|
|
line could have been skipped using
|
|
|
|
csv = csv_decoder_new ("data.csv" ;skiplines=1);
|
|
|
|
The second row (also second line) in the file is the header line:
|
|
it gives the names of the columns. It may be read using
|
|
|
|
header = csv.readrow ();
|
|
|
|
The rest of the file consists of the data values. We want to read
|
|
the first 4 columns as single precision (Float_Type) values,
|
|
and the 5th as a string. One way to do this is
|
|
|
|
table = csv.readcol (;type=['f','f','f','f','s']);
|
|
|
|
This will result in `table' set to a structure of the form
|
|
|
|
struct { col1 = Float_Type[5],
|
|
col2 = Float_Type[5],
|
|
col3 = Float_Type[5],
|
|
col4 = Float_Type[5],
|
|
col5 = String_Type[5]
|
|
}
|
|
|
|
The same result could also have been achieved using
|
|
|
|
table = csv.readcol (;type='f', type5='s');
|
|
|
|
If the `header' qualifier is used, then
|
|
|
|
table = csv.readcol (;type='f', type5='s', header=header);
|
|
|
|
would produce the structure
|
|
|
|
struct {x=Float_Type[5],
|
|
y=Float_Type[5],
|
|
errx=Float_Type[5],
|
|
erry=Float_Type[5],
|
|
notes_or_comments=String_Type[5]
|
|
}
|
|
|
|
Note how the "Notes -or- Comments" value was normalized.
|
|
|
|
To read just the `x' and `y' columns, either of the
|
|
following may be used:
|
|
|
|
table = csv.readcol ([1,2] ;type='f');
|
|
table = csv.readcol (["x","y"] ;type='f', header=header);
|
|
|
|
The `header' qualifier was required in the last form to map the
|
|
column names to numbers.
|
|
|
|
SEE ALSO
|
|
csv_decoder_new, csv_readcol, readascii
|
|
|
|
--------------------------------------------------------------
|
|
|
|
csv.readrow
|
|
|
|
SYNOPSIS
|
|
Read a row from a CSV file
|
|
|
|
USAGE
|
|
row = csv.readrow ()
|
|
|
|
DESCRIPTION
|
|
The `csv.readrow' function method may be used to read the next
|
|
row from the underlying CSV (comma-separated-value) parser object.
|
|
The object must have already been instantiated using the
|
|
`csv_decoder_new' function. It returns the row data in the form
|
|
of an array of strings. If the end of input it reached, NULL will
|
|
be returned.
|
|
|
|
QUALIFIERS
|
|
; blankrows: How a blank row should be handled (default: "skip")
|
|
|
|
The `blankrows' qualifier is used to specify how a blank row
|
|
should be handled. A blankrow is defined as a row made up of no
|
|
characters except for the newline or carriage-return sequence. For
|
|
example, the following 9 lines has one blank row that occurs on
|
|
line 8:
|
|
|
|
"12.3"
|
|
"4
|
|
|
|
5"
|
|
"5.1"
|
|
""
|
|
"7.2"
|
|
|
|
"6.2"
|
|
|
|
If the value of `"blankrow"' is `"skip"', then blank rows
|
|
will be ignored by the parser. If the value is `"stop"', then the row
|
|
will be returned as an empty array of strings (length equal to 0).
|
|
Otherwise the row will be treated as if it contained the empty
|
|
string and returned as an array of length 1 with a value of "".
|
|
The default behavior is to skip such rows.
|
|
|
|
SEE ALSO
|
|
csv_decoder_new, csv.readcol, csv_readcol
|
|
|
|
--------------------------------------------------------------
|
|
|
|
csv_readcol
|
|
|
|
SYNOPSIS
|
|
Read one or more columns from a CSV file
|
|
|
|
USAGE
|
|
Struct_Type csv_readcol (file|fp [,columns] ;qualifiers)
|
|
|
|
DESCRIPTION
|
|
This function may be used to read one or more of the columns in the specified
|
|
CSV file. If the `columns' argument is present, then only those
|
|
columns will be read; otherwise all columns in the file will be read.
|
|
The columns will be returned in the form of a structure.
|
|
|
|
QUALIFIERS
|
|
This function supports all of the qualifiers supported by the
|
|
`csv_decoder_new' function and the `csv.readcol' method.
|
|
In addition, if the `has_header' qualifier is present, the first
|
|
line processed (after skipping any lines implied by the
|
|
`skiplines' and `comment' qualifiers) will be regarded as
|
|
the header.
|
|
|
|
If the `rdb' qualifier is present, then assume that the file is
|
|
in the so-called RDB file format. This is a tab-delimited format
|
|
that consists of a line that contains the names of the fields,
|
|
followed by a line that specifies the data types of the columns.
|
|
|
|
EXAMPLE
|
|
|
|
data = csv_readcol ("mirror.csv" ;comment="#", has_header, delim='|');
|
|
data = csv_readcol ("foo.rdb" ; rdb);
|
|
|
|
|
|
SEE ALSO
|
|
csv_decoder_new, csv.readcol, csv.readrow, csv.writecol, csv_encoder_new
|
|
|
|
--------------------------------------------------------------
|
|
|
|
csv_encoder_new
|
|
|
|
SYNOPSIS
|
|
Create an object for writing CSV files
|
|
|
|
USAGE
|
|
csv = csv_encoder_new ()
|
|
|
|
DESCRIPTION
|
|
The `csv_encoder_new' function returns an object that may
|
|
be used for creating a CSV file.
|
|
|
|
QUALIFIERS
|
|
; delim: Character used for the field delimiter (default: ',')
|
|
; quote: Character used for quoting fields (default: '"')
|
|
; quoteall: Quote all field values
|
|
; quotesome: Quote only those fields where quoting is necessary
|
|
|
|
METHODS
|
|
.writecol : write one or more columns to a file. For more information
|
|
about this method, see the documentation for `csv.writecol'.
|
|
|
|
EXAMPLE
|
|
x = [0:2*PI:#100];
|
|
csv = csv_encoder_new (;delim='|');
|
|
csv.writecol ("sinx.csv", x, sin(x) ; names=["x", "sin of x"]);
|
|
|
|
NOTES
|
|
The `set_float_format' function may be used to specify the
|
|
format used where writing floating point numbers to the CSV file.
|
|
|
|
SEE ALSO
|
|
csv_writecol, csv_encoder_new, csv_readcol
|
|
|
|
--------------------------------------------------------------
|
|
|
|
csv_writecol
|
|
|
|
SYNOPSIS
|
|
Write to a file using a CSV format
|
|
|
|
USAGE
|
|
csv_writecol(file|fp, datalist | datastruct | col1,...,colN)
|
|
|
|
DESCRIPTION
|
|
This function write a one or more data columns to a file or open
|
|
file descriptor using a comma-separated-value format. The data
|
|
values may be expressed in several ways: as a list of a column
|
|
values (`datalist'), a structure whose fields specify the
|
|
column values (`datastruct'), or passed explicitly as
|
|
individual values (`col1',...`colN').
|
|
|
|
QUALIFIERS
|
|
; delim: character used for the delimiter (default: ',')
|
|
; names: An array of strings used to name the columns
|
|
; noheader: Do not write a header to the file
|
|
; quote: character used for the quoting fields (default: '"')
|
|
; quoteall: Quote all field values
|
|
; quotesome: Quote only those fields where quoting is necessary
|
|
; rdb: Write in an rdb-style TAB delimited format
|
|
|
|
Unless the `noheader' qualifier is given, a header containing
|
|
the names of the columns will be written if either the `names'
|
|
qualifier is present, or the data values are passed as a structure.
|
|
In the latter case, the structure field names will be used for the
|
|
column names.
|
|
|
|
The function will throw an `IOError' exception if a write error occurs.
|
|
|
|
EXAMPLE
|
|
For the purposes of this example, assume that three arrays are
|
|
given: `time', `temp', `humidity', which represent a
|
|
time series of temperature and humidity. Here are three equivalent
|
|
ways of writing the arrays to a CSV file:
|
|
|
|
% Via a structure
|
|
data = struct {time=time, temperature=temp, humidity=humidity};
|
|
csv_writecol ("weather.dat", data);
|
|
|
|
% Via a list
|
|
data = {time, temp, humidity};
|
|
csv_writecol ("weather.dat", data
|
|
;names=["time","temperature","humidity"]);
|
|
|
|
% Via explicit arguments
|
|
csv_writecol ("weather.dat", time, temp, humidity
|
|
;names=["time","temperature","humidity"]);
|
|
|
|
|
|
NOTES
|
|
This function is a simple wrapper around `csv_encoder_new' and
|
|
`csv.writecol'.
|
|
|
|
SEE ALSO
|
|
csv_encoder_new, csv.writecol, csv_readcol
|
|
|
|
--------------------------------------------------------------
|
|
|
|
csv.writecol
|
|
|
|
SYNOPSIS
|
|
Write to a file using a CSV format
|
|
|
|
USAGE
|
|
csv.writecol(file|fp, datalist | datastruct | col1,...,colN)
|
|
|
|
DESCRIPTION
|
|
The `csv.writecol' function method may be used to write one of
|
|
more data columns to a file or file descriptor via a CSV encoder
|
|
object instantiated using the `csv_encoder_new' function.
|
|
|
|
The data values may be expressed in several ways: as a list of a
|
|
column values (`datalist'), a structure whose fields specify
|
|
the column values (`datastruct'), or passed explicitly as
|
|
individual values (`col1',...`colN').
|
|
|
|
An `IOError' exception will be thrown if a write error occurs.
|
|
|
|
QUALIFIERS
|
|
; delim: character used for the delimiter (default: ',')
|
|
; names: An array of strings used for names of the columns
|
|
; noheader: Do not write a header to the file
|
|
; quoteall: Quote all field values
|
|
; quotesome: Quote only those fields where quoting is necessary
|
|
; rdb: Write in an rdb-style TAB delimited format
|
|
|
|
Unless the `noheader' qualifier is given, a header containing
|
|
the names of the columns will be written if either the `names'
|
|
qualifier is present, or the data values are passed as a structure.
|
|
In the latter case, the structure field names will be used for the
|
|
column names.
|
|
|
|
EXAMPLE
|
|
For the purposes of this example, assume that three arrays are
|
|
given: `time', `temp', `humidity', which represent a
|
|
time series of temperature and humidity. Here are three equivalent
|
|
ways of writing the arrays to a TAB delimited file:
|
|
|
|
csv = csv_encoder_new (;quote='\t');
|
|
|
|
% Via a structure
|
|
data = struct {time=time, temperature=temp, humidity=humidity};
|
|
csv.writecol ("weather.dat", data);
|
|
|
|
% Via a list
|
|
data = {time, temp, humidity};
|
|
csv.writecol ("weather.dat", data
|
|
;names=["time","temperature","humidity"]);
|
|
|
|
% Via explicit arguments
|
|
csv.writecol ("weather.dat", time, temp, humidity
|
|
;names=["time","temperature","humidity"]);
|
|
|
|
|
|
SEE ALSO
|
|
csv_encoder_new, csv_writecol, csv_readcol
|
|
|
|
--------------------------------------------------------------
|