./

pp_parsetext.pro

Routines

top source pp_parsetext

result = pp_parsetext(file [, header=header] [, lines=lines] [, splitlines=splitlines] [, as_struct=as_struct] [, fieldnames=fieldnames] [, types=types] [, trim=trim] [, spacedelimited=spacedelimited] [, skipblank=skipblank] [, delimiter=delimiter] [, stripquotes=stripquotes] [, isinteger=isinteger] [, isfloat=isfloat] [, missingint=missingint] [, missingfloat=missingfloat] [, blank=blank] [, buffer=buffer] [, nheader=nheader])

Parses table data in a text file (or text array) as an array, with multiple options to specify different file formats and processing to be applied to the file. The output can be a string array or an array of structures.

Parameters

file in required

A string with the file name to read. If buffer is set, this should be a string array, where each element correponds to what would be a file line.

Keywords

header out optional

The header line(s) from the text file, unparsed. The number of header lines is set by nheader.

lines out optional

A string array, with one element per line in the input file.

splitlines out optional

A list where each element is a string array corresponding to one line of the input file. Each element in the array is one column from that file line.

as_struct in optional default=0

If set, the output is an array of structures, one structure per input line.

fieldnames in out optional

The names for the structure fields returned when as_struct is set. If this is not given, field names are taken from the last line of the file header

types in out optional

A hash containing type specifications for each of the structure fields to be created when as_struct is set. If not given, it will be determined by guessing from the file's column contents.

trim in optional default=2

Determines the type of leading/trailing trimming to be applied to the file lines. It is passed to strtrim, which is applied to all file lines.

spacedelimited in optional default=0

If set, the columns are assumed to be separated by any positive number of blank spaces. If not set, the columns are assumed to be fixed length, equal to the lengths used in the header line.

skipblank in optional default=0

If set, blank lines in the file are skipped.

delimiter in optional

The character(s) used as column delimiter in the file (the columns are split with strplit). If not given, the input columns are assumed to be separated by blank space.

stripquotes in optional default=0

If set, table elements enclosed in quotes will have the quotes removed.

isinteger out optional

If provided, will return a list, with one element per column of the file. Each element is an array that informs whether the corresponding column element in the input is an integer. Most often used for debugging and finding anomalous values in the input.

isfloat out optional

If provided, will return a list, with one element per column of the file. Each element is an array that informs whether the corresponding column element in the input is a float. Most often used for debugging and finding anomalous values in the input.

missingint in optional

If provided, any missing values in columns with integers will be filled with this value.

missingfloat in optional

If provided, any missing values in columns with floats will be filled with this value.

blank in optional

Passed to pp_isnumber. If set, blank strings are considered valid numbers.

buffer in optional default=0

If set, the first argument (file) is taken as a string array of the file contents, instead of a file name to be read.

nheader in optional default=1

The number of header lines contained in the file. If as_struct is set and field names are not provided, the last line on the header is used to determine column names.

Examples

Read some example files provided with IDL, as structures:

 file=filepath('ascii.txt',subdirectory=['examples','data'])
 a=pp_parsetext(file,/skipblank,nheader=4,header=header,delimiter=',',$
 /as_struct,fieldnames=['lon','lat','el','temp','dew','speed','dir'])
 help,a
 ;A               STRUCT    = -> <Anonymous> Array[15]
 ;help,a[0]
 ;** Structure <4023e918>, 7 tags, length=56, data length=56, refs=2:
 ;LON             DOUBLE          -156.95000
 ;LAT             DOUBLE           20.783300
 ;EL              LONG64                       399
 ;TEMP            LONG64                        68
 ;DEW             LONG64                        64
 ;SPEED           LONG64                        10
 ;DIR             LONG64                        60
 print,header
 ;This file contains ASCII format weather data in a comma delimited table with comments prefaced by the "%" character. The columns represent:
 ;Longitude, latitude, elevation (in feet), temperature (in degrees F),  dew point (in degrees  F), wind speed (knots), wind direction (degrees)

Author information

Author:

Paulo Penteado (http://www.ppenteado.net), Mar/2015

Other attributes

Todo:

Expand documentation, with more examples. This function has received many options to be capable of parsing different kinds of text files I encounter, which means its options make for a large variety of possibilities in file formats.

Requires:

pp_isnumber, pp_readtxt

Statistics

Lines:	73 lines
Cyclomatic complexity:	30
Modified cyclomatic complexity:	27

File attributes

Modification date:	Wed Jun 29 22:15:28 2016
Lines:	73
Docformat:	rst rst

Documentation for pp_lib

Generated by IDLdoc

./