Package fieldpy :: Package core :: Module raw_file_readers
[hide private]
[frames] | no frames]

Module raw_file_readers

source code

This file contains a collection of file reader functions which can be called when reading in the raw data files. The datafiles must be in a array (1D or 2D) format.

The functions must have the following calling signature: reader_fun(input_file, other_parameters).

The time (in matplotlib format) will has always to be in the first column.

And will return a tuple (data, raw_data, metadata). The data will have the format: time (in matplolib format), other (non-time related) data columns.

Functions [hide private]
list
readfile_raw(input_file, separator=None, comment=None, start=0, stop=-1, ignore_empty=False)
Reads a text file line by line and returns the raw data into the nested list raw_data.
source code
 
campbell2num_date(campbell_date)
Convert our standart campbell notation into pylab format.
source code
 
num_date2campbell(num_date, secs=False)
Convert a numerical date as in pylab to our standart campbell notation (in a narray or a single date).
source code
np.array of floats
iso_time_to_date(isostrings, method='hash')
Converts a ISO 8601 date & time string (well, slightly perverted) into a matplotlib date number.
source code
tuple
read_campbell_cr10x(input_file, headers=None, secs=False, year=None)
Reads the file in standard Campbell CR10X dataformat:
source code
tuple
read_campbell_TAO5(input_file, given_headers=[])
Reads the file in TAO5 Campbell dataformat as used by CR1000:
source code
tuple
read_maw_file(input_file)
Reads a standart MAW file (as only used by me, Mauro A Werder) with format:
source code
Variables [hide private]
  __package__ = 'fieldpy.core'
Function Details [hide private]

readfile_raw(input_file, separator=None, comment=None, start=0, stop=-1, ignore_empty=False)

source code 

Reads a text file line by line and returns the raw data into the nested list raw_data. Ignores all trainling empty lines.

Parameters:
  • input_file (string) - The file to read.
  • separator (string) - Column separator in file, if equals to 'no_split' colums will not be split.
  • comment (string) - a string which comments the rest of the line
  • start (int) - which line to start on (default 0)
  • stop (int) - which line to stop on (default -1, i.e. to the end)
  • ignore_empty (boolena) - if True, ignore empty lines
Returns: list
Returns a nested list containing the split raw file lines (as strings).
>>> readfile_raw('test_files/maw_file_test.maw', separator=',', comment='#')
[['2010-07-13 08:49:00', '0', '0.3030', '5', 'asdd asdlkj asl'], ['2010-07-13 08:56:00', '15', '0.2320', '8866', 'asdd asdlkj asl'], ['2010-07-13 08:58:00', '25', '0.2055', '5', '7'], ['2010-07-13 09:03:00', '50', '0.1620', '5', '']]

campbell2num_date(campbell_date)

source code 

Convert our standart campbell notation into pylab format.

Parameters:
  • campbell_date (list of lists or numpy array) - A numpy array with Campbell dates
Returns:
numpy array
>>> cd1 = [[2006, 139, 1245]]
>>> cd2 = [[2006, 139, 1245, 34]]
>>> campbell2num_date(cd1)
array([ 732450.53125])
>>> campbell2num_date(cd2)
array([ 732450.53164352])
>>> np.alltrue(cd1 == num_date2campbell(campbell2num_date(cd1)))
True
>>> np.alltrue(cd2 == num_date2campbell(campbell2num_date(cd2)))
False
>>> np.alltrue(cd2 == num_date2campbell(campbell2num_date(cd2), secs=True))
True

num_date2campbell(num_date, secs=False)

source code 

Convert a numerical date as in pylab to our standart campbell notation (in a narray or a single date).

Parameters:
  • num_date (numpy array or list) - vector of pylab dates
  • secs (boolean) - if True secs are appended
Returns:
numpy array with rows [year, julian day, time (,seconds)]
>>> dt = datetime.datetime(2006,6,6,12,37,25)
>>> nd = plt.date2num(dt)
>>> num_date2campbell([nd])
array([[2006,  157, 1237]])
>>> num_date2campbell([nd], secs=True)
array([[2006,  157, 1237,   25]])

iso_time_to_date(isostrings, method='hash')

source code 

Converts a ISO 8601 date & time string (well, slightly perverted) into a matplotlib date number. Note that this implementation is not particularly fast as it uses several try/except blocks. If efficiency is a concern, hard-code it.

Parameters:
  • isostrings (list of stings) - ISO 8601 date & time string: following formats are supported:
  • method (string) - Switch to use different alogrithm. In order of decreasing speed:
    • magic 40x
    • hash 3x
    • fast 4x (needs numpy>1.5)
    • '' 1x

    Set to '' to get good error checking/reporting.

Returns: np.array of floats
matplotlib date numbers
>>> iso_time_to_date(["2010-07-07 00:00:00"])
array([ 733960.])
>>> iso_time_to_date(["2010-07-07 00:00:00","2010-07-07 00:00:00.5","2010-09-07 03:01:00.5"])
array([ 733960.        ,  733960.00000579,  734022.12570023])
>>> iso_time_to_date(["2010-07-07 00:00:00","2010-07-07 00:01:00","2010-09-07 03:01:00"])
array([ 733960.        ,  733960.00069444,  734022.12569444])

read_campbell_cr10x(input_file, headers=None, secs=False, year=None)

source code 

Reads the file in standard Campbell CR10X dataformat:

number, year, julian day, time, data, ...

Or if year is not None: number, julian day, time, data, ...

Parameters:
  • input_file (string) - input file name
  • headers ([string]) - a list of headers to be given to the variable columns (default is [var1, var2, var3...])
  • secs (boolean) - If true the fifth row is interpreted as seconds else as data
  • year (integer) - If not None, then it is interpreted that the datafile contains no year column and value of 'year' parameter is used. (note, the colums are counted from zero)
Returns: tuple
tuple (data, raw_data, metadata)
>>> data, raw_data, metadata = read_campbell_cr10x('test_files/cr10x.dat')
>>> data, metadata, raw_data # doctest:+ELLIPSIS
(array([(732102.9722222222, -0.30595, 3.2896, 335.44),
       (732102.9791666667, -0.30629, 3.2656, 332.99),
       (732102.9861111111, -0.27962, 3.2405, 330.43),
       (732102.9930555556, -0.30513, 3.205, 326.81),
       (732103.0, -0.30523, 3.1689, 323.13),
       (732103.0069444445, -0.30457, 3.141, 320.29)], 
      dtype=[('time', '<f8'), ('var0', '<f8'), ('var1', '<f8'), ('var2', '<f8')]), {'headers': ['time', 'var0', 'var1', 'var2'],
 'input_file': 'test_files/cr10x.dat',
 'raw_headers': ['station number',
                 'year',
                 'julian day',
                 'time',
                 'var0',
                 'var1',
                 'var2'],
 'secs': False,
 'units': [],
 'year': None}, array([[  1.05000000e+02,   2.00500000e+03,   1.56000000e+02,
          2.32000000e+03,  -3.05950000e-01,   3.28960000e+00,
          3.35440000e+02],
...

read_campbell_TAO5(input_file, given_headers=[])

source code 

Reads the file in TAO5 Campbell dataformat as used by CR1000:

Resources: http://www.campbellsci.com/documents/manuals/loggernet_3-1.pdf Section B.1.4

Header format file format, station, logger type, serial number, OS version, logger-program file name, logger-program file checksum, table name "TIMESTAMP","RECORD",fieldname,fieldname,... "TS","RN", field-units, field-units,... "","",field recording method,field recording method,...

If the fieldname is not specified then a header of format 'var1' etc will be given, except if specified in the list given_headers.

Parameters:
  • input_file (string) - input file name
  • given_headers (list) - list of header names to give in the record array data. If an entry is None then the default one is used. Note that the field 'RECORD' is ignored in the data and thus does not feature in this list.
Returns: tuple
tuple (data, raw_data, metadata)

Note: It is assumed that any string-like thing is a date+time string

>>> d,rd,md = read_campbell_TAO5('test_files/TOA5_cr1000.dat')
>>> d,rd,md # doctest:+ELLIPSIS
(array([ (733960.0, 13.72, 12.6, 733959.9930787038, 13.43, 10.2, 733959.5997685185, 4.493, 7),
       (733961.0, 13.78, 12.48, 733960.2569675926, 13.15, 17.09, 733960.6921296297, 4.064, 8),
       (733962.0416666666, 13.74, 12.5, 733961.2257175926, 13.07, 17.36, 733961.6785185186, 5.637, 10)], 
      dtype=[('TIMESTAMP', '<f8'), ('Batt_Volt_Max', '<f8'), ('Batt_Volt_Min', '<f8'), ('Batt_Volt_TMn', '<f8'), ('Batt_Volt_Avg', '<f8'), ('Panel_Temp_Max', '<f8'), ('Panel_Temp_TMx', '<f8'), ('var7', '<f8'), ('Panel_Temp_Avg', '<i8')]), ...

read_maw_file(input_file)

source code 

Reads a standart MAW file (as only used by me, Mauro A Werder) with format:

#maw name of dataset # comment line #metadata is an metadata tag: #metadata.eg = 'asdf' # will create a attribute in metadata.eg with value 'asdf' #metadata.num = '1.234' # # the last comment line has the format and will be put into # metadata['headers'], metadata['units'] and use as datatype: # name0 (units) [dtype], name1 (units) [dtype], name2 (units) [dtype], ... val0, val1, val2 ... . . .

dtypes is one of the following: int, float, str, time_str

Time is represented as an ISO 8601 sting: "yyyy/mm/dd HH:MM:SS(.FF)" excluding the 'T' without time zone information (which should be given in the units as eg (UTC-7)).

The idea is to have a easy to parse text represenation of (a subset of) what can be contained in a netcdf3 file.

Parameters:
  • input_file (string) - input file name
Returns: tuple
tuple (data, raw_data, metadata)
>>> d,rd,md = read_maw_file('test_files/maw_file_test.maw')
>>> d,rd,md 
(array([(733966.3673611111, 0.0, 0.303, 5, 'asdd asdlkj asl'),
       (733966.3722222223, 15.0, 0.232, 8866, 'asdd asdlkj asl'),
       (733966.3736111111, 25.0, 0.2055, 5, '7'),
       (733966.3770833333, 50.0, 0.162, 5, '')], 
      dtype=[('time', '<f8'), ('var1', '<f8'), ('var2', '<f8'), ('var3', '<i8'), ('var4', '|O8')]), array([('2010-07-13 08:49:00', 0.0, 0.303, 5, 'asdd asdlkj asl'),
       ('2010-07-13 08:56:00', 15.0, 0.232, 8866, 'asdd asdlkj asl'),
       ('2010-07-13 08:58:00', 25.0, 0.2055, 5, '7'),
       ('2010-07-13 09:03:00', 50.0, 0.162, 5, '')], 
      dtype=[('time', '|O8'), ('var1', '<f8'), ('var2', '<f8'), ('var3', '<i8'), ('var4', '|O8')]), {'calibaration_solution_concentration': 10.0,
 'calibaration_solution_concentration_units': 'g/l',
 'dtypes': ['time_str', 'float', 'float', 'int', 'str'],
 'experimenter': 'MAW + UM',
 'headers': ['time', 'var1', 'var2', 'var3', 'var4'],
 'raw_headers': ['time', 'var1', 'var2', 'var3', 'var4'],
 'title': 'Test file',
 'units': ['UTC-7', 'ml', '', 'm^3', '']})