-
Notifications
You must be signed in to change notification settings - Fork 3
Standard reduction pipeline
K.-Michael Aye edited this page Jul 2, 2014
·
2 revisions
Note: This procedure currently creates approx 5 GB of data on top of the downloaded file size.
- Download data dump
xxx.csv.tar.gz
from the email you get every Sunday (if not, contact Meg). - on Mac, the Archive utility should properly unpack it, on linux:
tar zxvf yyyy_mm_dd_xxxx.csv.tar.gz
- clone this repository. (see here how.)
-
cd
intoP4_sandbox/planet4
-
python reduction.py path_to_csv_file
The argument is the full path to the unpacked CSV file.
You need a current Python environment that includes the following modules:
- pandas
- PyTables (tables)
- scipy / numpy
I can recommend the Anaconda distribution from Continuum Analytics, it contains extra features for academic users. But I have also used Enthought's Canopy successfully for years, just on Linux I don't like the hoops one has to go through for a multi-users installation.
It will create both the queryable and fast-read HDF5 database files in the same folder where the given CSV file is stored.