Python library for data manipulation and analysis. It can work with many different data types including:
- symbol separated data (tsv, csv, etc.)
- ordered and unordered time series data
- matrix and table data
- labelled and unlabelled data
Reading tab delimited data
First create a data file. Open a text file 3 lines of text with three words on each line separated by tabs. Then save it as data.tsv. Something like the following
id name dob
11 Alice January
12 Bob February
code
import pandas as pd
mydata = pd.read_table('data.tsv');
# for remote data
# pd.read_table('http://...')
mydata.head() # head() shows the first 5 rows
Reading character delimited files
Sample data. Save as data.txt
11|Alice|January
12|Bob|February
Code
import pandas as pd
cols = ['id', 'age', 'dob']
mydata = pd.read_table('data.txt', sep='|', header=None, names=cols)
Note that in the sample data, there is no header row. So we created one.