{"id":375,"date":"2020-08-19T05:28:00","date_gmt":"2020-08-19T09:28:00","guid":{"rendered":"https:\/\/molecularsciences.org\/content\/?p=375"},"modified":"2024-02-08T08:21:13","modified_gmt":"2024-02-08T13:21:13","slug":"using-pandas-to-read-file-data","status":"publish","type":"post","link":"https:\/\/molecularsciences.org\/content\/using-pandas-to-read-file-data\/","title":{"rendered":"Using Pandas to read file data"},"content":{"rendered":"\n<p>Python library for data manipulation and analysis. It can work with many different data types including:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>symbol separated data (tsv, csv, etc.)<\/li><li>ordered and unordered time series data<\/li><li>matrix and table data<\/li><li>labelled and unlabelled data<\/li><\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Reading tab delimited data<\/h3>\n\n\n\n<p>First create a data file. Open a text file 3 lines of text with three words on each line separated by tabs. Then save it as data.tsv. Something like the following<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>id    name    dob\n11    Alice    January\n12    Bob    February<\/code><\/pre>\n\n\n\n<p>code<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import pandas as pd\nmydata = pd.read_table('data.tsv');\n# for remote data\n# pd.read_table('http:\/\/...')\nmydata.head()    # head() shows the first 5 rows<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Reading character delimited files<\/h3>\n\n\n\n<p>Sample data. Save as data.txt<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>11|Alice|January\n12|Bob|February<\/code><\/pre>\n\n\n\n<p>Code<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import pandas as pd\ncols = &#91;'id', 'age', 'dob']\nmydata = pd.read_table('data.txt', sep='|', header=None, names=cols)<\/code><\/pre>\n\n\n\n<p>Note that in the sample data, there is no header row. So we created one.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Python library for data manipulation and analysis. It can work with many different data types including: symbol separated data (tsv, csv, etc.) ordered and unordered time series data matrix and table data labelled and unlabelled data Reading tab delimited data First create a data file. Open a text file 3 lines of text with three [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":548,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[203],"tags":[137],"class_list":["post-375","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-python","tag-python"],"_links":{"self":[{"href":"https:\/\/molecularsciences.org\/content\/wp-json\/wp\/v2\/posts\/375","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/molecularsciences.org\/content\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/molecularsciences.org\/content\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/molecularsciences.org\/content\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/molecularsciences.org\/content\/wp-json\/wp\/v2\/comments?post=375"}],"version-history":[{"count":2,"href":"https:\/\/molecularsciences.org\/content\/wp-json\/wp\/v2\/posts\/375\/revisions"}],"predecessor-version":[{"id":550,"href":"https:\/\/molecularsciences.org\/content\/wp-json\/wp\/v2\/posts\/375\/revisions\/550"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/molecularsciences.org\/content\/wp-json\/wp\/v2\/media\/548"}],"wp:attachment":[{"href":"https:\/\/molecularsciences.org\/content\/wp-json\/wp\/v2\/media?parent=375"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/molecularsciences.org\/content\/wp-json\/wp\/v2\/categories?post=375"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/molecularsciences.org\/content\/wp-json\/wp\/v2\/tags?post=375"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}