Python3 with Oracle database

Installation

These instructions are for Mac OS:

First install homebrew. Never use sudo for brew.

$ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
$ which brew
/usr/local/bin/brew

Install Python3. The installation comes with pip3.

$ brew install python3
$ which pip3
/usr/local/bin/pip3

Install cx_Oracle, the library that allows python3 to communicate with Oracle database.

What is Caching

A cache is a temporary data store that usually contains pre-computed data. The purpose of a cache is to provide the data the next time someone asks for it without having to re-compute the data. For example, your website needs to run a complex query to fetch results of a user request. A complex query takes time to run and uses the system's resources for the query to run. Every time someone click on a link, this query needs to run. However, if you cache the results, you can simple provide results from the cache rather than having to re-query the data.

Quick Introduction to Google Cloud

Google Cloud Platform (GCP) offers many services. They can loosely be classified as:

  • compute services
  • storage services
  • big data services
  • identity and security services
  • management tools
  • developer tools
  • other services

Compute services includes virtual machines, containers, and functions.

Storage services allow storage of files, archival storage, and persistence data. Data services include NoSQL, RDBMS, Hadoop and Spark.

Quick Introduction to Cloud

Cloud refers to hosted services over the Internet. For example, Google Drive and GoogleDocs are cloud services. Google Drive allows you to save your files on their hardware. GoogleDocs is a collection of software (word processor, spreadsheet, etc.) hosted on Google's servers. The software resided on Google's hardware and uses their memory and CPU. You access it through the Internet. The are three category of cloud services:

Python3 Basics

Variables

See the following program. Save as datatype.py:

a = 100    # integer
b = 1.23   # float
c = "python"    # string

# print variables
print(a)
print(b)
print(c)

# convert int to float
print(float(100))         

# convert float to int
print(int(3.14))

# convert string to int
d = "12"
e = "12.3"
print(int(d))
# print(int(e)) - this will generate an error

# convert string to float
print(float(d))
print(float(e))

# convert int to string
f = str(12)
print(type(f))

To run

Python3 File Management

This page shows how to work with text files using Python3 code. Following is a sample txt file we will be using. Lets called it stocks.txt

bce.to
hnd.to 
mtl.to

Reading from File

You can use read(), readline(), or readlines() function to read from a file. This example uses read()

stocks = open("data/stocks.txt","r")
stocks.read()
stocks.close()

output:

'bce.to\nhnd.to\nmtl.to'

Using readline():

Python3 Code Snippets

Get Current Date

import datetime as dt
now = dt.datetime.now()
print(now.year)
print(now.month)
print(now.day)

Download File from Internet

import urllib.request
url = 'http://molecularsciences.org'
response = urllib.request.urlopen(url)
mydata = response.read()
mytext = text.decode('utf-8')
print(mytext)

Hive

Hive is an SQL language that processes and analyzes data in Hadoop. It does not require knowledge of any programming language. Hive is not suitable for OLTP, it is designed for analyzing big data.

Hadoop in not a database, it is an ecosystem of tools that enables the features we require and desire when dealing with big data. Hadoop runs on HDFS and its native language is MapReduce. Hive converts your SQL commands to MapReduce. Hive also supports workflow integration with other tools such as Excel or Cognos.

Pages