Python TutorialGetting Started with PythonPython Basic SyntaxPython DatatypesPython IndentationPython Collection TypesPython Basic Input and OutputPython Built in Modules and FunctionsPython FunctionsChemPy - python packageCreating Python packagesFunctional Programming in PythonIncompatibilities moving from Python 2 to Python 3IoT Programming with Python and Raspberry PIKivy - Cross-platform Python Framework for NUI DevelopmentMutable vs Immutable (and Hashable) in PythonPyInstaller - Distributing Python CodePython *args and **kwargsPython 2to3 toolPython Abstract Base Classes (abc)Python Abstract syntax treePython Alternatives to switch statement from other languagesPython and ExcelPython Anti-PatternsPython ArcPyPython ArraysPython Asyncio ModulePython Attribute AccessPython AudioPython Binary DataPython Bitwise OperatorsPython Boolean OperatorsPython Checking Path Existence and PermissionsPython ClassesPython CLI subcommands with precise help outputPython Code blocks, execution frames, and namespacesPython Collections modulePython Comments and DocumentationPython Common PitfallsPython Commonwealth ExceptionsPython ComparisonsPython Complex mathPython concurrencyPython ConditionalsPython configparserPython Context Managers (with Statement)Python Copying dataPython CountingPython ctypesPython Data SerializationPython Data TypesPython Database AccessPython Date and TimePython Date FormattingPython DebuggingPython DecoratorsPython Defining functions with list argumentsPython DeploymentPython Deque ModulePython DescriptorPython Design PatternsPython DictionaryPython Difference between Module and PackagePython DistributionPython DjangoPython Dynamic code execution with `exec` and `eval`Python EnumPython ExceptionsPython ExponentiationPython Files & Folders I/OPython FilterPython FlaskPython Functools ModulePython Garbage CollectionPython GeneratorsPython getting start with GZipPython graph-toolPython groupby()Python hashlibPython HeapqPython Hidden FeaturesPython HTML ParsingPython HTTP ServerPython IdiomsPython ijsonPython Immutable datatypes(int, float, str, tuple and frozensets)Python Importing modulesPython Indexing and SlicingPython Input, Subset and Output External Data Files using PandasPython Introduction to RabbitMQ using AMQPStorm

Python Input, Subset and Output External Data Files using Pandas

From WikiOD

This section shows basic code for reading, sub-setting and writing external data files using pandas.

Basic Code to Import, Subset and Write External Data Files Using Pandas[edit | edit source]

# Print the working directory
import os
print os.getcwd()
# C:\Python27\Scripts

# Set the working directory
os.chdir('C:/Users/general1/Documents/simple Python files')
print os.getcwd()
# C:\Users\general1\Documents\simple Python files

# load pandas
import pandas as pd

# read a csv data file named 'small_dataset.csv' containing 4 lines and 3 variables
my_data = pd.read_csv("small_dataset.csv")
#     x   y   z
# 0   1   2   3
# 1   4   5   6
# 2   7   8   9
# 3  10  11  12

my_data.shape       # number of rows and columns in data set
# (4, 3)

my_data.shape[0]    # number of rows in data set
# 4

my_data.shape[1]    # number of columns in data set
# 3

# Python uses 0-based indexing.  The first row or column in a data set is located
# at position 0.  In R the first row or column in a data set is located
# at position 1.

# Select the first two rows
#    x   y   z
#0   1   2   3
#1   4   5   6

# Select the second and third rows
#    x  y  z
# 1  4  5  6
# 2  7  8  9

# Select the third row
#    x   y   z
#2   7   8   9

# Select the first two elements of the first column
my_data.iloc[0:2, 0:1]
#    x
# 0  1
# 1  4

# Select the first element of the variables y and z
my_data.loc[0, ['y', 'z']]
# y    2
# z    3

# Select the first three elements of the variables y and z
my_data.loc[0:2, ['y', 'z']]
#    y  z
# 0  2  3
# 1  5  6
# 2  8  9

# Write the first three elements of the variables y and z
# to an external file.  Here index = 0 means do not write row names.

my_data2 = my_data.loc[0:2, ['y', 'z']]

my_data2.to_csv('my.output.csv', index = 0)