Python split string to list

The split operation can be used to split a string into a list, enabling easier processing. By default, whitespace (spaces, line feeds, etc.) is used to determine where to split. You can easily override this with something more meaningful to your application.

Consider the following example, which breaks the provided text into both words and sentences, depending on the call to split

Python using Laptop
text = "Wear a warm hat. It is cold outside. Stay Warm."

words= text.split()
sentences = text.split('.')

print(words) 
print("------")
print(sentences)

When we run it, we get the following

> python split.py
['Wear', 'a', 'warm', 'hat.', 'It', 'is', 'cold', 'outside.', 'Stay', 'Warm.']
------
['Wear a warm hat', ' It is cold outside', ' Stay Warm', '']

We are fond of CSV (common separated value) files, and use pandas to process them. We can see though, how we could build something similar ourselves based on split.

Consider the following file

id,colour,weight,type,quantity
1,blue,finger,wool,22.0
2,green,finger,wool,4.0
3,blue,sock,wool-blend,20.0
6,black,chunky,acrylic,35.0
7,white,chunky,acrylic,11.0

The following code reads in this file in its entirety, as opposed to line by line, so we can also demonstrate the splitline method, which we use to isolate the different lines. We then in turn split the lines based on a comma separator.

file = "database.csv"

# Read in Database 
f = open(file,mode='r')
entireFile = f.read()

#Split into lines
lines = entireFile.splitlines()
print(lines)

print("------")

#Split into fields
for line in lines:
  fields = line.split(',')
  print(fields)

When we run, we get the following

> python split.py
['id,colour,weight,type,quantity', '1,blue,finger,wool,22.0', '2,green,finger,wool,4.0', '3,blue,sock,wool-blend,20.0', '6,black,chunky,acrylic,35.0', '7,white,chunky,acrylic,11.0']
------
['id', 'colour', 'weight', 'type', 'quantity']
['1', 'blue', 'finger', 'wool', '22.0']
['2', 'green', 'finger', 'wool', '4.0']
['3', 'blue', 'sock', 'wool-blend', '20.0']
['6', 'black', 'chunky', 'acrylic', '35.0']
['7', 'white', 'chunky', 'acrylic', '11.0']