Week 15 summary -- Python files and command line
08 Dec 2011We covered some examples outlined in last week's agenda. Next week will have a quiz -- I'll post some practice problems this weekend.
Line count program
Create a program that counts the number of lines in the file specified
on the command line. So if you type: linecount FILENAME
, it will
print the number of lines in FILENAME
.
Create an alternate version that counts the lines from stdin
if no FILENAME
is specified.
import sys
# test to see if a filename was specified
if len(sys.argv) == 2:
# get filename
name = sys.argv[1]
# open file
f = open(name)
else:
f = sys.stdin
# get list of lines
lines = f.readlines()
# print number of lines
print(len(lines))
Word count program
Create a program that counts the number of lines containing a particular
word. So if you type wordlines WORD FILENAME
, it will print
the number of lines that contain WORD
in the file FILENAME
.
Create an alternate version that prints all those lines instead
of counting them. Try using this version in a pipeline with
your linecount
program.
import sys
word = sys.argv[1]
name = sys.argv[2]
f = open(name)
# look in all lines of file for word
count = 0
for line in f:
if word in line:
print(line, end='') # don't print extra newlines
# First version: keep track of number of matches
# count = count + 1
# First version: print number of matches
# print(count)
Longest word program
Create a program that finds the longest word in a file. Hints:
-
words = line.split()
will take a line and split it into a list of words. - Break the program up into small functions. Define a function that finds the longest word in a list of words. Then define a function that finds the longest word in the entire file, using the first function.
- For finding the longest word, think about how you defined the
max
function to find the max number in a list. You can find the longest word in almost the same way -- you just need to transform the input in some way.
import sys
# split a line into words.
# This could be tweaked later on to split more intelligently.
def words(line):
return line.split()
# return longest word in a list
def longest(words):
long = ''
for w in words:
if len(w) > len(long):
long = w
return long
name = sys.argv[1]
f = open(name)
# Check all lines on file, keeping track of longest
# word seen so far.
longest_so_far = ''
for line in f:
w = words(line)
longest_on_line = longest(w)
if len(longest_on_line) > len(longest_so_far):
longest_so_far = longest_on_line
# When loop is done, longest_so_far contains longest in entire file.
print(longest_so_far)
Instead of working line-by-line, we could read in the entire file
all at once, and then split that whole thing into words. Our original
words
function still works for this purpose. If we get all words in the
file, we can just feed all of those into the longest
function and be
done. One downside is that for a very large file, we have to store
the whole thing in memory with this version.
import sys
# split a string into words.
# This could be tweaked later on to split more intelligently.
def words(line):
return line.split()
# return longest word in a list
def longest(words):
long = ''
for w in words:
if len(w) > len(long):
long = w
return long
name = sys.argv[1]
f = open(name)
contents = f.read()
all_words = words(contents)
print(longest(all_words))
As a final version, Python has a built-in max function for which you can specify how to determine the max. Using that, the program can be written on one line if you really want:
import sys
print(max(open(sys.argv[1]).read().split(), key=len))