Homework 4: Command line

Update: answers posted below each problem.

Due: Thursday October 11, 11:59PM.

To submit: Send an email to me at jal2016@email.vccs.edu with subject CSC 110 HW4 with your work attached in a zip file.

For this homework assignment, you'll be creating several text files by using command line utilities. In addition, create another text file called csc110-hw4.txt to contain answers to other questions.

  1. Browse this command line reference site, and select three commands you'd like to try out. (Read the descriptions to make sure that they don't do anything destructive to your computer). You can also type help at the command line for a summary of some of the available commands. For each command, execute the command and redirect the output to a file called command-name.txt (except replace command-name with the actual name of the command you choose).

    For example, if I chose the dir command, I would execute dir > dir.txt. You should choose commands we have not already used in class.

    Submit the two files you create as part of your homework zip file. In addition, in your homework answers file, briefly describe each command and explain why they're useful.

    Sample answer:

    The for command is used to execute another command repeatedly. It has lots of complicated options, but here are two simple cases:

    • Used with the /L flag, the command will cycle through a list of numbers. The following example cycles through the numbers from 1 to 5 and executes the command echo %G (appending the output to for.txt) for each number (with %G replaced by the number):

      C:\Users\jlepak>for /L %G IN (1,1,5) do echo %G >> for.txt
      
      C:\Users\jlepak>echo 1  1>>for.txt
      
      C:\Users\jlepak>echo 2  1>>for.txt
      
      C:\Users\jlepak>echo 3  1>>for.txt
      
      C:\Users\jlepak>echo 4  1>>for.txt
      
      C:\Users\jlepak>echo 5  1>>for.txt
      
      C:\Users\jlepak>type for.xt
      The system cannot find the file specified.
      
      C:\Users\jlepak>type for.txt
      1
      2
      3
      4
      5
      

      Note that all 5 lines that look like C:Usersjlepak>echo N  1>>for.txt are printed automatically after executing the for command. This printing can be disabled by specifying @echo off before running the command.

    • Used without any extra flags, it will cycle through all files listed by a glob pattern. The following sequence of commands shows a way to rename all files in your working directory by adding a prefix ("new-" in this case) to the filename. The example directory contains files named x1, x2, and x3 initially:

      C:\Users\jlepak\tmp\test>dir /b
      x1
      x2
      x3
      
      C:\Users\jlepak\tmp\test>for %f in (*) do move %f new-%f
      
      C:\Users\jlepak\tmp\test>move x1 new-x1
              1 file(s) moved.
      
      C:\Users\jlepak\tmp\test>move x2 new-x2
              1 file(s) moved.
      
      C:\Users\jlepak\tmp\test>move x3 new-x3
              1 file(s) moved.
      
      C:\Users\jlepak\tmp\test>dir /b
      new-x1
      new-x2
      new-x3
      
  2. Run the following commands:

    set a=50
    set b=25
    set ab=%a% + %b%
    set /a c=%a% + %b%
    
    • After running those commands, what is the output of the following commands?

      echo c = %c%
      echo ab = %ab%
      

      Answers:

      C:\Users\jlepak>set a=50
      
      C:\Users\jlepak>set b=25
      
      C:\Users\jlepak>set ab=%a% + %b%
      
      C:\Users\jlepak>set /a c=%a% + %b%
      75
      C:\Users\jlepak>echo c = %c%
      c = 75
      
      C:\Users\jlepak>echo ab = %ab%
      ab = 50 + 25
      
    • What are the commands you'd run to execute the following?

      • Create a variable named x with value 4321.
      • Create a variable named y and set it to the value of x squared.
      • Print the value of y.

      Answers:

      C:\Users\jlepak>set x=4321
      
      C:\Users\jlepak>set /a y=%x% * %x%
      18671041
      C:\Users\jlepak>echo %y%
      18671041
      
  3. Download these example sent emails (part of this Enron email dataset), and extract the contents of the zip file. Use the dir, find, and findstr commands to answer the following questions. You'll need to examine the help message for each command to determine how to use them: type dir /?, find /?, or findstr /?. Warning: some files may contain offensive language.

    • What was the largest sent email? Look for the flag for the dir that will sort by file size.
    • How many emails contain the word "bankrupt"? Hint: find the flag to use with the findstr command that will print just the names of the files that match (and not all the lines).
    • How many lines total contain the word "bankrupt"? Hint: pipe the output of type * (which will print the contents of every file) to a find command. Save each such line in a file called "bankrupt-lines.txt".

    Answers:

    C:\Users\jlepak\tmp\enron-sent-sample>dir /os
    ...
    10/02/2012  09:12 AM            32,224 1887
    10/02/2012  09:12 AM            34,784 6557
    10/02/2012  09:12 AM           115,021 6012
    10/02/2012  09:12 AM           115,990 6013
    10/02/2012  09:12 AM           151,144 5691
                7871 File(s)     16,924,252 bytes
                   2 Dir(s)  17,110,171,648 bytes free
    

    So email 5691 is the largest.

    C:\Users\jlepak\tmp\enron-sent-sample>findstr /M bankrupt *
    1149
    1335
    229
    2840
    2943
    2944
    3632
    4002
    483
    5045
    5448
    ...
    

    The total printout is 154 different files. findstr also can accept a regular expression if the /R flag is specified. You can use the special syntax to specify a search target more specifically. The following command attempts to find "bankrupt" only as a complete word (not as part of a word like "bankruptcy"), and finds only 27 files:

    C:\Users\jlepak\tmp\enron-sent-sample>findstr /M /R "\<bankrupt\>" *
    1149
    3632
    483
    5646
    5691
    5838
    5844
    5845
    ...
    

    To count the total number of lines, you can use the command:

    C:\Users\jlepak\tmp\enron-sent-sample>type * | find /c "bankrupt"
    

    This will pipe the contents of every file through the find command. It will display all filenames as they are read, but the final line of output is the number of lines counted by find, which is 367. Note that the command find /c "bankrupt" * is similar, but will print the count of lines for each file separately.

    To save each such line, use the command:

    C:\Users\jlepak\tmp\enron-sent-sample>type * | find "bankrupt" > ..\bankrupt-lines.txt
    

    Note that I had to save the output to a file in the parent directory. This is due to an issue with the way that the expansion of * works. If I had saved to a file in the current directory, that same file would be made part of the list that is printed, which is not what I would want.