Discussion 1
Survey:
- your name
- your program, your year
- your next step? continue research or industry
- what is your favorite programming language?
Expectations
- active participations
- ask questions
- try things out, spend time debugging
- help others with their practical roadblocks
I will not:
- google things for you. For example, if you ask my how to install jupyter notebook, I will send you this link
- teach how to use a programming language
- tell how past students do the module. You shouldn’t ask for any kind of code or material also. We will check them.
Questions?
Some of the following contents are from Prof. Ané’s STAT 679 and Wikibook.
The Unix Shell
GUI (graphical user interface): easy but not reproducible.
CLI (command line interface) or REPL (read-evaluate-print loop): steep learning curve but reproducible and powerful.
After you have a terminal open, type echo $SHELL
.
You may see this:
$ echo $SHELL
/bin/bash
or you may get this:
% echo $SHELL
/bin/zsh
The terminal is the “window” (more or less), while the shell is a program (or a programming language, like R and Python are). You can even write your own shell!
There are several shell programs, bash
(and zsh
) being the most common.
They are almost equivalent.
A list of shell commands
- directory structure, root is
/
- relative versus absolute paths
- in your code and projects: use relative paths as much as possible: it makes your code more portable, for others, and for yourself if you re-locate your own project folder
- shortcuts:
.
,..
,~
,-
cd -
is so useful!
- tab completion to get program and file names
- up/down arrows and
!
to repeat commands
man |
display the manual of a command |
pwd |
print working directory. where am I? |
ls |
list. many options, e.g. -a (all) -l (long) -lrt (reverse-sorted by time) |
cd |
change directory |
mkdir |
make directory |
rm |
remove (forever). -f to force, -i to ask interactively, -r recursively |
mv |
move (and rename). can overwrite existing files, unless -i to ask |
cp |
copy. would also overwrite existing files |
wc |
word count: lines, words, characters. -l , -w , -c |
cat |
concatenate |
less |
because “less is more”. q to quit. |
sort |
-n for numerical sorting |
head |
first 10 lines. -n 3 for first 3 lines (etc.) |
tail |
last 10 lines. -n 3 for last 3 lines, -n +30 for line 30 and up |
echo |
|
ps |
provide information about the currently running processes |
kill |
kill a process manully |
chmod |
change file permissions |
history |
shows the history of all previous commands, numbered |
The shell is an incredibly powerful tool:
- The Unix shell can do great things, but power comes with danger: it’s unsafe!
$ rm -rf *
deletes all files in the current directory.
rm
is to remove files & directories
-r
will do it recursively (enter each directory within each directory)
-f
will “force” removal without asking you confirmation for each individual file
*
in the shell will match anything
You should understand what the command is doing before executing it!
The Unix world: one file after another
When you think of a computer, you usually come up with the following things:
-
The computer itself
-
The keyboard
-
The mouse
-
Your hard drive with your files and directories on it
-
The network connection leading to the Internet
However, everything in the whole (Unix) universe is a file. Your (data) files are files. Your directories are files. Your hard drive is a file. Your keyboard is a read-only file of infinite size. Your monitor is infinitely sized write-only files. Your network connection is a read/write file.
Streams: what goes between files
Everything in Unix is a file – except that which sits between files. Between files Unix defines a mechanism that allows data to move from one file to another: the stream.
Unix has the following three standard streams:
-
Standard in (stdin): the standard stream for input into a file.
-
Standard out (stdout): the standard stream for output out of a file.
-
Standard error (stderr): the standard stream for error output from a file.
Text streams
to process a stream of data rather than holding it all in memory.
Example: concatenate two data files. Open both in editor, copy one and paste into the other?
- may not have enough memory
- manual operation: error-prone and not reproducible.
Instead: print the files’s content to standard output stream and redirect this stream from our terminal to the file we wish to save the combined results to.
cat a.txt b.txt c.txt
cat *.txt > combine.txt
Mock interview question
How can you generate a uniformly distributed random number between 1-7 with only a die?