104 Basic Commands

No comments

The commands cpu, top, kill, and ps are essential tools in Linux for monitoring and managing processes and system resources. Let’s go over each command in detail, including:

  • Purpose

  • Syntax

  • Common use-cases

  • Example output

  • How to analyze the output


πŸ”Ή 1. top — Real-time System Monitoring

πŸ“Œ Purpose:

Displays real-time information about system processes, CPU, memory usage, and load average.

πŸ“Œ Syntax:

top

πŸ“Œ Example Output (Partial):

top - 08:45:26 up  2:34,  2 users,  load average: 0.15, 0.20, 0.25
Tasks: 138 total,   1 running, 137 sleeping,   0 stopped,   0 zombie
%Cpu(s):  3.0 us,  1.0 sy,  0.0 ni, 95.0 id,  1.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   7850.2 total,  2200.5 free,  3400.6 used,  2249.1 buff/cache
PID  USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
1324 root      20   0  123456  45678   1234 R  23.4  0.6   0:10.53 chrome

πŸ“Œ How to Analyze:

  • Load average: First line shows 1, 5, and 15-minute system load. Rule of thumb: if load > number of CPUs, the system is overloaded.

  • %CPU: High %us (user) means CPU is working on your tasks, %sy (system) is kernel, %id is idle time.

  • %MEM: Memory usage per process.

  • PID/COMMAND: Helps identify the exact process consuming resources.


πŸ”Ή 2. ps — Snapshot of Current Processes

πŸ“Œ Purpose:

Displays a snapshot of current processes (unlike top which is real-time).

πŸ“Œ Syntax:

ps aux       # All processes
ps -ef       # Also shows all, but with different format

πŸ“Œ Example Output:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.1  8960  2340 ?        Ss   08:00   0:01 /sbin/init
prakash    1324  5.5  1.2 123456 45678 ?       Sl   08:45   0:10 chrome

πŸ“Œ Key Fields to Analyze:

  • PID: Process ID.

  • %CPU, %MEM: CPU and memory usage.

  • VSZ/RSS: Virtual and resident memory size.

  • STAT: Process state (R running, S sleeping, Z zombie).

  • COMMAND: The command/process name.


πŸ”Ή 3. kill — Send Signal to a Process

πŸ“Œ Purpose:

Terminates (or sends other signals) to a process using its PID.

πŸ“Œ Syntax:

kill <PID>               # Send SIGTERM (default)
kill -9 <PID>            # Send SIGKILL (force kill)
kill -l                  # List all signals

πŸ“Œ Example:

ps aux | grep chrome
# prakash   1324  5.5  1.2 ... chrome
kill -9 1324

πŸ“Œ How to Analyze:

  • If a process is unresponsive, use kill -9.

  • Use ps or top to find the PID of the problem process before killing.


πŸ”Ή 4. cpu — (Note: There's no native cpu command in Linux)

πŸ“Œ Possible Meanings:

  • You might be referring to:

    • Checking CPU usage via top, htop, or mpstat.

    • lscpu — to get CPU architecture info.

✅ Example 1: View CPU architecture

lscpu

Output:

Architecture:        x86_64
CPU(s):              8
Model name:          Intel(R) Core(TM) i7-8565U
CPU MHz:             1800.000

✅ Example 2: CPU usage

mpstat -P ALL 1

Output:

11:28:01 AM  CPU    %usr   %sys   %idle
11:28:02 AM  all     5.00   1.00   94.00

πŸ” Summary Comparison Table

Command Use Case Output Highlights When to Use
top Real-time process monitoring Load avg, CPU %, MEM %, PID, COMMAND Live resource troubleshooting
ps Snapshot of process state PID, %CPU, %MEM, STAT, COMMAND Get PID or process list
kill Send signals (terminate) N/A (command-line tool) Stop/kill hung or rogue processes
lscpu View CPU architecture info Model name, cores, threads Debug CPU availability or model
mpstat CPU usage per core over time %user, %sys, %idle Performance bottleneck diagnosis

πŸ“Œ Real-world Interview Tip (FAANG-ready):

Be prepared to:

  • Find high CPU processes using top or ps.

  • Kill a zombie or runaway process with kill -9.

  • Monitor memory leaks or CPU bottlenecks.

  • Explain CPU load and how to scale (e.g., vertical scaling, multithreading).



 Category: General Process Monitoring

1. Q: How do you find the top 5 memory-consuming processes on a Linux system?

A:

ps aux --sort=-%mem | head -n 6
  • --sort=-%mem: Sorts descending by memory usage.

  • head -n 6: First line is header.


2. Q: How do you monitor CPU usage per core in real-time?

A:

mpstat -P ALL 1
  • -P ALL: Show stats for all cores.

  • 1: Refresh every second.


3. Q: A process is stuck in zombie state. Can you kill it?

A: No. Zombie processes are already dead; they just haven’t been cleaned up. The parent process must wait() to release them. You can kill the parent process to clean it up.


4. Q: How do you identify zombie processes?

A:

ps aux | awk '$8=="Z" { print $2, $11 }'

OR

ps -eo pid,ppid,stat,cmd | grep Z

5. Q: How do you kill all Java processes running on the system?

A:

pkill -f java

Or:

ps aux | grep java | awk '{print $2}' | xargs kill -9

πŸ”Έ Category: top Analysis

6. Q: What does the load average in top mean?

A: It shows the average number of processes waiting to run:

  • First = 1-minute average

  • Second = 5-minute average

  • Third = 15-minute average
    If it's > number of cores, system is overloaded.


7. Q: In top, what does %wa mean in CPU stats?

A: %wa is the time the CPU is waiting on I/O. High %wa may indicate disk or network bottlenecks.


8. Q: How do you sort by memory in top?

A: Inside top, press Shift + M.


9. Q: You see a process consuming 100% CPU in top. How do you find out what it's doing?

A:

  1. Get PID from top

  2. Use strace -p <pid> to trace syscalls.

  3. Use lsof -p <pid> to inspect open files/sockets.


10. Q: How can you monitor top 10 CPU processes in real-time with a script?

A:

watch -n 2 "ps -eo pid,comm,%cpu,%mem --sort=-%cpu | head -n 11"

πŸ”Έ Category: ps and kill

11. Q: What's the difference between kill and kill -9?

A:

  • kill sends SIGTERM (15): Graceful shutdown.

  • kill -9 sends SIGKILL (9): Force kill, can't be trapped or ignored.


12. Q: How do you list all running processes by a specific user?

A:

ps -u <username>

13. Q: How do you kill all processes in a specific group (e.g., all child processes of a PID)?

A:

pkill -P <parent_pid>

14. Q: How do you determine parent-child relationships between processes?

A:

ps -eo pid,ppid,cmd
  • ppid: Parent PID

  • Use pstree for visual tree


15. Q: How do you identify processes using the most open files?

A:

lsof | awk '{ print $2 }' | sort | uniq -c | sort -nr | head

πŸ”Έ Category: CPU & System Info

16. Q: How do you get CPU core count and thread info on Linux?

A:

lscpu
  • CPU(s): Logical cores

  • Core(s) per socket: Physical cores per CPU

  • Thread(s) per core: Hyper-threading


17. Q: How do you monitor CPU utilization over time and graph trends?

A:

  • Use sar from sysstat:

sar -u 1 10
  • Or use tools like Grafana + Prometheus.


18. Q: How do you detect a CPU bottleneck vs memory bottleneck?

A:

  • High %us or %sy and low %id in top: CPU bottleneck.

  • High %wa: I/O bottleneck.

  • High memory usage, swapping (high si/so): Memory bottleneck.


19. Q: What is the difference between VSZ and RSS in ps?

A:

  • VSZ: Virtual memory size (includes code, data, shared libs).

  • RSS: Resident Set Size (actual physical memory in use).


20. Q: How do you trace what files or ports a process is using?

A:

lsof -p <pid>
  • Shows open files, sockets, network connections, etc.


✅ Bonus Tip for Interviews:

  • Always explain why a tool is used.

  • Tie back to real-world issues (e.g., server slowness, memory leaks, runaway processes).

  • Use watch + ps/top/lsof/netstat combinations to show dynamic diagnosis.

  • Know when to use strace, perf, or iotop for deeper profiling.



If a process is consuming 100% CPU in top, you can analyze what it is doing using a combination of strace, lsof, and other tools. Here's a complete step-by-step example to guide you through this investigation:


πŸ” Scenario:

You observe high CPU usage in top, and you want to investigate what that process is doing.


✅ Step-by-Step Analysis:

Step 1: Identify the Process in top

top

Look for the process using the most CPU (e.g., 100%).

Sample output:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
12345 user1     20   0   10240    920    860 R 100.0  0.0   1:23.45 python

→ The PID is 12345, and it's a Python process.


Step 2: Use strace to Trace System Calls

Attach strace to the running process to see what system calls it's making:

sudo strace -p 12345

Sample output:

clock_gettime(CLOCK_MONOTONIC, {tv_sec=12345, tv_nsec=678900000}) = 0
read(3, "", 4096)                         = 0
epoll_wait(4, [], 128, 0)                = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=12345, tv_nsec=679000000}) = 0

This shows the process is making repeated system calls—maybe polling or in a busy loop.

You can also log the output to a file and analyze later:

sudo strace -tt -T -p 12345 -o strace.log

Step 3: Use lsof to See Open Files/Sockets

This tells you which files, libraries, sockets, etc., the process is using.

sudo lsof -p 12345

Sample output:

COMMAND   PID   USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
python  12345  user1  cwd    DIR  8,1     4096      123456 /home/user1/myapp
python  12345  user1  txt    REG  8,1   123456      654321 /usr/bin/python3.8
python  12345  user1  mem    REG  8,1    45678      987654 /lib/x86_64-linux-gnu/libc.so.6
python  12345  user1    3u   REG  8,1   345678      456789 /home/user1/myapp/log.txt
python  12345  user1    4u  IPv4 23456      0t0        TCP 127.0.0.1:45678->127.0.0.1:3306 (ESTABLISHED)

→ It’s connected to a MySQL server on port 3306 — maybe it’s stuck querying DB.


Step 4: Optional - Check the Stack Trace

Use pstack (if installed) to print the stack trace of the process:

sudo pstack 12345

Sample output:

#0  0x00007f3f5dbae430 in __libc_recv () from /lib64/libc.so.6
#1  0x000055b6d2e5a891 in socket_read ()
#2  0x000055b6d2e4ff3a in main_loop ()

This helps in debugging loops or tight recursion inside the application.


🧠 Interview-ready Summary

Q: You see a process using 100% CPU in top. How do you find out what it's doing?

A:

  1. Get the PID from top.

  2. Use strace -p <pid> to trace system calls in real-time.

  3. Use lsof -p <pid> to check files, sockets, and shared libs used.

  4. (Optional) Use pstack <pid> for stack trace.

  5. Analyze logs or application logic based on what these tools show.

This helps you understand whether it’s stuck in a loop, polling, doing excessive computation, or waiting on I/O.


Here's a clear and detailed explanation of kill vs pkill, including differences, use cases, and real examples, perfect for interview prep:


πŸ”ͺ kill vs pkill in Linux

Feature kill pkill
Targets Process by PID Process by name or pattern
Signal Default is SIGTERM (15) Default is SIGTERM (15)
Flexibility Needs PID only More user-friendly (no PID needed)
User Scope Affects any PID (if permitted) Can restrict by user, session, terminal

kill — Send signal to PID

πŸ“Œ Syntax:

kill [-SIGNAL] PID

✅ Example:

top

Find a process with PID 12345.

Kill it:

kill 12345

Send a specific signal (e.g., SIGKILL = 9):

kill -9 12345

πŸ“Œ You must know the PID beforehand.


pkill — Send signal to process name

πŸ“Œ Syntax:

pkill [-SIGNAL] pattern

✅ Example:

To kill all processes with the name python:

pkill python

To force kill (SIGKILL) all nginx processes:

pkill -9 nginx

Restrict to processes run by a specific user:

pkill -u prakash java

Send signal only to processes matching name and terminal:

pkill -t pts/1 bash

πŸ“Œ You don’t need to look up the PID manually.


🧠 Real-World Use Case Comparison

Task Command
Kill process with known PID kill 9876
Kill all Java processes pkill java
Gracefully stop nginx (default SIGTERM) pkill nginx
Force kill Python script by PID kill -9 13579
Kill all processes for a user pkill -u prakash
Kill process by exact match (not substring) pkill -x nginx

πŸ›‘ Common Signals

Signal Name Number Purpose
SIGTERM 15 Graceful termination
SIGKILL 9 Forceful termination
SIGHUP 1 Reload configuration
SIGSTOP 19 Pause process
SIGCONT 18 Resume stopped process

🧠 Interview-ready Summary

Q: What's the difference between kill and pkill?

A:

  • kill sends signals to a specific process ID (PID).

  • pkill matches process names or patterns, making it easier to target multiple or unknown PIDs.

Example:

  • kill -9 1234 kills PID 1234.

  • pkill -9 nginx kills all nginx processes.

Here’s a detailed explanation of the pstree command with an example output — useful for Linux interviews and real-time debugging.


🌳 What is pstree?

pstree shows processes in a tree format, visualizing parent-child relationships (i.e., which process spawned which).


✅ Basic Syntax:

pstree

πŸ“Œ Example Output:

systemd─┬─NetworkManager───2*[{NetworkManager}]
        ├─sshd───sshd───bash───pstree
        ├─cron
        ├─dbus-daemon
        ├─gnome-shell───2*[{gnome-shell}]
        ├─firefox───9*[{firefox}]
        ├─pulseaudio───{pulseaudio}
        └─cupsd

πŸ” How to Read It:

  • systemd is the root (PID 1).

  • sshd started a child sshd session, which launched bash, which launched pstree.

  • firefox has 9 threads shown as {firefox}.

  • 2*[{NetworkManager}] means two threads under NetworkManager.


πŸ”§ Useful Options:

Command Description
pstree -p Show PIDs
pstree -u Show usernames
pstree -a Show command line arguments
pstree -h Highlight current process tree
pstree <username> Show processes owned by a specific user
pstree <PID> Show subtree for a specific process

πŸ“Œ Example with PID and arguments:

pstree -p -a

Output:

systemd(1)─┬─cron(585)
           ├─sshd(980)───sshd(1032)───bash(1033)───pstree(1080) -p -a
           ├─nginx(850) -g daemon off;───nginx(851)

🧠 Interview Tip

Q: Why use pstree instead of ps?

A: pstree visually shows parent-child process hierarchy, helping you trace how a process was spawned (e.g., sshd → bash → python script) — useful in debugging background services, daemons, or containerized environments.



Here's a code-block-style diagram to help you visually understand bottlenecks in a Linux system, categorized by key system components like CPU, memory, disk I/O, network, and processes. This is extremely useful for both troubleshooting and interviews:


πŸ“Š Linux System Bottlenecks – Block Diagram (Text-Based)

+-----------------+        +----------------+       +------------------+
|     CPU         |<-----> |    Processes   |<----->|      Memory       |
+-----------------+        +----------------+       +------------------+
      ^  |                          ^   |                      |
      |  |                          |   |                      v
      |  |                    +-----+   +------+        +-------------+
      |  |                    | High CPU usage |        |   Swapping  |
      |  +------------------->| Infinite loops |        |   Thrashing |
      |                       | Busy-waiting   |        +-------------+
      |
      |          +----------------+    
      |          | Interrupts     |     
      +----------| Context Switch |    
                 +----------------+   

        |
        v
+----------------+
|     Load Avg   |
|  (uptime, top) |
+----------------+

+-------------------+        +-------------------+
|     Disk I/O      |<-----> |     Processes     |
+-------------------+        +-------------------+
        |                              ^
        |                              |
        v                              |
+-------------------+        +--------------------------+
|  High I/O Wait    |<-------|  Log-heavy apps, DBs     |
|  (iostat, iotop)  |        |  Misconfigured writes    |
+-------------------+        +--------------------------+
+-------------------+        +-------------------+
|    Network I/O    |<-----> |     Processes     |
+-------------------+        +-------------------+
        |                              ^
        |                              |
        v                              |
+---------------------+       +-------------------------+
| High RX/TX, Drops   |<------| Chatty apps, poor MTU   |
| (iftop, netstat)    |       | Packet loss, congestion |
+---------------------+       +-------------------------+

πŸ” Mapping Bottlenecks to Tools

Bottleneck Area Symptoms Commands to Diagnose
CPU High load, 100% usage top, htop, mpstat, pidstat
Memory Swapping, OOM kills free -m, vmstat, dmesg, top
Disk I/O High iowait, slow FS access iotop, iostat, dstat, df -h, lsof
Network Packet drops, latency spikes iftop, nethogs, ss, netstat, ping
Process Zombie/stuck/high CPU tasks ps aux, top, pstree, strace, lsof
System Load Load avg spikes uptime, top, w, sar, vmstat

🧠 Interview Tip

Q: How would you diagnose a slow Linux system?

A:
Start by checking system-wide metrics:

  • top/htop for CPU, memory, process bottlenecks.

  • iotop or iostat for disk I/O wait.

  • iftop or netstat for network delays.
    Then drill down using:

  • strace, lsof, vmstat, and dmesg to isolate the culprit.


Would you like a printable visual PNG/diagram version of this block diagram for your notes?


No comments :

Post a Comment