106 Chronologically structured chapter that builds foundational Linux concepts step-by-step

No comments

Here's a chronologically structured chapter that builds foundational Linux concepts step-by-step and prepares you for FAANG-level interviews with tools, commands, and real-time scenarios.

🧠 Chapter: Mastering Linux Internals for Interviews

1. The Big Picture: How Linux Works

Overview: Role of the kernel, user space vs kernel space
Key Concept: Linux as a multitasking, multiuser, monolithic kernel system

2. CPU: The Core Executor

What: Executes instructions, context switching, scheduling
Terms: User time, system time, idle time
Command: top, htop, mpstat, uptime
Interview Insight: Explain CPU-bound vs I/O-bound processes

3. Memory: RAM and Virtual Memory

Concepts: Virtual memory, paging, swapping, buffers, cache
Commands: free -h, vmstat, /proc/meminfo, top (RES/VIRT/SHR)
Interview Tip: What causes high swap usage?

4. Processes and Threads

Process: An independent executing program (PID, PPID)
Thread: Lightweight process sharing the same address space
Commands: ps -ef, pstree, top, htop
System Calls: fork(), exec(), exit(), wait()
Key Difference: fork() duplicates, exec() replaces, exit() terminates

5. Kernel: The Brain

Components: Process scheduler, memory manager, I/O manager
System Calls Interface: Bridge between user and kernel space
Command: uname -a, dmesg
Real World: Debugging kernel logs with dmesg

6. I/O and Disk Subsystems

Concepts: Block vs character devices, buffered I/O, async I/O
Commands: iostat, iotop, df -h, du -sh, lsblk, mount
Use Case: Identify I/O bottlenecks with iotop, pidstat -d

7. System Calls Deep Dive

What: Interface to kernel services (e.g., file ops, process control)
Examples: read(), write(), open(), close(), kill()
Tool: strace — trace system calls and signals
Interview: How exec() works under the hood? Use strace to show.

8. Process States and Lifecycle

States: Running, Sleeping, Zombie, Stopped, Orphan
Monitoring Tools:
- top, htop – for real-time process view
- watch -n 1 'ps aux | grep <pid>'
- pidstat – CPU, memory, I/O usage over time
- lsof – list open files by a process
Real-World Debug: A zombie process scenario

9. Signals in Depth

Types: TERM, KILL, STOP, CONT, HUP, INT, etc.
Commands:
- kill -SIGTERM <pid> – graceful shutdown
- kill -9 <pid> – force kill
- trap in shell scripts
Tools: strace -p <pid> to inspect signal handling
Real-Time Example: Releasing stuck processes via SIGKILL

10. Background & Detached Execution

Commands:
- nohup command & – runs even after logout
- disown %job_id – remove from job table
Use Case: Run long jobs on remote systems safely

11. Advanced Performance Debugging

renice – change process priority
pidstat – profile specific PIDs
strace – syscall tracing
lsof – open file/socket tracking
dmesg – kernel ring buffer

12. Process Monitoring in Production

Monitor All States:
- ps -eo pid,state,cmd
- top -H – thread view
- watch -n 1 'ps aux | grep <app>'
Real-World Case: Memory leak or CPU spike in production
- top → strace → lsof → kill or renice

Let's break down and deeply explain the concepts of CPU time and the Linux scheduler using real-world analogies, command-line examples, and system internals. This will help you understand it at an interview level, especially for FAANG or senior DevOps/System Engineer roles.

🔧 Part 1: What is CPU Time?

✅ Definition:

CPU Time refers to the amount of time a CPU spends executing a specific process's instructions, excluding any time the process is idle or waiting for I/O (disk/network) operations.

🔄 Breakdown:

There are typically two types of CPU time:

User CPU Time: Time spent executing user-space code (your application).
System CPU Time: Time spent in the kernel (system calls, managing files, sockets, memory).

💡 Analogy:

Imagine the CPU as a chef in a kitchen.

Each process is a customer placing an order (program to run).
CPU time is the time the chef (CPU) actually spends cooking the dish (executing instructions) — not waiting for ingredients (I/O).

📌 Real Example:

$ time ls -l

Output:

real    0.003s
user    0.001s
sys     0.002s

real: Total elapsed wall-clock time (you watching).
user: Time spent executing in user space (0.001s).
sys: Time spent in kernel space (0.002s).

So CPU time = user + sys = 0.003s.

⚙️ Part 2: Linux Scheduler (How CPU Time is Shared)

✅ What is the Scheduler?

The Linux scheduler is a kernel component responsible for deciding which process/thread runs on the CPU and for how long.

It manages CPU time sharing to ensure:

Efficiency
Fairness (all get CPU)
Responsiveness (interactive processes run fast)
Throughput (keep CPUs busy)

📦 Types of Scheduling Policies in Linux:

Policy	Description
CFS (default)	Completely Fair Scheduler – balances CPU time fairly across processes
`SCHED_FIFO`	Real-time: first-in, first-out. No time slice, runs until it yields.
`SCHED_RR`	Real-time: Round-robin. Time slice-based rotation.
`SCHED_DEADLINE`	Guarantees deadlines for real-time tasks.

🧠 How the CFS Scheduler Works (Deep Dive)

CFS (Completely Fair Scheduler) is the default scheduler used by modern Linux kernels.

🔁 Key Concept:

Each process is assigned a virtual runtime (vruntime). The process with the lowest vruntime gets the CPU.

📈 Idea:

Track how long a process has used the CPU.
If a process has had less CPU time than others, it is prioritized next.
Ensures fair CPU time proportionate to process weight (nice value).

🛠️ Tools:

You can view scheduling details using:

ps -eo pid,comm,ni,pri,cls,stat --sort=pid

ni – Nice value (lower = higher priority)
pri – Kernel priority
cls – Scheduling class (TS = CFS, FF = FIFO)
stat – Process state

🧮 Example of Scheduler in Action

Imagine three processes:

Process A (Interactive shell)
Process B (Background database)
Process C (CPU-intensive encoding)

What scheduler does:

Process A: gets quick CPU bursts so shell is responsive.
Process B: gets occasional CPU as it's mostly waiting for I/O.
Process C: gets fair chunk, but not all CPU, to keep system responsive.

📟 Demo: Viewing CPU Time & Scheduling

🧪 Check CPU time per process:

ps -eo pid,etime,time,comm --sort=-time | head

| etime | Elapsed real time since the process started
| time | Total CPU time (user + system) consumed

👨‍💻 Strace Example:

To see how system calls contribute to CPU/system time:

strace -T -p <pid>

-T shows how much time each system call takes.

🧠 Interview-Level Questions

Question	Answer
What is CPU time?	Time CPU spends executing user + system code of a process.
Difference between wall-clock and CPU time?	Wall-clock is total elapsed; CPU time is time the CPU actually executed your process.
What does the Linux scheduler do?	Decides which process/thread to run next on the CPU.
What is vruntime in CFS?	A measure of how much CPU time a process has had; lower values are run first.
What is the default scheduler in Linux?	CFS (Completely Fair Scheduler).

📚 Summary (In a Nutshell)

Concept	Description
CPU Time	Actual processing time used by the CPU for a process.
User Time	Time in user space (application code).
System Time	Time in kernel space (system calls).
Linux Scheduler	Kernel component that selects which process gets CPU time.
CFS	Ensures fair CPU sharing using `vruntime`.
Tools	`top`, `ps`, `strace`, `htop`, `time`, `pidstat`

No comments :

Post a Comment

105 Process states and Management

linux No comments

Here’s a complete guide on Process Management – Process States, focused on FAANG-level interview preparation, with clear explanations, real-world examples, and 20 high-quality Q&A to test and reinforce your understanding.

🔧 1. Key Topics to Cover (Process Management & Process States)

Category	Topics
Process Basics	PID, PPID, UID, GID, nice, renice
Process Lifecycle	Creation, Execution, Termination
Process States (Linux/Unix)	`Running`, `Waiting`, `Stopped`, `Zombie`, `Sleeping`, `Dead`
State Transitions	How and when process moves between states
Signals	`kill`, `SIGKILL`, `SIGTERM`, `SIGSTOP`, `SIGCONT`, etc.
Parent-Child Relationship	Orphan and Zombie processes
Foreground/Background Jobs	`fg`, `bg`, `jobs`, `&`, `nohup`
Process Scheduling	`nice`, `renice`, priority (niceness level), time slicing
Troubleshooting Tools	`ps`, `top`, `htop`, `pstree`, `strace`, `lsof`, `kill`, `nice`, `renice`, `watch`

🚦 2. Process States in Linux (FAANG-focused)

State	Description
`R` (Running)	Actively executing on CPU
`S` (Sleeping)	Waiting for I/O (interruptible sleep)
`D` (Uninterruptible sleep)	Waiting for disk/network (not killable easily)
`T` (Stopped)	Suspended (via signal like SIGSTOP)
`Z` (Zombie)	Process completed but parent didn’t call `wait()`
`X` (Dead)	Terminated, not seen often
`I` (Idle)	Kernel threads only, rarely shown

📚 3. FAANG-Ready Interview Questions + Answers

✅ Basic to Intermediate

Q1. What are the different states a process can be in? Explain.

Answer:

Running (R): Actively on CPU or ready to run.
Sleeping (S): Waiting for I/O; can be interrupted by signals.
Uninterruptible Sleep (D): Waiting on resources like disk; cannot be interrupted.
Stopped (T): Halted by a signal (e.g., SIGSTOP).
Zombie (Z): Terminated but not reaped by parent.
Dead (X): Process terminated and removed from system.

Q2. How can you identify zombie processes on Linux?

Answer:

ps aux | grep 'Z'

Or:

ps -eo pid,ppid,state,cmd | grep Z

Zombie processes show as state Z and can only be cleared if the parent process is terminated or calls wait().

Q3. What is the difference between zombie and orphan process?

Zombie	Orphan
Child terminated, parent alive but didn’t call wait()	Child alive, parent terminated
Consumes PID	Reassigned to init (PID 1)
Can’t be killed	Kernel adopts it

Q4. How does a process move from running to waiting (sleeping)?

Answer:
A process transitions from running to sleeping when it requests I/O or is waiting for a resource (disk, network, user input). The scheduler then switches to another process until I/O is ready.

Q5. Explain the lifecycle of a Linux process.

Answer:

Created – via fork() or exec()
Ready – added to scheduler queue
Running – executes instructions
Waiting – for I/O or other resource
Terminated – exits using exit()
Zombie – if not reaped by parent

✅ Advanced FAANG-Level

Q6. How do you handle a zombie process in a production server?

Answer:

Identify with ps aux | grep Z
Check PPID
Kill parent process to allow init (PID 1) to adopt and reap the zombie.

Q7. How to find which syscall a process is waiting on?

Answer:
Use strace:

strace -p <pid>

This shows the system calls being waited on.

Q8. What does the `D` state (Uninterruptible sleep) mean and why is it dangerous?

Answer:
It indicates the process is waiting for I/O and cannot be killed or interrupted (even with kill -9). This can be a sign of hardware issues, NFS hang, or I/O lock.

Q9. How does the Linux kernel schedule processes?

Answer:

Linux uses the CFS (Completely Fair Scheduler).
It maintains a red-black tree to assign CPU time.
Each process gets a fair share of time based on niceness and priority.

Q10. What happens when you run `kill -9 <pid>` on a zombie?

Answer:
Nothing. Zombie processes are already terminated. They can’t be killed. You need to terminate the parent process so the kernel can clean up the zombie.

✅ Real-World + Debugging

Q11. How do you debug high CPU usage from a process?

Answer:

top or htop
pidstat -p <pid> 1
strace -p <pid>
lsof -p <pid>

These tools help track what the process is doing, system calls, and open files.

Q12. How do you change the priority of a running process?

Answer:

renice -n 10 -p <pid>

Niceness range: -20 (highest priority) to 19 (lowest priority).

Q13. What’s the use of the `pstree` command?

Answer:
Shows process hierarchy, useful for visualizing parent-child relationships:

pstree -p

Q14. What signal is sent by default using `kill` command?

Answer:
SIGTERM (15) – allows process to gracefully shut down.

Q15. When would you use `nohup`?

Answer:
To run a process after logout/session close:

nohup ./my_script.sh &

✅ Behavioral + Situational

Q16. You found a process in `D` state for hours. What do you do?

Answer:

Use strace or dmesg to check blocked I/O.
Check if NFS, disk, or DB is hung.
Consider restarting the service or rebooting if stuck.

Q17. A user reports background jobs disappearing after logout. Why?

Answer:

Use nohup or disown to detach processes from terminal session.

nohup script.sh &
disown

Q18. How to monitor all states of processes?

Answer:

ps -eo pid,user,state,cmd | sort

Also use top, htop, watch.

Q19. What causes a process to become a zombie?

Answer:
Child process exits, but parent hasn’t called wait() to read its exit status. Kernel keeps it to return info when parent requests.

Q20. How can you force the cleanup of multiple zombie processes?

Answer:
Find the parent with:

ps -eo ppid,pid,state,cmd | grep Z

Then kill or restart the parent to allow init to reap the child processes.

📌 Summary Cheat Sheet

Process Tool	Purpose
`top` / `htop`	Real-time resource usage
`ps`	List process info
`strace`	Trace system calls
`lsof`	List open files/sockets
`kill` / `pkill` / `killall`	Send signals
`nice` / `renice`	Adjust priority
`jobs` / `bg` / `fg`	Background job control
`pstree`	Visualize process hierarchy

No comments :

Post a Comment

104 Basic Commands

linux No comments

The commands cpu, top, kill, and ps are essential tools in Linux for monitoring and managing processes and system resources. Let’s go over each command in detail, including:

Purpose
Syntax
Common use-cases
Example output
How to analyze the output

🔹 1. `top` — Real-time System Monitoring

📌 Purpose:

Displays real-time information about system processes, CPU, memory usage, and load average.

📌 Syntax:

top

📌 Example Output (Partial):

top - 08:45:26 up  2:34,  2 users,  load average: 0.15, 0.20, 0.25
Tasks: 138 total,   1 running, 137 sleeping,   0 stopped,   0 zombie
%Cpu(s):  3.0 us,  1.0 sy,  0.0 ni, 95.0 id,  1.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   7850.2 total,  2200.5 free,  3400.6 used,  2249.1 buff/cache
PID  USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
1324 root      20   0  123456  45678   1234 R  23.4  0.6   0:10.53 chrome

📌 How to Analyze:

Load average: First line shows 1, 5, and 15-minute system load. Rule of thumb: if load > number of CPUs, the system is overloaded.
%CPU: High %us (user) means CPU is working on your tasks, %sy (system) is kernel, %id is idle time.
%MEM: Memory usage per process.
PID/COMMAND: Helps identify the exact process consuming resources.

🔹 2. `ps` — Snapshot of Current Processes

📌 Purpose:

Displays a snapshot of current processes (unlike top which is real-time).

📌 Syntax:

ps aux       # All processes
ps -ef       # Also shows all, but with different format

📌 Example Output:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.1  8960  2340 ?        Ss   08:00   0:01 /sbin/init
prakash    1324  5.5  1.2 123456 45678 ?       Sl   08:45   0:10 chrome

📌 Key Fields to Analyze:

PID: Process ID.
%CPU, %MEM: CPU and memory usage.
VSZ/RSS: Virtual and resident memory size.
STAT: Process state (R running, S sleeping, Z zombie).
COMMAND: The command/process name.

🔹 3. `kill` — Send Signal to a Process

📌 Purpose:

Terminates (or sends other signals) to a process using its PID.

📌 Syntax:

kill <PID>               # Send SIGTERM (default)
kill -9 <PID>            # Send SIGKILL (force kill)
kill -l                  # List all signals

📌 Example:

ps aux | grep chrome
# prakash   1324  5.5  1.2 ... chrome
kill -9 1324

📌 How to Analyze:

If a process is unresponsive, use kill -9.
Use ps or top to find the PID of the problem process before killing.

🔹 4. `cpu` — (Note: There's no native `cpu` command in Linux)

📌 Possible Meanings:

You might be referring to:
- Checking CPU usage via top, htop, or mpstat.
- lscpu — to get CPU architecture info.

✅ Example 1: View CPU architecture

lscpu

Output:

Architecture:        x86_64
CPU(s):              8
Model name:          Intel(R) Core(TM) i7-8565U
CPU MHz:             1800.000

✅ Example 2: CPU usage

mpstat -P ALL 1

Output:

11:28:01 AM  CPU    %usr   %sys   %idle
11:28:02 AM  all     5.00   1.00   94.00

🔍 Summary Comparison Table

Command	Use Case	Output Highlights	When to Use
`top`	Real-time process monitoring	Load avg, CPU %, MEM %, PID, COMMAND	Live resource troubleshooting
`ps`	Snapshot of process state	PID, %CPU, %MEM, STAT, COMMAND	Get PID or process list
`kill`	Send signals (terminate)	N/A (command-line tool)	Stop/kill hung or rogue processes
`lscpu`	View CPU architecture info	Model name, cores, threads	Debug CPU availability or model
`mpstat`	CPU usage per core over time	%user, %sys, %idle	Performance bottleneck diagnosis

📌 Real-world Interview Tip (FAANG-ready):

Be prepared to:

Find high CPU processes using top or ps.
Kill a zombie or runaway process with kill -9.
Monitor memory leaks or CPU bottlenecks.
Explain CPU load and how to scale (e.g., vertical scaling, multithreading).

Category: General Process Monitoring

1. Q: How do you find the top 5 memory-consuming processes on a Linux system?

ps aux --sort=-%mem | head -n 6

--sort=-%mem: Sorts descending by memory usage.
head -n 6: First line is header.

2. Q: How do you monitor CPU usage per core in real-time?

mpstat -P ALL 1

-P ALL: Show stats for all cores.
1: Refresh every second.

3. Q: A process is stuck in zombie state. Can you kill it?

A: No. Zombie processes are already dead; they just haven’t been cleaned up. The parent process must wait() to release them. You can kill the parent process to clean it up.

4. Q: How do you identify zombie processes?

ps aux | awk '$8=="Z" { print $2, $11 }'

ps -eo pid,ppid,stat,cmd | grep Z

5. Q: How do you kill all Java processes running on the system?

pkill -f java

Or:

ps aux | grep java | awk '{print $2}' | xargs kill -9

🔸 Category: `top` Analysis

6. Q: What does the load average in `top` mean?

A: It shows the average number of processes waiting to run:

First = 1-minute average
Second = 5-minute average
Third = 15-minute average
If it's > number of cores, system is overloaded.

7. Q: In `top`, what does `%wa` mean in CPU stats?

A: %wa is the time the CPU is waiting on I/O. High %wa may indicate disk or network bottlenecks.

8. Q: How do you sort by memory in `top`?

A: Inside top, press Shift + M.

9. Q: You see a process consuming 100% CPU in `top`. How do you find out what it's doing?

Get PID from top
Use strace -p <pid> to trace syscalls.
Use lsof -p <pid> to inspect open files/sockets.

10. Q: How can you monitor top 10 CPU processes in real-time with a script?

watch -n 2 "ps -eo pid,comm,%cpu,%mem --sort=-%cpu | head -n 11"

🔸 Category: `ps` and `kill`

11. Q: What's the difference between `kill` and `kill -9`?

kill sends SIGTERM (15): Graceful shutdown.
kill -9 sends SIGKILL (9): Force kill, can't be trapped or ignored.

12. Q: How do you list all running processes by a specific user?

ps -u <username>

13. Q: How do you kill all processes in a specific group (e.g., all child processes of a PID)?

pkill -P <parent_pid>

14. Q: How do you determine parent-child relationships between processes?

ps -eo pid,ppid,cmd

ppid: Parent PID
Use pstree for visual tree

15. Q: How do you identify processes using the most open files?

lsof | awk '{ print $2 }' | sort | uniq -c | sort -nr | head

🔸 Category: CPU & System Info

16. Q: How do you get CPU core count and thread info on Linux?

lscpu

CPU(s): Logical cores
Core(s) per socket: Physical cores per CPU
Thread(s) per core: Hyper-threading

17. Q: How do you monitor CPU utilization over time and graph trends?

Use sar from sysstat:

sar -u 1 10

Or use tools like Grafana + Prometheus.

18. Q: How do you detect a CPU bottleneck vs memory bottleneck?

High %us or %sy and low %id in top: CPU bottleneck.
High %wa: I/O bottleneck.
High memory usage, swapping (high si/so): Memory bottleneck.

19. Q: What is the difference between VSZ and RSS in `ps`?

VSZ: Virtual memory size (includes code, data, shared libs).
RSS: Resident Set Size (actual physical memory in use).

20. Q: How do you trace what files or ports a process is using?

lsof -p <pid>

Shows open files, sockets, network connections, etc.

✅ Bonus Tip for Interviews:

Always explain why a tool is used.
Tie back to real-world issues (e.g., server slowness, memory leaks, runaway processes).
Use watch + ps/top/lsof/netstat combinations to show dynamic diagnosis.
Know when to use strace, perf, or iotop for deeper profiling.

If a process is consuming 100% CPU in top, you can analyze what it is doing using a combination of strace, lsof, and other tools. Here's a complete step-by-step example to guide you through this investigation:

🔍 Scenario:

You observe high CPU usage in top, and you want to investigate what that process is doing.

✅ Step-by-Step Analysis:

Step 1: Identify the Process in `top`

top

Look for the process using the most CPU (e.g., 100%).

Sample output:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
12345 user1     20   0   10240    920    860 R 100.0  0.0   1:23.45 python

→ The PID is 12345, and it's a Python process.

Step 2: Use `strace` to Trace System Calls

Attach strace to the running process to see what system calls it's making:

sudo strace -p 12345

Sample output:

clock_gettime(CLOCK_MONOTONIC, {tv_sec=12345, tv_nsec=678900000}) = 0
read(3, "", 4096)                         = 0
epoll_wait(4, [], 128, 0)                = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=12345, tv_nsec=679000000}) = 0

This shows the process is making repeated system calls—maybe polling or in a busy loop.

You can also log the output to a file and analyze later:

sudo strace -tt -T -p 12345 -o strace.log

Step 3: Use `lsof` to See Open Files/Sockets

This tells you which files, libraries, sockets, etc., the process is using.

sudo lsof -p 12345

Sample output:

COMMAND   PID   USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
python  12345  user1  cwd    DIR  8,1     4096      123456 /home/user1/myapp
python  12345  user1  txt    REG  8,1   123456      654321 /usr/bin/python3.8
python  12345  user1  mem    REG  8,1    45678      987654 /lib/x86_64-linux-gnu/libc.so.6
python  12345  user1    3u   REG  8,1   345678      456789 /home/user1/myapp/log.txt
python  12345  user1    4u  IPv4 23456      0t0        TCP 127.0.0.1:45678->127.0.0.1:3306 (ESTABLISHED)

→ It’s connected to a MySQL server on port 3306 — maybe it’s stuck querying DB.

Step 4: Optional - Check the Stack Trace

Use pstack (if installed) to print the stack trace of the process:

sudo pstack 12345

Sample output:

#0  0x00007f3f5dbae430 in __libc_recv () from /lib64/libc.so.6
#1  0x000055b6d2e5a891 in socket_read ()
#2  0x000055b6d2e4ff3a in main_loop ()

This helps in debugging loops or tight recursion inside the application.

🧠 Interview-ready Summary

Q: You see a process using 100% CPU in top. How do you find out what it's doing?

Get the PID from top.
Use strace -p <pid> to trace system calls in real-time.
Use lsof -p <pid> to check files, sockets, and shared libs used.
(Optional) Use pstack <pid> for stack trace.
Analyze logs or application logic based on what these tools show.

This helps you understand whether it’s stuck in a loop, polling, doing excessive computation, or waiting on I/O.

Here's a clear and detailed explanation of kill vs pkill, including differences, use cases, and real examples, perfect for interview prep:

🔪 `kill` vs `pkill` in Linux

Feature	`kill`	`pkill`
Targets	Process by PID	Process by name or pattern
Signal	Default is `SIGTERM` (15)	Default is `SIGTERM` (15)
Flexibility	Needs PID only	More user-friendly (no PID needed)
User Scope	Affects any PID (if permitted)	Can restrict by user, session, terminal

✅ `kill` — Send signal to PID

📌 Syntax:

kill [-SIGNAL] PID

✅ Example:

top

Find a process with PID 12345.

Kill it:

kill 12345

Send a specific signal (e.g., SIGKILL = 9):

kill -9 12345

📌 You must know the PID beforehand.

✅ `pkill` — Send signal to process name

📌 Syntax:

pkill [-SIGNAL] pattern

✅ Example:

To kill all processes with the name python:

pkill python

To force kill (SIGKILL) all nginx processes:

pkill -9 nginx

Restrict to processes run by a specific user:

pkill -u prakash java

Send signal only to processes matching name and terminal:

pkill -t pts/1 bash

📌 You don’t need to look up the PID manually.

🧠 Real-World Use Case Comparison

Task	Command
Kill process with known PID	`kill 9876`
Kill all Java processes	`pkill java`
Gracefully stop nginx (default SIGTERM)	`pkill nginx`
Force kill Python script by PID	`kill -9 13579`
Kill all processes for a user	`pkill -u prakash`
Kill process by exact match (not substring)	`pkill -x nginx`

🛑 Common Signals

Signal Name	Number	Purpose
`SIGTERM`	15	Graceful termination
`SIGKILL`	9	Forceful termination
`SIGHUP`	1	Reload configuration
`SIGSTOP`	19	Pause process
`SIGCONT`	18	Resume stopped process

🧠 Interview-ready Summary

Q: What's the difference between kill and pkill?

kill sends signals to a specific process ID (PID).
pkill matches process names or patterns, making it easier to target multiple or unknown PIDs.

Example:

kill -9 1234 kills PID 1234.
pkill -9 nginx kills all nginx processes.

Here’s a detailed explanation of the pstree command with an example output — useful for Linux interviews and real-time debugging.

🌳 What is `pstree`?

pstree shows processes in a tree format, visualizing parent-child relationships (i.e., which process spawned which).

✅ Basic Syntax:

pstree

📌 Example Output:

systemd─┬─NetworkManager───2*[{NetworkManager}]
        ├─sshd───sshd───bash───pstree
        ├─cron
        ├─dbus-daemon
        ├─gnome-shell───2*[{gnome-shell}]
        ├─firefox───9*[{firefox}]
        ├─pulseaudio───{pulseaudio}
        └─cupsd

🔍 How to Read It:

systemd is the root (PID 1).
sshd started a child sshd session, which launched bash, which launched pstree.
firefox has 9 threads shown as {firefox}.
2*[{NetworkManager}] means two threads under NetworkManager.

🔧 Useful Options:

Command	Description
`pstree -p`	Show PIDs
`pstree -u`	Show usernames
`pstree -a`	Show command line arguments
`pstree -h`	Highlight current process tree
`pstree <username>`	Show processes owned by a specific user
`pstree <PID>`	Show subtree for a specific process

📌 Example with PID and arguments:

pstree -p -a

Output:

systemd(1)─┬─cron(585)
           ├─sshd(980)───sshd(1032)───bash(1033)───pstree(1080) -p -a
           ├─nginx(850) -g daemon off;───nginx(851)

🧠 Interview Tip

Q: Why use pstree instead of ps?

A: pstree visually shows parent-child process hierarchy, helping you trace how a process was spawned (e.g., sshd → bash → python script) — useful in debugging background services, daemons, or containerized environments.

Here's a code-block-style diagram to help you visually understand bottlenecks in a Linux system, categorized by key system components like CPU, memory, disk I/O, network, and processes. This is extremely useful for both troubleshooting and interviews:

📊 Linux System Bottlenecks – Block Diagram (Text-Based)

+-----------------+        +----------------+       +------------------+
|     CPU         |<-----> |    Processes   |<----->|      Memory       |
+-----------------+        +----------------+       +------------------+
      ^  |                          ^   |                      |
      |  |                          |   |                      v
      |  |                    +-----+   +------+        +-------------+
      |  |                    | High CPU usage |        |   Swapping  |
      |  +------------------->| Infinite loops |        |   Thrashing |
      |                       | Busy-waiting   |        +-------------+
      |
      |          +----------------+    
      |          | Interrupts     |     
      +----------| Context Switch |    
                 +----------------+   

        |
        v
+----------------+
|     Load Avg   |
|  (uptime, top) |
+----------------+

+-------------------+        +-------------------+
|     Disk I/O      |<-----> |     Processes     |
+-------------------+        +-------------------+
        |                              ^
        |                              |
        v                              |
+-------------------+        +--------------------------+
|  High I/O Wait    |<-------|  Log-heavy apps, DBs     |
|  (iostat, iotop)  |        |  Misconfigured writes    |
+-------------------+        +--------------------------+

+-------------------+        +-------------------+
|    Network I/O    |<-----> |     Processes     |
+-------------------+        +-------------------+
        |                              ^
        |                              |
        v                              |
+---------------------+       +-------------------------+
| High RX/TX, Drops   |<------| Chatty apps, poor MTU   |
| (iftop, netstat)    |       | Packet loss, congestion |
+---------------------+       +-------------------------+

🔍 Mapping Bottlenecks to Tools

Bottleneck Area	Symptoms	Commands to Diagnose
CPU	High load, 100% usage	`top`, `htop`, `mpstat`, `pidstat`
Memory	Swapping, OOM kills	`free -m`, `vmstat`, `dmesg`, `top`
Disk I/O	High iowait, slow FS access	`iotop`, `iostat`, `dstat`, `df -h`, `lsof`
Network	Packet drops, latency spikes	`iftop`, `nethogs`, `ss`, `netstat`, `ping`
Process	Zombie/stuck/high CPU tasks	`ps aux`, `top`, `pstree`, `strace`, `lsof`
System Load	Load avg spikes	`uptime`, `top`, `w`, `sar`, `vmstat`

🧠 Interview Tip

Q: How would you diagnose a slow Linux system?

A:
Start by checking system-wide metrics:

top/htop for CPU, memory, process bottlenecks.
iotop or iostat for disk I/O wait.
iftop or netstat for network delays.
Then drill down using:
strace, lsof, vmstat, and dmesg to isolate the culprit.

Would you like a printable visual PNG/diagram version of this block diagram for your notes?

No comments :

Post a Comment

103 Sudo vs Su and Principle of least privileges

linux No comments

1. Difference Between `su` and `sudo`

Basic Definition

su (Substitute User or Switch User):
Allows you to switch to another user account, typically the root user. You’ll be prompted to enter the target user’s password.
sudo (Superuser Do):
Allows you to run a single command with elevated privileges, but you use your own password (not root's). Access is controlled through the /etc/sudoers file.

Key Differences

Feature	`su`	`sudo`
Purpose	Switch to another user session	Run a command with elevated privileges
Password Needed	Target user's password (e.g. root)	Your own password
Security	Less secure (full shell access)	More secure (limited command access)
User Traceability	No command logs	Commands logged in `/var/log/auth.log`
Configuration	No configuration	Highly configurable via `/etc/sudoers`
Session Scope	Creates a new shell	Executes a single command

Real-World Usage

Use su when:
- You need to maintain a full root shell session.
- You're working in an environment where sudo isn’t configured.
Use sudo when:
- You want to minimize risk by limiting access to specific commands.
- You're working in a multi-user environment and need an audit trail.
- You want to adhere to best security practices.

Example

# Using su to become root
su -
# (enter root password)
apt update

# Using sudo to run a command as root
sudo apt update
# (enter your password)

2. Principle of Least Privilege (PoLP)

What It Means

The Principle of Least Privilege is a security best practice stating that users and processes should be granted only the minimum level of access needed to perform their tasks — no more, no less.

💡 Why It’s Important

Reduces attack surface: Limits what attackers can do if they gain access.
Minimizes human error: Prevents accidental deletion or system changes.
Improves auditability: Easier to track and understand permission usage.
Supports compliance: Many regulations (e.g., HIPAA, GDPR) require it.

How It's Applied in Linux

User Permissions
Users are placed into groups and given access to only necessary files or directories.
Sudo Configuration
The /etc/sudoers file is used to allow users to run specific commands as root, without giving full root access.

Example:
```
john ALL=(ALL) NOPASSWD: /usr/bin/systemctl restart apache2
```
This means john can restart Apache with sudo, but nothing else.
File Ownership and Permissions
The chmod, chown, and chgrp commands are used to tightly control file access.
Service Accounts
Processes like web servers or databases run under dedicated users (e.g., www-data) that only have access to their specific directories.

Interview-Worthy Talking Points

“Using sudo instead of su enforces least privilege by giving users temporary access to specific tasks.”
“We implement PoLP by reviewing user permissions regularly and removing unnecessary sudo privileges.”
“I once audited a system where developers had full root access — we moved them to role-based sudo rules, which enhanced security significantly.”

No comments :