Linux Tips - Tips N TRIKS

Monday, 17 February 2014

Linux Tips

Linux Tips


Reset your root password


OK, this first trick involves getting your root password back when you’ve lost it. Ever had that happen? It can be rather nerve-wracking. And if you try to reboot the box and bring it up into single-user, it may actually ask for your password before allowing you to get a shell. That simply won’t do.
It’s easy, though, if you have access to a serial console.


When you first boot a Linux box, the kernel goes through and does its kerneley things. Setting up drivers, sacrificing firstborns, all that neat stuff. But after it’s all done and ready to start stuff in userspace, it starts up a program called “init”, which is the alpha and the omega of all userspace programs. It is the father of every single program that you see in ps except for the virtual ones that the kernel spawns, such as ksoftirqd.
You can use this fact to completely subvert the entire bootup process to your own ends.
When your system comes up to the grub prompt, select “e” for edit. Then move your arrow to the “kernel” line, and select “e” again. This will let you edit the line.
Add the following to the end:
init=/bin/sh
And type “b” to boot. The kernel will boot, but instead of starting init, it will instead start a shell (you have basically fooled the kernel into thinking that /bin/sh is init).
Now run the following commands:
mount -o remount,rw /
mount /usr
passwd root
sync
sync
exit
The first command remounts the root filesystem as read-only (you’ll need this in order to make any changes). The second command gives you some other tools – most likely your favorite text editor. The third command changes the password, but the fourth and fifth commands are the most important. Since there are no system services running at all, the chances of your changes actually being synced back to the filesystem after making your change is actually pretty low. Running the first command causes it to happen. Running the second “sync” command forces you to wait enough time for the data to actually be flushed back to disk (these things don’t happen instantly).
The system will possibly panic on running exit. Don’t worry about this. Just reboot the system. You should be able to log in with your new root password.

Backups

This is one of those topics that separates the men from the boys, so to speak. Backups. Always, always, always have them.
Do not treat RAID arrays as if they are inviolable. It’s always a possibility that more than one disk could fail. And it’s also a possibility that the hardware itself could fail, thus corrupting the array. RAID is more fault tolerant than one disk, but only just. Don’t depend on it. A high-traffic company learned this a few weeks ago, taking no backups and depending on RAID mirroring. I’d be surprised if the sysadmin who worked there ever works again. Not doing at least some kind of backup is stupid, and possibly even negligent.
It doesn’t really matter on what medium you do the backups onto. If you have a small amount of critical data, you might consider putting it on CDs or DVDs and sending it offsite. If you have a lot of servers, you might want to consider tape. For my personal systems, I just copy it off onto hard drives that aren’t physically onsite. Restoring through a DSL connection is a pain, but it beats losing everything.
There are several ways of doing backups, and the good news is for Linux systems all of the software you need comes with your system. You can use scp (which is good for brute force backups). You can use rsync (which will keep your backups up to date, good for having a recent copy, not good for having archival copies you can go back to). mkisofs and cdrecord are good tools to create your backups to send to DVD. There are also enterprisey systems like Amanda. There are lots of different ways to do it and I won’t go into them here, but you can feel free to put your favorite method in the comments if you think it will help.
The main thing that I want to get across in this particular article is – do it! It doesn’t take much time to set it up, and you’re not going to think much about it until things die – at which point it will (maybe even literally, depending on whose data you’re storingsave your life.

Process States


If you look in ps for a process, you will usually see the characters S or R… and sometimes others. But what do they mean?
The kernel contains something called a run queue. When a process is ready to run, it tells the kernel that it needs some cycles from the CPU. Once it does this, it is said to be “in the run queue”. It’s status at this point is runnable, or R.
When the process is not waiting in the run queue, for example, when it is waiting on input or doing something else that does not require processing time from the CPU, it is said to be in “stopped” state, or S.
D state is particularly annoying. It means “uninterruptible IO wait”. This means that the process is stuck in a system call waiting for some IO. When it is in this state, you cannot do anything with the process, not even send a signal (the signals are queued up waiting for the process to leave D state). There are only two ways out of this state – either fix the condition that is causing it (it’s usually NFS related), or reboot the system. There are no other options.
T state, as I mentioned previously, means the process is stopped. You will need to send a SIGCONT to start the process again.
Z means the process is defunct. Either kill the parent, reboot the box, or live with it. This is actually fairly harmless, except that it takes up space in the process table.
There are some other values as well. You can “man ps” to find outwhat they are (look under PROCESS STATE CODES). You’ll find the output of ps to be much more informative than you thought, once you know how to read it.

Signals


Signals are one of the most visible aspects of the Linux operating system. They are also one of the least understood. Every sysadmin, even the PFYs who aren’t PFed yet, know how to kill a process. But do you know how this works underneath? Do you know how flexible the linux signalling system truly is?
If not, you’re about to find out.
Signals are yet another one of those kernel interfaces, like system calls and device drivers. They are not IPC in the sense that they cannot be used to send information to a program of themselves. They are basically an asynchronous way of telling a program that something is expected of it. They are also the kernel’s way of telling a program something as well.
There are three signals that are the most common, and a couple more that are less common but just as important. These are:
Critically important
  • SIGKILL (9) – Terminate a process. Noninterruptible.
  • SIGSEGV (11) – Segmentation Fault.
  • SIGTERM (15) – Terminate a process in an orderly fashion.
And then there are the less well known but at least as important:
  • SIGBUS (2) – Bus Error
  • SIGCHLD (17) – Child process terminated
  • SIGSTOP (19) – Stop executing
  • SIGCONT (18) – Continue executing after a stop
All of these different signals have a specific meaning.
SIGSEGV is one you’ll be very familiar with. It means “Segmentation Fault”. This is actually triggered by something very deep in the hardware itself, but is usually caused by a careless programmer. It is invoked when a programmer attempts to write to or read from memory that it has not actually been given. I’ll go into more details on that when I write about virtual memory.
This signal cannot be caught or ignored.
SIGTERM and SIGKILL are two ways of saying “kill the process”. The difference is that SIGTERM can be caught or even ignored – the process can decide not to listen to this signal. It does not have the same option when it comes to SIGKILL. When you run kill without any arguments, a SIGTERM is sent. When you run kill -9, a SIGKILL is sent.
Because it can’t be caught or ignored, the process does not have the ability to clean up after itself, and whatever it was doing at the time is left in an indeterminate state. Think of is this way – a SIGTERM is quitting time for the day, you get to pack up and take everything with you. a SIGKILL is like a fire alarm – you drop everything and leave the building.
SIGBUS is a bus error. This also originates deep in the hardware, but you’ll get this either under the same circumstances as SIGSEGV, or when hardware is failing. It’s just as catastrophic to a program as SIGSEGV.
SIGCHLD is the reason zombie proceses exist. When a linux process spawns a child (I’ll go into this process some other time), it basically owns the child. When the child dies, the parent process is notified of this fact via a SIGCHLD signal. The parent process is required to call the wait() system call in order to “reap” the child process. During the time between the SIGCHLD is sent and the parent process reaps the child process, the child process exists only as an entry in the process table. Also known as a defunct process, or zombie. So if you see a defunct process, one of two things has happened: The parent process is unable to reap the child, or whomever wrote the parent process screwed up.
I really have no idea why it was designed this way – I’m sure there’s some historical reason that will make perfect sense once I hear it, but it seems like an extra step to me.
SIGSTOP and SIGCONT are two special signals. SIGSTOP is sent to a process to tell to stop. At this point, if you run ps on the process, it will show up with a status of “T”. Then, it will start executing again when you send SIGCONT.
Strace and other processes that attach to a running process use these signals.
There are many other signals as well. SIGUSR1 and SIGUSR2 are some pretty intersting ones – they’re user defined. Some processes will listen for these signals and do some interesting things – such as increase logging, for example.
Look in /usr/include/asm/signal.h for a complete list of signals, or run kill -l.

Different kinds of files


Linux has many different kinds of files. First let’s start with a little more basic discussion: what is a file?
Basically, a file is anything that can have a file descriptor associated with it.
What is a file descriptor?
Ahh. Glad you asked. Sit down, this could take a bit.
Linux is POSIX compliant, which means its API (Application Programming Interface) is consistent with a set of standards developed for Unix a long time ago. It defines a set ofsystem calls (system calls are basically a way of requesting services from the kernel) and library calls. Anything that is POSIX compliant is going to have the same basic core API, although the standards are vague enough that there is a little wiggle room here and there.
There are some important system calls when it comes to file manipulation. Four of them are:
  • open
  • read
  • write
  • close
These system calls make up the foundations of file manipulation, although there are other calls that are just as important to do things like erasing or moving a file.
A file descriptor is a number returned from an open() system call. That’s all it is, is a number. However, once a file is opened, that descriptor is used to tell the kernel which opened file you are trying to operate on. The descriptor is passed into any other system call that is referencing that file, such as read(), write(), and close().
So, basically, a file is anything you can open using open().
You will find that nearly everything in Linux is a file – including network connections (though these don’t appear on the filesystem, you interact with them in nearly the same way as you do a regular file).
There are several different types of files.
  • File
  • Directory
  • Link
  • Named pipe
  • Block special file
  • Character special file
  • Socket
All of these file types ARE files, but they show up differently when listing a filesystem (the first character of the permissions shows you what kind of file it is) and more importantlybehave differently when you try to operate on them.
A regular file (indicated by a “-”) is just that, a regular file. You can write to it, read from it, erase it, or whatever.
A directory (indicated by a “d”) is basically a file that contains a list of other files. It’s still a file, however.
symbolic link (indicated by am “l”) is a file that points to another file – in such a way that the libraries and OS know how to follow it.
A named pipe is basically a FIFO (first in first out) that is exposed on the filesystem. These are used in interprocess communication – a process can have it open for reading, for example, while another has it open for writing. A socket is similar to a named pipe.
Device special and block files are both ways to interface with kernel devices. For example, /dev/null is a special file. When you write into it, the kernel takes the bytes and dumps them into the bitbucket. Other drivers do different things, for example, /dev/tty. When you do a ls of one of these, you’ll see a device major and device minor number – these numbers are the kernel’s way of keeping track of what goes where. You could rename /dev/null to /dev/Bush if you wanted to, and as long as it had the same major and minor numbers it would behave identically. The kernel doesn’t care what it’s called, only what it is.
Now that you understand what the different type of files are, how about a little tip on how to use them?
You probably already know about “ls”, so I’m not going to go into it. But did you know about lsof? lsof will show you all of the open files on your system – including network connections. (Remember I told you that network connections were files too? Here’s proof).
Another useful little command is mknod. This is how you create the device special and block special files (though don’t do it directly if you can avoid it, use MAKEDEV instead). This is useful to know if you, somehow, end up with /dev/null as a regular file. (It happens).
And don’t forget about the simple but tried-and-true command, ln. This creates symbolic links if given with the -s option, and hard links if not (but I’m not going to go into what those are right now).
Unexpectedly complex, huh? You’ll find every aspect of the Linux OS to be like that – a beguiling simplicity overlaying a fiendishly complex nest of interrelated subsystems.
It’s worth it to know all of these things, though. You never know when that kind of knowledge will come in handy.

No comments:

Post a Comment