天天看點

Troubleshooting part1ConceptualSystem Access TroubleshootingFilesystem Troubleshooting

Conceptual

Cannot connect to the server

Always ask three questions:

  1. Source:

    a)ping 127.0.0.1 to check physical connection;

    b)Ping other website

    c)Ping the default gateway, basically is the same like 192.168.8.x, then the default is 192.168.8.1

  2. Destination:

    a)Ping the same subnet to check is the whole network or just the computer

    b)Could be the service on the computer fail down

  3. protocol:

    a)Ping, telnet, curl

System Access Troubleshooting

Server is not reachable

Troubleshooting part1ConceptualSystem Access TroubleshootingFilesystem Troubleshooting

Ping server name, if the name is unknown, use

nslookup

[email protected] ~ % nslookup google.com
Server:		192.168.1.254
Address:	192.168.1.254#53

Non-authoritative answer:
Name:	google.com
           

netstat -rnv

查詢 Gateway

[[email protected] etc]# netstat -rnv
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
0.0.0.0         192.168.1.254   0.0.0.0         UG        0 0          0 enp0s3
192.168.1.0     0.0.0.0         255.255.255.0   U         0 0          0 enp0s3
           

Cannot connect to a website or application

一些 networking commands

Troubleshooting part1ConceptualSystem Access TroubleshootingFilesystem Troubleshooting
Troubleshooting part1ConceptualSystem Access TroubleshootingFilesystem Troubleshooting

telnet IP port

[[email protected] etc]# telnet 142.250.179.78 80
Trying 142.250.179.78...
Connected to 142.250.179.78.
Escape character is '^]'.
           

如果是 connected 說明 service is running

Cannot SSH as root/user

Troubleshooting part1ConceptualSystem Access TroubleshootingFilesystem Troubleshooting

如果

telnet ip 22

可以connect 的話

  1. 說明可能是沒有 Root 登陸的權限.

    如果 /etc/ssh/ssh_config 裡面 PermitRootLogIn 是 yes 的話,那麼十之八九是密碼錯了

    cd /var/log 下有個檔案 secure ,可以通過這個檢視

  2. 也可能是因為該使用者不存在,

    id user

    如, id aws 顯示該使用者存在, ali 則是 no such user
[[email protected] log]# id aws
uid=1002(aws) gid=1002(aws) groups=1002(aws)
[[email protected] log]# id ali
id: ‘ali’: no such user
           

Firewall

Troubleshooting part1ConceptualSystem Access TroubleshootingFilesystem Troubleshooting
  1. 可以通過

    ps -ef

    檢視 iptable 是運作的
  2. 然後

    systemctl status firewalld

    檢視到狀态是 active 的

    systemctl stop firewall

    或者

    systemctl disable firewall

    來關閉防火牆後,再重試 telnet
  3. stop 和 disable 的差別是,後者在 reboot 後依然是關閉的
[[email protected] log]# ps -ef | grep iptable
root        2013    1511  0 12:26 pts/0    00:00:00 grep --color=auto iptable
[[email protected] log]# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor p>
   Active: active (running) since Thu 2021-09-09 23:21:06 EDT; 13h ago
     Docs: man:firewalld(1)
 Main PID: 837 (firewalld)
    Tasks: 2 (limit: 4928)
   Memory: 26.9M
   CGroup: /system.slice/firewalld.service
           └─837 /usr/libexec/platform-python -s /usr/sbin/firewalld --nofork ->

Sep 09 23:21:05 localhost.localdomain systemd[1]: Starting firewalld - dynamic >
Sep 09 23:21:06 localhost.localdomain systemd[1]: Started firewalld - dynamic f>
Sep 09 23:21:07 localhost.localdomain firewalld[837]: WARNING: AllowZoneDriftin
           

Filesystem Troubleshooting

Cannot cd into a directory

absolute path & relative path 的問題

Troubleshooting part1ConceptualSystem Access TroubleshootingFilesystem Troubleshooting

Cannot find a file

Troubleshooting part1ConceptualSystem Access TroubleshootingFilesystem Troubleshooting
  1. 用 find 尋找

    find / -name "name"

此處

/

表示全局,非常重要
[[email protected] log]# find / -name "ssh_config"
/etc/ssh/ssh_config
/usr/etc/ssh/ssh_config
           
  1. 用 locate 查找

    但是注意,因為 locate 引用的是 database 裡的資料, 一天更新一次,不是即時更新的. 是以需要先

    upadatedb

[[email protected] log]# cd ~
[[email protected] ~]# locate aya_test
[[email protected] ~]# updatedb
[[email protected] ~]# locate aya_test
/var/log/aya_test
           

Cannot create links

  1. Inode

Each file has an inode (index node). Inode is like the database of the file. It is like the Passport or ID card without your name. Because it contains many things, except two: the name of the file & the content of the file.

Troubleshooting part1ConceptualSystem Access TroubleshootingFilesystem Troubleshooting
  1. Soft links

    Sof links 跟 Windows 系統中的快捷鍵特别像

    Troubleshooting part1ConceptualSystem Access TroubleshootingFilesystem Troubleshooting
    Troubleshooting part1ConceptualSystem Access TroubleshootingFilesystem Troubleshooting
  2. Hard links

    Hard links 就像原件的副本一樣.即使把原來的檔案删了, hard links file 也依然存在

    Troubleshooting part1ConceptualSystem Access TroubleshootingFilesystem Troubleshooting
    Troubleshooting part1ConceptualSystem Access TroubleshootingFilesystem Troubleshooting
    建立連接配接
  3. 先建立一個文檔 test
[[email protected] troubleshooting]# vim test
[[email protected] troubleshooting]# cat test
#!/bin/bash

#This is a file for testing
           
  1. 然後用

    ln -s RESOURCE absolute path TARGET absolute path

  2. 再去目标檔案中查找
  3. cat test_1 可以看到内容一緻
[[email protected] troubleshooting]# ln -s /root/troubleshooting/test /tmp/test_1


[[email protected]calhost troubleshooting]# cd /tmp/
[[email protected] tmp]# ls -ltr
total 0
drwx------. 3 root root 17 Sep 10 10:59 systemd-private-4b0288c1199f499f8e5089f9bfa3e9e0-chronyd.service-baejoi
lrwxrwxrwx. 1 root root 26 Sep 11 05:25 test_1 -> /root/troubleshooting/test


[[email protected] tmp]# cat test_1 
#!/bin/bash

#This is a file for testing
           

*!!!注意,這裡一定要用絕對路徑

總結

Troubleshooting part1ConceptualSystem Access TroubleshootingFilesystem Troubleshooting

Cannot write to a file

Troubleshooting part1ConceptualSystem Access TroubleshootingFilesystem Troubleshooting

Cannot change permission or ownership

root 身份在 /home/aws/ 的使用者下建立 helloworld/

ls -ltr

可以看出來,這個檔案屬于 root

[[email protected] ~]# cd /home/aws
[[email protected] aws]# mkdir helloworld
[[email protected] aws]# ls -ltr
total 1452
-rwxr-xr-x. 1 aws  aws  1486618 Sep 11 06:31 messages
drwxr-xr-x. 2 root root       6 Sep 11 12:14 helloworld
           

是以 aws 使用者無法 touch file

[[email protected] ~]$ cd helloworld/
[[email protected] helloworld]$ touch hello
touch: cannot touch 'hello': Permission denied
           

以 root 身份更改 ownership

chown user file/dir

[[email protected] aws]# chown aws helloworld/
[[email protected] aws]# ls -ltr
total 1452
-rwxr-xr-x. 1 aws aws  1486618 Sep 11 06:31 messages
drwxr-xr-x. 2 aws root      22 Sep 11 12:22 helloworld
           

所有者從 root 變成了 aws. 再嘗試

touch file

就成功了

[[email protected] ~]$ cd helloworld/
[[email protected] helloworld]$ touch hello.py
[[email protected] helloworld]$ ls 
hello.py
           

!!!注意,如果 parent directory 的 ownership 改變了,即使這個檔案是 user 建立的, 也無法操作

比如, 在 helloworld/ 中建立 hello.py . 該檔案屬于 aws 所有

[[email protected] helloworld]$ ls -ltr
total 0
-rw-rw-r--. 1 aws aws 0 Sep 11 12:22 hello.py
           

但是如果将 helloworld/ 的所屬遷移回 root

[[email protected] aws]# ls -ltr
total 1452
-rwxr-xr-x. 1 aws  aws  1486618 Sep 11 06:31 messages
drwxr-xr-x. 2 root root      22 Sep 11 12:22 helloworld
           

那麼即使 hello.py 是 aws 建立的,也沒有權限删除

[[email protected] helloworld]$ rm hello.py 
rm: cannot remove 'hello.py': Permission denied
           

Disk space full or Add more disk

Troubleshooting part1ConceptualSystem Access TroubleshootingFilesystem Troubleshooting

iostat is I/O status 的指令.

Create a link to another filesystem. 舉個例子: ln -s /usr/var/log/sap /temp/sap

Adding new disk and creating partition

Troubleshooting part1ConceptualSystem Access TroubleshootingFilesystem Troubleshooting
  1. 添加 disk
  2. fdisk -l | more

Disk /dev/sda: 8 GiB, 8589934592 bytes, 16777216 sectors
           

建立的可能是 /dev/sdb , sdc , etc

  1. create partition by

    fdisk /dev/sdb

  2. create physical volume by

    pvcreate /dev/sdb

  3. create volume group by

    vgcreate name(oracle_vg) /dev/sdb

  4. Create logical volume by

    lvcreate name(oracle_lv) --size 1G oracle_vg

    , 這個 oracle_vg 表示, associated with oracle_lv
  5. 格式化 logical volume by

    mkfs .xfs /dev/oracle_vg/oracle_lv

  6. 建立新的檔案夾

    mkdir oracle

  7. 将 logical volume 挂載上去

    mount /dev/oracle_vg/oracle_lv /oracle

    再用

    df -h

    檢視

Extend disk with LVM

前面的步驟同上

隻是在建立了 physical volume sdd

pvcreate /dev/sdd

  1. vgextend oracle_vg /dev/sdd

  2. extend logical volume by

    lvextend -L+1G /dev/mapper/oracle_vg/oracle_lv

How to delete old files

Troubleshooting part1ConceptualSystem Access TroubleshootingFilesystem Troubleshooting
[[email protected]t test]# find /root/test -type f -mtime +90 -exec ls -l {} \;
-rw-r--r--. 1 root root 0 Mar  1  2021 /root/test/a
-rw-r--r--. 1 root root 0 Mar  1  2021 /root/test/b
-rw-r--r--. 1 root root 0 Mar  1  2021 /root/test/c
-rw-r--r--. 1 root root 0 Mar  1  2021 /root/test/d
           

find /root/test -type f -mtime +90 -exec mv {} {}.old \;

這裡

mv {} {}.old \;

表示不管找到的什麼,都在後面打上 .old 的尾巴

[[email protected] test]# find /root/test -type f -mtime +90 -exec mv {} {}.old \;

[[email protected] test]# ls -ltr
total 4
-rw-r--r--. 1 root root  0 Mar  1  2021 d.old
-rw-r--r--. 1 root root  0 Mar  1  2021 c.old
-rw-r--r--. 1 root root  0 Mar  1  2021 b.old
-rw-r--r--. 1 root root  0 Mar  1  2021 a.old
-rwxr-xr-x. 1 root root 76 Sep 12 02:49 delteodlfile
           

Filesystem is corruption

Your system runs very slow; Identify the system

Troubleshooting part1ConceptualSystem Access TroubleshootingFilesystem Troubleshooting

注意:

  1. fsck 的時候是 dev/sda1, 而不是查 mount 的目錄
  2. fsck 失敗的原因是因為需要先 umount
[[email protected] ~]# fsck /boot
fsck from util-linux 2.32.1
If you wish to check the consistency of an XFS filesystem or
repair a damaged filesystem, see xfs_repair(8).


[[email protected] ~]# fsck /dev/sda1
fsck from util-linux 2.32.1
If you wish to check the consistency of an XFS filesystem or
repair a damaged filesystem, see xfs_repair(8).
           

/etc/fstab Corruption

Troubleshooting part1ConceptualSystem Access TroubleshootingFilesystem Troubleshooting
[[email protected] etc]# vim fstab 


# 
# /etc/fstab
# Created by anaconda on Sun May  9 08:50:43 2021
#
# Accessible filesystems, by reference, are maintained under '/dev/disk/'.
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info.
#
# After editing this file, run 'systemctl daemon-reload' to update systemd
# units generated from this file.
#
/dev/mapper/cs-root     /                       xfs     defaults        0 0
UUID=677759ee-ef4a-4346-a958-1b27081de187 /boot                   xfs     defaults        0 0
/dev/mapper/cs-swap     none                    swap    defaults        0 0
~                                                                            
           

第一列是 block device, 第二列 /boot 等是 mount point, 第三列 type of the file, 第四列 mount options, 第五列(0)是 backup operation (如果是1表示 the dump utility should back up the partition). 第六列(0)是表示 fsck 對這個 device 不能進行檢測

如果是 fstab 裡面的内容出了問題,需要進入到 rescue 的模式中

然後在 /mnt/sysimage/etc 中修改 fstab

繼續閱讀