Showing posts with label clustermode. Show all posts
Showing posts with label clustermode. Show all posts

Tuesday, 16 February 2016

Understanding difference between 7-mode and c-dot from a systemshell point of view

There is a huge difference in the way the Ontap shell communicates with the layer beneath.. In a 7mode all the configuration files are stored in the the /etc directory... for example: /etc/rc, /ect/exportfs, /etc/hosts..

But when it comes to cluster mode.. we have the files present.. but they are actually not really used.. Instead all the files are managed by the RDB (replicating database).. 


replication ring is a set of identical processes running on all nodes in the cluster.
The basis of clustering is the replicated database (RDB). An instance of the RDB is maintained on each node in a cluster. There are a number of processes that use the RDB to ensure consistent data across the cluster. These processes include the management application (mgmt), volume location database (vldb), virtual-interface manager (vifmgr), and SAN management daemon (bcomd).
For instance, the vldb replication ring for a given cluster consists of all instances of vldb running in the cluster.
RDB replication requires healthy cluster links among all nodes in the cluster. If the cluster network fails in whole or in part, file services can become unavailable. The cluster ring show displays the status of replication rings and can assist with troubleshooting efforts..
Lets jump into the systemshell... and from there to the bash shell and exactly see how it is working.
PLEASE DON'T TRY IT IN PRODUCTION, UNLESS YOU KNOW WHAT YOU ARE DOING.
cluster600::*> systemshell
  (system node systemshell)

Data ONTAP/amd64 (cluster600-01) (pts/2)
login: diag
Password:
Last login: Fri Feb 12 11:33:17 from localhost
Warning:  The system shell provides access to low-level
diagnostic tools that can cause irreparable damage to
the system if not used properly.  Use this environment
only when directed to do so by support personnel.

cluster600-01%
cluster600-01%
cluster600-01% sudo bash
bash-3.2#
bash-3.2#
bash-3.2# cd /mroot/etc/cluster_config/rdb
bash-3.2# ls
Bcom            Management      VLDB            VifMgr
bash-3.2#

These are the four files that contain are contained in the RDB.. Can we change it.. Yes we can, but needs lot of understanding as how it works to do it.. I have tried doing it and it works.. but its a long process. The better option is to use the cluster command line.. that's the actual process.. 
In senarios when the there is a disaster this is the way to go.. 
In the next blog, I'll walk through the way how the junction path works and whats the difference when it comes to 7mode and c-dot
PLEASE WRITE BACK IF THIS WAS USEFUL.

Sunday, 14 February 2016

Netapp monitoring solution from scratch




I created a netapp monitoring solution on a UNIX machine... and have uploaded the static files to a free php server for so that you understand how it looks..
As its a file server where the files are uploaded.. It is not exactly the one as on the UNIX machine.. but you could still get a feel of it.

Advantages:
  • No third-party tools used like Nagios or PRTG or any other,
  • Build from scratch.
  • Fully customization.
  • Any required parameters can be monitored.
  • Complete look and feel can be changed as per user needs
  • Easy to use.



If anyone is interested to know more about it, feel free to drop me a message.

Example Link:
This is for cluster mode.. I have done similar one for 7-mode as well... and in the live solution it monitors even the NFS and CIFS load from hosts connected to filer.. and also the detailed host load analysis.
nfs600_vol1 is the only volume which was used to write data from a server. So you would see load only on this volume.. This is a test netapp simulator which I have used in this solution.

http://anirvan_lahiri.net23.net/monitoring/monitor.php

Friday, 12 February 2016

NetApp Clustermode Qtree Quota


1. vol create -vserver vs-nfs -volume nfsvol3 -aggregate aggrn1 300m
2. vol mount -vserver vs-nfs -volume nfsvol3 -junction-path /nfsvol3
3. export-policy create -vserver vs-nfs -policyname nfspol
4. export-policy rule create -vserver vs-nfs -policyname nfspol -clientmatch -rorule any -rwrule any -superuser any
5. qtree create -vserver vs-nfs -volume nfsvol3 -qtree q1
6. quota policy create -vserver vs-nfs -volume nfsvol3 -name nfs3pol
7. quota policy rule create -vserver vs-nfs -policy-name nfs3pol -volume nfsvol3 -type tree -target q1 disk-limit 20m

NetApp Cluster Mode SFO.





In Cluster Mode, when a failover or takeover has taken place, the root-aggregate
of the partnernode is owned by the surviving partner.
How to get to the rootvolume of the partner’s root-aggregate?
1. Log in to she systemshell.
2. Run the command ‘mount_partner
PS: The root-volume of the partner is then mounted on /partner

Cluster Mode mhost troubleshooting

If you need any help, feel free to ask.
1. go to the systemshell
set diag
systemshell -node cl1-01
2. unmount mroot
cd /etc
./netapp_mroot_unmount
logout
3. run cluster show a couple of times and see that health is false
cluster show
4. run cluster ring show to see that M-host is offline
cluster ring show
Node UnitName Epoch DB Epoch DB Trnxs Master Online
——— ——– ——– ——– ——– ——— ———
cl1-01 mgmt 6 6 699 cl1-01 master
cl1-01 vldb 7 7 84 cl1-01 master
cl1-01 vifmgr 9 9 20 cl1-01 master
cl1-01 bcomd 7 7 22 cl1-01 master
cl1-02 mgmt 0 6 692 – offline
cl1-02 vldb 7 7 84 cl1-01 secondary
cl1-02 vifmgr 9 9 20 cl1-01 secondary
cl1-02 bcomd 7 7 22 cl1-01 secondary
5. try to create a volume and see that the status of the aggregate
cannot be determined if you pick the aggregate from the broken M-host.
6. now vldb will also be offline.
5. remount mroot by starting mgwd from the systemshell
set diag
systemshell -node cl1-01
/sbin/mgwd -z &
7. when you run cluster ring show it should show vldb offline
cl1::*> cluster ring show
Node UnitName Epoch DB Epoch DB Trnxs Master Online
——— ——– ——– ——– ——– ——— ———
cl1-01 mgmt 6 6 738 cl1-01 master
cl1-01 vldb 7 7 87 cl1-01 master
cl1-01 vifmgr 9 9 24 cl1-01 master
cl1-01 bcomd 7 7 22 cl1-01 master
cl1-02 mgmt 6 6 738 cl1-01 secondary
cl1-02 vldb 0 7 84 – offline
cl1-02 vifmgr 0 9 20 – offline
cl1-02 bcomd 7 7 22 cl1-01 secondary
Watch vifmgr has gone bad as well.
8. start vldb by running spmctl -s -h vldb
or run /sbin/vldb
in this case, do the same for vifmgr.
Please leave your comments, that would be helpful.

cluster netapp convert snapmirror to snapvault


Steps:
  • Break the data protection mirror relationship by using the snapmirror break command.
  • The relationship is broken and the disaster protection volume becomes a read-write volume.
  • Delete the existing data protection mirror relationship, if one exists, by using the snapmirror delete command.
  • Remove the relationship information from the source SVM by using the snapmirror release command.
  • (This also deletes the Data ONTAP created Snapshot copies from the source volume.)
  • Create a SnapVault relationship between the primary volume and the read-write volume by using the snapmirror create command with the -type XDP parameter.
  • Convert the destination volume from a read-write volume to a SnapVault volume and establish the SnapVault relationship by using the snapmirror resync command.
  • (Warning: all data newer than the snapmirror.xxxxxx snapshot copy will be lost and also: the snapvault destination should
  • not be the source of another snapvault relationship)


Please leave your comments.. that would be helpful.

Clustermode mroot destroyed


HA-PAIR.
Logged into the systemshell and completely emptied mroot on both nodes.
rebooted.
During boot, an mroot.tgz is extracted from /tmp and the new mroot is
regenerated.
You will have to set a new root password (boot menu) and diag password (security login password).


Please leave your comments for me to improve

clustermode recovery mroot (only use if you know what you are doing)

the backups are in /mroot/etc/backups/config
the rdb environment is in /mroot/etc/cluster_config/
I tarred the db’s in /mroot/etc/cluster_config/tarred
on the surviving node, and copied that to a third location.
The db’s were not restored by the backup and the directories
remain empty.
I copy the tar file from the third location to the cluster_config
directory and untarred. Reboot and node functions properly again.
1. on surviving node:
cd /mroot/etc/cluster_config/rdb
tar cf tarred .
scp tarred root@192.168.1.159:/tmp/
2. on broken node:
boot system and login…
kp-01::system configuration*> backup show
Node Backup Tarball Time Size
——— —————————————– —————— —–
kp-01 kp-01.daily.2015-04-06.00_10_00.7z 04/06 00:10:00 4.66MB
kp-01 kp-01.daily.2015-04-07.00_10_01.7z 04/07 00:10:01 4.18MB
kp-01 kp-02.daily.2015-04-06.00_10_00.7z 04/06 00:10:00 3.27MB
kp-01 kp-02.daily.2015-04-07.00_10_01.7z 04/07 00:10:01 3.21MB
kp-01 kp.8hour.2015-04-06.18_15_00.7z 04/06 18:15:00 7.35MB
kp-01 kp.8hour.2015-04-07.02_15_04.7z 04/07 02:15:04 7.68MB
kp-01 kp.8hour.2015-04-07.10_15_00.7z 04/07 10:15:00 7.97MB
kp-01 kp.daily.2015-04-05.00_10_00.7z 04/05 00:10:00 7.21MB
kp-01 kp.daily.2015-04-06.00_10_00.7z 04/06 00:10:00 8.11MB
kp-01 kp.daily.2015-04-07.00_10_01.7z 04/07 00:10:01 7.56MB
kp-01 kp.weekly.2015-03-11.10_08_12.7z 03/11 10:08:12 3.26MB
kp-01 kp.weekly.2015-03-17.00_15_00.7z 03/17 00:15:00 6.06MB
kp-01 kp.weekly.2015-04-07.00_15_06.7z 04/07 00:15:06 7.57MB
kp-01::system*> configuration recovery node restore -backup kp.8hour.2013-04-07.02_15_04.7z
3. on broken node
cd /mroot/etc/cluster_config/rdb
scp root@192.168.1.159:/tmp/tarred .
tar xf tarred
sudo reboot

Featured post

Netapp monitoring solution from scratch

I created a netapp monitoring solution on a UNIX machine... and have uploaded the static files to a free php server for so that you under...