Part Deux – Diggin’ into ASM Add Disk to DiskGroup operation

This is part 2 of the storage expansion of ODA.   If you remeber from Part1, Although we added a whole disk shelf, I’m just illustrating the addition on a specific disk in Slot 14.

This disk will be inserted into the OS as disk device via event handler, since multipathing is enabled we will see 2paths, and thus 2 disk names for the root slot device name

This section from the oak log, describes the disk characteristics;inclduing capacity , hba port and state :

2019-02-05 10:55:35.510: [   STMHW][710730400] Sha::Inserting OSDevName /dev/sdr for slot 14.        <— SDR

2019-02-05 10:55:35.510: [   STMHW][710730400] Sha::Inserting OSDevName /dev/sdao for slot 14.  

<— SDAO

2019-02-05 10:55:35.510: [   STMHW][710730400] Physical Disk [14] Info:

2019-02-05 10:55:35.510: [   STMHW][710730400] Slot Num    = 14

2019-02-05 10:55:35.510: [   STMHW][710730400] Col  Num    = 2

2019-02-05 10:55:35.510: [   STMHW][710730400] OsDevNames  = |/dev/sdao||/dev/sdr|

2019-02-05 10:55:35.510: [   STMHW][710730400] Serial Num  = 1839J5XJ9X

2019-02-05 10:55:35.510: [   STMHW][710730400] Disk Type   = SSD

2019-02-05 10:55:35.510: [   STMHW][710730400] Expander    = 0 : 508002000231a17e

2019-02-05 10:55:35.510: [   STMHW][710730400] scsi-id     = 5000cca0a101ac54

2019-02-05 10:55:35.510: [   STMHW][710730400] sectors     = 781404246

2019-02-05 10:55:35.510: [   STMHW][710730400] OsDisk[14] Info:

2019-02-05 10:55:35.510: [   STMHW][710730400] OsDevName: /dev/sdr, Id = 14, Slot = 14, Capacity = 3200631791616: 3200gb, Type = SSD, hba port = 14 State = State: GOOD, expWwn = 5080020002311fbe, scsiId = 5000cca0a101ac54, Ctrlr = 0

2019-02-05 10:55:35.510: [   STMHW][710730400] OsDisk[38] Info:

2019-02-05 10:55:35.510: [   STMHW][710730400] OsDevName: /dev/sdao, Id = 38, Slot = 14, Capacity = 3200631791616: 3200gb, Type = SSD, hba port = 14 State = State: GOOD, expWwn = 508002000231a17e, scsiId = 5000cca0a101ac54, Ctrlr = 1

2019-02-05 10:55:35.512: [   STMHW][710730400] Sha::Inserting OSDevName /dev/sdr for slot 14

2019-02-05 10:55:35.512: [   STMHW][710730400] Sha::Inserting OSDevName /dev/sdao for slot 14

2019-02-05 10:55:35.512: [   STMHW][710730400] Physical Disk [14] Info:

2019-02-05 10:55:35.512: [   STMHW][710730400] Slot Num    = 14

2019-02-05 10:55:35.512: [   STMHW][710730400] Col  Num    = 2

2019-02-05 10:55:35.512: [   STMHW][710730400] OsDevNames  = |/dev/sdao||/dev/sdr|

2019-02-05 10:55:35.512: [   STMHW][710730400] Serial Num  = 1839J5XJ9X

2019-02-05 10:55:35.512: [   STMHW][710730400] Disk Type   = SSD

2019-02-05 10:55:35.512: [   STMHW][710730400] Expander    = 0 : 508002000231a17e

2019-02-05 10:55:35.512: [   STMHW][710730400] scsi-id     = 5000cca0a101ac54

2019-02-05 10:55:35.512: [   STMHW][710730400] sectors     = 781404246

2019-02-05 10:55:35.512: [   STMHW][710730400] OsDisk[14] Info:

This section from the oak log, describes the disk details from  PDiskAdapter.scr action script and FishWrap . Note the Autodiscovery hint, as the disk is partitioned for the different diskgroups:

2019-02-05 10:55:35.946: [   STMHW][150968064]{1:11302:2} Sha::Inserting OSDevName /dev/sdr for slot 14

2019-02-05 10:55:35.946: [   STMHW][150968064]{1:11302:2} Sha::Inserting OSDevName /dev/sdao for slot 14

2019-02-05 10:55:35.946: [ ADAPTER][150968064]{1:11302:2} Running predictive failure check for: /dev/sdao

2019-02-05 10:55:35.946: [    SCSI][150968064]{1:11302:2} SCSI Inquiry Command response for /dev/sdao

2019-02-05 10:55:35.946: [   OAKFW][167753472]{1:11302:2} [ActionScript] = /opt/oracle/oak/adapters/PDiskAdapter.scr

2019-02-05 10:55:35.946: [    SCSI][150968064]{1:11302:2} Vendor = HGST     Product = HBCAC2DH2SUN3.2T Revision = A170

2019-02-05 10:55:35.946: [   OAKFW][167753472]{1:11302:2} [ActionTimeout] = 1500

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [ActivePath] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [AgentFile] = %COMET_MS_HOME%/bin/%TYPE_NAME%

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [AsmDiskList] = |0|

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [AutoDiscovery] = 1

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [AutoDiscoveryHint] = |data:80:SSD||reco:20:SSD||redo:100:SSD|

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [CheckInterval] = 600

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [ColNum] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [DiskId] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [DiskType] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [Enabled] = 1

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [ExpNum] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [MultiPathList] = |0|

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [Name] = PDType

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [NewPartAddr] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [OSUserType] = |userType:Multiuser|

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [PlatformName] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [PrevUsrDevName] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [SectorSize] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [SerialNum] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [Size] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [SlotNum] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [TotalSectors] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [UsrDevName] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [gid] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [mode] = 660

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [uid] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [DependListOpr] = add

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [Dependency] = |0|

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [IState] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [Initialized] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [IsConfigDependency] = false

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [MonitorFlag] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [Name] = ResourceDef

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [PrevState] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [State] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [StateChangeTs] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [StateDetails] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} [TypeName] = 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} Added new resource : e0_pd_11 to the agfw

2019-02-05 10:55:35.947: [   OAKFW][167753472][F-ALGO]{1:11302:2} Resource name : e0_pd_11, state : 0

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} PE invalidating the data model

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} Evaluating Add Resource for e0_pd_11

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} Executing plan size: 1

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} PE: Sending message to agent : RESOURCE_VALIDATE[e0_pd_11] ID 4361:96

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} Engine received the message : RESOURCE_VALIDATE[e0_pd_09] ID 4361:90

2019-02-05 10:55:35.947: [   OAKFW][167753472]{1:11302:2} Preparing VALIDATE command for : e0_pd_09

2019-02-05 10:55:35.948: [   STMHW][150968064]{1:11302:2} Sha::Inserting OSDevName /dev/sdr for slot 14

2019-02-05 10:55:35.948: [   STMHW][150968064]{1:11302:2} Sha::Inserting OSDevName /dev/sdao for slot 14

2019-02-05 10:55:35.948: [ ADAPTER][150968064]{1:11302:2} Creating resource for PD: SSD_E0_S14_2701241428

2019-02-05 10:55:35.948: [ ADAPTER][150968064]{1:11302:2} partName datapctStr  80 diskType =SSD

This section from the oak log, describes the disk validation

2019-02-05 10:55:36.015: [        ][4177499904]{1:11302:2} [validate] print_args called with argument : validate

2019-02-05 10:55:36.015: [        ][4177499904]{1:11302:2} [validate] Arguments passed to PDiskAdapter:

2019-02-05 10:55:36.015: [        ][4177499904]{1:11302:2} [validate] ResName = e0_pd_14

2019-02-05 10:55:36.015: [        ][4177499904]{1:11302:2} [validate] DiskId = 35000cca0a101ac54

2019-02-05 10:55:36.015: [        ][4177499904]{1:11302:2} [validate] DevName = SSD_E0_S14_2701241428

2019-02-05 10:55:36.015: [        ][4177499904]{1:11302:2} [validate] MultiPaths = /dev/sdao /dev/sdr

2019-02-05 10:55:36.015: [        ][4177499904]{1:11302:2} [validate] ActivePath = /dev/sdao

2019-02-05 10:55:36.015: [        ][4177499904]{1:11302:2} [validate] DiskType = SSD

2019-02-05 10:55:36.015: [        ][4177499904]{1:11302:2} [validate] Expander = 0

2019-02-05 10:55:36.015: [        ][4177499904]{1:11302:2} [validate] Size = 3200631791616

2019-02-05 10:55:36.015: [        ][4177499904]{1:11302:2} [validate] Sectors = 781404246

2019-02-05 10:55:36.015: [        ][4177499904]{1:11302:2} [validate] ExpColNum = 2

2019-02-05 10:55:36.015: [        ][4177499904]{1:11302:2} [validate] NewPartAddr = 0

2019-02-05 10:55:36.015: [        ][4177499904]{1:11302:2} [validate] DiskSerial# = 1839J5XJ9X

2019-02-05 10:55:36.023: [        ][4085245696]{1:11302:2} [validate] [Tue Feb 5 10:55:35 EST 2019] Action script ‘/opt/oracle/oak/adapters/PDiskAdapter.scr’ for resource [e0_pd_15] called for action validate

This section from the oak log, we see the Linux kernel changes once the device entry is created; eg, IO scheduler, queue  depth, property values

2019-02-05 10:55:36.166: [        ][4177499904]{1:11302:2} [validate] Running echo deadline > /sys/block/sdao/queue/scheduler;echo 4096 > /sys/block/sdao/queue/nr_requests;echo 128 > /sys/block/sdao/queue/read_ahead_kb;

2019-02-05 10:55:36.166: [        ][4177499904]{1:11302:2} [validate] Running echo deadline > /sys/block/sdr/queue/scheduler;echo 4096 > /sys/block/sdr/queue/nr_requests;echo 128 > /sys/block/sdr/queue/read_ahead_kb;

2019-02-05 10:55:36.166: [        ][4177499904]{1:11302:2} [validate] Running echo 64 > /sys/block/sdao/device/queue_depth

2019-02-05 10:55:36.166: [        ][4177499904]{1:11302:2} [validate] Running echo 64 > /sys/block/sdr/device/queue_depth

2019-02-05 10:55:36.166: [        ][4177499904]{1:11302:2} [validate] Running echo 30 > /sys/block/sdao/device/timeout

2019-02-05 10:55:36.166: [        ][4177499904]{1:11302:2} [validate] Running echo 30 > /sys/block/sdr/device/timeout

2019-02-05 10:55:36.166: [   OAKFW][4177499904]{1:11302:2} Command : validate for: e0_pd_14 completed with status: SUCCESS

2019-02-05 10:55:36.166: [   OAKFW][167753472][F-ALGO]{1:11302:2} Engine received reply for command : validate for: e0_pd_14

2019-02-05 10:55:36.166: [   OAKFW][167753472]{1:11302:2} PE: Received last reply for : RESOURCE_VALIDATE[e0_pd_14] ID 4361:107

2019-02-05 10:55:36.166: [CLSFRAME][167753472]{1:11302:2} String params:CmdUniqId=SYS_START_-185246_e0_pd_14|ResId=e0_pd_14|ResTypeName=PDType|

2019-02-05 10:55:36.166: [CLSFRAME][167753472]{1:11302:2} Int params:ConfigVers=0|ErrCode=0|MsgId=4356|ProbeResource=0|sflag=4097|

2019-02-05 10:55:36.166: [   OAKFW][167753472]{1:11302:2} PE sending last reply for : RESOURCE_ADD[e0_pd_14] ID 4356:98

2019-02-05 10:55:36.166: [   OAKFW][167753472]{1:11302:2} PE sending last reply for : MIDTo:1|OpID:1|FromA:{Absolute|Node:1|Process:4294781990|Type:1}|ToA:{Absolute|Node:1|Process:4294781990|Type:1}|MIDFrom:7|Type:1|Pri2|Id:98:Ver:2String params:CmdUniqId=SYS_START_-185246_e0_pd_14|ResId=e0_pd_14|ResTypeName=PDType|Int params:ConfigVers=0|ErrCode=0|MsgId=4356|ProbeResource=0|sflag=4097|Map params: Map [PulldownDeps] BEGIN OF VALUES: END OF VALUESMap [PullupDeps] BEGIN OF VALUES: END OF VALUESMap [ResAttrList] BEGIN OF VALUES:ActivePath=/dev/sdao|AsmDiskList=|e0_data_14||e0_reco_14||ColNum=2|DiskId=35000cca0a101ac54|DiskType=SSD|ExpNum=0|MultiPathList=|/dev/sdao||/dev/sdr||Name=e0_pd_14|NewPartAddr=0|PlatformName=X7_2_ODA_HA|PrevUsrDevName=|SectorSize=4096|SerialNum=1839J5XJ9X|Size=3200631791616|SlotNum=14|TotalSectors=781404246|UsrDevName=SSD_E0_S14_2701241428| END OF VALUESMap [StartupDeps] BEGIN OF VALUES: END OF VALUESMap [StopDeps] BEGIN OF VALUES: END OF VALUES

2019-02-05 10:55:36.174: [        ][4085245696]{1:11302:2} [validate] INFO: DCS stack running on ODA-HA system

2019-02-05 10:55:36.174: [        ][4085245696]{1:11302:2} [validate] failed to stat() /dev/mapper/SSD_E0_S15_2701258144

This section from the oak log, validates the state and complete insertion

2019-02-05 10:55:51.997: [   STMHW][4177499904]{1:11302:2} getState : 1

2019-02-05 10:55:51.997: [        ][4177499904]{1:11302:2} [check] Validating disk header for : SSD_E0_S14_2701241428

2019-02-05 10:55:51.997: [ ADAPTER][4177499904]{1:11302:2} Succefully opened the device: /dev/sdao

2019-02-05 10:55:51.997: [ ADAPTER][4177499904]{1:11302:2} Diskheader.Read: devName = /dev/sdao, master_inc = 0, m_slave_inc = 0, disk_status = 0 disk_inc = 0, slot_num= 0, serial num =  chassis snum =   part_loaded_cnt=0

2019-02-05 10:55:52.608: [   STMHW][4177499904]{1:11302:2} getState : 1

2019-02-05 10:55:52.608: [   STMHW][4177499904]{1:11302:2} State has been changed for: /dev/sdr Old State: GOOD, New State: INSERTED

2019-02-05 10:55:52.608: [   STMHW][4177499904]{1:11302:2} State has been changed for: /dev/sdao Old State: GOOD, New State: INSERTED

2019-02-05 10:55:52.608: [        ][4177499904]{1:11302:2} [check] Found the disk in uninitialized state.

2019-02-05 10:55:52.608: [   STMHW][4177499904]{1:11302:2} getState : 1

2019-02-05 10:55:52.608: [        ][4177499904]{1:11302:2} [check] Running ssd wear level check for: /dev/sdao

2019-02-05 10:55:52.609: [    SCSI][4177499904]{1:11302:2} SSD Media used endurance indicator: 0%

2019-02-05 10:55:52.609: [   STMHW][4177499904]{1:11302:2} Sha::Inserting OSDevName /dev/sdr for slot 14

2019-02-05 10:55:52.609: [   STMHW][4177499904]{1:11302:2} Sha::Inserting OSDevName /dev/sdao for slot 14

2019-02-05 10:55:52.609: [        ][4177499904]{1:11302:2} [check] Disk State: 1,  Label: NewDiskInserted

2019-02-05 10:55:53.856: [   STMHW][4085245696]{1:11302:2} getState : 1

2019-02-05 10:55:53.856: [   STMHW][4085245696]{1:11302:2} State has been changed for: /dev/sdr Old State: INSERTED, New State: GOOD

2019-02-05 10:55:53.856: [   STMHW][4085245696]{1:11302:2} State has been changed for: /dev/sdao Old State: INSERTED, New State: GOOD

2019-02-05 10:55:53.856: [        ][4085245696]{1:11302:2} [check] Validating disk header for : SSD_E0_S14_2701241428

2019-02-05 10:55:53.856: [ ADAPTER][4085245696]{1:11302:2} Succefully opened the device: /dev/sdao

2019-02-05 10:55:53.856: [ ADAPTER][4085245696]{1:11302:2} Diskheader.Read: devName = /dev/sdao, master_inc = 0, m_slave_inc = 0, disk_status = 0 disk_inc = 0, slot_num= 0, serial num =  chassis snum =   part_loaded_cnt=0

2019-02-05 11:06:16.741: [   OAKFW][167753472]{0:7:2} OAKD received the UI request, filter = PDType

2019-02-05 11:06:16.741: [CLSFRAME][167753472]{0:7:2} payload=| NAME           PATH           TYPE           STATE          STATE_DETAILS

|| e0_pd_00       /dev/sdab      SSD            ONLINE         Good           || e0_pd_01       /dev/sdac      SSD            ONLINE         Good           || e0_pd_02       /dev/sdad      SSD            ONLINE         Good           || e0_pd_03       /dev/sdae      SSD            ONLINE         Good           || e0_pd_04       /dev/sdf       SSD            ONLINE         Good           || e0_pd_05       /dev/sdaf      SSD            UNKNOWN        NewDiskInserted|| e0_pd_06       /dev/sdag      SSD            UNKNOWN        NewDiskInserted|| e0_pd_07       /dev/sdah      SSD            UNKNOWN        NewDiskInserted|| e0_pd_08       /dev/sdai      SSD            UNKNOWN        NewDiskInserted|| e0_pd_09       /dev/sdaj      SSD            UNKNOWN        NewDiskInserted|| e0_pd_10       /dev/sdak      SSD            UNKNOWN 

2019-02-05 11:06:16.741: [CLSFRAME][167753472]      NewDiskInserted|| e0_pd_11       /dev/sdal      SSD            UNKNOWN        NewDiskInserted|| e0_pd_12       /dev/sdam      SSD            UNKNOWN        NewDiskInserted|| e0_pd_13       /dev/sdan      SSD            UNKNOWN        NewDiskInserted|| e0_pd_14       /dev/sdao      SSD            UNKNOWN        NewDiskInserted|| e0_pd_15       /dev/sdap      SSD            UNKNOWN        NewDiskInserted|| e0_pd_16       /dev/sdaq      SSD            UNKNOWN        NewDiskInserted|| e0_pd_17       /dev/sdar      SSD            UNKNOWN        NewDiskInserted|| e0_pd_18       /dev/sdas      SSD            UNKNOWN        NewDiskInserted|| e0_pd_19       /dev/sdat      SSD            UNKNOWN        NewDiskInserted|| e0_pd_20       /dev/sdau      SSD            ONLINE         Good           || e0_pd_21       /dev/sdav      SSD            ONLINE         Good           || e0_pd_22       /dev/sdaw      SSD           

2019-02-05 11:06:16.741: [CLSFRAME][167753472] ONLINE         Good           || e0_pd_23       /dev/sdaa      SSD            ONLINE         Good           ||

2019-02-05 11:06:16.741: [CLSFRAME][167753472]{0:7:2} String params:CmdUniqId=|filter=PDType|pname=Resources:|

Diggin’ into ASM Add Disk to DiskGroup operation

Recently we had to add storage to our X7-HA ODA.  This storage add includes a multi-step process, which is generally handled by the ODA OAK automation.  We simply added the disks in the slot, and the oakd dameon and workflow takes care of the device management.  The key things the oakd automation does is

  • Instantiates the disk device into the OS
  • Build partition tables
  • Create devmapper device names
  • updates the asmappl.config (***DO NOT TOUCH or EDIT THIS FILE..or apocalyptic things will HAPPEN **)
  • Generate a ASM disk add commands to added the disks to DATA and RECO diskgroups in the 80-20% pre-defined configuration.

This blog will cover the disk part and walk you through trace file.  I’ll blog about the automation stuff later

If you peak at the oakd.log or look in the ASM alert.log, you’ll see the actual command that gets executed.I have shown only the DATA dg disk add operation.    The RECO operation is the same but uses disk partitin P2.

My comments are inline:

SQL> ALTER DISKGROUP /*+ _OAK_AsmCookie */ DATA ADD DISK   <- This OAK ASM Cookie invokes ODA specific backend operations
‘AFD:SSD_E0_S05_2701246684P1’ NAME SSD_E0_S05_2701246684P1,   <– This is the list of disks that will be added to ODA
‘AFD:SSD_E0_S06_2701246400P1’ NAME SSD_E0_S06_2701246400P1,
‘AFD:SSD_E0_S07_2701243880P1’ NAME SSD_E0_S07_2701243880P1,
‘AFD:SSD_E0_S08_2701246408P1’ NAME SSD_E0_S08_2701246408P1,
‘AFD:SSD_E0_S09_2701257952P1’ NAME SSD_E0_S09_2701257952P1,
‘AFD:SSD_E0_S10_2701255368P1’ NAME SSD_E0_S10_2701255368P1,
‘AFD:SSD_E0_S11_2701247132P1’ NAME SSD_E0_S11_2701247132P1,
‘AFD:SSD_E0_S12_2701246568P1’ NAME SSD_E0_S12_2701246568P1,
‘AFD:SSD_E0_S13_2701251260P1’ NAME SSD_E0_S13_2701251260P1,
‘AFD:SSD_E0_S14_2701259824P1’ NAME SSD_E0_S14_2701259824P1,
‘AFD:SSD_E0_S15_2701255760P1’ NAME SSD_E0_S15_2701255760P1,
‘AFD:SSD_E0_S16_2701229772P1’ NAME SSD_E0_S16_2701229772P1,
‘AFD:SSD_E0_S17_2701232460P1’ NAME SSD_E0_S17_2701232460P1,
‘AFD:SSD_E0_S18_2701257420P1’ NAME SSD_E0_S18_2701257420P1,
‘AFD:SSD_E0_S19_2701253140P1’ NAME SSD_E0_S19_2701253140P1
kfdp_query: callcnt 78 grp 1 (DATA)                                  <– Its being added to DATA
kfdp_query: callcnt 79 grp 1 (DATA)
NOTE: Assigning number (1,5) to disk (AFD:SSD_E0_S05_2701246684P1) . <– Each disk is assigned a disk#
Disk 0x766f50e0 (1:5:AFD:SSD_E0_S05_2701246684P1) is being named (SSD_E0_S05_2701246684P1)
NOTE: Assigning number (1,6) to disk (AFD:SSD_E0_S06_2701246400P1)
2019-02-21 14:37:14.762*:kgfm.c@547: kgfmInitialize          <– Here the disks get initialized, using an array 
2019-02-21 14:37:14.763*:kgf.c@926: kgfArray_construct 0x7f2e648af208 len=0 nsegs=0
2019-02-21 14:37:14.763*:kgf.c@926: kgfArray_construct 0x7f2e648b75a0 len=0 nsegs=0
2019-02-21 14:37:14.763*:kgf.c@926: kgfArray_construct 0x7f2e69a676e8 len=0 nsegs=0
…<deleted repeated lines>

<– kgfmReadOak reads in OAK configuration information into memory structures

2019-02-21 14:37:14.763*:kgfm.c@1773: kgfmReadOak: max_disk_count is 100
2019-02-21 14:37:14.763*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [max_disk_count] = [100]
2019-02-21 14:37:14.763*:kgf.c@1025: kgfArray_grow 0x7f2e69a77e90 len=0->1 nsegs=0->1
2019-02-21 14:37:14.763*:kgfm.c@1773: kgfmReadOak: appliance_name is ODA
2019-02-21 14:37:14.763*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [appliance_name] = [ODA]
2019-02-21 14:37:14.763*:kgf.c@1025: kgfArray_grow 0x7f2e69a77e90 len=1->2 nsegs=1->1
2019-02-21 14:37:14.763*:kgfm.c@1773: kgfmReadOak: diskstring is AFD:*
2019-02-21 14:37:14.763*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [diskstring] = [AFD:*]
2019-02-21 14:37:14.763*:kgf.c@1025: kgfArray_grow 0x7f2e69a77e90 len=2->3 nsegs=1->1
2019-02-21 14:37:14.763*:kgfm.c@1773: kgfmReadOak: file_version is 2
2019-02-21 14:37:14.763*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [file_version] = [2]
2019-02-21 14:37:14.763*:kgf.c@1025: kgfArray_grow 0x7f2e69a77e90 len=3->4 nsegs=1->1
2019-02-21 14:37:14.763*:kgfm.c@1773: kgfmReadOak: oda_version is 3
2019-02-21 14:37:14.763*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [oda_version] = [3]
2019-02-21 14:37:14.763*:kgf.c@1025: kgfArray_grow 0x7f2e69a77e90 len=4->5 nsegs=1->1
2019-02-21 14:37:14.763*:kgfm.c@1773: kgfmReadOak: jbod_count is 1
2019-02-21 14:37:14.763*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [jbod_count] = [1]
2019-02-21 14:37:14.763*:kgf.c@1025: kgfArray_grow 0x7f2e69a77e90 len=5->6 nsegs=1->1
2019-02-21 14:37:14.763*:kgfm.c@1773: kgfmReadOak: jbod_slot_count is 24 <– all 24 slots in the ODA are filled
2019-02-21 14:37:14.763*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [jbod_slot_count] = [24]
2019-02-21 14:37:14.763*:kgf.c@1025: kgfArray_grow 0x7f2e69a77e90 len=6->7 nsegs=1->1
2019-02-21 14:37:14.763*:kgfm.c@1773: kgfmReadOak: data_slot_count is 20 . <– 20 disks for DATA DG
2019-02-21 14:37:14.763*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [data_slot_count] = [20]
2019-02-21 14:37:14.763*:kgf.c@1025: kgfArray_grow 0x7f2e69a77e90 len=7->8 nsegs=1->1
2019-02-21 14:37:14.763*:kgfm.c@1773: kgfmReadOak: reco_slot_count is 20
2019-02-21 14:37:14.763*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [reco_slot_count] = [20]
2019-02-21 14:37:14.763*:kgf.c@1025: kgfArray_grow 0x7f2e69a77e90 len=8->9 nsegs=1->1
2019-02-21 14:37:14.763*:kgfm.c@1773: kgfmReadOak: redo_slot_count is 4       <– 4 disks for REDO DG
2019-02-21 14:37:14.763*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [redo_slot_count] = [4]
2019-02-21 14:37:14.763*:kgf.c@1025: kgfArray_grow 0x7f2e69a77e90 len=9->10 nsegs=1->1
2019-02-21 14:37:14.763*:kgfm.c@1773: kgfmReadOak: max_missing is 0
2019-02-21 14:37:14.763*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [max_missing] = [0]
2019-02-21 14:37:14.763*:kgf.c@1025: kgfArray_grow 0x7f2e69a77e90 len=10->11 nsegs=1->1
2019-02-21 14:37:14.763*:kgfm.c@1773: kgfmReadOak: min_partners is 2
2019-02-21 14:37:14.763*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [min_partners] = [2]
2019-02-21 14:37:14.763*:kgf.c@1025: kgfArray_grow 0x7f2e69a77e90 len=11->12 nsegs=1->1
2019-02-21 14:37:14.763*:kgfm.c@1773: kgfmReadOak: agent_sql_identifier is /*+ _OAK_AsmCookie
2019-02-21 14:37:14.763*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [agent_sql_identifier] = [/*+ _OAK_AsmCookie ]
2019-02-21 14:37:14.763*:kgf.c@1025: kgfArray_grow 0x7f2e69a77e90 len=12->13 nsegs=1->1
2019-02-21 14:37:14.763*:kgfm.c@1773: kgfmReadOak: rdbms_compatibility is 12.1.0.2
2019-02-21 14:37:14.763*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [rdbms_compatibility] = [12.1.0.2]
2019-02-21 14:37:14.763*:kgf.c@1025: kgfArray_grow 0x7f2e69a77e90 len=13->14 nsegs=1->1
2019-02-21 14:37:14.764*:kgfm.c@1773: kgfmReadOak: asm_compatibility is 12.2.0.1
2019-02-21 14:37:14.764*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [asm_compatibility] = [12.2.0.1]
2019-02-21 14:37:14.764*:kgf.c@1025: kgfArray_grow 0x7f2e69a77e90 len=14->15 nsegs=1->1
2019-02-21 14:37:14.764*:kgfm.c@1773: kgfmReadOak: _asm_hbeatiowait is 100
2019-02-21 14:37:14.764*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [_asm_hbeatiowait] = [100]

This next section starts the disk association and partnership mapping

[

2019-02-21 14:37:14.769*:kgfm.c@2857: kgfmAddDisk: enc=0 slot=23 part=1 path=AFD:SSD_E0_S23_2181135920P1
2019-02-21 14:37:14.769*:kgf.c@1025: kgfArray_grow 0x7f2e648af208 len=0->1 nsegs=0->1
2019-02-21 14:37:14.769*:kgf.c@1025: kgfArray_grow 0x7f2e69a63530 len=0->24 nsegs=0->1
2019-02-21 14:37:14.769*:kgfm.c@1968: kgfmReadOak: disk 23 partners [ 2019-02-21 14:37:14.769*:kgfm.c@1970: 22 2019-02-21 14:37:14.769*:kgfm.c@1970: 21 2019-02-21 14:37:14.769*:kgfm.c@1970: 20 2019-02-21 14:37:14.769*:kgfm.c@1971: ]

[

2019-02-21 14:37:14.770*:kgfm.c@2857: kgfmAddDisk: enc=0 slot=22 part=1 path=AFD:SSD_E0_S22_2181131148P1
2019-02-21 14:37:14.770*:kgf.c@1025: kgfArray_grow 0x7f2e648af208 len=1->2 nsegs=1->1
2019-02-21 14:37:14.770*:kgfm.c@1968: kgfmReadOak: disk 22 partners [ 2019-02-21 14:37:14.770*:kgfm.c@1970: 23 2019-02-21 14:37:14.770*:kgfm.c@1970: 21 2019-02-21 14:37:14.770*:kgfm.c@1970: 20 2019-02-21 14:37:14.770*:kgfm.c@1971: ]

This repeated for every disk in the ODA

….

Define DiskGroup Attributes –

2019-02-21 14:37:14.774*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [agent_sql_identifier] = [/*+ _OAK_AsmCookie ]
2019-02-21 14:37:14.775*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [oda_version] = [1]
2019-02-21 14:37:14.775*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [jbod_count] = [1]
2019-02-21 14:37:14.775*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [jbod_slot_count] = [24]
2019-02-21 14:37:14.775*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [data_slot_count] = [20]
2019-02-21 14:37:14.775*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [reco_slot_count] = [20]
2019-02-21 14:37:14.775*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [redo_slot_count] = [4]
2019-02-21 14:37:14.775*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [diskstring] = [(null)]
2019-02-21 14:37:14.775*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [appliance_name] = [ODA]
2019-02-21 14:37:14.775*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [agent_sql_identifier] = [/* ASM Appliance Agent */]
2019-02-21 14:37:14.775*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [asm_compatibility] = [11.2.0.3]
2019-02-21 14:37:14.775*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [rdbms_compatibility] = [11.2.0.2]
2019-02-21 14:37:14.775*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [max_disk_count] = [100]
2019-02-21 14:37:14.775*:kgfm.c@2954: kgfmAddAttribute: tmpl 0 [_asm_hbeatiowait] = [0]
2019-02-21 14:37:14.775*:kgfm.c@2643: kgfmStaticCardinatliy: odaver 3 njbods 1 tmpl 1 card0 3 card1 3
2019-02-21 14:37:14.775*:kgfm.c@2643: kgfmStaticCardinatliy: odaver 3 njbods 1 tmpl 1 card0 5 card1 5
2019-02-21 14:37:14.775*:kgfm.c@2954: kgfmAddAttribute: tmpl 1 [max_missing] = [0]

NOTE: running client discovery for group 1 (reqid:7106731798842955164)

<– AT this point the disks are added and  client (database) re-discovers the diskgroup

*** 2019-02-21T14:37:20.381245-05:00
NOTE: running client discovery for group 1 (reqid:7106731798842926578)

*** 2019-02-21T14:37:21.421653-05:00
kfdp_updateReconf(): callcnt 80 grp 1 scope 0x204 .  <–Once you see kfdp_updateReconf …its DONE!
NOTE: group 1 PST updated.
PST verChk [reconf]: id=46860409, grp=1, requested=9 at 02/21/2019 14:37:21
PST verChk [reconf]: id=46860409 grp=1 completed at 02/21/2019 14:37:21

Expanding /u01 filesystem on Oracle Database Appliance

vna

Recently I had to expand the /u01 on our ODA because we were in the process of consolidating several new Oracle database systems, each with their own Oracle Homes (don’t ask….its what the lines of business wanted).

Although lot of this is just a simple Linux LVM stuff…I feel it warrants a blog entry….since folks view an ODA as different beast 🙂

[root@vna-oda1-0 ~]# pvdisplay

  — Physical volume —

  PV Name               /dev/md1

  VG Name               VolGroupSys

  PV Size               446.03 GiB / not usable 29.00 MiB

  Allocatable           yes 

  PE Size               32.00 MiB

  Total PE              14272

  Free PE               7424

  Allocated PE          6848

  PV UUID               Kw1O64-n9j0-4OW7-yUCZ-8FHc-HKug-mOPyP4

[root@vna-oda1-0 ~]# df -m /u01

Filesystem           1M-blocks  Used Available Use% Mounted on

/dev/mapper/VolGroupSys-LogVolU01

                        100666 64564     30982  68% /u01

[root@vna-oda1-0 ~]# lvdisplay /dev/mapper/VolGroupSys-LogVolU01

  — Logical volume —

  LV Path                /dev/VolGroupSys/LogVolU01

  LV Name                LogVolU01

  VG Name                VolGroupSys

  LV UUID                UGVuZY-Xia1-u0Th-TaZ2-JF9Q-FH01-jDVfOn

  LV Write Access        read/write

  LV Creation host, time localhost.localdomain, 2018-05-23 13:14:22 -0400

  LV Status              available

  # open                 1

  LV Size                100.00 GiB

  Current LE             3200

  Segments               1

  Allocation             inherit

  Read ahead sectors     auto

  – currently set to     256

  Block device           249:40

[root@vna-oda1-0 ~]# df -kh 

Filesystem            Size  Used Avail Use% Mounted on

/dev/mapper/VolGroupSys-LogVolRoot

                       30G  5.8G   23G  21% /

tmpfs                 378G  1.3G  376G   1% /dev/shm

/dev/md0              477M  115M  338M  26% /boot

/dev/sda1             500M  304K  500M   1% /boot/efi

/dev/mapper/VolGroupSys-LogVolOpt

                       59G   36G   21G  64% /opt

/dev/mapper/VolGroupSys-LogVolU01

                       99G   64G   31G  68% /u01

lvextend –size +50G /dev/VolGroupSys/LogVolU01

  Size of logical volume VolGroupSys/LogVolU01 changed from 100.00 GiB (3200 extents) to 150.00 GiB (4800 extents).

  Logical volume LogVolU01 successfully resized.

[root@vna-oda0-0 oak]# 

[root@vna-oda0-0 oak]# 

[root@vna-oda0-0 oak]# lvdisplay /dev/mapper/VolGroupSys-LogVolU01

  — Logical volume —

  LV Path                /dev/VolGroupSys/LogVolU01

  LV Name                LogVolU01

  VG Name                VolGroupSys

  LV UUID                g2CMK8-kERY-43uu-p0ZD-V5In-otaL-S2zSdX

  LV Write Access        read/write

  LV Creation host, time localhost.localdomain, 2018-05-21 11:04:34 -0400

  LV Status              available

  # open                 1

  LV Size                150.00 GiB

  Current LE             4800

  Segments               2

  Allocation             inherit

  Read ahead sectors     auto

  – currently set to     256

  Block device           249:25

[root@vna-oda0-0 oak]# resize2fs /dev/VolGroupSys/LogVolU01

resize2fs 1.43-WIP (20-Jun-2013)

Filesystem at /dev/VolGroupSys/LogVolU01 is mounted on /u01; on-line resizing required

old_desc_blocks = 7, new_desc_blocks = 10

Performing an on-line resize of /dev/VolGroupSys/LogVolU01 to 39321600 (4k) blocks.

The filesystem on /dev/VolGroupSys/LogVolU01 is now 39321600 blocks long.

 

ODA Patching/Upgrade Triage

This blog will define the appropriate logs, files and command output to collect when triaging ODA related patching issues

Recently we attempted to upgrade from 12.2.1.2 to 12.2.1.3 on ODA X7.  We did the usual as part of the upgrade process.

unzip p27648057_122130_Linux-x86-64_1of3.zip

ls -l  oda-sm-12.2.1.3.0-180504-server*

/opt/oracle/dcs/bin/odacli update-repository -f oda-sm-12.2.1.3.0-180504-server1of3.zip, p27648057_122130_Linux-x86-64_2of3.zip,

p27648057_122130_Linux-x86-64_3of3.zip

/opt/oracle/dcs/bin/odacli update-dcsagent -v 12.2.1.3.0

/opt/oracle/dcs/bin/odacli describe-job -i jobid

rpm -qa |grep dcs-agent

dcs-agent-18.1.3.0.0_LINUX.X64_180504-86.x86_64

/opt/oracle/dcs/bin/odacli update-server -v 12.2.1.3.0

The update-server failed according to this log:

{

    “updatedTime” : 1530126568790,

    “startTime” : 1530126562856,

    “endTime” : 1530126568788,

    “taskId” : “TaskZJsonRpcExt_225”,

    “status” : “Success”,

    “taskResult” : “Successfully created the yum repos for patchingos”,

    “taskName” : “Creating repositories using yum”,

    “taskDescription” : null,

    “parentTaskId” : “TaskSequential_224”,

    “jobId” : “5dd32b38-4d4b-4a3a-bf45-bb71cc8bf801”,

    “tags” : [ ],

    “reportLevel” : “Info”

  }, {

    “updatedTime” : 1530126894836,

    “startTime” : 1530126568799,

    “endTime” : 1530126894834,

    “taskId” : “TaskZJsonRpcExt_228”,

    “status” : “Success”,

    “taskResult” : “Successfully updated the OS”,

    “taskName” : “Applying OS Patches”,

    “taskDescription” : null,

    “parentTaskId” : “TaskParallel_227”,

    “jobId” : “5dd32b38-4d4b-4a3a-bf45-bb71cc8bf801”,

    “tags” : [ ],

    “reportLevel” : “Info”

  },

{

    “updatedTime” : 1530126899457,

    “startTime” : 1530126894980,

    “endTime” : 1530126899455,

    “taskId” : “TaskZJsonRpcExt_236”,

    “status” : “Success”,

    “taskResult” : “Successfully updated the Firmware Disk”,

    “taskName” : “Applying Firmware Disk Patches”,

    “taskDescription” : null,

    “parentTaskId” : “TaskSequential_235”,

    “jobId” : “5dd32b38-4d4b-4a3a-bf45-bb71cc8bf801”,

    “tags” : [ ],

    “reportLevel” : “Info”

  },

{

    “updatedTime” : 1530127898145,

    “startTime” : 1530127248252,

    “endTime” : 1530127898144,

    “taskId” : “TaskSequential_121”,

    “status” : “Failure”,

    “taskResult” : “DCS-10001:Internal error encountered:  apply patch using OpatchAuto on node odanode1.”,

    “taskName” : “task:TaskSequential_121”,

    “taskDescription” : null,

    “parentTaskId” : “TaskSequential_256”,

    “jobId” : “5dd32b38-4d4b-4a3a-bf45-bb71cc8bf801”,

    “tags” : [ ],

    “reportLevel” : “Error”

  },

….

{

    “updatedTime” : 1530127898142,

    “startTime” : 1530127463976,

    “endTime” : 1530127898141,

    “taskId” : “TaskSequential_162”,

    “status” : “Failure”,

    “taskResult” : “DCS-10001:Internal error encountered:  apply patch using OpatchAuto on nodeodanode1.”,

    “taskName” : “task:TaskSequential_162”,

    “taskDescription” : null,

    “parentTaskId” : “TaskSequential_121”,

    “jobId” : “5dd32b38-4d4b-4a3a-bf45-bb71cc8bf801”,

    “tags” : [ ],

    “reportLevel” : “Error”

  }, {

    “updatedTime” : 1530127898139,

    “startTime” : 1530127463979,

    “endTime” : 1530127898137,

    “taskId” : “TaskZJsonRpcExt_163”,

    “status” : “Failure”,

    “taskResult” : “DCS-10001:Internal error encountered:  apply patch using OpatchAuto on node odanode1″,

    “taskName” : “clusterware upgrade”,

    “taskDescription” : null,

    “parentTaskId” : “TaskSequential_162”,

    “jobId” : “5dd32b38-4d4b-4a3a-bf45-bb71cc8bf801”,

    “tags” : [ ],

    “reportLevel” : “Error”

  } ],

  “createTimestamp” : 1530126459722,

  “resourceList” : [ ],

  “description” : “Server Patching”

}

When opening an SR make sure you collected the following infor for support.  This will minimize the ‘backn’forth’-ness and reduce mean time to resolve.

Please provide the output from the commands below:
1.

odacli describe-component

System Version 

—————

12.2.1.3.0

Component                                Installed Version    Available Version   

—————————————- ——————– ——————–

OAK                                       12.2.1.3.0            up-to-date          

GI                                        12.2.0.1.171017       12.2.0.1.180116     

DB                                        12.2.0.1.171017       12.2.0.1.180116     

DCSAGENT                                  18.1.3.0.0            up-to-date          

ILOM                                      4.0.0.28.r121827      4.0.0.22.r120818    

BIOS                                      41017600              41017100            

OS                                        6.9                   up-to-date          

FIRMWARECONTROLLER                        QDV1RE14              up-to-date          

ASR                                       5.7.7                 up-to-date          

# odacli describe-latestpatch

componentType   availableVersion    

————— ——————–

gi              12.2.0.1.180116     

gi              12.2.0.1.180116     

db              12.2.0.1.180116     

db              12.1.0.2.180116     

db              11.2.0.4.180116     

oak             12.2.1.3.0          

oak             12.2.1.3.0          

asr             5.7.7               

ilom            4.0.0.22.r120818    

os              6.9                 

hmp             2.4.1.0.9           

bios            41017100            

firmwarecontroller 13.00.00.00         

firmwarecontroller 4.650.00-7176       

firmwarecontroller kpyagr3q            

firmwarecontroller qdv1re14            

firmwaredisk    0r3q                

firmwaredisk    a122                

firmwaredisk    a122                

firmwaredisk    a374                

firmwaredisk    c122                

firmwaredisk    c122                

firmwaredisk    c376                

firmwaredisk    0112                

dcsagent        18.1.3.0.0          

2.  Provide/upload the following most recent configuration and diagnostic details./log files: 

/u01/app/12.2.0.1/oracle/cfgtoollogs/opatchautodb/systemconfig* 

/tmp/opatchAutoAnalyzePatch.log 

/u01/app/12.2.0.1/oracle/cfgtoollogs/opatchauto/core/opatch/opatch<date>.log 

/u01/app/12.2.0.1/oracle/cfgtoollogs/opatchautodb/systemconfig<date>.log 

/u01/app/12.2.0.1/oracle/cfgtoollogs/opatchauto/opatchauto<date>.log

/u01/app/oracle/crsdata/odanode1/crsconfig/crspatch_<hostname><date>.log