Community
Search Options
Search Options
Log in
Skip to main content (Press Enter).
Sign in
Skip auxiliary navigation (Press Enter).
IBM TechXchange
Community
Cloud Global
Cloud Native Apps
Cloud Partner Accelerator
Cloud Platform as a Service
Cloud Training
High Performance Computing
IBM Cloud for SAP
Infrastructure as a Service
VMware on Cloud
Groups
AI
Automation
Data
Security
Sustainability
Cloud
IBM Z & LinuxONE
Power
Storage
IBM Champions
IBM Japan
All Groups
My Groups
Champions
User Groups
Cloud user groups
All user groups
Events
Conference
Community Events
User Groups Events
All TechXchange Events
Participate
TechXchange Group
Welcome Corner
Blogging in the Community
Directory
Community Leaders
Resources
Marketplace
Marketplace
IBM Data Management Community
Connect with Db2, Informix, Netezza, open source, and other data experts to gain value from your data, share insights, and solve problems.
Ask a question
Learn more about TechXchange Dev Days virtual and in-person events
here
Skip main navigation (Press Enter).
Toggle navigation
Data Management User Groups
Technical Service Bulletin 2021-449, repost from Cloudera
View Only
Group Home
Threads
56
Library
48
Blogs
44
Events
0
Members
428
Technical Service Bulletin 2021-449, repost from Cloudera
0
Like
Tue January 19, 2021 11:54 AM
Lynn Chou
Technical Service Bulletin 2021-449
Kudu tablet server might crash in certain workflows where a tablet is dropped right after ALTER TABLE statement
DDL and DML operations can accumulate in the Kudu tablet replica's write ahead log (WAL) during normal operation. Upon the shutdown of a tablet replica (for example, right before removing the replica), information on the accumulated operations (first 50) are printed into the tablet server's INFO log file.
A bug was introduced with the fix for
KUDU-2690. The code contains a flipped if-condition that results in de-referencing of an invalid pointer while reporting on a pending ALTER TABLE operation in the tablet replica's WAL. The issue manifests itself in kudu-tserver processes crashing with SIGSEGV (segmentation fault).
The occurrence of the issue is limited to scenarios which result in accumulating at least one pending ALTER TABLE operation in the tablet replica's WAL at the time when the tablet replica is shut down. An example scenario is an ALTER TABLE request (for example, adding a column) immediately followed by a request to drop a tablet (for example, drop a range partition). Another example scenario is shutting down a tablet server while it's still processing an ALTER TABLE request for one of its tablet replicas. A slowness in file system operations increases the chances for the issue to manifest itself.
Component affected:
Kudu
Products affected:
CDH
Releases affected:
CDH 6.2.0, 6.2.1
CDH 6.3.0, 6.3.1, 6.3.2, 6.3.3
Users affected:
Kudu clusters with the impacted releases
Impact:
In the worst case, multiple kudu-tserver processes can crash in a Kudu cluster, making data unavailable until the affected tablet servers are started back.
Severity:
High
Action required:
Workaround
Avoid dropping range partitions and tablets right after issuing ALTER TABLE request. Wait for the pending ALTER TABLE requests to complete before dropping tablets or shutting down tablet servers.
Solution
Upgrade to CDH 6.3.4 or CDP
#Cloudera
#Hadoop
#OpenSourceOfferings
Statistics
0 Favorited
6 Views
0 Files
0 Shares
0 Downloads
IBM TechXchange
Community
Cloud Global
Cloud Native Apps
Cloud Partner Accelerator
Cloud Platform as a Service
Cloud Training
High Performance Computing
IBM Cloud for SAP
Infrastructure as a Service
VMware on Cloud
Groups
AI
Automation
Data
Security
Sustainability
Cloud
IBM Z & LinuxONE
Power
Storage
IBM Champions
IBM Japan
All Groups
My Groups
Champions
User Groups
Cloud user groups
All user groups
Events
Conference
Community Events
User Groups Events
All TechXchange Events
Participate
TechXchange Group
Welcome Corner
Blogging in the Community
Directory
Community Leaders
Resources
Marketplace
Marketplace
Powered by Higher Logic