Sunday, June 25, 2017

My First Steps with MariaDB ColumnStore

This is a "Howto" kind of post, and some speculations and historical references aside, it will show you how to build MariaDB ColumnStore (current version 1.0.9) from GitHub source, how to install and configure it for basic usage, as well as how to resolve some minor problems you may get in the process. Keep reading and eventually you'll get the real Howto below :)

* * *

I try not to care about any software issues besides good old MySQL, InnoDB storage engine internals, query optimization and some MyRocks or a little bit of Galera clusters. But as I work in MariaDB Support it is not possible to ignore other things production DBAs use and care about these days. So, recently I noted increased number of issues related to MariaDB ColumnStore and had to step in to resolve some of them. To do this I had first to make sure this software is "supportable by me", that is, I can build it from source on my own hardware (so that I can apply patches if needed and is not dependent of any packaging problems), install, start and stop, do some basic things with it and know what to do to troubleshoot non-trivial problems.

Actually I tried to check if it is "supportable by me" long time ago. First during the company meeting in Berlin back in April 2016 I was presented with some basic installation and usage steps, then a year ago I tried to build one of the very first public releases of ColumnStore, and this ended up with MCOL-142  that was later fixed by the team. Starting from version 1.0.6 and my successful build of MariaDB ColumnStore based on GitHub instructions I was sure that I can supported it if needed.

This week I found myself in a situation that I do need to install MariaDB ColumnStore to help customer, but I had only my Ubuntu 14.04 netbook at hand. Based on my records, I had few minor problems building the engine itself there, as of version 1.0.7:

...
-- Performing Test INLINE - Failed
-- Could NOT find LibXml2 (missing:  LIBXML2_LIBRARIES LIBXML2_INCLUDE_DIR)
CMake Error at CMakeLists.txt:91 (MESSAGE):
  Could not find a usable libxml2 development environment!


-- Configuring incomplete, errors occurred!
See also "/home/openxs/git/mariadb-columnstore-engine/CMakeFiles/CMakeOutput.log".
See also "/home/openxs/git/mariadb-columnstore-engine/CMakeFiles/CMakeError.log".

sudo apt-get install build-essential automake libboost-all-dev bison cmake libncurses5-dev libreadline-dev libperl-dev libssl-dev libxml2-dev libkrb5-dev flex
...
Setting up libboost-wave-dev:amd64 (1.54.0.1ubuntu1) ...
Setting up libboost-all-dev (1.54.0.1ubuntu1) ...
Setting up libperl-dev (5.18.2-2ubuntu1.1) ...
Setting up libhwloc-plugins (1.8-1ubuntu1.14.04.1) ...
Processing triggers for libc-bin (2.19-0ubuntu6.9) ...
openxs@ao756:~/git/mariadb-columnstore-engine$
openxs@ao756:~/git/mariadb-columnstore-engine$ cmake . -DSERVER_BUILD_INCLUDE_DIR=../mariadb-columnstore-server/include -DSERVER_SOURCE_ROOT_DIR=../mariadb-columnstore-server
-- Running cmake version 2.8.12.2
-- MariaDB-Columnstore 1.0.7
-- Found LibXml2: /usr/lib/x86_64-linux-gnu/libxml2.so (found version "2.9.1")
-- Boost version: 1.54.0
-- Found the following Boost libraries:
--   system
--   filesystem
--   thread
--   regex
--   date_time
SERVER_BUILD_INCLUDE_DIR = /home/openxs/git/mariadb-columnstore-engine/../mariadb-columnstore-server/include
SERVER_SOURCE_ROOT_DIR = /home/openxs/git/mariadb-columnstore-engine/../mariadb-columnstore-server
-- Configuring done
-- Generating done
-- Build files have been written to: /home/openxs/git/mariadb-columnstore-engine
openxs@ao756:~/git/mariadb-columnstore-engine$

...
[100%] Built target cpimport
[100%] Building CXX object writeengine/server/CMakeFiles/WriteEngineServer.dir/we_getfilesizes.cpp.o
/home/openxs/git/mariadb-columnstore-engine/writeengine/server/we_getfilesizes.cpp: In static member function ‘static int WriteEngine::WE_GetFileSizes::processFileName(messageqcpp::ByteStream&, std::string&, int)’:
/home/openxs/git/mariadb-columnstore-engine/writeengine/server/we_getfilesizes.cpp:275:19: warning: ‘fileSize’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     bs << fileSize;
                   ^
/home/openxs/git/mariadb-columnstore-engine/writeengine/server/we_getfilesizes.cpp:276:29: warning: ‘compressedFileSize’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     bs << compressedFileSize;
                             ^
Linking CXX executable WriteEngineServer
[100%] Built target WriteEngineServer
openxs@ao756:~/git/mariadb-columnstore-engine$ echo $?
0
openxs@ao756:~/git/mariadb-columnstore-engine$ make install
...
[100%] Built target cpimport
Install the project...
-- Install configuration: "RELWITHDEBINFO"
CMake Error at cmake_install.cmake:44 (FILE):
  file cannot create directory: /usr/local/mariadb/columnstore.  Maybe need
  administrative privileges.


make: *** [install] Error 1
So, first I had some missing build dependencies, had to read the manual and install them. Then both server and engine, as of version 1.0.7, were built successfully, but I had a problem during installation. The reason is clear, ColumnStore used to insist on installing into the /usr/local/mariadb/columnstore only, and I insisted on getting it elsewhere (as I usually do for my source code builds, and failed, so had to give up and use sudo). I got it and moved on at that times. If you care, proper steps for installing into any directory as a non-user are described in KB now.

So, based on the previous attempts and my success on Fedora 23, I decided to get recent 1.0.9 version from GitHub and build it on this Ubuntu 14.04 netbook. I followed current  GitHub instructions and few of my older notes (hence some extra options and not just cmake ., but seems these are truly optional and everything works as described):
...
openxs@ao756:~/git/mariadb-columnstore-server$ cd ../mariadb-columnstore-engine/openxs@ao756:~/git/mariadb-columnstore-engine$ cmake . -DSERVER_BUILD_INCLUDE_DIR=../mariadb-columnstore-server/include -DSERVER_SOURCE_ROOT_DIR=../mariadb-columnstore-server
-- Running cmake version 2.8.12.2
-- MariaDB-Columnstore 1.0.9
-- Boost version: 1.54.0
-- Found the following Boost libraries:
--   system
--   filesystem
--   thread
--   regex
--   date_time
SERVER_BUILD_INCLUDE_DIR = /home/openxs/git/mariadb-columnstore-engine/../mariadb-columnstore-server/include
SERVER_SOURCE_ROOT_DIR = /home/openxs/git/mariadb-columnstore-engine/../mariadb-columnstore-server
-- Configuring done
-- Generating done
-- Build files have been written to: /home/openxs/git/mariadb-columnstore-engine
openxs@ao756:~/git/mariadb-columnstore-engine$ time make -j 2
...
[100%] Built target cpimport
[100%] Building CXX object writeengine/server/CMakeFiles/WriteEngineServer.dir/we_getfilesizes.cpp.o
/home/openxs/git/mariadb-columnstore-engine/writeengine/server/we_getfilesizes.cpp: In static member function -?static int WriteEngine::WE_GetFileSizes::processFileName(messageqcpp::ByteStream&, std::string&, int)Б-?:
/home/openxs/git/mariadb-columnstore-engine/writeengine/server/we_getfilesizes.cpp:275:19: warning: Б-?fileSizeБ-? may be used uninitialized in this function [-Wmaybe-uninitialized]
     bs << fileSize;
                   ^
/home/openxs/git/mariadb-columnstore-engine/writeengine/server/we_getfilesizes.cpp:276:29: warning: Б-?compressedFileSizeБ-? may be used uninitialized in this function [-Wmaybe-uninitialized]
     bs << compressedFileSize;
                             ^
Linking CXX executable WriteEngineServer
[100%] Built target WriteEngineServer

real    43m48.955s
user    80m37.430s
sys     4m17.819s

openxs@ao756:~/git/mariadb-columnstore-engine$ sudo make install
...
-- Installing: /usr/local/mariadb/columnstore/bin/cpimport
-- Set runtime path of "/usr/local/mariadb/columnstore/bin/cpimport" to "/usr/local/mariadb/columnstore/lib"

openxs@ao756:~/git/mariadb-columnstore-engine$ cd /usr/local/mariadb/columnstore/bin/
Next step is to run post-install (as root or via sudo in my case):
openxs@ao756:/usr/local/mariadb/columnstore/bin$ sudo ./post-install
...
There is literally nothing interesting in the output and everything should just work. Now time to configure settings for MariaDB ColumnStore and run postConfigure script (again via sudo):
openxs@ao756:/usr/local/mariadb/columnstore/bin$ sudo ./postConfigure

This is the MariaDB ColumnStore System Configuration and Installation tool.
It will Configure the MariaDB ColumnStore System and will perform a Package
Installation of all of the Servers within the System that is being configured.

IMPORTANT: This tool should only be run on the Parent OAM Module
           which is a Performance Module, preferred Module #1

Prompting instructions:

        Press 'enter' to accept a value in (), if available or
        Enter one of the options within [], if available, or
        Enter a new value


===== Setup System Server Type Configuration =====

There are 2 options when configuring the System Server Type: single and multi

  'single'  - Single-Server install is used when there will only be 1 server configured
              on the system. It can also be used for production systems, if the plan is
              to stay single-server.

  'multi'   - Multi-Server install is used when you want to configure multiple servers now or
              in the future. With Multi-Server install, you can still configure just 1 server
              now and add on addition servers/modules in the future.

Select the type of System Server install [1=single, 2=multi] (2) > 1

Performing the Single Server Install.
Enter System Name (columnstore-1) >

===== Setup Storage Configuration =====


----- Setup Performance Module DBRoot Data Storage Mount Configuration -----

There are 2 options when configuring the storage: internal or external

  'internal' -    This is specified when a local disk is used for the DBRoot storage.
                  High Availability Server Failover is not Supported in this mode

  'external' -    This is specified when the DBRoot directories are mounted.
                  High Availability Server Failover is Supported in this mode.

Select the type of Data Storage [1=internal, 2=external] (1) >


Enter the list (Nx,Ny,Nz) or range (Nx-Nz) of DBRoot IDs assigned to module 'pm1' (1) >
The MariaDB ColumnStore port of '3306' is already in-use on local server
Enter a different port number > 3307


===== Performing Configuration Setup and MariaDB ColumnStore Startup =====

NOTE: Setting 'NumBlocksPct' to 50%
      Setting 'TotalUmMemory' to 25% of total memory.

Running the MariaDB ColumnStore setup scripts

post-mysqld-install Successfully Completed
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/usr/local/mariadb/columnstore/mysql/lib/mysql/mysql.sock' (2 "No such file or directory")
Error running post-mysql-install, /tmp/post-mysql-install.log
Exiting...
openxs@ao756:/usr/local/mariadb/columnstore/bin$
The process is quite clear and I installed a simple test configuration on a single server, with internal storage used (you may find interesting details on how data are stored in KB articles and in my previous post), single Performance Module (PM) and single DBRoot. The process and details are well described here. It was noted that I have another running instance listening to port 3306 (it is Percona Server that I have on this netbook since my days working for Percona). I've picked up port 3307 for MariaDB instance. Now, it seems post-mysqld-install part was executed successfully and then after waiting for some time there was some failure while connecting to MySQL. I decided that it does not matter (I had some failures on Fedora 23, but eventually it worked, and I was in hurry and proceeded to the KB article on mscadmin and basic usage. I've noted that ColumnStore is NOT started and tried to start it:
openxs@ao756:/usr/local/mariadb/columnstore/bin$ sudo ./mcsadmin startsystem
startsystem   Sat Jun 24 20:24:54 2017
startSystem command, 'columnstore' service is down, sending command to
start the 'columnstore' service on all modules


   System being started, please wait...........
**** startSystem Failed : check log files
openxs@ao756:/usr/local/mariadb/columnstore/bin$
So, I've got my first real trouble with MariaDB ColumnStore and I have no other option than troubleshoot it. Let's check the content of the error log of MariaDB server first:
openxs@ao756:/usr/local/mariadb/columnstore/bin$ tail ../mysql/db/ao756.err
tail: cannot open Б-?../mysql/db/ao756.errБ-? for reading: Permission denied
openxs@ao756:/usr/local/mariadb/columnstore/bin$ sudo tail ../mysql/db/ao756.err
2017-06-24 20:25:14 139795698476992 [Note] InnoDB: Highest supported file format is Barracuda.
2017-06-24 20:25:14 139795698476992 [Note] InnoDB: starting tracking changed pages from LSN 1616918
2017-06-24 20:25:14 139795698476992 [Note] InnoDB: 128 rollback segment(s) are active.
2017-06-24 20:25:14 139795698476992 [Note] InnoDB: Waiting for purge to start
2017-06-24 20:25:14 139795698476992 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.35-80.0 started; log sequence number 1616918
2017-06-24 20:25:14 139795698476992 [Note] Plugin 'FEEDBACK' is disabled.
2017-06-24 20:25:14 139795698476992 [ERROR] /usr/local/mariadb/columnstore/mysql//bin/mysqld: unknown variable 'gtid_mode=ON'
2017-06-24 20:25:14 139795698476992 [ERROR] Aborting

170624 20:25:17 mysqld_safe mysqld from pid file /usr/local/mariadb/columnstore/mysql/db/ao756.pid ended
So, the problem is in incompatible setting in some my.cnf file used by default (this setting was used for Percona Server I have installed from packages). After fixing the problem (I had to proceed so went for as dirty fix of commenting out settings I've found in /etc/my,.cnf) it was easy too start ColumnStore:
openxs@ao756:/usr/local/mariadb/columnstore/bin$ sudo vi /etc/my.cnf
openxs@ao756:/usr/local/mariadb/columnstore/bin$ grep gtid /etc/my.cnf#gtid_mode=ON
#enforce_gtid_consistency
openxs@ao756:/usr/local/mariadb/columnstore/bin$ sudo ./mcsadmin shutdownsystem

shutdownsystem   Sat Jun 24 20:42:21 2017

This command stops the processing of applications on all Modules within the MariaDB ColumnStore System

   Checking for active transactions
           Do you want to proceed: (y or n) [n]: y

   Stopping System...
   Successful stop of System

   Shutting Down System...
   Successful shutdown of System

openxs@ao756:/usr/local/mariadb/columnstore/bin$ sudo ./mcsadmin startsystem
startsystem   Sat Jun 24 20:43:15 2017
startSystem command, 'columnstore' service is down, sending command to
start the 'columnstore' service on all modules


   System being started, please wait..................
   Successful start of System

openxs@ao756:/usr/local/mariadb/columnstore/bin$ ps aux | grep column
root      6993  0.0  0.0  15336  1484 pts/2    S    20:43   0:00 /bin/bash /usr/local/mariadb/columnstore/bin/run.sh /usr/local/mariadb/columnstore/bin/ProcMon
root      6994  7.1  0.4 1214156 15808 pts/2   Sl   20:43   0:05 /usr/local/mariadb/columnstore/bin/ProcMon
root      7162  0.0  0.0   4456   788 pts/2    S    20:43   0:00 /bin/sh /usr/local/mariadb/columnstore/mysql//bin/mysqld_safe --datadir=/usr/local/mariadb/columnstore/mysql/db --pid-file=/usr/local/mariadb/columnstore/mysql/db/ao756.pid --ledir=/usr/local/mariadb/columnstore/mysql//bin
mysql     7380  0.6  4.1 1170124 160416 pts/2  Sl   20:43   0:00 /usr/local/mariadb/columnstore/mysql//bin/mysqld --basedir=/usr/local/mariadb/columnstore/mysql/ --datadir=/usr/local/mariadb/columnstore/mysql/db --plugin-dir=/usr/local/mariadb/columnstore/mysql/lib/plugin --user=mysql --log-error=/usr/local/mariadb/columnstore/mysql/db/ao756.err --pid-file=/usr/local/mariadb/columnstore/mysql/db/ao756.pid --socket=/usr/local/mariadb/columnstore/mysql/lib/mysql/mysql.sock --port=3307
root      7454  0.0  0.2 506564 10564 pts/2    Sl   20:43   0:00 /usr/local/mariadb/columnstore/bin/controllernode fg
root      7475  0.0  0.2 353284 10176 pts/2    Sl   20:43   0:00 /usr/local/mariadb/columnstore/bin/ServerMonitor
root      7502  0.0  0.4 177956 17984 pts/2    Sl   20:43   0:00 /usr/local/mariadb/columnstore/bin/workernode DBRM_Worker1 fg
openxs    7951  0.0  0.0  14652   960 pts/2    S+   20:44   0:00 grep --color=auto column
openxs@ao756:/usr/local/mariadb/columnstore/bin$
 So, it seems the system started as expected, a lot of processes are running, but when I tried to create my first ColumnStore table I've got a message that the storage engine in not supported! Same for InfiniDB.

I'll skip some troubleshooting steps and ideas I had, and just state that:

You should make sure none of MySQL configuration files that are read by default should have any incompatible settings at the moment you run postConfigure script.

You can surely re-run postConfigure when everything is fixed, and this is a proper output for completed, correct attempt:
openxs@ao756:/usr/local/mariadb/columnstore/bin$ sudo ./postConfigure

This is the MariaDB ColumnStore System Configuration and Installation tool.
It will Configure the MariaDB ColumnStore System and will perform a Package
Installation of all of the Servers within the System that is being configured.

IMPORTANT: This tool should only be run on the Parent OAM Module
           which is a Performance Module, preferred Module #1

Prompting instructions:

        Press 'enter' to accept a value in (), if available or
        Enter one of the options within [], if available, or
        Enter a new value


===== Setup System Server Type Configuration =====

There are 2 options when configuring the System Server Type: single and multi

  'single'  - Single-Server install is used when there will only be 1 server configured
              on the system. It can also be used for production systems, if the plan is
              to stay single-server.

  'multi'   - Multi-Server install is used when you want to configure multiple servers now or
              in the future. With Multi-Server install, you can still configure just 1 server
              now and add on addition servers/modules in the future.

Select the type of System Server install [1=single, 2=multi] (1) >

Performing the Single Server Install.
Enter System Name (columnstore-1) >

===== Setup Storage Configuration =====


----- Setup Performance Module DBRoot Data Storage Mount Configuration -----

There are 2 options when configuring the storage: internal or external

  'internal' -    This is specified when a local disk is used for the DBRoot storage.
                  High Availability Server Failover is not Supported in this mode

  'external' -    This is specified when the DBRoot directories are mounted.
                  High Availability Server Failover is Supported in this mode.

Select the type of Data Storage [1=internal, 2=external] (1) >

Enter the list (Nx,Ny,Nz) or range (Nx-Nz) of DBRoot IDs assigned to module 'pm1' (1) >


===== Performing Configuration Setup and MariaDB ColumnStore Startup =====

NOTE: Setting 'NumBlocksPct' to 50%
      Setting 'TotalUmMemory' to 25% of total memory.

Running the MariaDB ColumnStore setup scripts

post-mysqld-install Successfully Completed
post-mysql-install Successfully Completed

Starting MariaDB Columnstore Database Platform

MariaDB ColumnStore Database Platform Starting, please wait ..... DONE

System Catalog Successfull Created

MariaDB ColumnStore Install Successfully Completed, System is Active

Enter the following command to define MariaDB ColumnStore Alias Commands

. /usr/local/mariadb/columnstore/bin/columnstoreAlias

Enter 'mcsmysql' to access the MariaDB ColumnStore SQL console
Enter 'mcsadmin' to access the MariaDB ColumnStore Admin console

openxs@ao756:/usr/local/mariadb/columnstore/bin$

Node new messages at the end. If you do NOT see any of them do not blindly assume that everything is working as expected. Also note that the tool remembereds previous chopices.

Now I've got access to the engine:
openxs@ao756:/usr/local/mariadb/columnstore/bin$ ../mysql/bin/mysql -uroot test --host=127.0.0.1 --port=3307
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 4
Server version: 10.1.23-MariaDB Columnstore 1.0.9-1
Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [test]> show engines;
+--------------------+---------+--------------------------------------------------------------------------------------------------+--------------+------+------------+
| Engine             | Support | Comment                                                                                          | Transactions | XA   | Savepoints |
+--------------------+---------+--------------------------------------------------------------------------------------------------+--------------+------+------------+
| Columnstore        | YES     | Columnstore storage engine                                                                       | YES          | NO   | NO         |
| CSV                | YES     | CSV storage engine                                                                               | NO           | NO   | NO         |
| MyISAM             | YES     | MyISAM storage engine                                                                            | NO           | NO   | NO         |
| MEMORY             | YES     | Hash based, stored in memory, useful for temporary tables                                        | NO           | NO   | NO         |
| MRG_MyISAM         | YES     | Collection of identical MyISAM tables                                                            | NO           | NO   | NO         |
| InfiniDB           | YES     | Columnstore storage engine (deprecated: use columnstore)                                         | YES          | NO   | NO         |
| SEQUENCE           | YES     | Generated tables filled with sequential values                                                   | YES          | NO   | YES        |
| Aria               | YES     | Crash-safe tables with MyISAM heritage                                                           | NO           | NO   | NO         |
| InnoDB             | DEFAULT | Percona-XtraDB, Supports transactions, row-level locking, foreign keys and encryption for tables | YES          | YES  | YES        |
| PERFORMANCE_SCHEMA | YES     | Performance Schema                                                                               | NO           | NO   | NO         |
+--------------------+---------+--------------------------------------------------------------------------------------------------+--------------+------+------------+
10 rows in set (0.00 sec)

MariaDB [test]> create table taa(id int, c1 int, c2 date, c3 int) engine=columnstore;
Query OK, 0 rows affected (3.81 sec)

MariaDB [test]> insert into taa values(1,1,now(),1), (2,2,NULL,2),(3,NULL,NULL,3);
Query OK, 3 rows affected, 1 warning (1.15 sec)
Records: 3  Duplicates: 0  Warnings: 1

MariaDB [test]> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1265
Message: Data truncated for column 'c2' at row 1
1 row in set (0.00 sec)

MariaDB [test]> select * from taa;
+------+------+------------+------+
| id   | c1   | c2         | c3   |
+------+------+------------+------+
|    1 |    1 | 2017-06-24 |    1 |
|    2 |    2 | NULL       |    2 |
|    3 | NULL | NULL       |    3 |
+------+------+------------+------+
3 rows in set (0.15 sec)

MariaDB [test]> create table tab like taa;
ERROR 1178 (42000): The storage engine for the table doesn't support The syntax or the data type(s) is not supported by Columnstore. Please check the Columnstore syntax guide for supported syntax or data types.MariaDB [test]> create table tab engine=columnstore select * from taa;
ERROR 1178 (42000): The storage engine for the table doesn't support The syntax or the data type(s) is not supported by Columnstore. Please check the Columnstore syntax guide for supported syntax or data types.MariaDB [test]> create table tab(id int, c1 int, c2 date, c3 int) engine=columnstore;
Query OK, 0 rows affected (3.40 sec)

MariaDB [test]> alter table taa engine=InnoDB;
ERROR 1030 (HY000): Got error 1815 "Unknown error 1815" from storage engine Columnstore
MariaDB [test]>
Note some limitation we have in ColumnStore comparing to "normal" MariaDB it is based on. Some things do not work, some unique error messages and bugs are present, surely. You can read about current known limitations here.

My real goal was to check if NULL values are properly loaded by the cpimport tool, so I tried:
openxs@ao756:/usr/local/mariadb/columnstore/bin$ ./cpimport -e 1 -s '\t' -n1 test tab
Locale is : C
Column delimiter : \t

Using table OID 3005 as the default JOB ID
Input file(s) will be read from : STDIN
Job description file : /usr/local/mariadb/columnstore/data/bulk/tmpjob/3005_D20170624_T211331_S295214_Job_3005.xml
I/O error : Permission denied
I/O error : Permission denied
2017-06-24 21:13:31 (16045) ERR  :  file /usr/local/mariadb/columnstore/data/bulk/tmpjob/3005_D20170624_T211331_S295214_Job_3005.xml does not exist [1055]
Error in loading job information;  The File does not exist. ; cpimport.bin is terminating.
openxs@ao756:/usr/local/mariadb/columnstore/bin$

openxs@ao756:/usr/local/mariadb/columnstore/bin$ cat /tmp/taa.txt | sudo ./cpimport -e 1 -s '\t' -n1 test tab
Locale is : C
Column delimiter : \t

Using table OID 3005 as the default JOB ID
Input file(s) will be read from : STDIN
Job description file : /usr/local/mariadb/columnstore/data/bulk/tmpjob/3005_D20170624_T211615_S997504_Job_3005.xml
Log file for this job: /usr/local/mariadb/columnstore/data/bulk/log/Job_3005.log
2017-06-24 21:16:16 (16582) INFO : successfully loaded job file /usr/local/mariadb/columnstore/data/bulk/tmpjob/3005_D20170624_T211615_S997504_Job_3005.xml
2017-06-24 21:16:16 (16582) INFO : Job file loaded, run time for this step : 0.0409038 seconds
2017-06-24 21:16:16 (16582) INFO : PreProcessing check starts
2017-06-24 21:16:16 (16582) INFO : PreProcessing check completed
2017-06-24 21:16:16 (16582) INFO : preProcess completed, run time for this step : 0.209377 seconds
2017-06-24 21:16:16 (16582) INFO : No of Read Threads Spawned = 1
2017-06-24 21:16:16 (16582) INFO : No of Parse Threads Spawned = 3
2017-06-24 21:16:16 (16582) INFO : Reading input from STDIN to import into table test.tab...
2017-06-24 21:16:16 (16582) INFO : For table test.tab: 4 rows processed and 4 rows inserted.
2017-06-24 21:16:16 (16582) WARN : Column test.tab.c2; Number of invalid dates replaced with zero value : 1
2017-06-24 21:16:17 (16582) INFO : Bulk load completed, total run time : 1.28894 seconds

openxs@ao756:/usr/local/mariadb/columnstore/bin$
The problem is I have the error for a date column, and the error is caused by the first line in the file with column titles, as produced by mysql command line client by default. One has to add -N option to get proper data loaded:
openxs@ao756:/usr/local/mariadb/columnstore/bin$ ../mysql/bin/mysql -uroot test --host=127.0.0.1 --port=3307 -N -e"select * from taa" >/tmp/taa.txt
openxs@ao756:/usr/local/mariadb/columnstore/bin$ cat /tmp/taa.txt
1 1 2017-06-24 1

2 2 NULL 2
3 NULL NULL 3
openxs@ao756:/usr/local/mariadb/columnstore/bin$ cat /tmp/taa.txt | sudo ./cpimport -e 1 -s '\t' -n1 test tab
Locale is : C
Column delimiter : \t

Using table OID 3005 as the default JOB ID
Input file(s) will be read from : STDIN
Job description file : /usr/local/mariadb/columnstore/data/bulk/tmpjob/3005_D20170624_T212114_S554015_Job_3005.xml
Log file for this job: /usr/local/mariadb/columnstore/data/bulk/log/Job_3005.log
2017-06-24 21:21:14 (18107) INFO : successfully loaded job file /usr/local/mariadb/columnstore/data/bulk/tmpjob/3005_D20170624_T212114_S554015_Job_3005.xml
2017-06-24 21:21:14 (18107) INFO : Job file loaded, run time for this step : 0.0412772 seconds
2017-06-24 21:21:14 (18107) INFO : PreProcessing check starts
2017-06-24 21:21:14 (18107) INFO : PreProcessing check completed
2017-06-24 21:21:14 (18107) INFO : preProcess completed, run time for this step : 0.205652 seconds
2017-06-24 21:21:14 (18107) INFO : No of Read Threads Spawned = 1
2017-06-24 21:21:14 (18107) INFO : No of Parse Threads Spawned = 3
2017-06-24 21:21:14 (18107) INFO : Reading input from STDIN to import into table test.tab...
2017-06-24 21:21:15 (18107) INFO : For table test.tab: 3 rows processed and 3 rows inserted.
2017-06-24 21:21:15 (18107) INFO : Bulk load completed, total run time : 1.28646 seconds
openxs@ao756:/usr/local/mariadb/columnstore/bin$ ../mysql/bin/mysql -uroot test --host=127.0.0.1 --port=3307 -N -e"select * from tab"
+------+------+------------+------+
|    1 |    1 | 2017-06-24 |    1 |
|    2 |    2 | NULL       |    2 |
|    3 | NULL | NULL       |    3 |
+------+------+------------+------+
To sunmmarize:
  1. These days one can easliy build, install and configure current version of MariaDB ColumnStore from GitHub sources.
  2. As long as you are reding KB artciles and GitHub documentation carefully, there is no problem to install and configure even non-default environment, where other MySQL versions co-exist.
  3. Youi should make sure node of the configuration files that are read by default contains any incompatible setting (MySQL 5.7-specific ones may fit into that calregory, or anything about GTID-based replication in MySQL).
  4. cpimport is a cool tol for paralles/fast data loads into MariaDB ColumnStore, but it does assume that input is a CSV or tab-separated file, and it tries to load all rows.
  5. Read error messages carefully.
  6. Make sure you read this article if you want to use ColumnStore.

Sunday, June 18, 2017

MySQL Support Engineer's Chronicles, Issue #6

Previous post in series was published almost 4 months ago, but I do not plan to end it. So, let me quickly discuss some of problems I worked on or was interested in so far in June, and provide some useful links.

Back on June 2 I had to find out what exact files are created by MariaDB's ColumnStore when I create a table in this storage engine. Actually in recent versions one can check the tables in the INFORMATION_SCHEMA, but if still wonders why are all these directories with numbers in the names (/usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/193.dir/000.dir/FILE000.cdf), please, check also this storage architecture review and take into account the following hint by the famous LinuxJedi:
partA: oid>>24
partB: (oid&0x00ff0000)>>16
partC: (oid&0x0000ff00) >> 8
partD: oid&0x000000ff
partE: partition
fName: segment
Those numbers in the directory names are decimal integers, 3 digits, zero-padded. oid is object_id.

If you plan to try to work with MariaDB ColumnStore, please, make sure to check also this KB article, MariaDB ColumnStore System Usage.

Later I had to work on some out-of-memory issues, and found this article on how to get the details on transparent huge pages usage on RedHat systems very useful.

No matter how much I hate clusters, I have to deal at least with Galera-based ones. One of the problems to deal with, recently, was about many threads "hanging" in "wsrep in pre-commit stage" status. In the process of investigation I had found this bug report, lp:1197771, and I am still wondering what is the proper status for it ("Fix Committed" to nothing does not impress me). Check also this discussion.

The most important discussions for MySQL Community, IMHO, happened around June 9 in relation to Bug #86607, "InnoDB crashed when master thread evict dict_table_t object". Great finding and detailed analysis by Alibaba's engineer, Jianwei Zhao, as well as his detailed MTR test case (that worked on a bit modified source code with a couple of lines like os_thread_sleep(2000000); added here and there), were all rejected by my deal old friend Sinisa... It caused a lot of reactions and comments on Facebook. Further references to magic "Oracle policies"  had not helped. What really helped is a real verification by Umesh Shastry later. As many colleagues noted, one had just to try, and test fails even on a non-modified code, quite often. It's quite possible that MariaDB will fix this bug faster, so you may want to monitor this related bug report, MDEV-13051. To summarize, it's not about any policies, but all about the proper care and attitude, and had always been like that...

Another great by to follow, Bug #86215, was reported back in May by Mark Callaghan. He spent a lot of time checking how well MySQL performs, historically, on different versions, for 4.0 to 8.0. Recent findings are summarized in this blog post. Based on the results, it seems that for most tests huge drop in performance happened in MySQL 5.7, and so far version 8.0.x had made few improvements on top of that. Check also this summary of tests and related blog posts. I wonder what would be the results of analysis for this bug and related posts by the great Dimitri Kravtchuk, whom the bug is assigned to currently.

Those who may have to work with MariaDB or MySQL on Windows and deal with crashes or hangs, may find this Micsoft's article useful. It explains how to get call stacks for threads with Visual Studio 2017. Thanks to Vladislav Vaintroub, for the link. I badly need new box with Windows 10 it seems, to check all this and other cool stuff, like shadow backups...

MariaDB's MaxScale 2.1 GA version had appeared recently, with a new cache filter feature. You can find the documentation here and there, and some comparison of it's initial implementation with ProxySQL's cache in this great blog post.

Yet another MySQL bug, reported by my colleague Hartmut Holzgraefe long time ago, attracted my attention while working on issues. This was Bug #76793, "Different row size limitations in Anaconda and Barracuda file formats". It was verified more than 2 years ago, but still nothing happens, neither in the code, nor in the manual to describe and explain the differences. Sad, but true.

The last but not the least, Support team of MariaDB is hiring now. We need one more engineer for our APAC team. If you want to work on complex and interesting issues from reasonable customers with properly set expectations, in a wisely managed team, and get a chance to work with me for 3-4 hours on a daily basis (yes, I am around/working during some APAC hours often enough, at least this summer) - please, get in touch and send your CV!

Saturday, April 15, 2017

How To Find What Thread Had Executed FTWRL

This week one of MariaDB Support customers asked how to find what thread had executed FLUSH TABLES WITH READ LOCK (FTWRL) and thus blocked all changes to data. I decided to list several ways to find this out (as eventually customer wanted to know how to find this out not only in MariaDB 10.1, but also in MySQL 5.6 etc).

Let me start with a quick summary. I know the following ways (other that having all queries logged in general query log, slow log, by some audit plugin or at client side, and checking the log) to find the thread that executed FLUSH TABLES WITH READ LOCK successfully:
  1. In MariaDB starting from 10.0.7 you can use METADATA_LOCK_INFO plugin.
  2. In MySQL starting from 5.7 you can use performance_schema.metadata_locks table.
  3. In MySQL starting from 5.6 (or MariaDB 10.x.y) you can use performance_schema.events_statements_history table.
  4. In all versions of MySQL or MariaDB you can attach gdb and check threads one by one (and I explain one of the ways to do this in the post).
Now, time to present the details. Unfortunately, when FTWRL is executed successfully, it is not visible neither in SHOW PROCESSLIST nor in related INFORMATION_SCHEMA table:
MariaDB [test]> select version();
+-----------------+
| version()       |
+-----------------+
| 10.1.18-MariaDB |
+-----------------+
1 row in set (0.00 sec)

MariaDB [test]> flush tables with read lock;
Query OK, 0 rows affected (0.06 sec)

MariaDB [test]> show full processlist\G
*************************** 1. row ***************************
      Id: 10
    User: root
    Host: localhost
      db: test
 Command: Query
    Time: 0
   State: init
    Info: show full processlist
Progress: 0.000
*************************** 2. row ***************************
      Id: 11
    User: root
    Host: localhost
      db: test
 Command: Query
    Time: 743
   State: Waiting for global read lock
    Info: delete from t0
Progress: 0.000
2 rows in set (0.00 sec)

MariaDB [test]> select * from information_schema.processlist\G*************************** 1. row ***************************
           ID: 11
         USER: root
         HOST: localhost
           DB: test
      COMMAND: Query
         TIME: 954
        STATE: Waiting for global read lock
         INFO: delete from t0
      TIME_MS: 954627.587
        STAGE: 0
    MAX_STAGE: 0
     PROGRESS: 0.000
  MEMORY_USED: 67464
EXAMINED_ROWS: 0
     QUERY_ID: 1457
  INFO_BINARY: delete from t0
          TID: 8838
*************************** 2. row ***************************
           ID: 10
         USER: root
         HOST: localhost
           DB: test
      COMMAND: Query
         TIME: 0
        STATE: Filling schema table
         INFO: select * from information_schema.processlist
      TIME_MS: 0.805
        STAGE: 0
    MAX_STAGE: 0
     PROGRESS: 0.000
  MEMORY_USED: 84576
EXAMINED_ROWS: 0
     QUERY_ID: 1461
  INFO_BINARY: select * from information_schema.processlist
          TID: 8424
2 rows in set (0.02 sec)
There is usually nothing useful in the INNODB STATUS also:
MariaDB [test]> show engine innodb status\G
...
------------
TRANSACTIONS
------------
Trx id counter 20439
Purge done for trx's n:o < 20422 undo n:o < 0 state: running but idle
History list length 176
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0, not started
MySQL thread id 11, OS thread handle 0x7f7f5cdb8b00, query id 1457 localhost root Waiting for global read lock
delete from t0
---TRANSACTION 0, not started
MySQL thread id 10, OS thread handle 0x7f7f5ce02b00, query id 1462 localhost root init
show engine innodb status
--------
...
No wonder, global read lock that FTWRL sets is not an InnoDB lock.

In this case we have just 2 user threads connected, so it's easy to find out what thread could set the global read lock we are waiting for. With dozens or hundreds of threads connected one has to guess which one it can be (often those that sleep for a longest time are blamed), unless he knows for sure how to find the exact one. So, let me list some of the ways to find this out.

I'd like to start with MariaDB 10.x.y. It provides a special METADATA_LOCK_INFO plugin. Surely you can not install it when FTWRL is already executed (as plugin ends up recorded in mysql.plugins table when installed):
MariaDB [test]> INSTALL SONAME 'metadata_lock_info';
ERROR 1223 (HY000): Can't execute the query because you have a conflicting read lock
MariaDB [test]> unlock tables;
Query OK, 0 rows affected (0.00 sec)

MariaDB [test]> INSTALL SONAME 'metadata_lock_info';
Query OK, 0 rows affected (0.37 sec)

MariaDB [test]> select * from mysql.plugin;
+--------------------+-----------------------+
| name               | dl                    |
+--------------------+-----------------------+
| BLACKHOLE          | ha_blackhole.so       |
| METADATA_LOCK_INFO | metadata_lock_info.so |
+--------------------+-----------------------+
2 rows in set (0.00 sec)
But when plugin is installed, we can get the information we need from the information_schema.metadata_lock_info table:
MariaDB [test]> flush tables with read lock;
Query OK, 0 rows affected (0.01 sec)

MariaDB [test]> select * from information_schema.metadata_lock_info;
+-----------+------------+---------------+------------------+--------------+------------+
| THREAD_ID | LOCK_MODE  | LOCK_DURATION | LOCK_TYPE        | TABLE_SCHEMA | TABLE_NAME |
+-----------+------------+---------------+------------------+--------------+------------+
|        10 | MDL_SHARED | NULL          | Global read lock |              |            |
|        10 | MDL_SHARED | NULL          | Commit lock      |              |            |
+-----------+------------+---------------+------------------+--------------+------------+
2 rows in set (0.00 sec)

MariaDB [test]> select connection_id();
+-----------------+
| connection_id() |
+-----------------+
|              10 |
+-----------------+
1 row in set (0.01 sec)

MariaDB [test]> show processlist;
+----+------+-----------+------+---------+------+------------------------------+------------------+----------+
| Id | User | Host      | db   | Command | Time | State                        | Info             | Progress |
+----+------+-----------+------+---------+------+------------------------------+------------------+----------+
| 10 | root | localhost | test | Query   |    0 | init                         | show processlist |    0.000 |
| 12 | root | localhost | test | Query   |   47 | Waiting for global read lock | delete from t0   |    0.000 |
+----+------+-----------+------+---------+------+------------------------------+------------------+----------+
2 rows in set (0.00 sec)
We are looking for a row where lock_type is "Global read lock":
MariaDB [test]> select thread_id from information_schema.metadata_lock_info where lock_type = 'Global read lock';
+-----------+
| thread_id |
+-----------+
|        10 |
+-----------+
1 row in set (0.00 sec)
Now, let me proceed to MySQL 5.7. It had introduced the performance_schema.metadata_locks table, and we can use it as follows (I've used Percona Server 5.7.17 for this example, but it's not any different from upstream MySQL 5.7 in this regard):
openxs@ao756:~/dbs/maria10.1$ mysql -uroot -proot test
mysql: [Warning] Using a password on the command line interface can be insecure.
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 12
Server version: 5.7.17-13-log Percona Server (GPL), Release '13', Revision 'fd33d43'

Copyright (c) 2009-2016 Percona LLC and/or its affiliates
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> select @@performance_schema;
+----------------------+
| @@performance_schema |
+----------------------+
|                    1 |
+----------------------+
1 row in set (0.00 sec)

mysql> UPDATE performance_schema.setup_consumers SET ENABLED = 'YES' WHERE NAME = 'global_instrumentation';
Query OK, 0 rows affected (0.18 sec)
Rows matched: 1  Changed: 0  Warnings: 0

mysql> UPDATE performance_schema.setup_instruments SET ENABLED = 'YES' WHERE NAME = 'wait/lock/metadata/sql/mdl';
Query OK, 1 row affected (0.04 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> flush tables with read lock;
Query OK, 0 rows affected (0.06 sec)

mysql> select * from performance_schema.metadata_locks;
+-------------+--------------------+----------------+-----------------------+-------------+---------------+-------------+-------------------+-----------------+----------------+
| OBJECT_TYPE | OBJECT_SCHEMA      | OBJECT_NAME    | OBJECT_INSTANCE_BEGIN | LOCK_TYPE   | LOCK_DURATION | LOCK_STATUS | SOURCE            | OWNER_THREAD_ID | OWNER_EVENT_ID |
+-------------+--------------------+----------------+-----------------------+-------------+---------------+-------------+-------------------+-----------------+----------------+
| GLOBAL      | NULL               | NULL           |       139917187957472 | SHARED      | EXPLICIT      | GRANTED     | lock.cc:1128      |              39 |             10 || COMMIT      | NULL               | NULL           |       139917300586224 | SHARED      | EXPLICIT      | GRANTED     | lock.cc:1212      |              39 |             10 |
| TABLE       | performance_schema | metadata_locks |       139917192091536 | SHARED_READ | TRANSACTION   | GRANTED     | sql_parse.cc:6406 |              39 |             11 |
+-------------+--------------------+----------------+-----------------------+-------------+---------------+-------------+-------------------+-----------------+----------------+
3 rows in set (0.01 sec)

mysql> show processlist;
+----+----------+--------------------+------+---------+------+----------+------------------+-----------+---------------+
| Id | User     | Host               | db   | Command | Time | State    | Info             | Rows_sent | Rows_examined |
+----+----------+--------------------+------+---------+------+----------+------------------+-----------+---------------+
| 11 | maxscale | 192.168.1.37:55399 | NULL | Sleep   |    4 |          | NULL             |         0 |             0 |
| 12 | root     | localhost          | test | Query   |    0 | starting | show processlist |         0 |             0 |
+----+----------+--------------------+------+---------+------+----------+------------------+-----------+---------------+
2 rows in set (0.02 sec)
In this case we are looking for the value "GLOBAL" in OBJECT_TYPE column and "SHARED" in LOCK_TYPE column. In this row we see OWNER_TREAD_ID, but the value, 39, is not present in the SHOW PROCESSLIST output. We need to search for the proper thread id in the performance_schema.threads table, like this:
mysql> select t.processlist_id from performance_schema.threads t join performance_schema.metadata_locks ml on ml.owner_thread_id = t.thread_id where ml.object_type='GLOBAL' and ml.lock_type='SHARED';
+----------------+
| processlist_id |
+----------------+
|             12 |
+----------------+
1 row in set (0.03 sec)
Now, what if we run older MySQL 5.6.x that does not have metadata_locks table? If we make sure that the history of statements executed by each thread is maintained in performance_schema, we can easily search the history for the FTWRL statement in the performance_schema.events_statements_history table! Steps are also simple:
mysql> update performance_schema.setup_consumers set enabled = 'YES' where NAME = 'events_statements_history';
ERROR 1223 (HY000): Can't execute the query because you have a conflicting read lock
mysql> unlock tables;
Query OK, 0 rows affected (0.00 sec)

mysql> update performance_schema.setup_consumers set enabled = 'YES' where NAME = 'events_statements_history';
Query OK, 0 rows affected (0.00 sec)
Rows matched: 1  Changed: 0  Warnings: 0

mysql> flush tables with read lock;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from performance_schema.events_statements_history where sql_text like 'flush tables%'\G
*************************** 1. row ***************************
              THREAD_ID: 39
               EVENT_ID: 21
           END_EVENT_ID: 21
             EVENT_NAME: statement/sql/flush
                 SOURCE: socket_connection.cc:95
            TIMER_START: 94449505549959000
              TIMER_END: 94449505807116000
             TIMER_WAIT: 257157000
              LOCK_TIME: 0
               SQL_TEXT: flush tables with read lock
                 DIGEST: 03682cc3e0eaed3d95d665c976628d02
            DIGEST_TEXT: FLUSH TABLES WITH READ LOCK
...
    NESTING_EVENT_LEVEL: 0
1 row in set (0.00 sec)
 Again, if we need thread id for the processlist, we have to join with performance_schema.threads:
mysql> select t.processlist_id from performance_schema.threads t join performance_schema.events_statements_history h on h.thread_id = t.thread_id where h.digest_text like 'FLUSH TABLES%';
+----------------+
| processlist_id |
+----------------+
|             12 |
+----------------+
1 row in set (0.01 sec)
Note that using digest_text while searching in history is a bit more reliable, as the text is already normalized, so you do not need to care about extra spaces etc. On the other hand, it requires statements_digest consumer to be enabled:

mysql> select * from performance_schema.setup_consumers where enabled = 'YES';
+---------------------------+---------+
| NAME                      | ENABLED |
+---------------------------+---------+
| events_statements_current | YES     |
| events_statements_history | YES     |
| global_instrumentation    | YES     |
| thread_instrumentation    | YES     |
| statements_digest         | YES     |
+---------------------------+---------+
5 rows in set (0.00 sec)
Now, what if you do not have performance_schema enabled (or compiled in), use older MySQL version or it's too late to run any updates or install any plugins? As usual, gdb to the rescue!

We already know that global shared metadata lock is set, so one can try to apply the approach I've already written about in this post. It looks a bit complicated though, so I decided to try something easier and just traverse the linked list of all MySQL threads in gdb in a way similar to what Shane Bester did years ago while looking for SQL statements executed by thread in the core file.

So, in MariaDB 10.1 with performance_schema disabled I, again, executed the following (surely, delete was executed in another session after FTWRL):
openxs@ao756:~/dbs/maria10.1$ bin/mysql -uroot --socket=/tmp/mariadb.sock test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 13
Server version: 10.1.18-MariaDB MariaDB Server

Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [test]> select @@performance_schema;
+----------------------+
| @@performance_schema |
+----------------------+
|                    0 |
+----------------------+
1 row in set (0.00 sec)

MariaDB [test]> flush tables with read lock;
Query OK, 0 rows affected (0.00 sec)

MariaDB [test]> show processlist;
+----+------+-----------+------+---------+------+------------------------------+------------------+----------+
| Id | User | Host      | db   | Command | Time | State                        | Info             | Progress |
+----+------+-----------+------+---------+------+------------------------------+------------------+----------+
| 13 | root | localhost | test | Query   |    0 | init                         | show processlist |    0.000 |
| 14 | root | localhost | test | Query   |    9 | Waiting for global read lock | delete from t0   |    0.000 |
+----+------+-----------+------+---------+------+------------------------------+------------------+----------+
2 rows in set (0.00 sec)
and then found the PID for MariaDB server and attached gdb to the process:
openxs@ao756:~$ ps aux | grep 'mysqld ' | grep maria
openxs    7444  0.1  5.7 748028 220744 ?       Sl   кві14   1:47 /home/openxs/dbs/maria10.1/bin/mysqld --no-defaults --basedir=/home/openxs/dbs/maria10.1 --datadir=/home/openxs/dbs/maria10.1/data --plugin-dir=/home/openxs/dbs/maria10.1/lib/plugin --log-error=/home/openxs/dbs/maria10.1/data/ao756.err --pid-file=ao756.pid --socket=/tmp/mariadb.sock --port=3307
openxs@ao756:~$ sudo gdb -p 7444
...
Loaded symbols for /home/openxs/dbs/maria10.1/lib/plugin/metadata_lock_info.so
0x00007f7f5a2f084d in poll () at ../sysdeps/unix/syscall-template.S:81
81      ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) p threads
$1 = {<base_ilist> = {first = 0x7f7f3fbf3008, last = {
      _vptr.ilink = 0x7f7f5cea3510 <vtable for ilink+16>,
      prev = 0x7f7f3ffa1018, next = 0x0}}, <No data fields>}
As you can see, we have a global variable threads. threads->first is a pointer to (the structure that describes) the first thread (of THD * type) in the double linked list of all MySQL threads, so we can do the following:
(gdb) set $thd=(THD *)(threads->first)
(gdb) p $thd
$3 = (THD *) 0x7f7f3fbf3008
(gdb) p $thd->thread_id$4 = 14
(gdb) p $thd->global_read_lock
$5 = {m_state = Global_read_lock::GRL_NONE, m_mdl_global_shared_lock = 0x0,
  m_mdl_blocks_commits_lock = 0x0}
I had not checked any version before MySQL 5.6 yet, but I assume that all of them have list of threads, each thread's structure has a thread_id item and I suppose to see global_read_lock item as well, if not as a structure with related MDL information, then with something else (to be studied later).

In the case above, I clearly see that thread with 14 in the processlist had not set global read lock. Now, let me proceed to the next thread:
(gdb) set $thd=(THD *)($thd->next)
(gdb) p $thd
$6 = (THD *) 0x7f7f3ffa1008
(gdb) p $thd->thread_id
$7 = 13
(gdb) p $thd->global_read_lock
$8 = {m_state = Global_read_lock::GRL_ACQUIRED_AND_BLOCKS_COMMIT,
  m_mdl_global_shared_lock = 0x7f7f2e90e080,
  m_mdl_blocks_commits_lock = 0x7f7f2e90e180}
So, it is thread with id 13 that had executed FTWRL and had not yet released the global read lock. I was lucky to find it fast, but I can repeat steps above to move on to the next list item until the next is 0x0. Shane had written a similar code as a Python macro for gdb is his post I mentioned. One day I may grow up and write something similar for this case.

Other approach was invented and described by in his blog post on the same topic of getting SQL statements from core files. One can just create a big enough input file for gdb that will switch to process threads one by one and try to get information from MySQL thread structure there, assuming it was created (and ignoring errors), like this:
(gdb) thread 1[Switching to thread 1 (Thread 0x7f7f5ce6d7c0 (LWP 7444))]
#0  0x00007f7f5a2f084d in poll () at ../sysdeps/unix/syscall-template.S:81
81      ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) p do_command::thd->thread_id
No frame is currently executing in block do_command(THD*).
(gdb) thread 2
[Switching to thread 2 (Thread 0x7f7f5ce02b00 (LWP 9232))]
#0  pthread_cond_timedwait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
238     ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S: No such file or directory.
(gdb) p do_command::thd->thread_id
$9 = 14
(gdb) p do_command::thd->global_read_lock
$10 = {m_state = Global_read_lock::GRL_NONE, m_mdl_global_shared_lock = 0x0,
  m_mdl_blocks_commits_lock = 0x0}
(gdb) thread 3
[Switching to thread 3 (Thread 0x7f7f5cd6eb00 (LWP 9230))]
#0  0x00007f7f5a2f084d in poll () at ../sysdeps/unix/syscall-template.S:81
81      ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) p do_command::thd->thread_id
$11 = 13
(gdb) p do_command::thd->global_read_lock
$12 = {m_state = Global_read_lock::GRL_ACQUIRED_AND_BLOCKS_COMMIT,
  m_mdl_global_shared_lock = 0x7f7f2e90e080,
  m_mdl_blocks_commits_lock = 0x7f7f2e90e180}

...
And so on, for 1000 threads or more, if we expect that many. With some smart use of shell and tools like grep and awk on the resulting output we can probably eventually get just what we need - thread id of the thread that did FTWRL and is still keeping global read lock.

One day I may work on this kind of scripts, but for now I just wanted to show how gdb can help you to get this information.

Tuesday, April 11, 2017

Fun with Bugs #52 - On Some Bugs Fixed in MySQL 5.7.18

I had not reviewed MySQL 5.7 release notes for quite a some time in this series. Last time I checked it was MySQL 5.7.15. So, as soon as I noted new release, 5.7.18, I decided to check the release notes for any interesting fixed bug (reported by Community users in public) in the areas I am interested in: InnoDB, replication, optimizer and few others.

Note that recently most of the bugs fixed are internal only, found by Oracle engineers that never cared (or are not allowed, who knows) to report them in public, so this blog post is not even remotely a full review of what's fixed in MySQL 5.7.18 and is not a replacement for reading the detailed release notes.

So, here is the result. I've noted the following InnoDB bugs from Community users are fixed (as usual, I explicitly state who reported and verified the bug):
  • Bug #83245 - "Log parsing buffer overflow". This regression bug was reported by my former colleague from Percona, Tomislav Plavcic (based on the original findings by Peter Zaitsev), and verified by Umesh Shastry.
  • Bug #82968 "InnoDB's estimate for rows-in-table can return rows=1 due to a race condition", was reported by my colleague Sergey Petrunya while working on my MariaDB bug report, MDEV-10649 (based, in turn, on a problem one of our key customers hit in production). The bug was kindly verified by Sinisa Milivojevic.
  • Bug #80060 - "Concurrent TRUNCATE TABLEs cause stalls". It's great to see this case reported by Domas Mituzas fixed at least in 5.7.x. But it's even better to just read all the comments and arguments in that great and funny discussion he had with Sinisa Milivojevic who was working on verification...
  • Bug #80580 - "count(*) much slower on 5.7 than 5.6". This regression (not tagged as such!) bug was reported by Qi Xiaobin as optimizer one (but it turned out to be in InnoDB), and verified by  Umesh Shastry. Simple public test case there was provided by me.
  • Bug #84202 is not public, so let me simply quote (after telling you again that I hate such cases):
    "InnoDB: The row_search_mvcc() function unnecessarily traversed the entire table for a range query, which occurred when the record was not in the transaction read view. (Bug #84202, Bug #23481444, Bug #25251375)"
These bugs reported by Community users were fixed in replication:
  • Bug #83537 - "WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS timeout value handles some inputs badly". It was reported by Dan Reif and verified by Umesh Shastry. Now fractional and negative timeouts are processed in a reasonable and expected way.
  • Bug #84674 - "Having an unresolvable hostname in group_repl should not block group replication", by Kenny Gryp. I've already reviewed active group replication bugs in a separate post some time ago. Nice to see some of them fixed. The bug was probably verified by Nuno Carvalho.
  • Bug #83918 - "RBR: Wrong field length in error message". It was reported by Luis Soares from Oracle,  who probably fixed it himself. Nice example when Oracle engineers do not hesitate to report bugs in public!
  • Bug #83270 - "Performance degradation when the server has many gtids", was reported by Daniël van Eeden and quickly verified by Umesh Shastry.
The following interesting bugs in other categories were also fixed:
  • Bug #84786 - "--performance-schema-digests-size=1 leads to SIGSEGV". It was reported by Roel Van de Paar and promptly verified by Miguel Solorzano.
  • Bug #84263 - "mysql.server (MySQL Server Startup Script) can not work,and export some error.". This bug was reported by Song Boomballa. See also this my related bug report, Bug #84173 - "mysqld_safe --no-defaults & silently does NOT work any more". Nice to know this regression is finally fixed. 
  • Bug #84172 - "The pid-file value is ignored in the /etc/my.cnf option file". This nice packaging bug/difference was noted and reported by Monty Solomon.
  • Bug #83253 - "performance_schema digests the same query to multiple different digests", was reported by my former colleague Justin Swanhart. The bug was verified and probably fixed by Mayank Prasad.
  • Bug #83110 - "found_rows() returns 1 when no rows found". This nice regression bug was found and reported by Yamada Isami, and verified by Umesh Shastry
  • Bug #83005 - "When the Optimiser is using index for group-by it often gives wrong results", was reported by Yoseph Phillips and verified, again by Umesh Shastry. Somehow this regression affecting versions 5.6+ was not noted for a long time.
  • Bug #82313 - "Wrong result for EXPLAIN SELECT COUNT(*) FROM TABLE;". This is yet another optimizer regression bug in 5.7 found by Justin Swanhart. I wonder if MySQL 8 (that seemed to be affected) would include the fix... Release notes fro 8.0.1 are empty at the moment!
  • Bug #81854 - "Force index is skipped while executing select count(*)", was reported by Zhai Weixiang and verified by Umesh Shastry.
  • Bug #78244 - "SELECT DISTINCT, wrong results combined with use_index_extensions=off". It was reported by Daniel G and verified by Miguel Solorzano.
  • Bug #41908 - "Trying to start MySQL while another instance is starting: confusing error messag". The bug was reported by  Roel Van de Paar more than 8 years ago! It was verified by Sveta Smirnova. Somehow it seems that 5.7.18 had introduced a lot of changes into the way mysqld_safe works...
That's all fixed bugs that I've considered relevant for myself and this post. I've just built MySQL 5.7.18 from source on my Ubuntu 14.04 netbook while I write this post, so I do not have any personal experience with this new release, yet. But I know it works:
openxs@ao756:~/dbs/5.7$ bin/mysqld_safe --no-defaults --port=3308 --socket=/tmp/mysql57.sock --basedir=/home/openxs/dbs/5.7 --datadir=/home/openxs/dbs/5.7/data &
[1] 5202
openxs@ao756:~/dbs/5.7$ 2017-04-11T10:51:24.224518Z mysqld_safe Logging to '/home/openxs/dbs/5.7/data/ao756.err'.
2017-04-11T10:51:24.264873Z mysqld_safe Starting mysqld daemon with databases from /home/openxs/dbs/5.7/data

openxs@ao756:~/dbs/5.7$ bin/mysql -uroot --socket=/tmp/mysql57.sock
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.7.18 MySQL Community Server (GPL)

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>
Time to start testing and looking for something new...

Sunday, April 9, 2017

Accessing Oracle tables via MariaDB CONNECT engine and ODBC

In my previous working environment it was typical to consider MariaDB Server as just a set of "random" storage engines mixed together, sometimes for unclear reasons and with questionable extra value. But even at that times I was already impressed by some features of one of the engines supported only by MariaDB, CONNECT. PIVOT table type specifically was of interest in a couple of customer cases I had to work on, but I quickly found out that the engine does not work with Oracle's MySQL or Percona Server, and thus had to forget about it for some time.

Recently while working in MariaDB I've got several more chances to deal with some new CONNECT use cases (and some related problems, that ended up mostly minor or caused by ignorance). This engine can be (and is, widely) used to access tables from virtually any other database management system or data source, as long as there is an ODBC or JDBC driver for it. I had to set up a testing environment to reproduce some of the problems noted while accessing remote tables in Oracle RDBMS, and eventually decided to document setup steps, mistakes and findings for myself and my readers.

I think the topic may require a series of posts, and in this first one I plan to concentrate on creating a basic testing environment to access Oracle RDBMS via ODBC driver and CONNECT engine on my main testing box running Fedora 25, and resolving simplest access issues, like those I described in MDEV-12355 (ended as not a bug thanks to quick help and clarifications from engine creator, Olivier Bertrand).

For pure testing purposes I had not planned to spend time installing and setting up Oracle RDBMS on that Fedora 25 box directly (later I tried and failed), but decided to rely on some existing Docker image. So, I started with installing Docker and dependencies (see this article for some hints, if needed):
[openxs@fc23 ~]$ sudo dnf install docker
...
Installed:
  container-selinux.noarch 2:2.10-1.fc25
  docker.x86_64 2:1.12.6-6.gitae7d637.fc25
  docker-common.x86_64 2:1.12.6-6.gitae7d637.fc25
  oci-register-machine.x86_64 0-2.7.gitbb20b00.fc25
  oci-systemd-hook.x86_64 1:0.1.6-1.gitfe22236.fc25
  skopeo-containers.x86_64 0.1.17-1.dev.git2b3af4a.fc25

Complete!

[openxs@fc23 ~]$ sudo systemctl start docker
I am going to skip details on images that had not worked well (not a topic for this blog). Eventually I ended up working with https://hub.docker.com/r/sath89/oracle-12c/:
[openxs@fc23 server]$ sudo docker run -d -p 8080:8080 -p 1521:1521 -e ORACLE_ALLOW_REMOTE=true sath89/oracle-12c:latest
4515bc4a6fb4804ac8d714115de12e47a9e8f9baabb2300cde672aedb4843c2b
[openxs@fc23 server]$ sudo docker ps
CONTAINER ID        IMAGE                      COMMAND             CREATED             STATUS              PORTS                                            NAMES
4515bc4a6fb4        sath89/oracle-12c:latest   "/entrypoint.sh "   14 seconds ago      Up 8 seconds        0.0.0.0:1521->1521/tcp, 0.0.0.0:8080->8080/tcp   adoring_turing

[openxs@fc23 server]$ sudo docker logs -f 4515bc4a6fb4
ls: cannot access /u01/app/oracle/oradata: No such file or directory
Database not initialized. Initializing database.
Starting tnslsnr
Copying database files
1% complete
3% complete
11% complete
18% complete
26% complete
37% complete
Creating and starting Oracle instance
40% complete
45% complete
50% complete
55% complete
56% complete
60% complete
62% complete
Completing Database Creation
66% complete
70% complete
73% complete
85% complete
96% complete
100% complete
Look at the log file "/u01/app/oracle/cfgtoollogs/dbca/xe/xe.log" for further details.
Configuring Apex console
Database initialized. Please visit http://#containeer:8080/em http://#containeer:8080/apex for extra configuration if needed
Starting web management console
PL/SQL procedure successfully completed.
Starting import from '/docker-entrypoint-initdb.d':
found file /docker-entrypoint-initdb.d//docker-entrypoint-initdb.d/*
[IMPORT] /entrypoint.sh: ignoring /docker-entrypoint-initdb.d/*
Import finished
Database ready to use. Enjoy! ;)
^C
This image initializes new database by default (unless you pass a volume on host, then it is initialized for the first time if the database is not there), so you have to wait (maybe for many minutes) and check the logs before trying to access the database. Note that container's port 1521 is mapped to host's port 1521 (and my Fedora 25 IP address, 192.168.1.85, is then used later below to access "remote" Oracle server on this port).

So, my "Oracle server" is ready (probably). Now, my Fedora 25 host will work as a client, so I had downloaded and installed basic Oracle client software RPMs (including ODBC driver and SQL*Plus) from OTN:
[openxs@fc23 maria10.1]$ rpm -q -a | grep oracle
oracle-instantclient11.2-sqlplus-11.2.0.4.0-1.x86_64
oracle-xe-11.2.0-1.0.x86_64
oracle-instantclient11.2-basic-11.2.0.4.0-1.x86_64
oracle-instantclient11.2-odbc-11.2.0.4.0-1.x86_64
I also installed unixODBC (including -devel, as I built MariaDB from source and one needs unixODBC-devel to have CONNECT engine build properly) as fine manual stated at the beginning:
[openxs@fc23 maria10.1]$ rpm -q -a | grep unixODBC
unixODBC-2.3.4-3.fc25.x86_64
unixODBC-devel-2.3.4-3.fc25.x86_64
One has to set up tnsnames.ora for Oracle client to work with (remote) instance, and /etc/odbcinst.ini + /etc/odbc.ini to inform unixODBC about the known drivers and system data sources (this discussion had some useful hints):
[openxs@fc23 maria10.1]$ cat /etc/oracle/tnsnames.ora
XE =
 ( DESCRIPTION =
  (ADDRESS_LIST =
  (ADDRESS =
  (PROTOCOL = TCP)
  (Host = 192.168.1.85)
  (Port = 1521)
  )
 )
 (CONNECT_DATA = (SID = XE)
 )
)

[openxs@fc23 maria10.1]$ cat /etc/odbcinst.ini
...

[OracleODBC]
Description = Oracle ODBC driver for Oracle 11g
Driver64 = /usr/lib/oracle/11.2/client64/lib/libsqora.so.11.1
FileUsage = 1
Driver Logging = 7

[openxs@fc23 maria10.1]$ cat /etc/odbc.ini
[oracle]
Driver = OracleODBCDSN = OracleODBC
ServerName = XE
UserID = system
Password = oracle
 Surely, I tried to access my Oracle server in Docker container via sqlplus command first:
[openxs@fc23 server]$ /usr/lib/oracle/11.2/client64/bin/sqlplus system/oracle@localhost:1521/xe.oracle.docker/usr/lib/oracle/11.2/client64/bin/sqlplus: error while loading shared libraries: libsqlplus.so: cannot open shared object file: No such file or directory
[openxs@fc23 server]$ ldd /usr/lib/oracle/11.2/client64/bin/sqlplus
        linux-vdso.so.1 (0x00007ffe8813b000)
        libsqlplus.so => not found
        libclntsh.so.11.1 => not found
        libnnz11.so => not found
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f0aadc93000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f0aad98a000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f0aad76a000)
        libnsl.so.1 => /lib64/libnsl.so.1 (0x00007f0aad551000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f0aad18b000)
        /lib64/ld-linux-x86-64.so.2 (0x0000563b14bfb000)
This is what happens when one does not care to read all the details and forgets to set some environment variables and inform dynamic loader about libraries location. So, I did the following to resolve the problem:
[openxs@fc23 server]$ export ORACLE_HOME=/usr/lib/oracle/11.2/client64
[openxs@fc23 server]$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ORACLE_HOME/lib
[openxs@fc23 server]$ export PATH=$PATH:$ORACLE_HOME/bin
[openxs@fc23 server]$ sudo ldconfig
[openxs@fc23 server]$ ldd /usr/lib/oracle/11.2/client64/bin/sqlplus
        linux-vdso.so.1 (0x00007ffd31b50000)
        libsqlplus.so => /usr/lib/oracle/11.2/client64/lib/libsqlplus.so (0x00007f88f680f000)
        libclntsh.so.11.1 => /usr/lib/oracle/11.2/client64/lib/libclntsh.so.11.1 (0x00007f88f3ea0000)
        libnnz11.so => /usr/lib/oracle/11.2/client64/lib/libnnz11.so (0x00007f88f3ad3000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f88f38b2000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f88f35a9000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f88f3389000)
        libnsl.so.1 => /lib64/libnsl.so.1 (0x00007f88f3170000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f88f2daa000)
        libaio.so.1 => /lib64/libaio.so.1 (0x00007f88f2ba8000)
        /lib64/ld-linux-x86-64.so.2 (0x000056123639a000)

[openxs@fc23 server]$ sqlplus system/oracle@localhost:1521/xe.oracle.docker    
SQL*Plus: Release 11.2.0.4.0 Production on Fri Mar 24 11:00:21 2017

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Standard Edition Release 12.1.0.2.0 - 64bit Production

SQL> select count(*) from scott.emp;

  COUNT(*)
----------
        14

SQL> create table scott.t1(c1 number(30));

Table created.

SQL> insert into scott.t1 values (12345);

1 row created.

SQL> desc scott.t1;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 C1                                                 NUMBER(30)

SQL>

SQL> exit
Disconnected from Oracle Database 12c Standard Edition Release 12.1.0.2.0 - 64bit Production
So, SQL*Plus (that I still miss sometimes, after using it maybe only 10 times over last 12 years) client works. Let's try to access Oracle via ODBC and isql now:
[openxs@fc23 server]$ isql -v oracle
[08004][unixODBC][Oracle][ODBC][Ora]ORA-12154: TNS:could not resolve the connect identifier specified

[ISQL]ERROR: Could not SQLConnect
Surely, one more environment variable is not set, the one pointing out the location of tnsnames.ora file:
[openxs@fc23 server]$ export TNS_ADMIN=/etc/oracle
[openxs@fc23 server]$ isql -v oracle
+---------------------------------------+
| Connected!                            |
|                                       |
| sql-statement                         |
| help [tablename]                      |
| quit                                  |
|                                       |
+---------------------------------------+
SQL> select * from scott.t1;
+---------------------------------+
| C1                              |
+---------------------------------+
| 12345                           |
+---------------------------------+
SQLRowCount returns -1
1 rows fetched
SQL> select count(*) from scott.emp;+-----------------------------------------+
| COUNT(*)                                |
+-----------------------------------------+
| 14                                      |
+-----------------------------------------+
SQLRowCount returns -1
1 rows fetched
SQL> quit
[openxs@fc23 server]$

Looks like now we should be able to use ODBC table type of CONNECT storage engine of MariaDB (that was already loaded with INSTALL SONAME 'ha_connect.so'; check the details here) to access remote Oracle table:
MariaDB [test]> create table oracle_t1 engine=connect table_type=ODBC tabname='t1' dbschema='scott' connection='dsn=oracle';
ERROR 1105 (HY000): Cannot get columns from t1
MariaDB [test]> show warnings\G
*************************** 1. row ***************************
  Level: Error
   Code: 1105
Message: Cannot get columns from t1
*************************** 2. row ***************************
  Level: Error
   Code: 1030
Message: Got error 122 "Internal (unspecified) error in handler" from storage engine CONNECT
2 rows in set (0.00 sec)
Do not be like me and do not report bugs when you see these messages! Go read the fine manual that explicitly mentions Oracle as being case sensitive and converting identifiers to upper case unless they are quoted.

As soon as you use proper case in tabname (and dbname) settings, everything works as expected, at least for my primitive use case:
MariaDB [test]> create table t1_oracle engine=connect table_type=ODBC tabname='t1' dbname='scott' connection='dsn=oracle';
ERROR 1105 (HY000): Cannot get columns from t1
MariaDB [test]> create table t1_oracle engine=connect table_type=ODBC tabname='T1' dbname='scott' connection='dsn=oracle';ERROR 1105 (HY000): Cannot get columns from T1
MariaDB [test]> create table t1_oracle engine=connect table_type=ODBC tabname='T1' dbname='SCOTT' connection='dsn=oracle';Query OK, 0 rows affected (1.29 sec)

MariaDB [test]> select * from t1_oracle;
+-------+
| C1    |
+-------+
| 12345 |
+-------+
1 row in set (0.07 sec)

MariaDB [test]> create table t2_oracle engine=connect table_type=ODBC tabname='SCOTT.T1' connection='dsn=oracle';
Query OK, 0 rows affected (3.54 sec)

MariaDB [test]> select * from t2_oracle;
+-------+
| C1    |
+-------+
| 12345 |
+-------+
1 row in set (0.06 sec)
To summarize, with some initial setup efforts (that should include reading the official CONNECT manual) it is easy to access Oracle tables from MariaDB server via ODBC driver. It may be very useful if you plan to migrate or use some data from Oracle RDBMS.

As a side note, Docker really rocks sometimes. I spent maybe 15+ years installing Oracle RDBMS (starting from version 7.0.16 or so) on all kinds of OSes dozens of times, and I do not miss this experience and do not mind to have it up and running with one simple command any time (rarely) I need it.